Skip to content

DocumentMonitor never transitions status to 'missing' when the underlying file vanishes #791

@kovtcharov

Description

@kovtcharov

Summary

DocumentMonitor._check_documents logs a warning when an indexed file is gone from disk but never updates the DB row's indexing_status — so /api/documents continues to report the doc as complete, and downstream chat requests hand the (now-stale) RAG-cached chunks to the LLM even though the source file no longer exists.

Location

src/gaia/ui/document_monitor.py:168-176

if file_info is None:
    # File deleted or inaccessible
    if status != \"missing\":
        logger.warning(
            \"Indexed file no longer accessible: %s (doc_id=%s)\",
            filepath,
            doc_id,
        )
    continue   # ← status never changes; 'missing' branch is unreachable

The class-docstring at line 58 says File deleted → log warning (does not remove from library) — that design choice is sensible, but the code also never flips the status flag, so the status != \"missing\" guard above is effectively dead code: status is always complete, so the warning fires on every 30s poll forever.

Expected

Two options, either of which is consistent:

  1. Flip status: self._db.update_document_status(doc_id, \"missing\") when file_info is None. Warning only fires once. /api/documents can show a "missing" badge so the user knows to re-attach or remove.
  2. Don't warn: drop the status != \"missing\" guard entirely and just logger.debug(...) it — the current code is in a confused middle state.

Option 1 is more useful and matches what get_document_status / document_status already return via the indexing_status field.

Reproduce

  1. Upload any file via /api/documents/upload-path.
  2. Delete the file on disk.
  3. Wait >30s.
  4. Observe: /api/documents still shows indexing_status: \"complete\"; server logs fire the warning every 30s.

Impact

  • Log spam (once per poll per missing file).
  • Users have no UI indication that a linked doc is gone — RAG will keep returning cached chunks from a vanished file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent-uiAgent UI changesbugSomething isn't workingragRAG system changes

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions