Summary
DocumentMonitor._check_documents logs a warning when an indexed file is gone from disk but never updates the DB row's indexing_status — so /api/documents continues to report the doc as complete, and downstream chat requests hand the (now-stale) RAG-cached chunks to the LLM even though the source file no longer exists.
Location
src/gaia/ui/document_monitor.py:168-176
if file_info is None:
# File deleted or inaccessible
if status != \"missing\":
logger.warning(
\"Indexed file no longer accessible: %s (doc_id=%s)\",
filepath,
doc_id,
)
continue # ← status never changes; 'missing' branch is unreachable
The class-docstring at line 58 says File deleted → log warning (does not remove from library) — that design choice is sensible, but the code also never flips the status flag, so the status != \"missing\" guard above is effectively dead code: status is always complete, so the warning fires on every 30s poll forever.
Expected
Two options, either of which is consistent:
- Flip status:
self._db.update_document_status(doc_id, \"missing\") when file_info is None. Warning only fires once. /api/documents can show a "missing" badge so the user knows to re-attach or remove.
- Don't warn: drop the
status != \"missing\" guard entirely and just logger.debug(...) it — the current code is in a confused middle state.
Option 1 is more useful and matches what get_document_status / document_status already return via the indexing_status field.
Reproduce
- Upload any file via
/api/documents/upload-path.
- Delete the file on disk.
- Wait >30s.
- Observe:
/api/documents still shows indexing_status: \"complete\"; server logs fire the warning every 30s.
Impact
- Log spam (once per poll per missing file).
- Users have no UI indication that a linked doc is gone — RAG will keep returning cached chunks from a vanished file.
Summary
DocumentMonitor._check_documentslogs a warning when an indexed file is gone from disk but never updates the DB row'sindexing_status— so/api/documentscontinues to report the doc ascomplete, and downstream chat requests hand the (now-stale) RAG-cached chunks to the LLM even though the source file no longer exists.Location
src/gaia/ui/document_monitor.py:168-176The class-docstring at line 58 says
File deleted → log warning (does not remove from library)— that design choice is sensible, but the code also never flips the status flag, so thestatus != \"missing\"guard above is effectively dead code: status is alwayscomplete, so the warning fires on every 30s poll forever.Expected
Two options, either of which is consistent:
self._db.update_document_status(doc_id, \"missing\")whenfile_info is None. Warning only fires once./api/documentscan show a "missing" badge so the user knows to re-attach or remove.status != \"missing\"guard entirely and justlogger.debug(...)it — the current code is in a confused middle state.Option 1 is more useful and matches what
get_document_status/document_statusalready return via theindexing_statusfield.Reproduce
/api/documents/upload-path./api/documentsstill showsindexing_status: \"complete\"; server logs fire the warning every 30s.Impact