Skip to content

[Bug]: index update does not process deleted documents #1613

@mmaitre314

Description

@mmaitre314

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the bug

Two issues related to handling deleted documents during index updates:

The code here fails if there are only documents to delete and no documents to add:

# Fail on empty delta dataset
if delta_dataset.new_inputs.empty:
    error_msg = "Incremental Indexing Error: No new documents to process."
    raise ValueError(error_msg)

if delta_dataset.new_inputs.empty:

(related to #1600)

In the same function, delta_dataset.deleted_inputs is not referenced and only delta_dataset.new_inputs is. So it looks like document deletion is not implemented. If that's the case, with some guidance I may be able to provide a PR.

Steps to reproduce

Run an index update with only documents to delete and no documents to add.

Expected Behavior

Updates containing only document deletions succeed and deleted documents are removed from the index.

GraphRAG Config Used

N/A

Logs and screenshots

No response

Additional Information

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageDefault label assignment, indicates new issue needs reviewed by a maintainer

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions