Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: File removed from the brain not listed but still exists #1336

Closed
matthieujacq opened this issue Oct 5, 2023 · 3 comments
Closed

[Bug]: File removed from the brain not listed but still exists #1336

matthieujacq opened this issue Oct 5, 2023 · 3 comments
Assignees
Labels
area: backend Related to backend functionality or under the /backend directory bug Something isn't working Stale

Comments

@matthieujacq
Copy link
Contributor

What happened?

Quick step to reproduce

  • upload a large file to a brain (like an ebook .epub)
  • remove it right after
  • try adding it again

What happens

The file cannot be uploaded again (error message : "Failed to upload file already exists"), but it is not in the list anymore.

What should happen

  • either: we should be able to reupload the document
  • or: the deletion fails hence display an error message and the file is still in the list

Screen recording

Screen.Recording.2023-10-05.at.16.55.40.mov

Relevant log output

No response

Twitter / LinkedIn details

No response

@matthieujacq matthieujacq added the bug Something isn't working label Oct 5, 2023
@dosubot dosubot bot added the area: backend Related to backend functionality or under the /backend directory label Oct 5, 2023
@dosubot
Copy link

dosubot bot commented Oct 5, 2023

🤖

Hello,

Thank you for reporting this issue. It seems like you're encountering a problem where a file that has been removed from the brain cannot be re-uploaded because the system still recognizes it as existing in the storage, even though it's not listed in the brain anymore.

This behavior is due to how the Quivr system handles file uploads and deletions. When a file is uploaded, it is stored in a storage system and a reference to it is added to the brain. When the file is removed from the brain, only the reference in the brain is removed, not the file in the storage. This is why the system indicates that the file already exists even though it is not listed in the brain anymore.

Here is the relevant code that handles file uploads:

try:
    fileInStorage = upload_file_storage(file_content, filename_with_brain_id)
    logger.info(f"File {fileInStorage} uploaded successfully")

except Exception as e:
    if "The resource already exists" in str(e):
        raise HTTPException(
            status_code=403,
            detail=f"File {uploadFile.filename} already exists in storage.",
        )
    else:
        raise HTTPException(
            status_code=500, detail="Failed to upload file to storage."
        )

(Source: backend/routes/upload_routes.py)

As for the deletion process, the system first retrieves the vector IDs associated with the file name from the 'vectors' table. For each vector ID, the system deletes the corresponding entry from the 'brains_vectors' table. The system then checks if the vector is still associated with any other brains. If the vector is not associated with any other brains, it is deleted from the 'vectors' table. This suggests that deleting a file from the Quivr system not only removes the reference from the brain but also deletes the file from the storage if it is not associated with any other brains.

Here is the relevant code that handles file deletions:

def delete_file_from_brain(self, brain_id, file_name: str):
    # First, get the vector_ids associated with the file_name
    vector_response = (
        self.db.table("vectors")
        .select("id")
        .filter("metadata->>file_name", "eq", file_name)
        .execute()
    )
    vector_ids = [item["id"] for item in vector_response.data]

    # For each vector_id, delete the corresponding entry from the 'brains_vectors' table
    for vector_id in vector_ids:
        self.db.table("brains_vectors").delete().filter(
            "vector_id", "eq", vector_id
        ).filter("brain_id", "eq", brain_id).execute()

        # Check if the vector is still associated with any other brains
        associated_brains_response = (
            self.db.table("brains_vectors")
            .select("brain_id")
            .filter("vector_id", "eq", vector_id)
            .execute()
        )
        associated_brains = [
            item["brain_id"] for item in associated_brains_response.data
        ]

        # If the vector is not associated with any other brains, delete it from 'vectors' table
        if not associated_brains:
            self.db.table("vectors").delete().filter(
                "id", "eq", vector_id
            ).execute()

    return {"message": f"File {file_name} in brain {brain_id} has been deleted."}

(Source: backend/models/databases/supabase/brains.py)

I hope this clarifies the current behavior of the system. We understand that this can be confusing and we will consider improving the system to handle such cases better in the future. For now, you might want to ensure that the file is completely deleted from the storage before trying to re-upload it.

Thank you for your understanding and patience. If you have any other questions or need further clarification, please don't hesitate to ask.

Best regards,
Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@matthieujacq matthieujacq moved this to Bugs & Improvements in Quivr's Roadmap Oct 5, 2023
@matthieujacq matthieujacq self-assigned this Oct 5, 2023
mamadoudicko added a commit that referenced this issue Oct 31, 2023
Issue: #1435

- feat(knowledgeTab): update structure
- refactor: change AddKnowledge structure
- feat: change AddKnowledge component structure
- feat: rework sources logic
- feat: change knowledge tab upload process
- fix: change knowledge tab fetch, create, update logic
- feat: improve added knowledge ui
- style: improve responsivity

Fix: 
- #1516
- #1336
- #1204




https://github.com/StanGirard/quivr/assets/63923024/f2917bf3-4ff8-42c6-8149-0b36287441b4
Copy link
Contributor

github-actions bot commented Nov 4, 2023

Thanks for your contributions, we'll be closing this issue as it has gone stale. Feel free to reopen if you'd like to continue the discussion.

@github-actions github-actions bot added the Stale label Nov 4, 2023
@mamadoudicko mamadoudicko moved this from Bugs & Improvements to Todo in Quivr's Roadmap Nov 6, 2023
@github-actions github-actions bot removed the Stale label Nov 6, 2023
@mamadoudicko mamadoudicko moved this from Todo to In Progress in Quivr's Roadmap Nov 6, 2023
@mamadoudicko mamadoudicko moved this from In Progress to To Be Validated in Quivr's Roadmap Nov 6, 2023
@Chloeroumengas Chloeroumengas moved this from To Be Validated to Done in Quivr's Roadmap Nov 7, 2023
Copy link
Contributor

github-actions bot commented Dec 6, 2023

Thanks for your contributions, we'll be closing this issue as it has gone stale. Feel free to reopen if you'd like to continue the discussion.

@github-actions github-actions bot added the Stale label Dec 6, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 11, 2023
coolCatalyst added a commit to coolCatalyst/quivr that referenced this issue Jun 1, 2024
Issue: QuivrHQ/quivr#1435

- feat(knowledgeTab): update structure
- refactor: change AddKnowledge structure
- feat: change AddKnowledge component structure
- feat: rework sources logic
- feat: change knowledge tab upload process
- fix: change knowledge tab fetch, create, update logic
- feat: improve added knowledge ui
- style: improve responsivity

Fix: 
- QuivrHQ/quivr#1516
- QuivrHQ/quivr#1336
- QuivrHQ/quivr#1204




https://github.com/StanGirard/quivr/assets/63923024/f2917bf3-4ff8-42c6-8149-0b36287441b4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: backend Related to backend functionality or under the /backend directory bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

3 participants