Skip to content

Conversation

@avirajsingh7
Copy link
Collaborator

@avirajsingh7 avirajsingh7 commented Jun 19, 2025

Summary

Target issue #220
Explain the motivation for making this change. What existing problem does the pull request solve?

Addition of new Endpoint(remove/{doc_id}/permanent) which will:

soft deletes the document — meaning its metadata and reference are retained in the database, but it is marked as deleted (deleted_at is set). The actual file stored in cloud storage (e.g., S3) is permanently deleted, and this action is irreversible.
If the document is part of an active collection, those collections
will be deleted using the collections delete interface. Noteably, this
means all OpenAI Vector Store's and Assistant's to which this document
belongs will be deleted.

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

@codecov
Copy link

codecov bot commented Jun 19, 2025

Codecov Report

Attention: Patch coverage is 97.40260% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/core/cloud/storage.py 71.42% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Comment on lines +31 to +36
def aws_credentials():
os.environ["AWS_ACCESS_KEY_ID"] = "testing"
os.environ["AWS_SECRET_ACCESS_KEY"] = "testing"
os.environ["AWS_SECURITY_TOKEN"] = "testing"
os.environ["AWS_SESSION_TOKEN"] = "testing"
os.environ["AWS_DEFAULT_REGION"] = settings.AWS_DEFAULT_REGION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at some point we also need to create .env.test as we add more testcases that may need similar behaviour.
As this is duplicate from backend/app/tests/api/routes/documents/test_route_document_upload.py

Copy link
Collaborator Author

@avirajsingh7 avirajsingh7 Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.
For the time being, we can go with this way.

Comment on lines +87 to +89
a_crud = OpenAIAssistantCrud()
d_crud = DocumentCrud(session, current_user.id)
c_crud = CollectionCrud(session, current_user.id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big fan of these names like a_crud, d_crud, I raised same in jerome's PR also but don't know what's the best way to do in OOPs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkhileshNegi that's right,
But this is something used across the document module to keep things consistent across module I used this way

@avirajsingh7 avirajsingh7 force-pushed the feat/s3_permanent_delete branch from 6973b06 to cbb95bd Compare June 20, 2025 04:57
@@ -0,0 +1,5 @@
This operation soft deletes the document — meaning its metadata and reference are retained in the database, but it is marked as deleted. The actual file stored in cloud storage (e.g., S3) is permanently deleted, and this action is irreversible.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a very significant change but the line where it states that the file gets permanently deleted from S3 should come as the first line

@AkhileshNegi AkhileshNegi merged commit cff0579 into main Jun 20, 2025
2 checks passed
@AkhileshNegi AkhileshNegi deleted the feat/s3_permanent_delete branch June 20, 2025 06:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready-for-review

Projects

Status: Closed

Development

Successfully merging this pull request may close these issues.

Deleting the document should also remove it from s3 or cloud storage

3 participants