-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup task for obsolete attachments #1648
Comments
Needs discussion on whether this should be implemented via the _vacuum endpoint, or incorporated into compact. Since compact isn't doing much work currently (the obsolete revision bodies it targets are handled by auto-expiry), it might make more sense to incorporate it there. That way users can use compact as a one-stop API for 'clean up data I don't need'. If we really needed to isolate the tasks, we could do that using a parameter on the compact call. |
Is there any progress on this issue? |
This didn't make it into the 1.3 release, but should be available on master as soon as we get 1.3 out the door, and will be included on 1.4. |
Original issue was filed as on #64. The main design issue remaining is handling concurrency issues during GC processing, specifically:
|
Needs followup to confirm 2.0 CBL behaviour - specifically whether there's a 1-to-1 association from attachment to document. If so, the attachment lifecycle can be modified to follow the document lifecycle (i.e. clean up attachments on document removal). |
@adamcfraser looks like attachments are still shared among docs based on the content-addressable store approach. To verify this, I used mobile-training-todo to add two docs with the same attachment: Doc id -lwoJM-z8EjO2tswcm5S9yp
Doc id -MCw8q3tzq5EYSdrivfWjy1
and they both appear to refer to attachment doc: |
There seems to be two overall approaches to this problem: Approach A: Modify couchbase lite to generate attachment digests to be per-documentOverview This proposed fix would make each attachment digest unique per document, so that as soon as a document was deleted, all of it's attachments could immediately be deleted. Currently the attachment digests are shared among documents, which can save space. However, it makes them tricky to clean up, since there is currently no way to know whether other docs are referring to an attachment. Pros
Cons
Approach B: Implement ref-counting schemeOverview This proposed fix would continue the content-addressable-store approach to attachments, but add a "reference counter" to each attachment, so that it could safely determine when attachments could be reaped. For example:
Pros
Cons
|
@snej any thoughts on the approaches mentioned in #1648 (comment)? @djpongh there's not any actionable work until we figure out an overall approach, so I'm going to re-assign to you to drive that forward. |
Unfortunately, this is too complicated to try and get into our 2.1 release. |
This is insane ... We almost moved to attachments for images when we read about it in the main documentation, it looked like it would simplify and solve issues with storage. It was only when we read the details in the 1.5 rest api with a pointer to this bug that we discovered it doesn't work. Can the original documentation be updated to say that you don't support attachments! You clearly don't if attachments can't actually be deleted. |
@djpongh @adamcfraser When can we expect this feature is going to be added/fixed. |
Migrated to CBG-795 |
When a document is deleted or an attachment removed the binary document remains in the underlying CBS bucket.
For apps that create and delete docs with attachments on a regular basis this can have a big impact on storage.
There is already a _vacuum DB endpoint which is meant to purge orphaned attachments but it is currently not implemented.
The text was updated successfully, but these errors were encountered: