Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 objects are now flagged when they may require cleanup #59

Merged
merged 1 commit into from May 21, 2018

Conversation

sjones4
Copy link
Member

@sjones4 sjones4 commented May 16, 2018

This pull request adds a new flag to s3 object entities to indicate when version clean up may be needed. When an object version is set as the latest (i.e. version added or removed) we now also set the "cleanup required" flag. The clean up flag is only meaningful on the latest version of an object.

The bucket reaper task now performs version clean up only for these flagged objects rather than all objects in a bucket. This task clears the clean up flag after processing an object, allowing progress to be made even when there are many objects to be checked.

It is generally the case that periodic clean up is not required, as clean up is performed when updates to an object are made. Having a flag allows for manual triggering of object version clean up (via database change) should it ever be necessary.

@sjones4 sjones4 added this to To do in Eucalyptus 4.4.4 via automation May 16, 2018
@sjones4 sjones4 moved this from To do to Needs Review in Eucalyptus 4.4.4 May 16, 2018
@cdonati
Copy link

cdonati commented May 17, 2018

@sjones4 I'm curious about how object metadata is implemented. Is it stored separately from object data in a key/value store? Do metadata properties have an index that allows the bucket reaper task to look up objects that have the flag set, or does the task still need to iterate through all of the keys when looking for that flag?

@sjones4
Copy link
Member Author

sjones4 commented May 17, 2018

@cdonati the object storage gateway service does not store any object data, it is just a front end. The data is stored by walrus or ceph (s3). The object storage gateway object metadata is persisted in postgres. This flag is added to the main object entity table so although a full table scan may be used it is still much more efficient than the current approach (which performs some processing for every object)

@cdonati
Copy link

cdonati commented May 17, 2018

Cool. Thanks for the explanation. At some point, I'd love to hear more about the implementation details during a sprint planning session or something. I would be interested in learning about how the metadata is kept consistent with the underlying object data since that would be relevant for our GCS implementation.

@sjones4 sjones4 merged commit 4334dda into Corymbia:maint-4.4 May 21, 2018
Eucalyptus 4.4.4 automation moved this from Needs Review to Done May 21, 2018
@sjones4 sjones4 deleted the issue-s3-object-cleanup branch May 21, 2018 05:06
obino pushed a commit that referenced this pull request Oct 24, 2018
YAML format support for network and region configuration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants