-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw: make incomplete multipart upload part of bucket check efficient #57083
rgw: make incomplete multipart upload part of bucket check efficient #57083
Conversation
ec09933
to
4994233
Compare
01d2953
to
aa08528
Compare
jenkins test make check |
fd20641
to
1f10776
Compare
jenkins test api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice. a lot of things could benefit from concurrency like this, but it gets complicated with optional_yield
@adamemerson and i have been working on some concurrency primitives for use with c++20 coroutines (#50005 for example). something similar for optional_yield could help to simplify efforts like this by hiding the complexity needed to support the two different runtimes
edit: the co_throttle
class from #49720 would be a better algorithm here because it enforces bounded concurrency like your use of max_aio
p.s. expecting minor conflicts from #55592 which removes our fork of the spawn library
1f10776
to
6f46e87
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great if it works. do we have any useful test coverage of this stuff?
6f46e87
to
6f78ffd
Compare
apparently fixed since in https://tracker.ceph.com/issues/65585 |
jenkins test make check |
Previously the incomplete multipart portion of bucket check would list all entries in the _multipart_ namespace across all shards and then analyze them in memory before taking further action. Since all index entries for a given multipart upload are all on the same shard by design, we can work on this asynchronously shard by shard. Furthermore since all entries for a given multipart upload are sequential in the bucket index, we can use a small window to analyze each of the uploads. This should make the operation quicker and use much less memory in the worst cases. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
6f78ffd
to
ebe9893
Compare
I've rebased on latest main; we'll see if that fixes the CI issue(s). |
@mkogan1 : I was looking into what it'd take to port this to Quincy. It'd be non-trivial. But maybe it'd be easy to make a version that would simply process each shard sequentially. I was curious as to whether a HEAD |
jenkins test api |
https://jenkins.ceph.com/job/ceph-api/73295/
commented on https://tracker.ceph.com/issues/47612, but it's over 3 years old now. cc @Pegonzal @epuertat can someone please look into this? |
jenkins test api |
https://jenkins.ceph.com/job/ceph-windows-pull-requests/39437/consoleFull
asked about this on #ceph-devel slack |
jenkins test windows |
Thanks, @cbodley! |
@ivancich hi, ran a test of bucket check [--fix] performance and memory footprint before and after this PR commit, the real memory footprint is lower, the bucket check operation takes longer thou
|
Previously the incomplete multipart portion of bucket check would list all entries in the multipart namespace across all shards and then analyze them in memory before taking further action.
Since all index entries for a given multipart upload are all on the same shard by design, we can work on this asynchronously shard by shard. Furthermore since all entries for a given multipart upload are sequential in the bucket index, we can use a small window to analyze each of the uploads.
This should make the operation quicker and use much less memory in the worst cases.
Fixes: https://tracker.ceph.com/issues/65769
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
x
between the brackets:[x]
. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e