New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mimic: rgw: resharding fixes #26789
mimic: rgw: resharding fixes #26789
Conversation
Dynamic resharding used to leave behind stale bucket instances; walk through the metadata pool and identify these instances by comparing the reshard status. If the reshard status is done, these instances are ok to be cleared. For reshard status of none we compare against the bucket entry point to ensure that we don't match the current entry point. Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com> (cherry picked from commit 0c35a6f)
Add a delete command as well that clears the resharded instances. We print out the json status to indicate success or error state Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com> (cherry picked from commit bf8f885)
Sort through and batch bucket instances so that multiple calls to reading current bucket info and locking can be avoided. For the most trivial case when the bucket is already deleted we exit early with all the stale instances. When the bucket reshard is in progress we only process the stale entries with status done, if the bucket is available for locking then we lock down and mark the other instances as well. Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com> (cherry picked from commit fb9c049) Conflicts: src/rgw/rgw_bucket.cc Get rid of the following c++17isms: - split_tenant auto return type -> trailing return type - tuple destructuring bind for split tenant with std::tie
The function cls_rgw_bucket_init was renamed to cls_rgw_bucket_init_index in order to better describe its functionality. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> (cherry picked from commit 20868bd)
Because RGWRados::cls_rgw_init_index is never called, remove it. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> (cherry picked from commit 4593778)
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> (cherry picked from commit 48e22fb)
Clean up old bucket index shards when a resharding is complete. Also, when a resharding fails, clean up unfinished bucket index shards. Do both clean-ups asynchronously. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> (cherry picked from commit f84f70d) Conflicts: src/rgw/rgw_rados.h merge conflict as bucket_placement functions were moved after the rgw rados refactor
…ously We can now take advantage of the new asynchronous bucket shard removal code and where we used to remove each shard synchronously now remove them asynchronously. This would be a huge win when we have tens of thousands of shards. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> (cherry picked from commit cb0da45) Conflicts: src/rgw/rgw_rados.cc conflicts with placement set and rgw rados refactor
This fixes a typo in a log message. It's a separate commit so downstream commits point to the right upstream commits via cherry-pick. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> (cherry picked from commit 7d1768f)
Adding this to the next mimic suite (as we're merging another rados last minute pr), If this passes QE maybe it can go in? |
+1 from my side, as long as one of the RGW core devs approves. |
👍 i'd love to see this also, but i'll defer to @ivancich to doublecheck his commits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commits look good.
jenkins test make check |
jenkins seems to be failing with a dashboard frontend build failure triggered by some nodejs thingy |
tested on a vstart cluster, working as expected, New reshards do not create stale instances anymore and older stale instaces are picked up by the admin tool as expected |
It looks like Yuri is testing it. So it seems reasonable that if it passes QA it can be merged given the jenkins issue(s). |
jenkins issues will be hopefully fixed by #26814 |
jenkins retest this please |
https://tracker.ceph.com/issues/37554
https://tracker.ceph.com/issues/37447