Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mimic: rgw: resharding fixes #26789

Merged
merged 9 commits into from Mar 8, 2019
Merged

Conversation

theanalyst
Copy link
Member

https://tracker.ceph.com/issues/37554
https://tracker.ceph.com/issues/37447

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

theanalyst and others added 9 commits March 6, 2019 15:45
Dynamic resharding used to leave behind stale bucket instances; walk through the
metadata pool and identify these instances by comparing the reshard status. If
the reshard status is done, these instances are ok to be cleared. For reshard
status of none we compare against the bucket entry point to ensure that we don't
match the current entry point.

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit 0c35a6f)
Add a delete command as well that clears the resharded instances. We print out
the json status to indicate success or error state

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit bf8f885)
Sort through and batch bucket instances so that multiple calls to reading
current bucket info and locking can be avoided. For the most trivial case when
the bucket is already deleted we exit early with all the stale instances. When
the bucket reshard is in progress we only process the stale entries with status
done, if the bucket is available for locking then we lock down and mark the
other instances as well.

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit fb9c049)

 Conflicts:
	src/rgw/rgw_bucket.cc
Get rid of the following c++17isms:
- split_tenant auto return type -> trailing return type
- tuple destructuring bind for split tenant with std::tie
The function cls_rgw_bucket_init was renamed to
cls_rgw_bucket_init_index in order to better describe its
functionality.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 20868bd)
Because RGWRados::cls_rgw_init_index is never called, remove it.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 4593778)
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 48e22fb)
Clean up old bucket index shards when a resharding is complete. Also,
when a resharding fails, clean up unfinished bucket index shards. Do
both clean-ups asynchronously.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit f84f70d)

 Conflicts:
	src/rgw/rgw_rados.h
merge conflict as bucket_placement functions were moved after the rgw rados
	refactor
…ously

We can now take advantage of the new asynchronous bucket shard removal
code and where we used to remove each shard synchronously now remove
them asynchronously. This would be a huge win when we have tens of
thousands of shards.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit cb0da45)

 Conflicts:
	src/rgw/rgw_rados.cc
conflicts with placement set and rgw rados refactor
This fixes a typo in a log message. It's a separate commit so
downstream commits point to the right upstream commits via
cherry-pick.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 7d1768f)
@theanalyst theanalyst changed the title mimic: resharding fixes mimic: rgw: resharding fixes Mar 6, 2019
@theanalyst theanalyst requested a review from cbodley March 6, 2019 16:36
@theanalyst
Copy link
Member Author

Adding this to the next mimic suite (as we're merging another rados last minute pr), If this passes QE maybe it can go in?

@smithfarm
Copy link
Contributor

If this passes QE maybe it can go in?

+1 from my side, as long as one of the RGW core devs approves.

@cbodley
Copy link
Contributor

cbodley commented Mar 6, 2019

👍 i'd love to see this also, but i'll defer to @ivancich to doublecheck his commits

Copy link
Member

@ivancich ivancich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My commits look good.

@ivancich
Copy link
Member

ivancich commented Mar 6, 2019

jenkins test make check

@theanalyst theanalyst requested a review from yuriw March 7, 2019 09:42
@theanalyst
Copy link
Member Author

jenkins seems to be failing with a dashboard frontend build failure triggered by some nodejs thingy

@theanalyst
Copy link
Member Author

tested on a vstart cluster, working as expected, New reshards do not create stale instances anymore and older stale instaces are picked up by the admin tool as expected

@ivancich
Copy link
Member

ivancich commented Mar 7, 2019

It looks like Yuri is testing it. So it seems reasonable that if it passes QA it can be merged given the jenkins issue(s).

@theanalyst
Copy link
Member Author

jenkins issues will be hopefully fixed by #26814

@sebastian-philipp
Copy link
Contributor

jenkins retest this please

@yuriw
Copy link
Contributor

yuriw commented Mar 7, 2019

@yuriw yuriw merged commit 9143490 into ceph:mimic Mar 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants