Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw multisite: fix the increamtal bucket sync init #11553

Merged
merged 1 commit into from Oct 20, 2016

Conversation

Aran85
Copy link
Contributor

@Aran85 Aran85 commented Oct 19, 2016

in the RGWBucketShardFullSyncCR::operate, inc_marker will assigned with remote bilog's max_marker.
but the sync_status's inc_marker cant be assigned.so the next step inc sync will always sync
from null log,which means at beginning log.

Signed-off-by: Zengran Zhang zhangzengran@h3c.com

@cbodley
Copy link
Contributor

cbodley commented Oct 19, 2016

thanks, this looks correct. i'll run some tests to validate

@cbodley cbodley self-assigned this Oct 19, 2016
@cbodley
Copy link
Contributor

cbodley commented Oct 19, 2016

@Aran85 I created a ticket for this bug at http://tracker.ceph.com/issues/17624 so we can make sure this gets backported to jewel. could you please add Fixes: http://tracker.ceph.com/issues/17624 to your commit message?

@Aran85
Copy link
Contributor Author

Aran85 commented Oct 20, 2016

@cbodley done

in the `RGWBucketShardFullSyncCR::operate`, inc_marker will assigned with remote bilog's max_marker.
but the sync_status's inc_marker cant be assigned.so the next step inc sync will always sync
from null log,which means at beginning log.

Fixes: http://tracker.ceph.com/issues/17624

Signed-off-by: Zengran Zhang <zhangzengran@h3c.com>
@cbodley
Copy link
Contributor

cbodley commented Oct 20, 2016

tested successfully using two vstart clusters with one zone each. i started a gateway on the first zone and created a bucket with ~20 objects. i then started the second zone's gateway, and watched it go through full sync. without this fix, the first request for incremental sync uses an empty marker. with the fix, it uses a valid marker 👍

that said, this points out another issue with RGWRunBucketSyncCoroutine - it shouldn't be keeping a cached copy of the sync status outside of the lock. rather than each of the InitBucketShardSyncStatus, BucketShardFullSync, and BucketShardIncrementalSync coroutines taking and releasing that lock, RunBucketSync should acquire it at the beginning and hold it over all of the other calls

RunBucketSync also doesn't appear to be handling the parallel full and incremental sync correctly, so we'll want to get that working for #10995. i'll work on a design that covers both of these issues, and share with the ceph-devel list

@cbodley cbodley merged commit d77fae5 into ceph:master Oct 20, 2016
@Aran85 Aran85 deleted the wip-datasync-status branch October 21, 2016 00:24
@cbodley
Copy link
Contributor

cbodley commented Oct 21, 2016

@Aran85 this was causing a segfault in the radosgw-admin bucket sync init command because RGWBucketSyncStatusManager::init_sync_status() was returning a coroutine that had a pointer to a temporary variable. i opened #11594 with the fix

i also opened #11598 that does the initial lease refactoring for RGWRunBucketSyncCoroutine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants