New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: add missing RGWPeriod::reflect() based on new atomic update_latest_epoch() #14915

Merged
merged 5 commits into from Jul 14, 2017

Conversation

Projects
None yet
4 participants
@cbodley
Contributor

cbodley commented May 2, 2017

tests of #14688 were failing because a period update --commit in test_multi_period_incremental_sync() was mistakenly removing another zone from the period. this was happening because RGWPeriodPuller had store the latest period without also calling RGWPeriod::reflect() to update the local zonegroup object. so the zone modify --master and period update --commit were operating on an old version of the zonegroup

RGWPeriodPuller could not safely call RGWPeriod::reflect(), because it couldn't determine whether it had the latest epoch of that period (calling reflect() on an older epoch would overwrite the more recent zonegroup object). this was explained by the comment // XXX: if this is a newer epoch, we should overwrite the existing latest_epoch. but there's no way to do that atomically

this patch set adds a new RGWPeriod::update_latest_epoch() to provide atomic updates to the period's latest_epoch object, which callers can use to safely determine whether to call RGWPeriod::reflect() and RGWRealm::notify_new_period()

Fixes: http://tracker.ceph.com/issues/19816
Fixes: http://tracker.ceph.com/issues/19817

@cbodley

This comment has been minimized.

Contributor

cbodley commented May 3, 2017

turns out this doesn't actually fix my issue in multisite-for-teuthology, but the change is still valid and passed the rgw suite: http://pulpito.ceph.com/cbodley-2017-05-02_14:01:34-rgw-wip-19817---basic-smithi/

(2 known s3test failures due to apache, the rest were valgrind failures in the mon)

@cbodley cbodley requested review from yehudasa and oritwas May 3, 2017

@oritwas oritwas self-assigned this May 25, 2017

@oritwas

This comment has been minimized.

Contributor

oritwas commented Jun 6, 2017

jenkins test this please

cbodley added some commits May 2, 2017

rgw: remove unused RGWPeriod::use_next_epoch
Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: period latest_epoch ops take optional objv tracker
Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: add atomic RGWPeriod::update_latest_epoch
update_latest_epoch() uses RGWObjVersionTracker to implement atomic
updates to the period's latest_epoch, returning -EEXIST if we already
have an epoch >= the one given

Fixes: http://tracker.ceph.com/issues/19816

Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: remove set_latest_epoch from RGWPeriod::store_info
split the latest_epoch update out of RGWPeriod::store_info(), so callers
that need to call the atomic update_latest_epoch() can do so and
interpret its result separately

Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: call update_latest_epoch() on all period updates
when updating the period, callers use the atomic result of
update_latest_epoch() to determine whether they need to call
RGWPeriod::reflect() and RGWRealm::notify_new_period()

this adds a missing call to RGWPeriod::reflect() to RGWPeriodPuller,
which was previously not safe to do without atomic updates to
latest_epoch

Fixes: http://tracker.ceph.com/issues/19817

Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley

This comment has been minimized.

Contributor

cbodley commented Jun 16, 2017

rebased

@yuriw yuriw merged commit 34fcd22 into ceph:master Jul 14, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
arm64 make check arm64 make check succeeded
Details
make check make check succeeded
Details

@cbodley cbodley deleted the cbodley:wip-19817 branch Jul 14, 2017

@smithfarm

This comment has been minimized.

Contributor

smithfarm commented Aug 22, 2017

@oritwas @cbodley So this is fixing failures in the tests introduced by #14688? If that's the case, does it make sense to backport it to jewel given that #14688 is not being so backported?

@smithfarm

This comment has been minimized.

Contributor

smithfarm commented Aug 22, 2017

I guess the answer is "yes" because the tests in #14688 found a real bug that is being addressed by this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment