New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw multisite: fixes for meta sync across periods #13070

Merged
merged 11 commits into from Apr 26, 2017

Conversation

Projects
None yet
2 participants
@cbodley
Contributor

cbodley commented Jan 23, 2017

fixes for the incremental metadata sync logic that detects the end of one period and advances to the next

the rgw: RGWMetaSyncShardCR uses fake error to break out of RGWBackoffControlCR commit is admittedly hacky, but i couldn't find a good way to modify the RGWBackoffControlCR interface to support this use (should RGWBackoffControlCR really keep retrying on success? suggestions welcome!)

passes the updated test_multi_period_incremental_sync() test in #13067

@cbodley cbodley requested a review from yehudasa Jan 23, 2017

@cbodley

This comment has been minimized.

Contributor

cbodley commented Jan 23, 2017

bool operator!=(const Cursor& lhs, const Cursor& rhs)
{
return lhs.history != rhs.history || lhs.epoch != rhs.epoch;

This comment has been minimized.

@yehudasa

yehudasa Jan 23, 2017

Member

or can just do return !(lhs == rhs)

@cbodley cbodley referenced this pull request Jan 25, 2017

Merged

rgw multisite: automated mdlog trimming #13111

2 of 2 tasks complete
@cbodley

This comment has been minimized.

Contributor

cbodley commented Jan 30, 2017

teuthology results: all green except for 10 dead jobs

@cbodley

This comment has been minimized.

Contributor

cbodley commented Jan 30, 2017

jenkins test this please

@cbodley

This comment has been minimized.

Contributor

cbodley commented Jan 30, 2017

@yehudasa look okay to merge?

@yehudasa

This comment has been minimized.

Member

yehudasa commented Feb 16, 2017

@cbodley there are some cases that I'm not sure we really cover, and maybe we need to think about in the context of this PR: when we switch to a new period, the max marker of the previous period should be the point at which the new master was, and not necessarily where the original master was. I don't think we currently do a rollback of changes that happened on original master, but lost due to the switch to the new master. This can affect all the zones, and not necessarily the new master.

@@ -1576,10 +1571,6 @@ class RGWMetaSyncShardCR : public RGWCoroutine {
&max_marker, INCREMENTAL_MAX_ENTRIES,
&log_entries, &truncated));
for (log_iter = log_entries.begin(); log_iter != log_entries.end(); ++log_iter) {
if (!period_marker.empty() && period_marker < log_iter->id) {

This comment has been minimized.

@cbodley

cbodley Feb 17, 2017

Contributor

it looks like my diagnosis in this commit was wrong - this line is not comparing markers from different periods. rather, if the old and new master zones agree on the last marker of this period, we won't see any log entries with a marker greater than period_marker, and this block will never execute

@cbodley

This comment has been minimized.

Contributor

cbodley commented Mar 7, 2017

@yehudasa updated to fix handling of the period markers - test_multi.py is passing reliably (including the updated test in #13067)

i also changed the interaction between RGWBackoffControlCR and RGWMetaSyncShardCR. RGWBackoffControlCR no longer retries on success, so RGWMetaSyncShardCR can just return success when it reaches the end of a period. this breaks out of the RGWBackoffControlCR loop and returns control back to RGWMetaSyncCR, which can advance to the next period. the previous version of this pull request was returning a fake error in this case, but #13546 recently merged that causes it to retry on all errors

this change to RGWBackoffControlCR exposed some odd behavior in RGWDataSyncCR that relied on retrying after success. in the first call to RGWDataSyncCR, RGWInitDataSyncStatusCoroutine would create the sync status objects and return success. but it didn't initialize the sync_status.sync_markers used to spawn RGWDataSyncShardControlCR, so we'd loop over an empty map, spawn nothing, and return success. once RGWBackoffControlCR called us back, we would re-read the sync status object to get the correct sync_status.sync_markers, and continue as normal.

i modified RGWInitDataSyncStatusCoroutine a bit so that RGWDataSyncCR could pass in a pointer to its rgw_data_sync_status to avoid having to read it back from rados

@cbodley cbodley changed the title from rgw multisite: fixes for meta sync across periods to [DNM] rgw multisite: fixes for meta sync across periods Mar 15, 2017

@cbodley

This comment has been minimized.

Contributor

cbodley commented Mar 22, 2017

updates:

  • rgw_sync_status_marker includes a realm_epoch so that markers from old periods can be ignored in comparisons against other periods
  • RGWPeriod::update_sync_status() reads markers from its sync status using RGWMetaSyncStatusManager::read_sync_status(), rather than getting the latest markers from its local mdlog (which would include entries that it hasn't synced)

passes tests in #13067 (which have a new sleep() after period commit, which is expected to race with metadata sync by design)

@cbodley cbodley changed the title from [DNM] rgw multisite: fixes for meta sync across periods to rgw multisite: fixes for meta sync across periods Mar 22, 2017

@cbodley

This comment has been minimized.

Contributor

cbodley commented Apr 10, 2017

rebased, ping @yehudasa

const auto current_epoch = current_period.get_realm_epoch();
if (current_epoch != status.sync_info.realm_epoch) {
// no sync status markers for the current period
const int behind = current_epoch - status.sync_info.realm_epoch;

This comment has been minimized.

@yehudasa

yehudasa Apr 10, 2017

Member

@cbodley is it possible somehow that current_epoch < status.sync_info.realm_epoch?

This comment has been minimized.

@cbodley

cbodley Apr 10, 2017

Contributor

probably not possible, because RGWPeriod::commit() has a check for if (realm_epoch != current_period.get_realm_epoch() + 1) just prior to calling update_sync_status(). that would mean that meta sync had advanced past the current_period

This comment has been minimized.

@yehudasa

yehudasa Apr 10, 2017

Member

@cbodley how about adding assert there to cover that case?

This comment has been minimized.

@cbodley

cbodley Apr 10, 2017

Contributor

added assert(current_epoch > status.sync_info.realm_epoch);

@yehudasa

lgtm

@cbodley

This comment has been minimized.

Contributor

cbodley commented Apr 10, 2017

jenkins test this please

cbodley added some commits Aug 23, 2016

rgw: add == and != operators for period history cursor
RGWMetaSyncCR was using operator== but it always returned true!

Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: RGWBackoffControlCR only retries until success
Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: clean up RGWInitDataSyncStatusCoroutine
RGWInitDataSyncStatusCoroutine operates on a given rgw_data_sync_status
pointer, which saves us from having to read it back from rados

Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: change metadata read_sync_status interface
makes the same change to read_sync_status() in RGWMetaSyncStatusManager,
needed to support multiple concurrent readers for the rest interface

Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: use RGWShardCollectCR for RGWReadSyncStatusCoroutine
Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: store realm epoch with sync status markers
sync status markers can't be compared between periods, so we need to
record the current period's realm epoch with its markers. when the
rgw_meta_sync_info.realm_epoch is more recent than the marker's
realm_epoch, we must treat the marker as empty

Signed-off-by: Casey Bodley <cbodley@redhat.com>

cbodley added some commits Mar 20, 2017

rgw: period commit uses sync status markers
Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: remove rgw_realm_reconfigure_delay
when the master zone is changed, this config variable was increasing the
window of time where the old master zone would continue to handle
requests to modify metadata. those changes would not be reflected by the
new metadata master zone, and would be lost to the cluster

it was an attempt to optimize for the unlikely case of multiple period
changes in a short period of time, but the logic in reload() handles this
case correctly as is

Signed-off-by: Casey Bodley <cbodley@redhat.com>
rgw: require --yes-i-really-mean-it to promote zone with stale metadata
if a zone is promoted to master before it has a chance to sync from the
previous master zone, any metadata entries after its sync position will
be lost

print an error if 'period commit' is trying to promote a zone that is
more than one period behind the current master, and only allow the
commit to proceed if the --yes-i-really-mean-it flag is provided

Signed-off-by: Casey Bodley <cbodley@redhat.com>
test/rgw: sync status ignores shard markers from previous periods
Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley

This comment has been minimized.

Contributor

cbodley commented Apr 26, 2017

rebased and still passing test_multi_period_incremental_sync()

the only conflicts were in test_multi.py, so i'm not planning to run through teuthology again

@cbodley

This comment has been minimized.

Contributor

cbodley commented Apr 26, 2017

@yehudasa @mattbenjamin could one of you please override the failed build check? the only make check failure was 148 - osd-scrub-repair.sh (Failed)

@yehudasa yehudasa merged commit 97157f9 into ceph:master Apr 26, 2017

3 of 4 checks passed

default Build finished.
Details
Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
arm build successfully built on arm
Details

@cbodley cbodley deleted the cbodley:wip-rgw-meta-sync-periods branch Apr 26, 2017

dongbula pushed a commit to dongbula/ceph that referenced this pull request Jul 22, 2017

Merge pull request ceph#13070 from cbodley/wip-rgw-meta-sync-periods
rgw multisite: fixes for meta sync across periods

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment