New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: sync status compares the current master period #12907

Merged
merged 3 commits into from Jan 18, 2017

Conversation

Projects
None yet
2 participants
@theanalyst
Member

theanalyst commented Jan 12, 2017

Previously sync status compared master's oldest period against the current local period, leading to an error message. Fixing this by getting the current period from realm.

Fixes: http://tracker.ceph.com/issues/18064

@theanalyst

This comment has been minimized.

Member

theanalyst commented Jan 12, 2017

still incomplete, and seeing that metadata sync still always reports that it is behind by a few shards

@theanalyst theanalyst requested a review from cbodley Jan 12, 2017

@cbodley

thanks for taking this on!

ret = sync.read_master_log_shards_info(&master_period, &master_shards_info);
/* Set the master zonegroup as the remote */
RGWPeriod current_period(local_period);

This comment has been minimized.

@cbodley

cbodley Jan 12, 2017

Contributor

the naming of local_period and current_period here is confusing. the local/current period is already initialized as RGWRados::current_period, and can be queried with store->get_current_period_id(). the master zone is aggressive about making sure other zones have the latest period, so it's safe to trust our local copy instead of querying it from the master's realm

This comment has been minimized.

@theanalyst

theanalyst Jan 12, 2017

Member

ah ok, didn't know that, will modify

status.push_back(string("failed to fetch realm info from master: ") + cpp_strerror(-ret));
return;
}
ret = sync.read_master_log_shards_info(&master_period, &master_shards_info);

This comment has been minimized.

@cbodley

cbodley Jan 12, 2017

Contributor

sync_status.sync_info.period records our sync progress in the master's mdlogs, which are split up by period. master_period from sync.read_master_log_shards_info() just tells new zones which period to start on for incremental sync, so we shouldn't consult master_period here at all

the comparison below should just be if (sync_status.sync_info.period != store->get_current_period_id())

This comment has been minimized.

@theanalyst

theanalyst Jan 12, 2017

Member

yeah earlier we were checking against the current_period againstmaster_period from the above call, which is why I thought the md sync status was always wrong (since that is the first period to start sync from). Will modify to get_current_period_id.

@cbodley

This comment has been minimized.

Contributor

cbodley commented Jan 12, 2017

looks good, thanks. can we drop the changes to do_realm_get()?

rgw_admin: get master's period from store's current period info
This ensures that we get the current period in contrast to the admin log
which gets the master's earliest period.

Fixes: http://tracker.ceph.com/issues/18064
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
@theanalyst

This comment has been minimized.

Member

theanalyst commented Jan 13, 2017

Dropped the other commits, now while the sync status no longer shows the period error, I'm often seeing that the metadata in secondary is always behind by one shard,

          realm 3b284993-df62-448a-acdd-02fc217b71bf (gold)
      zonegroup 89cd2f99-0f4b-4c0d-86d1-9fc2c9071c45 (us)
           zone 97ea1518-abe3-4c6b-acfd-2113929ac77a (us-west)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is behind on 1 shards
      data sync source: b4c5323d-30a2-4a4f-b84e-59dac54f5ba8 (us-east-1)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source

and this shard seems to be mdlog from the master's first period which was the zone user creation, though the zone user is created in the secondary, I'm not seeing a mdlog entry about this.

@cbodley

This comment has been minimized.

Contributor

cbodley commented Jan 13, 2017

and this shard seems to be mdlog from the master's first period

okay, that's an issue with RGWRemoteMetaLog::read_master_log_shards_info(). it's reading the rgw_mdlog_info from the master, then requesting the mdlog shards for rgw_mdlog_info::period, which is the master's oldest mdlog period. what we want is the shards for the current period - so we should either pass in our store->get_current_period_id() or an empty string (which the master will interpret to mean 'current period'). something like this?

-int RGWRemoteMetaLog::read_master_log_shards_info(string *master_period, map<int, RGWMetadataLogInfo> *shards_info)
+int RGWRemoteMetaLog::read_master_log_shards_info(const string& period, map<int, RGWMetadataLogInfo> *shards_info)
 {
   if (store->is_meta_master()) {
     return 0;
   }
 
   rgw_mdlog_info log_info;
   int ret = read_log_info(&log_info);
   if (ret < 0) {
     return ret;
   }
-
-  *master_period = log_info.period;
 
-  return run(new RGWReadRemoteMDLogInfoCR(&sync_env, log_info.period, log_info.num_shards, shards_info));
+  return run(new RGWReadRemoteMDLogInfoCR(&sync_env, period, log_info.num_shards, shards_info));
 }
@theanalyst

This comment has been minimized.

Member

theanalyst commented Jan 13, 2017

I see, I thought there were other consumers of the master_period, looks like there isn't, in which case we can make this change

rgw: allow getting master log shards info on specified period
This is needed for rgw admin's sync status or else we end up always
publishing that we're behind since we are always checking against
master's first period to sync from

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
@theanalyst

This comment has been minimized.

Member

theanalyst commented Jan 13, 2017

Synced state:

          realm d01f13a6-45ad-45fb-8b0b-cd03a05382c8 (gold)
      zonegroup 6b1b12f5-051b-4c81-b6bf-b5f3ba82bba2 (us)
           zone 6ad3f4a2-afed-49e8-8eae-1c92b83c46c4 (us-west)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is caught up with master
      data sync source: 56301478-7c86-4c66-ab96-e85bcaf6d6e6 (us-east-1)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source

Syncing state

      zonegroup 6b1b12f5-051b-4c81-b6bf-b5f3ba82bba2 (us)
           zone 6ad3f4a2-afed-49e8-8eae-1c92b83c46c4 (us-west)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is behind on 3 shards
                oldest incremental change not applied: 2017-01-13 16:24:39.0.337583s
      data sync source: 56301478-7c86-4c66-ab96-e85bcaf6d6e6 (us-east-1)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source
rgw_admin: read master log shards from master's current period
Also make the sync output look similar to the output of data sync
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
@theanalyst

This comment has been minimized.

Member

theanalyst commented Jan 14, 2017

test this please

@theanalyst theanalyst changed the title from wip: rgw: sync status compares the current master period to rgw: sync status compares the current master period Jan 16, 2017

@cbodley cbodley merged commit a18c7e1 into ceph:master Jan 18, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment