Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw multisite: trim data logs as peer zones catch up #10372

Merged
merged 13 commits into from Oct 4, 2016

Conversation

cbodley
Copy link
Contributor

@cbodley cbodley commented Jul 20, 2016

No description provided.

void RGWOp_MDLog_Status::execute()
{
// construct a temporary status manager to read the sync status
RGWMetaSyncStatusManager sync(store, store->get_async_rados());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbodley shouldn't we have an existing status manager through s->store?

@yehudasa
Copy link
Member

@cbodley overall looks good. Meta should be pretty similar, will need to get bucket index trimming too.

RGWSimpleRadosReadCR won't currently fail with ENOENT, but instead
passes an empty object to handle_data(). add an empty_on_enoent flag to
the constructor, defaulting to true, to make this behavior optional for
callers that do want to fail on ENOENT

Signed-off-by: Casey Bodley <cbodley@redhat.com>
this allows us to limit the number of outstanding requests for shard
markers

there also appeared to be issues with spawning the shard CRs
from RGWReadDataSyncStatusCoroutine::handle_data(), because
handle_data() was returning before the shard CRs completed

Signed-off-by: Casey Bodley <cbodley@redhat.com>
and took out the redundant 'rgw' from 'rgw meta sync:'

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
rest handlers for sync status need to return ENOENT errors. the only
other callers are in radosgw-admin, so the ENOENT errors are ignored at
those call sites instead

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
RGWDataSyncStatusManager::read_sync_status() now operates on the given
parameter, rather than its internal member variable. this allows
multiple concurrent readers, which is needed for the rest interface

Signed-off-by: Casey Bodley <cbodley@redhat.com>
RGWCoroutinesManager::run() is not reentrant, so concurrent users of
read_sync_status() must use different managers

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley
Copy link
Contributor Author

cbodley commented Jul 22, 2016

updated to address your comments. the rest api now uses the existing SyncStatusManagers, which required these new commits:

  • rgw: expose sync managers through RGWRados
  • rgw: change read_sync_status interface
  • rgw: use separate cr manager for read_sync_status

also changed so that a rgw_sync_log_trim_interval of 0 will disable the trim thread

@cbodley
Copy link
Contributor Author

cbodley commented Sep 20, 2016

@yehudasa this was part of the latest wip-cbodley-testing run, but there was a single s3-tests failure that hit the assert() that detects deadlocks in RGWCoroutinesManager::run().. at some point during the run, osd.2 started failing every op with -ECANCELED, leading to a cascade of coroutine errors. so I'm not sure whether the cr deadlock was due to this unique failure case, or caused by new coroutines added by this PR. I'll run this branch through more local testing to see if I can learn more.

edit: actually, the assertion was hit inside of the RGWMetaSyncProcessorThread, and this PR doesn't make any changes to meta sync coroutines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants