rgw multisite: trim data logs as peer zones catch up #10372

cbodley · 2016-07-20T15:08:36Z

No description provided.

yehudasa · 2016-07-21T22:26:11Z

src/rgw/rgw_rest_log.cc

+void RGWOp_MDLog_Status::execute()
+{
+  // construct a temporary status manager to read the sync status
+  RGWMetaSyncStatusManager sync(store, store->get_async_rados());


@cbodley shouldn't we have an existing status manager through s->store?

yehudasa · 2016-07-21T22:50:29Z

@cbodley overall looks good. Meta should be pretty similar, will need to get bucket index trimming too.

RGWSimpleRadosReadCR won't currently fail with ENOENT, but instead passes an empty object to handle_data(). add an empty_on_enoent flag to the constructor, defaulting to true, to make this behavior optional for callers that do want to fail on ENOENT Signed-off-by: Casey Bodley <cbodley@redhat.com>

this allows us to limit the number of outstanding requests for shard markers there also appeared to be issues with spawning the shard CRs from RGWReadDataSyncStatusCoroutine::handle_data(), because handle_data() was returning before the shard CRs completed Signed-off-by: Casey Bodley <cbodley@redhat.com>

and took out the redundant 'rgw' from 'rgw meta sync:' Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rest handlers for sync status need to return ENOENT errors. the only other callers are in radosgw-admin, so the ENOENT errors are ignored at those call sites instead Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

RGWDataSyncStatusManager::read_sync_status() now operates on the given parameter, rather than its internal member variable. this allows multiple concurrent readers, which is needed for the rest interface Signed-off-by: Casey Bodley <cbodley@redhat.com>

RGWCoroutinesManager::run() is not reentrant, so concurrent users of read_sync_status() must use different managers Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

cbodley · 2016-07-22T16:46:16Z

updated to address your comments. the rest api now uses the existing SyncStatusManagers, which required these new commits:

rgw: expose sync managers through RGWRados
rgw: change read_sync_status interface
rgw: use separate cr manager for read_sync_status

also changed so that a rgw_sync_log_trim_interval of 0 will disable the trim thread

cbodley · 2016-09-20T20:10:10Z

@yehudasa this was part of the latest wip-cbodley-testing run, but there was a single s3-tests failure that hit the assert() that detects deadlocks in RGWCoroutinesManager::run().. at some point during the run, osd.2 started failing every op with -ECANCELED, leading to a cascade of coroutine errors. so I'm not sure whether the cr deadlock was due to this unique failure case, or caused by new coroutines added by this PR. I'll run this branch through more local testing to see if I can learn more.

edit: actually, the assertion was hit inside of the RGWMetaSyncProcessorThread, and this PR doesn't make any changes to meta sync coroutines

cbodley added feature rgw labels Jul 20, 2016

cbodley assigned yehudasa Jul 20, 2016

yehudasa reviewed Jul 21, 2016
View reviewed changes

cbodley added 7 commits July 21, 2016 22:14

rgw: add dout_prefix for rgw_data_sync.cc

821c70d

and took out the redundant 'rgw' from 'rgw meta sync:' Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add json decoders for data sync status

ebbb70b

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: don't ignore ENOENT in RGWRemoteDataLog::read_sync_status()

2cc533b

rest handlers for sync status need to return ENOENT errors. the only other callers are in radosgw-admin, so the ENOENT errors are ignored at those call sites instead Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: expose sync managers through RGWRados

ccef4b0

Signed-off-by: Casey Bodley <cbodley@redhat.com>

cbodley force-pushed the wip-rgw-data-log-trim branch from 9e22200 to 926f6a2 Compare July 22, 2016 16:06

cbodley added 6 commits July 22, 2016 12:08

rgw: use separate cr manager for read_sync_status

6b1e40d

RGWCoroutinesManager::run() is not reentrant, so concurrent users of read_sync_status() must use different managers Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add rest handlers to query sync status

a66b4cc

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: enable async calls to time_log_trim

d67436b

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add RGWRadosTimelogTrimCR

20f31a4

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add RGWDataLogTrimCR

6a366f9

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add RGWSyncLogTrimThread to RGWRados

3b674bb

Signed-off-by: Casey Bodley <cbodley@redhat.com>

cbodley force-pushed the wip-rgw-data-log-trim branch from 926f6a2 to 3b674bb Compare July 22, 2016 16:38

cbodley mentioned this pull request Aug 2, 2016

rgw multisite: use a rados lock to coordinate data log trimming #10546

Merged

cbodley added the wip-cbodley-testing label Aug 22, 2016

yehudasa merged commit b974dac into ceph:master Oct 4, 2016

cbodley deleted the wip-rgw-data-log-trim branch October 4, 2016 19:11

cbodley mentioned this pull request Mar 28, 2017

jewel: rgw: use separate http_manager for read_sync_status #14195

Merged

cbodley mentioned this pull request Jan 22, 2018

jewel: rgw: automated trimming of datalog and mdlog #20061

Merged

ghost mentioned this pull request Mar 17, 2015

osdc/Objecter: improve pool deletion detection #4032

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rgw multisite: trim data logs as peer zones catch up #10372

rgw multisite: trim data logs as peer zones catch up #10372

cbodley commented Jul 20, 2016

yehudasa Jul 21, 2016

yehudasa commented Jul 21, 2016

cbodley commented Jul 22, 2016

cbodley commented Sep 20, 2016 •

edited

rgw multisite: trim data logs as peer zones catch up #10372

rgw multisite: trim data logs as peer zones catch up #10372

Conversation

cbodley commented Jul 20, 2016

yehudasa Jul 21, 2016

Choose a reason for hiding this comment

yehudasa commented Jul 21, 2016

cbodley commented Jul 22, 2016

cbodley commented Sep 20, 2016 • edited

cbodley commented Sep 20, 2016 •

edited