New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw: multisite log tracing #16492
rgw: multisite log tracing #16492
Conversation
@yehudasa I spent some time trying to find something worrisome in this; I haven't found it so far. I don't -think- sending updates to the mgr at 10s intervals should be costly at either end? I did wonder whether we have a modern idiom replacing RWLock? |
@yehudasa ok, one practical question; this change creates |
@mattbenjamin while we update the status every 10 seconds, it is librados that decides when to send these updates, following its own internal config |
@mattbenjamin I don't recall any such periodic mgr thread that can be used. I do think that it'd be nice to have something we could hook into, but maybe that's out of scope. |
17ff646
to
2e66311
Compare
@mattbenjamin rebased |
src/rgw/rgw_sync_trace.h
Outdated
}; | ||
|
||
/* | ||
* a container to RGWSyncTraceNodeRef, responsible to keep track |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start sentence with capital letter. Let's make a
capital.
src/rgw/rgw_sync_trace.h
Outdated
* of live nodes, and when last ref is dropped, calls ->finish() | ||
* so that node moves to the retired list in the manager | ||
*/ | ||
class RGWSyncTraceNodeContainer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this wrapper adds an extra allocation to create and indirection to access. consider using a custom deleter with shared_ptr
instead? this works for me:
-using RGWSTNCRef = std::shared_ptr<RGWSyncTraceNodeContainer>;
-RGWSTNCRef RGWSyncTraceManager::add_node(RGWSyncTraceNode *node)
+RGWSyncTraceNodeRef RGWSyncTraceManager::add_node(RGWSyncTraceNode *node)
{
RWLock::WLocker wl(lock);
RGWSyncTraceNodeRef& ref = nodes[node->handle];
ref.reset(node);
- return RGWSTNCRef(new RGWSyncTraceNodeContainer(ref));
+ // return a separate shared_ptr that calls finish() on the node instead of
+ // deleting it. the lambda capture holds a reference to the original 'ref'
+ auto deleter = [ref] (RGWSyncTraceNode *node) { node->finish(); };
+ return {node, deleter};
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the full commit that replaces uses of RGWSTNCRef
with RGWSyncTraceNodeRef
is available here: cbodley@af246d3
src/rgw/rgw_data_sync.cc
Outdated
ldout(sync_env->cct, 20) << __func__ << "(): sync status for bucket " | ||
<< bucket_shard_str{bs} << ": " << sync_status.state << dendl; | ||
tn->log(20, SSTR("sync status for bucket " | ||
<< bucket_shard_str{bs} << ": " << sync_status.state)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should leave bucket_shard_str
part out of these messages, now that it's already included in the log message. for example, this one looks like:
data:sync:shard[75]:entry[eslbge-29:d2560962-42d3-4295-ac0c-efbbce6b79d8.4134.9]:bucket[eslbge-29:d2560962-42d3-4295-ac0c-efbbce6b79d8.4134.9]: sync status for bucket eslbge-29:d2560962-42d3-4295-ac0c-efbbce6b79d8.4134.9: 0
@@ -1537,6 +1537,9 @@ OPTION(rgw_sync_log_trim_interval, OPT_INT) // time in seconds between attempts | |||
|
|||
OPTION(rgw_sync_data_inject_err_probability, OPT_DOUBLE) // range [0, 1] | |||
OPTION(rgw_sync_meta_inject_err_probability, OPT_DOUBLE) // range [0, 1] | |||
OPTION(rgw_sync_trace_history_size, OPT_INT) // max number of complete sync trace entries to keep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yehudasa How good it would be to start comment with Capital letters?
// Max number of complete sync trace entries to keep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amitkumar50 on a scale of 0 to 100? 3
@cbodley incorporated your changes, comments. @mattbenjamin replaced RWLock with ceph::shunique_lock. |
src/rgw/rgw_sync_trace.h
Outdated
mutable boost::shared_mutex lock; | ||
using shunique_lock = ceph::shunique_lock<decltype(lock)>; | ||
|
||
std::atomic<uint64_t> count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we init count to 0 here
struct bucket_str_noinstance { | ||
const rgw_bucket& b; | ||
bucket_str_noinstance(const rgw_bucket& b) : b(b) {} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: newline after struct definition
src/rgw/rgw_sync.h
Outdated
@@ -420,18 +426,13 @@ class RGWMetaSyncSingleEntryCR : public RGWCoroutine { | |||
|
|||
bool error_injection; | |||
|
|||
RGWSyncTraceNodeRef tn; | |||
|
|||
public: | |||
RGWMetaSyncSingleEntryCR(RGWMetaSyncEnv *_sync_env, | |||
const string& _raw_key, const string& _entry_marker, | |||
const RGWMDLogStatus& _op_status, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while you're at it can you fix the alignment here, (not a part of this changeset)
src/rgw/rgw_sync_trace.cc
Outdated
} catch (boost::bad_expression const& e) { | ||
ldout(cct, 5) << "NOTICE: sync trace: bad expression: bad regex search term" << dendl; | ||
} catch (...) { | ||
ldout(cct, 5) << "NOTICE: sync trace: regex_search() through exception" << dendl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
threw
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
So that we could then move nodes to the retired node list in the manager. Also, more tracing implementation. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Dump trace node messages when either rgw or rgw_sync log level is high enough (will only show one message if both are set). Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
avoid use after free Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
So that we can just see which resources are getting synced right now Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
following code review Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
29f56b8
to
88c2adf
Compare
@cbodley @theanalyst rebased and took care of the issues mentioned |
To prevent lock getting freed before other members that use it for cleanup. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
these temporary errors get retried automatically, so no admin intervention is required. logging them only serves to waste space in omap and obscure the more serious sync errors Fixes: http://tracker.ceph.com/issues/22473 Signed-off-by: Casey Bodley <cbodley@redhat.com> (cherry picked from commit ca4510b) Conflicts: src/rgw/rgw_data_sync.cc ("multisite log tracing" feature - see ceph#16492 - is not being backported to luminous)
these temporary errors get retried automatically, so no admin intervention is required. logging them only serves to waste space in omap and obscure the more serious sync errors Fixes: http://tracker.ceph.com/issues/22473 Signed-off-by: Casey Bodley <cbodley@redhat.com> (cherry picked from commit ca4510b) Conflicts: src/rgw/rgw_data_sync.cc ("multisite log tracing" feature - see ceph#16492 - is not being backported to luminous)
these temporary errors get retried automatically, so no admin intervention is required. logging them only serves to waste space in omap and obscure the more serious sync errors Fixes: http://tracker.ceph.com/issues/22473 Resolves: rhbz#1530665 Signed-off-by: Casey Bodley <cbodley@redhat.com> (cherry picked from commit ca4510b) Conflicts: src/rgw/rgw_data_sync.cc ("multisite log tracing" feature - see ceph#16492 - is not being backported to luminous)
these temporary errors get retried automatically, so no admin intervention is required. logging them only serves to waste space in omap and obscure the more serious sync errors Fixes: http://tracker.ceph.com/issues/22473 Signed-off-by: Casey Bodley <cbodley@redhat.com> (cherry picked from commit ca4510b) Conflicts: src/rgw/rgw_data_sync.cc ("multisite log tracing" feature - see ceph#16492 - is not being backported to luminous)
these temporary errors get retried automatically, so no admin intervention is required. logging them only serves to waste space in omap and obscure the more serious sync errors Fixes: http://tracker.ceph.com/issues/22473 Resolves: rhbz#1527132 Signed-off-by: Casey Bodley <cbodley@redhat.com> (cherry picked from commit ca4510b) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> Conflicts: src/rgw/rgw_data_sync.cc ("multisite log tracing" feature - see ceph#16492 - is not being backported to luminous/jewel)
these temporary errors get retried automatically, so no admin intervention is required. logging them only serves to waste space in omap and obscure the more serious sync errors Fixes: http://tracker.ceph.com/issues/22473 Resolves: rhbz#1527132 Signed-off-by: Casey Bodley <cbodley@redhat.com> (cherry picked from commit ca4510b) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> Conflicts: src/rgw/rgw_data_sync.cc ("multisite log tracing" feature - see ceph#16492 - is not being backported to luminous/jewel)
A new framework that tracks in memory of current rgw sync process. The new system allows following a specific sync entity (vs. the current flat log dump). Each entity has an id that is a concatenation of the path within the execution tree. An entity would roughly reflect a sync role (meta, meta shard, meta entry, data, data shard, bucket shard, object), and it is possible to look at the history of that entity's point of view. New admin socket commands were added:
All commands can get an extra param that is a regex that can be used to search for a specific entity (matching the history of that entity). E.g.,
We keep info about all current running nodes, and also keep some history of complete nodes. We can see a view of current nodes that are marked as active (where we identified and deal with actual meta/data sync):
There's a active_short option that can be used to just get a plain name of the entities that are current syncing (e.g., list of /). This list is being sent to the service map periodically and can be retrieved via the
ceph service status
command.