New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd-mirror A/A: coordinate image syncs with leader #14745
Conversation
Implementation details:
|
Nit: typo of "separate" in "rbd-mirror: refactor ImageSyncThrottler" commit message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original thinking was that these messages would be part of InstanceWatcher
since that would allow direct RPC between the instance requesting the sync and the leader that can grant it. Sending the RPC via the LeaderWatcher
would result in a broadcast message.
The InstanceWatcher
already has the notion of repeating messages until acked which seems like it could be re-used here. Also, I think you would be able to eliminate the listener pattern by instantiating the InstanceSyncThrottler
when the leader is acquired and by passing the pointer to InstanceWatcher
(and clearing it when the leader role is lost).
} | ||
sync_holder->m_on_finish->complete(r); | ||
m_throttler->finish_op(sync_holder->m_local_image_id); | ||
Mutex::Locker locker(m_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I would prefer to clean up the local state before firing the completion down to the instance throttler
@@ -100,6 +100,7 @@ std::vector<MockImageSync *> MockImageSync::instances; | |||
|
|||
|
|||
// template definitions | |||
#include "tools/rbd_mirror/InstanceSyncThrottler.cc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A template specialization of InstanceSyncThrottle should be defined above and eliminate the need to pull in the CC file here. That would also allow the test cases below to specifically test ImageSyncThrottler instead of its interactions w/ InstanceSyncThrottler.
@@ -68,6 +136,10 @@ struct UnknownPayload { | |||
typedef boost::variant<HeartbeatPayload, | |||
LockAcquiredPayload, | |||
LockReleasedPayload, | |||
SyncRequestPayload, | |||
SyncRequestAckPayload, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: perhaps this message could be called SyncStartPayload
(from leader -> instance to grant permission to start the sync) and we could eliminate the previous use of SyncStartPayload
below. The Ack
suffix just seems odd to me since it's not an acknowledgement of the request but rather a command to start.
@dillaman Thank you for the review. Using |
@dillaman updated |
std::list<C_SyncHolder *> m_sync_queue; | ||
std::map<PoolImageId, C_SyncHolder *> m_inflight_syncs; | ||
|
||
std::map<std::string, C_SyncHolder *> m_waiting_syncs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: seems like you can remove this collection since once you schedule the sync, you hand the management over to InstanceWatcher
for starting / canceling. This collection just seems to be used for an assert.
|
||
class ProgressContext; | ||
|
||
/** | ||
* Manage concurrent image-syncs | ||
*/ | ||
template <typename ImageCtxT = librbd::ImageCtx> | ||
class ImageSyncThrottler : public md_config_obs_t { | ||
class ImageSyncThrottler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: should this class be the InstanceSyncThrottler
since it's associated w/ the instance and the original version (now named InstanceSyncThrottler
) should remain the ImageSyncThrottler
(or LeaderSyncThrottler
)? Right now, the InstanceXYZ
classes are tied to a specific instance of rbd-mirror.
Alternatively, if you pass the InstanceWatcher
reference down to the bootstrap state machine, it could instantly create a ImageSync
state machine whose first state is just to invoke InstanceWatcher::notify_sync_start
and wait. That would eliminate this pass-through class entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InstanceSyncThrottler
name was chosen when I was working on previous version, when this class was handling sync notifications via the leader object. Now this name really looks wrong. I wouldn't like to change it to ImageSyncThrottler
because it knows nothing about 'image', and even that operations it throttles are syncs. I am going to change it to just Throttler
.
I will evaluate your idea to remove ImageSyncThrottler
and use InstanceWatcher::notify_sync_xyz
methods directly in ImageSync
state machine. Thanks.
<< ": canceling request to local throttler" << dendl; | ||
if (on_start != nullptr) { | ||
if (instance_id == m_instance_id) { | ||
send_sync_start(sync_id, on_start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can you just bubble up -ESTALE for the local instance as well to avoid the special case?
return; | ||
} | ||
|
||
// This should only be possible when notify_finish_sync for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this situation occur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A user completes a sync by sending notify_finish_sync
(which returns immediately, after queuing the sync notification to the leader), and after this (by some reason) calls notify_start_sync
for the same image again. At this moment the previous notification for this sync_id may still be in progress.
This could be avoided by using some unique value for sync_id instead of image_id by the caller, but this will complicate things on the caller side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK. Would it be possible to remove the special on_complete
callback and just wrap the on_start
context in that case with this special handling?
@@ -72,8 +74,19 @@ class InstanceWatcher : protected librbd::Watcher { | |||
const std::string &peer_image_id, | |||
bool schedule_delete, Context *on_notify_ack); | |||
|
|||
void notify_start_sync(const std::string &request_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: perhaps swap to "noun_verb" form like the methods above? notify_sync_start
, notify_sync_cancel
, etc
@trociny I think the optimization to directly pass-through image sync requests for the local leader's images might be adding undo complications. The code complexity would most likely be reduced if we just eliminated all those special cases and just send the message and allowed it to be received back by the watcher. |
assert(sync_ctx->on_complete == nullptr); | ||
sync_ctx->on_complete = new FunctionContext( | ||
[this, sync_id, on_start] (int r) { | ||
if (r == -ECANCELED) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this state possible? If chaining off a previous sync request that was canceled and somehow we raced to restart the sync request, should the previous -ECANCLED
be applied to the new sync request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the notification for the previous request is completed, on_complete
is always called with r =0 (see handle_notify_complete_sync
). This case is when the caller cancels this (not started yet) sync (handled in notify_cancel_sync
, if on_complete
is not null).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would that be a race condition back to the throttler? Since the cancel method doesn't take an "on_finish" callback, it could just assume it's all properly cleaned up. However, eventually, the "on_start" will get invoked with an -ECANCELED parameter.
If contexts were passed to the finish and cancel methods, perhaps these boundary cases would be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dillaman I think I see some problems with my current code here. I am not sure we are talking about the same thing though, but lets discuss this when I update the code fixing the problems I see and addressing other your comments.
@dillaman Here is a new version. The main changes:
Some details for the last item. Now, when requesting a sync slot, the instance sends SyncRequest messages and waits for ack (resending after timeout). When the leader throttler grants the sync slot, the SyncRequest is acked and the leader sends SyncStart message (resending after timeout until it is acked). The SyncStart is acked by the instance when the sync is complete. A situation is possible, when the leader receives a duplicate 'SyncRequest' resent after timeout, after it has already granted the sync slot and sent 'SyncStart'. In this case a duplicate 'SyncStart' will be sent, which is handled on the receiver side by completing the previous 'SyncStart' with -ESTALE. In this scheme it looks like 'SyncComplete' notifications are not necessary and just lead to duplicate sync throttler finish_op. It looks I may remove them, so |
@trociny In general it sounds good to me. |
@dillaman updated (unnecessary 'SyncComplet' messages are removed) |
@@ -785,8 +785,14 @@ void InstanceWatcher<I>::handle_image_acquire( | |||
const std::string &peer_image_id, Context *on_finish) { | |||
dout(20) << "global_image_id=" << global_image_id << dendl; | |||
|
|||
m_instance_replayer->acquire_image(global_image_id, peer_mirror_uuid, | |||
peer_image_id, on_finish); | |||
auto ctx = new FunctionContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably would be a good idea to track these queued callbacks via an AsyncOpTracker
to prevent lock release / shut down races.
src/tools/rbd_mirror/Throttler.cc
Outdated
// -*- mode:C++; tab-width:8; c-basic-offset:2; indent-tabs-mode:t -*- | ||
// vim: ts=8 sw=2 smarttab | ||
|
||
#include "Throttler.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why the name change to something so generic? This seems to be heavily tied to sync throttling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it can throttle any ops, it has nothing specific to image sync (or just sync). That is why so generic name. If you think it is a bad idea I will change it. ImageSyncThrottler? SyncThrottler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured that was the idea, but it is still tied to a specific config option for max sync and it dump status for number of syncs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point. I forgot about this. I will change it (return back) to ImageSyncThrottler.
@@ -85,10 +85,14 @@ template <typename I> | |||
void BootstrapRequest<I>::cancel() { | |||
dout(20) << dendl; | |||
|
|||
Mutex::Locker locker(m_lock); | |||
m_canceled = true; | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no need for new block
m_client_meta, m_work_queue, ctx, | ||
m_progress_ctx); | ||
if (m_canceled) { | ||
m_ret_val = -ECANCELED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: was this change needed? If the image sync wasn't started, no need to worry about a race w/ the result code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was no special need, this just looked more readable to me. And yes even if there no need to worry about a race I prefer to keep the lock in the cases like this if it does not make trouble. Because when I look at the code later it always looks alarming to me when I see variables like these accessed without lock and I need to go through the code to ensure it is safe.
Ok, I will return it back to make changes less intrusive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries -- just double checking
@@ -72,8 +74,18 @@ class InstanceWatcher : protected librbd::Watcher { | |||
const std::string &peer_image_id, | |||
bool schedule_delete, Context *on_notify_ack); | |||
|
|||
void notify_sync_request(const std::string &sync_id, Context *on_sync_start); | |||
bool cancel_notify_sync_request(const std::string &sync_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: perhaps just cancel_sync_request
?
4baf9a0
to
967ad82
Compare
@dillaman updated |
retest this please |
Remove unnecessary ImageSyncThrottler dependency. Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Would have been possible when release_image had lead to cancel sync (i.e. cancel_sync_request). Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Fixes: http://tracker.ceph.com/issues/18789 Signed-off-by: Mykola Golub <mgolub@mirantis.com>
rebased to resolve conflicts with master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm -- test failures seem unrelated to this change
Fixes: http://tracker.ceph.com/issues/18789
Signed-off-by: Mykola Golub mgolub@mirantis.com