Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

librbd: optionally unregister "laggy" journal clients #10378

Merged
merged 8 commits into from Sep 7, 2016

Conversation

trociny
Copy link
Contributor

@trociny trociny commented Jul 21, 2016

@dillaman
Copy link

@trociny Instead of unregistering the laggy client (which would work), I put a placeholder ClientState in the registration record [1]. My original idea was to set this to DISCONNECTED (via a new cls method) and adjust the JournalTrimmer to ignore clients with the state set to DISCONNECTED when determining the minimum set.

The rbd-mirror daemon can then detect if and when it's client state is set to DISCONNECTED, stop replay (if in-progress), and re-initiate an image sync w/o the need to split-brain. The sync point record supports a "from snapshot" so it can be modified to only resync from the last known good snapshot (if any). For example, if the system gets disconnected during the initial sync, it can recover w/o starting over from scratch by reconnecting to the journal, creating a second sync point record tied to the first, and once the first sync is complete, it would start with the second (similar to how VM/storage live migration transfers the full base, and then one or more deltas until it can cut over).

The iterative / recoverable image sync wouldn't be part of this effort, but I wanted to lay out my full plans.

[1] https://github.com/ceph/ceph/blob/master/src/cls/journal/cls_journal_types.h#L79

@trociny
Copy link
Contributor Author

trociny commented Jul 22, 2016

@dillaman Got it! Thanks. Will redo.

@trociny trociny force-pushed the wip-14738 branch 2 times, most recently from 6f922c7 to 8799931 Compare July 27, 2016 13:13
@trociny
Copy link
Contributor Author

trociny commented Jul 27, 2016

@dillaman How do you like this version?

@@ -1250,6 +1250,7 @@ OPTION(rbd_journal_object_flush_interval, OPT_INT, 0) // maximum number of pendi
OPTION(rbd_journal_object_flush_bytes, OPT_INT, 0) // maximum number of pending bytes per journal object
OPTION(rbd_journal_object_flush_age, OPT_DOUBLE, 0) // maximum age (in seconds) for pending commits
OPTION(rbd_journal_pool, OPT_STR, "") // pool for journal objects
OPTION(rbd_journal_client_object_sets_behind_max, OPT_INT, 0) // maximum number of object sets a journal client can be behind before it is automatically unregistered

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: perhaps rbd_journal_max_concurrent_object_sets?

@dillaman
Copy link

@trociny Looking good to me. The only missing piece is the smarts in rbd::mirror::ImageReplayer to catch a disconnect while we are actively replaying.

@trociny
Copy link
Contributor Author

trociny commented Aug 1, 2016

@dillaman Updated, still not tested well. Also there are potential issues to discuss:

  • A laggy client is disconnected, resync is started, the client is re-registered. While it is resyncing the master commit position grows and the mirror client is unregistered again before resync is complete. Should we add a way the rbd-mirror could tell that the client cannot be disconnected (during cync/bootstrap)?
  • What to do when disconnect is detected when replaying (or just after resync)? Doing resync immediately does not look like a good idea (mirroring is likely overloaded, and resync will only increase the load). Currently it just stops image replay (though without manual flag, so will be restarted on the next pool rescan). May be schedule resync after configurable interval?

@trociny
Copy link
Contributor Author

trociny commented Aug 3, 2016

Added some tests. Also, experimental rbd journal disconnect command is added. It is used in functional tests, not sure though it is useful enough to add it upstream. @dillaman do you think it would be useful to have such command? It could be also implemented as rbd mirror image disconnect.

const std::string &client_id = c.id;
uint64_t object_set = 0;
if (!c.commit_position.object_positions.empty()) {
auto position = *(c.commit_position.object_positions.begin());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: auto &

@@ -986,6 +993,7 @@ void JournalMetadata::committed(uint64_t commit_tid,

ldout(m_cct, 20) << "updated commit position: " << commit_position << ", "
<< "on_safe=" << m_commit_position_ctx << dendl;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: remove empty line

@trociny
Copy link
Contributor Author

trociny commented Aug 10, 2016

@dillaman updated. Now when a journal client is disconnected, rbd-mirror just stops image replayer, until 'mirror image resync' command is issued by user. Also rbd_mirror_resync_after_disconnect configuration option is added to automatically start the resync.

@@ -985,6 +1084,11 @@ Shell::Action action_import(
{"journal", "import"}, {}, "Import image journal.", "",
&get_import_arguments, &execute_import);

Shell::Action action_disconnect(
{"journal", "client", "disconnect"}, {},
"Flag image journal client disconnected.", "",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: "... as disconnected"


if (client.state != cls::journal::CLIENT_STATE_CONNECTED) {
dout(0) << "client flagged disconnected, stopping image replay" << dendl;
stop(nullptr, false, "disconnected");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here: should we set an error code so that an admin can see the problem?

@dillaman
Copy link

@trociny few minor comments, but otherwise lgtm

@trociny
Copy link
Contributor Author

trociny commented Aug 15, 2016

@dillaman Updated. I think I addressed all your comments. The only thing is renaming rbd_mirror_resync_after_disconnect config option to rbd_resync_to_primary_mirror_after_disconnect. I agree that using rbd_mirror prefix is unfortunate here, still I am not very happy with rbd_resync_to_primary_mirror_after_disconnect name. I changed it to rbd_mirroring_resync_after_disconnect instead. What do you think? If you still like rbd_resync_to_primary_mirror_after_disconnect more I will change it.

@dillaman
Copy link

@trociny That's fine with me -- just wanted to ensure it didn't overlap with "rbd_mirror".

@dillaman
Copy link

dillaman commented Sep 5, 2016

@trociny rebase required

Mykola Golub added 6 commits September 5, 2016 08:51
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
…cceeded

Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Mykola Golub added 2 commits September 5, 2016 12:48
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
…nnect

Signed-off-by: Mykola Golub <mgolub@mirantis.com>
@trociny
Copy link
Contributor Author

trociny commented Sep 5, 2016

@dillaman Testing after rebase I noticed the issue with ImageReplayer<I>::handle_remote_journal_metadata_updated(): it might be called when ImageReplayer was stopping and shutting journal down, and metadata was null so it crashed.

I have updated handle_remote_journal_metadata_updated to check the current ImageReplayer state and return if it is not running.

Now, retesting this locally. I will let you know about test results.

@trociny
Copy link
Contributor Author

trociny commented Sep 5, 2016

@dillaman it passed local tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants