New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon/MonClient: scale backoff interval down when we have a healthy mon session #16576

Merged
merged 3 commits into from Jul 26, 2017

Conversation

Projects
None yet
3 participants
@liewegas
Member

liewegas commented Jul 25, 2017

@liewegas liewegas requested a review from jecluis Jul 25, 2017

@liewegas liewegas added this to the luminous milestone Jul 25, 2017

@liewegas liewegas requested a review from tchaikov Jul 25, 2017

@jecluis

This looks fine, despite a note on tick() and un-backoff sending chills down my spine. :)

@@ -770,11 +769,24 @@ void MonClient::tick()
send_log();
}
_un_backoff();

This comment has been minimized.

@jecluis

jecluis Jul 26, 2017

Member

At this point, we may have just called _reopen_session(), and increased reopen_interval_multiplier as we started hunting. And in turn we're likely going to reduce the backoff, this tick around, to a value that may be smaller than the one set in _start_hunting(). It may not make a big difference in the end, but seems a distortion of the backoff's purpose. OTOH, should this line be moved to just after/before the send_keepalive(), I think the purpose would be maintained, allowing _start_hunting() to build on the current value instead of being overridden.

This comment has been minimized.

@liewegas

liewegas Jul 26, 2017

Member

_start_hunting starts at the min value, and the min is applied when we un-backoff, so it should never be less than that. This does change things to be slightly more aggressive, though, since we can tick at any point after we finally connect. We only un-backoff after we've authenticated though.

Oh.. at least that's what I meant to do. Repushed with an update!

@tchaikov

i am posting #16588 to explain what i have in my mind. probably we can just review/test/merge it if it looks sane.

void MonClient::_un_backoff()
{
// un-backoff our reconnect interval
reopen_interval_multiplier = std::min(

This comment has been minimized.

@tchaikov

tchaikov Jul 26, 2017

Contributor

s/min/max/

@@ -770,11 +769,26 @@ void MonClient::tick()
send_log();
}
if (have_session()) {

This comment has been minimized.

@tchaikov

tchaikov Jul 26, 2017

Contributor

have_session() is a method of MonConnection. probably we can use the 1f83eb3 in #16588?

This comment has been minimized.

@liewegas

liewegas Jul 26, 2017

Member

right, i misread.. the _hunting() check above already does what I want.

liewegas and others added some commits Jul 25, 2017

mon/MonClient: factor un-backoff into helper; add missing config option
- move this into a helper
- use the backoff factor for the division instead of hard-coding 2.0
- add a min config option instead of hard-coding 1.0

Signed-off-by: Sage Weil <sage@redhat.com>
mon/MonClient: un-backoff on tick when we have a authenticated session
This means that if we do some backoff, then authenticate, and are healthy
for an extended period of time, a subsequent failure won't leave us
starting our hunting sequence with a large backoff.

Fixes: http://tracker.ceph.com/issues/20371
Signed-off-by: Sage Weil <sage@redhat.com>
mon/MonClient: do not send_log if conn is not active anymore
the log message to be sent will be appended to waiting_for_session
instead. but we will send the logs anyway when the MonClient is
authorized with the new connection. so, avoid doing this as it's not
necessary.

also refactor the schedule_tick() call into a scope_guard, so it is
always called upon the return of the tick() method.

Signed-off-by: Kefu Chai <kchai@redhat.com>
@liewegas

This comment has been minimized.

Member

liewegas commented Jul 26, 2017

updated, and pulled in your patch

@tchaikov tchaikov merged commit c59fd70 into ceph:master Jul 26, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment