New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr: clean up daemon start process #16020

Merged
merged 4 commits into from Jul 14, 2017

Conversation

Projects
None yet
3 participants
@jcsp
Contributor

jcsp commented Jun 29, 2017

No description provided.

@jcsp jcsp added the mgr label Jun 29, 2017

@jcsp jcsp requested a review from liewegas Jun 29, 2017

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jun 30, 2017

Rebased

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jun 30, 2017

Mangled this in the rebase, updated.

jcsp added some commits Jun 29, 2017

mgr: wait for mon digest on startup
This is to avoid starting the python modules
before the mon_status and health information
is available.

Fixes: http://tracker.ceph.com/issues/20383
Signed-off-by: John Spray <john.spray@redhat.com>
mon: emit clog messages on manager changes
Signed-off-by: John Spray <john.spray@redhat.com>
mon: send mgrdigest promptly to new active mgr
Previously, active mgrs ended up waiting around
until the next periodic message.  This is a lot
more noticeable now that the mgr isn't considered
active until it has loaded the digest data.

Signed-off-by: John Spray <john.spray@redhat.com>
mgr: send beacon to mon as soon as done with Mgr::init
...instead of waiting for next periodic beacon.

Signed-off-by: John Spray <john.spray@redhat.com>
@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 4, 2017

Rebased

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 5, 2017

Five instances of this in my run:

[WRN] Manager daemon x is unresponsive. No standby daemons available." in cluster log

http://pulpito.ceph.com/sage-2017-07-03_15:41:59-rados-wip-sage-testing-distro-basic-smithi/

@liewegas liewegas modified the milestone: luminous Jul 6, 2017

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 10, 2017

Investigating the failures -- looking like this is the new logging making it more obvious that a mgr is failing, rather than a new bug. We had issues like this with the MDS equivalent log change, where things were failing out on laggy mon clusters.

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 14, 2017

Pretty sure the test failures are just MgrMonitor not handling laggy mon clusters properly: http://tracker.ceph.com/issues/20629

@jcsp

This comment has been minimized.

Contributor

jcsp commented Jul 14, 2017

@liewegas I think this is good to merge?

@liewegas liewegas merged commit 7e14253 into ceph:master Jul 14, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment