Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: MDSMonitor: handle MDSBeacon messages properly #5199

Merged
merged 4 commits into from Sep 9, 2015

Conversation

smithfarm
Copy link
Contributor

msg.get_session() should always return a non-zero pointer in
Monitor.dispatch()

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 16e8e2c)

Conflicts:
    src/mon/Monitor.cc
        Monitor::_ms_dispatch(Message *m) is bool in firefly
so the peon can remove the ignored mdsbeacon request from the
routed_requets at seeing this reply, and hence no longer resend the
request.

Fixes: ceph#11590
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 72a37b3)
* s/ignore/reply/
* s/out/ignore/

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit f00ecb8)

Conflicts:
    src/mon/MDSMonitor.cc
        Do not compare known daemon health with m->get_health()
the MDS (Beacon) is always expecting the reply for the mdsbeacon messages from
the lead mon, and it uses the delay as a metric for the laggy-ness of the
Beacon. when it comes to the MDSMonitor on a peon, it will remove the route
session at seeing a reply (route message) from leader, so a reply to
mdsbeacon will stop the peon from resending the mdsbeacon request to the
leader.

if the MDSMonitor re-forwards the unreplied requests after they are
outdated, there are chances that the requests reflecting old and even wrong
state of the MDSs mislead the lead monitor. for example, the MDSs which sent
the outdated messages could be dead.

Fixes: ceph#11590
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit b3555e9)
@smithfarm smithfarm added bug-fix cephfs Ceph File System labels Jul 10, 2015
@smithfarm smithfarm added this to the firefly milestone Jul 10, 2015
@smithfarm smithfarm self-assigned this Jul 10, 2015
@smithfarm
Copy link
Contributor Author

@tchaikov pls review

@gregsfortytwo
Copy link
Member

ping @tchaikov

@smithfarm
Copy link
Contributor Author

@tchaikov, @gregsfortytwo This has passed a rados suite (see http://tracker.ceph.com/issues/11644#rados for details). OK to merge, do you think?

@smithfarm smithfarm assigned tchaikov and unassigned smithfarm Sep 7, 2015
@tchaikov
Copy link
Contributor

tchaikov commented Sep 9, 2015

sorry for the latency. lgtm.

@tchaikov tchaikov assigned smithfarm and unassigned tchaikov Sep 9, 2015
smithfarm added a commit that referenced this pull request Sep 9, 2015
MDSMonitor: handle MDSBeacon messages properly

Reviewed-by: Kefu Chai <kchai@redhat.com>
@smithfarm smithfarm merged commit 79403ba into ceph:firefly Sep 9, 2015
@smithfarm smithfarm deleted the wip-11980-firefly branch September 9, 2015 04:05
@ghost ghost changed the title MDSMonitor: handle MDSBeacon messages properly fs: MDSMonitor: handle MDSBeacon messages properly Oct 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix cephfs Ceph File System
Projects
None yet
4 participants