Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mimic: mon: test if gid exists in pending for prepare_beacon #24272

Merged
merged 1 commit into from Oct 19, 2018

Conversation

batrick
Copy link
Member

@batrick batrick commented Sep 25, 2018

If it does not, send a null map. Bug introduced by
624efc6 which made preprocess_beacon only look
at the current fsmap (correctly). prepare_beacon relied on preprocess_beacon
doing that check on pending.

Running:

    while sleep 0.5; do bin/ceph mds fail 0; done

is sufficient to reproduce this bug. You will see:

    2018-09-07 15:33:30.350 7fffe36a8700  5 mon.a@0(leader).mds e69 preprocess_beacon mdsbeacon(24412/a up:reconnect seq 2 v69) v7 from mds.0 127.0.0.1:6813/2891525302 compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
    2018-09-07 15:33:30.350 7fffe36a8700 10 mon.a@0(leader).mds e69 preprocess_beacon: GID exists in map: 24412
    2018-09-07 15:33:30.350 7fffe36a8700  5 mon.a@0(leader).mds e69 _note_beacon mdsbeacon(24412/a up:reconnect seq 2 v69) v7 noting time
    2018-09-07 15:33:30.350 7fffe36a8700  7 mon.a@0(leader).mds e69 prepare_update mdsbeacon(24412/a up:reconnect seq 2 v69) v7
    2018-09-07 15:33:30.350 7fffe36a8700 12 mon.a@0(leader).mds e69 prepare_beacon mdsbeacon(24412/a up:reconnect seq 2 v69) v7 from mds.0 127.0.0.1:6813/2891525302
    2018-09-07 15:33:30.350 7fffe36a8700 15 mon.a@0(leader).mds e69 prepare_beacon got health from gid 24412 with 0 metrics.
    2018-09-07 15:33:30.350 7fffe36a8700  5 mon.a@0(leader).mds e69 mds_beacon mdsbeacon(24412/a up:reconnect seq 2 v69) v7 is not in fsmap (state up:reconnect)

in the mon leader log. The last line indicates the problem was safely handled.

Fixes: http://tracker.ceph.com/issues/35848

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit f26752a)

Conflicts:
    src/mon/MDSMonitor.cc
@batrick batrick added the cephfs Ceph File System label Sep 25, 2018
@batrick batrick added this to the mimic milestone Sep 25, 2018
@smithfarm smithfarm added the core label Sep 26, 2018
@yuriw
Copy link
Contributor

yuriw commented Oct 2, 2018

@yuriw
Copy link
Contributor

yuriw commented Oct 4, 2018

@neha-ojha
Copy link
Member

@batrick I see core and cephfs labels on this PR. This has already passed rados testing, but is there any other suite that this PR needs to be tested with?

@batrick
Copy link
Member Author

batrick commented Oct 10, 2018

Putting this in core was a mistake. This should be tested with cephfs suites.

@yuriw
Copy link
Contributor

yuriw commented Oct 15, 2018

@batrick
Copy link
Member Author

batrick commented Oct 19, 2018

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Copy link
Contributor

@yuriw yuriw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed-by: Patrick Donnelly pdonnell@redhat.com

@yuriw yuriw merged commit 9bd0b05 into ceph:mimic Oct 19, 2018
@batrick batrick deleted the i35858 branch July 16, 2020 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants