-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mds: register with mgr only after added to FSMap #31400
Conversation
@liewegas I had to make several changes. Please have another look. |
4ea8ff0
to
e66e40c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming the mds always starts in standby before going to creating or replay or whatever, then this looks good!
54d8d04
to
baa5175
Compare
I tested the changes on my local system (because of https://tracker.ceph.com/issues/42011) and got a slightly different error when running the dashboard backend API tests:
I guess this is out of scope of this PR and I'll create another tracker issue for that one. |
Found the same issue here: https://jenkins.ceph.com/job/ceph-dashboard-pr-backend/456/ |
This also cleans up the output to be more readable/useful in debug output. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This is a trivial refactor. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Few things here: - Make explicit the check for getting removed from the MDSMap. This was only done before by checking if MDS held a rank which does not check the case where a standby is removed from the FSMap. - Use mds_info_t::dump to simplify various debug output. - Add a few sanity asserts for invalid state transitions. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The Monitors send an empty MDSMap to an MDS it is removing. The MDS can't diagnose the cause. Instead suggest looking at the cluster/monitor logs. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This commit undoes the service daemon registration for the MDS. It doesn't look absolutely necessary and it causes the MDS to be listed twice in the `ceph versions` output: $ ceph versions ... "mds": { "ceph version v15.0.0-6915-g0143b904676 (0143b9046763ea1801efa8358a0c033ec862cea9) octopus (dev)": 3 }, "mds": { "unknown": 3 }, "overall": { "ceph version v15.0.0-6915-g0143b904676 (0143b9046763ea1801efa8358a0c033ec862cea9) octopus (dev)": 10, "unknown": 3 } Fixing that requires looking for duplicates or ignoring MDSs in the service daemons when the mon processes `ceph versions`. I have a feeling that it wasn't actually designed to be used by the MDS this way however. Additionally, the reason for "unknown" version is because the metadata sent to the mgr does not include "ceph_version". Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Note that we now sub to the mgrmap after init because the MgrClient connection to the mgr is driven by receipt of the MgrMap. This is important so that the MDS does not have metadata with the mgr when the mons are ignoring the MDS otherwise due to CompatSet incompatibilities. Fixes: https://tracker.ceph.com/issues/41538 Fixes: https://tracker.ceph.com/issues/42635 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/31400/head: mds: establish session with mgr only after added to FSMap mds: do not register as a service daemon mds: do not try to diagnose cause of MDSMap removal mds: fix handling of initial MDS states mds: remove unnecessary const qualifier mds: cleanup type decl and map iteration mds: define stream operator for mds_info_t Reviewed-by: Sage Weil <sage@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. sorry for being late. was traveling last week.
Fixes: https://tracker.ceph.com/issues/41538
Fixes: https://tracker.ceph.com/issues/42635
Signed-off-by: Patrick Donnelly pdonnell@redhat.com
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard backend
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox