New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mon/ConfigMonitor: update crush_location from osd entity #52088
Conversation
For osd entities, crush_location needs to refer to osd's parent (host) so that 'ceph config set' using osd/host mask can work. Fixes: https://tracker.ceph.com/issues/48750 Signed-off-by: Didier Gazen <didier.gazen@aero.obs-mip.fr>
jenkins test make check |
@@ -920,6 +920,7 @@ bool ConfigMonitor::refresh_config(MonSession *s) | |||
|
|||
string device_class; | |||
if (s->name.is_osd()) { | |||
osdmap.crush->get_full_location(s->entity_name.to_str(), &crush_location); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the problem that the remote_host branch above doesn't fill out the host level of the location?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's the problem because get_full_location(dev_name,...) is returning the fully qualified location of the device dev_name starting at its parent (not including dev_name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me, but someone more familiar with the code should probably review it.
From @kamoltat ref: https://trello.com/c/7y6uj4bo RADOS approved, all failures/dead jobs are unrelated and known failures. Failures: 7332480 -> Bug #58946: cephadm: KeyError: 'osdspec_affinity' - Dashboard - Ceph -> cephadm: KeyError: 'osdspec_affinity' - Ceph - Mgr - Dashboard Deads: 7332357 -> Bug #61164: Error reimaging machines: reached maximum tries (100) after waiting for 600 seconds - Infrastructure - Ceph -> Error reimaging machines: reached maximum tries (100) after waiting for 600 seconds |
For osd entities, crush_location needs to refer to osd's parent (host) so that 'ceph config set' using osd/host mask can work.
Fixes: https://tracker.ceph.com/issues/48750
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows