Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/cephadm: allow setting mon crush locations through mon service spec #49103

Merged
merged 6 commits into from Apr 25, 2023

Conversation

adk3798
Copy link
Contributor

@adk3798 adk3798 commented Nov 28, 2022

Fixes: https://tracker.ceph.com/issues/58101
Fixes: https://tracker.ceph.com/issues/58100

A few things this handles.

  1. Mon service specs now accept a crush location dict, that specifies what crush location any mon deployed on a given host should use. For example
service_type: mon
service_name: mon
placement:
  count: 5
spec:
  crush_locations:
    vm-00:
    - datacenter=a
    vm-01:
    - datacenter=b
    - rack=2
    vm-02:
    - datacenter=a

would tell cephadm any mon deployed on vm-00 should have crush location "datacenter=a", on vm-01 "datacenter=b,rack=2" etc.

  1. Whatever the first bucket=loc pair is in the list for that host will be passed as an arg to the --set-crush-location flag while deploying the mon. This is pretty much necessary to not have replacing tiebreaker mons in stretch mode clusters not be painful as in stretch mode the mon is not allowed to join the cluster without a crush location but you can't set a crush location for an unknown mon (i.e. one that hasn't been added yet).
  2. To handle users wanting to specify multiple crush locations for a single mon (since --set-crush-location only allows setting one bucket=loc pair) logic is added here to set crush locations through mon commands which allows passing all proided bucket=loc pairs.
  3. This covers a different issue (see https://tracker.ceph.com/issues/58100) that we don't redo service level configuration operations unless we deploy a new daemon for that service. This meant adjustments to the spec such as changing a cert that likely won't cause new daemons to be deployed would not be acted on at all. There is a change here to handle service level config operations more explicitly, including tracking whether we've done them since the spec was last updated. It's a bit of a separate thing but since setting these crush locations ended up falling under the umbrella of service level configuration operations I ended up just handling it here.

What is still missing/unsolved

  1. still need documentation
  2. some of the actions don't trigger automatically. For example. in order to get --set-crush-location set on an already existing monitor you may need to tell cephadm to redeploy the monitor (I believe this is linked to these crush locations not being properly considered a "dependency" for the mons). Or, for the command based setting of the crush locations, you might need to re-apply the same spec to get it to trigger for all mons (this might be related to the timing of deploying mons, when they appear in the monmap, and when we attempt the service level events).
  3. Perhaps a small teuthology test would be appropriate for this? Not sure if unit tests cover this very well.

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@adk3798 adk3798 requested a review from a team as a code owner January 12, 2023 20:22
@adk3798 adk3798 changed the title [WIP] mgr/cephadm: allow setting mon crush locations through mon service spec mgr/cephadm: allow setting mon crush locations through mon service spec Jan 12, 2023
@adk3798
Copy link
Contributor Author

adk3798 commented Jan 12, 2023

This is no longer WIP and is now open for full review. Docs and teuthology test have been added

Copy link
Contributor

@anthonyeleven anthonyeleven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're CRUSHING this ;)

doc/cephadm/services/mon.rst Outdated Show resolved Hide resolved
doc/cephadm/services/mon.rst Outdated Show resolved Hide resolved
doc/cephadm/services/mon.rst Outdated Show resolved Hide resolved
doc/cephadm/services/mon.rst Outdated Show resolved Hide resolved
doc/cephadm/services/mon.rst Outdated Show resolved Hide resolved
doc/cephadm/services/mon.rst Outdated Show resolved Hide resolved
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Copy link
Contributor

@anthonyeleven anthonyeleven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs LGTM

@adk3798

This comment was marked as resolved.

@github-actions
Copy link

github-actions bot commented Mar 8, 2023

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

src/cephadm/tests/test_cephadm.py Outdated Show resolved Hide resolved
src/pybind/mgr/cephadm/inventory.py Show resolved Hide resolved
src/pybind/mgr/cephadm/inventory.py Show resolved Hide resolved
src/pybind/mgr/cephadm/tests/test_cephadm.py Show resolved Hide resolved
Copy link
Contributor

@rkachach rkachach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@phlogistonjohn phlogistonjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few minor comments, nothing blocking IMO, and most could be done in follow up work.

I took a specific look at the embedded shell script int the test. I think it looks OK, but it is complex. I honestly probably would have looked for a way to do it in python rather than bash... but I see nothing wrong with the current version.

src/pybind/mgr/cephadm/services/cephadmservice.py Outdated Show resolved Hide resolved
src/pybind/mgr/cephadm/services/cephadmservice.py Outdated Show resolved Hide resolved
current_crush_locs = [m['crush_location'] for m in quorum_status['monmap']['mons'] if m['name'] == dd.daemon_id][0]
except Exception as e:
logger.info(f'Failed setting crush location for mon {dd.daemon_id}: {e}')
desired_crush_locs = '{' + ','.join(mon_spec.crush_locations[dd.hostname]) + '}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter if, for example, the order changed?
Perhaps if instead of creating a string of the desired crush locations, we parse the current_crush_locs into a comparable object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the order matters, so you're probably right that we should check for that and only set if we really are missing locations, not just they're in a different order. Want to push this into a follow up PR though.

Comment on lines +207 to +214
crush_locations:
host1:
- datacenter=a
host2:
- datacenter=b
- rack=2
host3:
- datacenter=a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, a future enhancement that might avoid having to spell out every host would be to allow
host labels to be translated into the crush rules. Then you could label your hosts and then have the labels select the crush rules. I think that would be a bit more expressive/scalable. Something for another day.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely a good idea and I don't think would be too hard to build on top of the current work. Will try to do this in a followup

…ns field

In order to allow having cephadm set the crush locations
for the mons. For helping with setting up stretch mode
with a cephadm cluster

Signed-off-by: Adam King <adking@redhat.com>
…n in spec

Necessary to do this for stretch mode tiebreaker mon replacement

Fixes: https://tracker.ceph.com/issues/58101

Signed-off-by: Adam King <adking@redhat.com>
Previously, the service config function was only called
when we deploy a new daemon for that service. That meant
that updates to the spec such as changing a cert that don't
affect the daemon placement wouldn't trigger the service level
config to happen again. With this change, we now mark
the service as needing its config function ran if a daemon
for the service is added/removed or if the spec is updated.

Fixes: https://tracker.ceph.com/issues/58100

Signed-off-by: Adam King <adking@redhat.com>
The part of this that added the --set-crush-location flag
when deploying the mon was handled in another commit. This
piece is to finish the functionality by having cephadm set
the location through commands to handle when multiple
bucket=loc pairs are specified for a single monitor

Fixes: https://tracker.ceph.com/issues/58101

Signed-off-by: Adam King <adking@redhat.com>
Trying to add a feature where mon crush locations
can be set through the orchestrator using the mon
service spec. This is meant to be a test for that.

Signed-off-by: Adam King <adking@redhat.com>
Signed-off-by: Adam King <adking@redhat.com>
@adk3798
Copy link
Contributor Author

adk3798 commented Apr 17, 2023

https://pulpito.ceph.com/adking-2023-04-05_05:00:01-orch:cephadm-wip-adk-testing-2023-04-04-2123-distro-default-smithi/

Reruns of failed/dead jobs: https://pulpito.ceph.com/adking-2023-04-07_13:59:11-orch:cephadm-wip-adk-testing-2023-04-04-2123-distro-default-smithi/

After reruns: 6 failed and 2 dead jobs

  • 2 dead jobs are an issue with the workload test after the upgrade has completed timing out. This happened in other runs on both quincy and reef done at the same time (with totally separate sets of patches), so I think it's something outside of ceph and not a blocker for merging PRs
  • 1 failure is the new keepalive-nfs test introduced in a PR in the run that doesn't work yet
  • 2 failures on staggered upgrade. Originally broken by https://tracker.ceph.com/issues/58535 and a PR in the run attempted to fix it, but the fix is not working yet
  • 2 infra failures, one pulling the container image needed for the test and another installing the kernel
  • 1 failure on testing jaegar tracing stuff. It seems the jaegar-agent daemons weren't coming up. Will need to investigate this further, but I don't think it's related to any PRs in the run and shouldn't block merging.

Specifically for this PR, the new test for setting the crush locations passed
https://pulpito.ceph.com/adking-2023-04-05_05:00:01-orch:cephadm-wip-adk-testing-2023-04-04-2123-distro-default-smithi/7232515
https://pulpito.ceph.com/adking-2023-04-05_05:00:01-orch:cephadm-wip-adk-testing-2023-04-04-2123-distro-default-smithi/7232578

@adk3798
Copy link
Contributor Author

adk3798 commented Apr 24, 2023

https://pulpito.ceph.com/adking-2023-04-20_12:02:37-orch:cephadm-wip-adk-testing-2023-04-20-0105-distro-default-smithi/

Reruns of all but upgrade-with-workload jobs (which are known to timeout after multiple hours currently, so reruns would waste a lot of resources): https://pulpito.ceph.com/adking-2023-04-22_17:05:01-orch:cephadm-wip-adk-testing-2023-04-20-0105-distro-default-smithi/

After reruns, 2 dead and 5 failed jobs:

  • 2 dead jobs were infra failures: "Failed to download metadata for repo 'CentOS-PowerTools'
  • 2 failed jobs on staggered upgrade tests, https://tracker.ceph.com/issues/58535 + fix for it not working yet
  • 2 failed jobs on new keepalive-only test. It's a new test that doesn't work yet, shouldn't affect being able to merge other PRs
  • 1 failed test deploying jeagar-tracing. Needs investigation, but doesn't seem related to any PRs in the run.

@adk3798 adk3798 merged commit c7d382b into ceph:main Apr 25, 2023
10 of 12 checks passed
yuvalif pushed a commit to yuvalif/ceph that referenced this pull request Apr 30, 2023
mgr/cephadm: allow setting mon crush locations through mon service spec

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants