Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/cephadm: nvme-of follow up work #52691

Merged
merged 8 commits into from
Aug 15, 2023
Merged

Conversation

adk3798
Copy link
Contributor

@adk3798 adk3798 commented Jul 28, 2023

The first piece of the nvme-of deployment work in cephadm was already merged in main. This PR is a follow up that fixes a handful of issues found in the original implementation.

This depends on having an nvme-of container with changes being proposed in ceph/ceph-nvmeof#166. For now, I have a temporary commit that points the default nvme-of image to an image on my personal quay with the changes present. That should be removed before merging (signed-off-by will fail as long as that commit is present.

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@adk3798 adk3798 requested a review from a team as a code owner July 28, 2023 21:35
Copy link
Contributor

@phlogistonjohn phlogistonjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, happy to approve once PR is out of WIP state. A few minor feedback items follow.

src/pybind/mgr/cephadm/services/nvmeof.py Outdated Show resolved Hide resolved
src/pybind/mgr/cephadm/serve.py Show resolved Hide resolved
- cephadm.shell:
host.a:
- ceph osd pool create foo
- rbd pool init foo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, what is the difference between the above and ceph osd pool application enable foo rbd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very little, if anything at all. One "initializes" a pool for rbd use and the other lets you "enable" the use of rbd on the pool (from their descriptions). I think we can use them interchangeably in this context.

@phlogistonjohn
Copy link
Contributor

The current changes LGTM, modulo the temporary container image hack. Once there's a usable upstream build with the change and you drop that temporary commit please ping me for final approval!

This is going to be used as the rados_id
to be set when connecting to the cluster using
the keyring we generate for the nvmeof daemon.

The python librados library defaults the name
to "client.admin" and so if we don't provide
a name or rados_id, we'll only be able to
use nvmeof with the "client.admin" keyring

Signed-off-by: Adam King <adking@redhat.com>
Before, we were just using the client.admin keyring
as a temporary workaround while we figured out
how to get the keyring to work. We should swap
over to using the keyring we actually generated
for the nvmeof daemon.

Signed-off-by: Adam King <adking@redhat.com>
The ok-to-stop function works for certain daemons
by checking if there are at least a certain number
(typically 1) daemon(s) that are actually running
and saying it's not ok-to-stop if if that won't
be true after the removals. This case breaks down
when all the daemons are in error state, making
it so cephadm will refuse to remove a set of
daemons that aren't even working because they're
not "ok to stop". Since ok-to-stop works in a
yes or no fashion, something like this where we
want to be willing to remove a certain subset
(or potentially all currently deployed) daemons
it's easier to keep this logic as part of applying
the service

Signed-off-by: Adam King <adking@redhat.com>
Similar to what is done for iscsi, basic deployment
test to make sure we can deploy the daemon and
it comes up in running state with no issue

Signed-off-by: Adam King <adking@redhat.com>
Rather than giving full admin privileges,
try to be a bit more strict by limiting it
to profile rbd mon caps and full OSD
privileges for rbd tagged pools. I also wanted
to include an OSD cap like

allow all pool="*" object_prefix "nvmeof.state"

but this caused a failure in the nvme-of daemon

RADOS permission error (Failed to operate write op for oid nvmeof.None.state)

Signed-off-by: Adam King <adking@redhat.com>
This is the IP the nvmeof daemon will bind
to, so it should be the IP of the host we're
deploying the nvmeof daemon on, not the IP
of the active mgr

Signed-off-by: Adam King <adking@redhat.com>
"igw_id" was leftover from the nvmeof implementation
being taken heavily from the iscsi implementation. "igw"
means nothing in this context, so we can change the name.

Signed-off-by: Adam King <adking@redhat.com>
0.0.2 includes a patch that allows the nvmeof
daemon to use non-admin keyrings, so we should
use it over 0.0.1

Signed-off-by: Adam King <adking@redhat.com>
@adk3798
Copy link
Contributor Author

adk3798 commented Aug 8, 2023

The current changes LGTM, modulo the temporary container image hack. Once there's a usable upstream build with the change and you drop that temporary commit please ping me for final approval!

this is no longer pointing towards a temp image. An official nvmeof image 0.0.2 was made with the necessary changes (ceph/ceph-nvmeof#172) and we now point to that.

Copy link
Contributor

@phlogistonjohn phlogistonjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@adk3798
Copy link
Contributor Author

adk3798 commented Aug 14, 2023

jenkins retest this please

@adk3798
Copy link
Contributor Author

adk3798 commented Aug 15, 2023

https://pulpito.ceph.com/adking-2023-08-15_13:44:09-orch:cephadm-wip-adk-testing-2023-08-14-1902-distro-default-smithi/

7 failures, 3 dead (as of writing this 1 dead and 2 running but the 2 running will time out) jobs

  • 4 failures were mds_upgrade_sequence upgrading from reef -> main. The upgrade from reef doesn't work yet as we need to implement something that can pull the latest reef build of the cephadm binary
  • 2 failures were https://tracker.ceph.com/issues/62084, known issue
  • 1 failure was https://tracker.ceph.com/issues/59704, known issue
  • 2 dead jobs were mds upgrade sequence starting from quincy. They time out doing a umount command. These were broken the same way before when upgrading from pacific, and we wanted to see if upgrading from quincy would fix it somehow but it hasn't.
  • 1 dead job was infra/teuthology issue: Error reimaging machines: Expected smithi172's OS to be centos 8 but found rhel 8.6

Overall, nothing related to the mds upgrade sequence or jaeger-tracing can be merged, but nothing here to block merging most other things in the run.

@adk3798 adk3798 merged commit 307cc75 into ceph:main Aug 15, 2023
10 of 12 checks passed
@adk3798 adk3798 mentioned this pull request Aug 15, 2023
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants