Skip to content

mgr/cephadm: fix tcmu-runner cephadm_stray_daemon#43833

Merged
sebastian-philipp merged 1 commit intoceph:masterfrom
melissa-kun-li:prevent-cephadm-stray-daemon-tcmurunner
Jan 6, 2022
Merged

mgr/cephadm: fix tcmu-runner cephadm_stray_daemon#43833
sebastian-philipp merged 1 commit intoceph:masterfrom
melissa-kun-li:prevent-cephadm-stray-daemon-tcmurunner

Conversation

@melissa-kun-li
Copy link
Copy Markdown
Contributor

Adds tcmu-runner daemon to managed to prevent cephadm from raising a cephadm_stray_daemon health warning whenever an image is added to a target

Fixes: https://tracker.ceph.com/issues/51111
Signed-off-by: Melissa Limelissali@redhat.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@melissa-kun-li melissa-kun-li requested a review from a team as a code owner November 6, 2021 03:53
@melissa-kun-li melissa-kun-li force-pushed the prevent-cephadm-stray-daemon-tcmurunner branch from 4d37464 to 009c15c Compare November 7, 2021 02:54
@melissa-kun-li
Copy link
Copy Markdown
Contributor Author

jenkins test make check

melissa-kun-li added a commit to melissa-kun-li/ceph that referenced this pull request Nov 23, 2021
melissa-kun-li added a commit to melissa-kun-li/ceph that referenced this pull request Dec 1, 2021
@melissa-kun-li
Copy link
Copy Markdown
Contributor Author

Test on this branch: melissa-kun-li@30fe259 just with the symlink change, no PR changes:
https://pulpito.ceph.com/melissali-2021-12-01_15:57:33-rbd:iscsi-master-distro-basic-smithi/

2021-12-01T16:28:19.622 INFO:tasks.cram.client.0.smithi026.stdout:/home/ubuntu/cephtest/archive/cram.client.0/gwcli_create.t: failed
2021-12-01T16:28:19.622 INFO:tasks.cram.client.0.smithi026.stdout:--- gwcli_create.t
2021-12-01T16:28:19.623 INFO:tasks.cram.client.0.smithi026.stdout:+++ gwcli_create.t.err
...

2021-12-01T16:28:19.635 INFO:tasks.cram.client.0.smithi026.stdout: Map the LUN
2021-12-01T16:28:19.635 INFO:tasks.cram.client.0.smithi026.stdout: ===========
2021-12-01T16:28:19.635 INFO:tasks.cram.client.0.smithi026.stdout:   $ sudo podman exec $ISCSI_CONTAINER gwcli iscsi-targets/iqn.2003-01.com.redhat.iscsi-gw:ceph-gw/hosts/iqn.1994-05.com.redhat:client disk disk=datapool/block0
2021-12-01T16:28:19.636 INFO:tasks.cram.client.0.smithi026.stdout:+  No such path /iscsi-targets/iqn.2003-01.com.redhat.iscsi-gw:ceph-gw/hosts/iqn.1994-05.com.redhat:client
2021-12-01T16:28:19.636 INFO:tasks.cram.client.0.smithi026.stdout:+  [255]
2021-12-01T16:28:19.636 INFO:tasks.cram.client.0.smithi026.stdout:   $ sudo podman exec $ISCSI_CONTAINER gwcli ls iscsi-targets/ | grep 'o- hosts' | awk -F'[' '{print $2}'
2021-12-01T16:28:19.636 INFO:tasks.cram.client.0.smithi026.stdout:-  Auth: ACL_ENABLED, Hosts: 1]
2021-12-01T16:28:19.636 INFO:tasks.cram.client.0.smithi026.stdout:+  Auth: ACL_ENABLED, Hosts: 0]
2021-12-01T16:28:19.637 INFO:tasks.cram.client.0.smithi026.stdout:   $ sudo podman exec $ISCSI_CONTAINER gwcli ls iscsi-targets/ | grep 'o- iqn.1994-05.com.redhat:client' | awk -F'[' '{print $2}'
2021-12-01T16:28:19.637 INFO:tasks.cram.client.0.smithi026.stdout:-  Auth: None, Disks: 1(300M)]
2021-12-01T16:28:19.637 INFO:tasks.cram.client.0.smithi026.stdout:# Ran 1 tests, 0 skipped, 1 failed.

Test on this branch: melissa-kun-li@2191585 (includes PR changes, deleted rbd/iscsi/supported-random-distros, added symlink to single-container-host.yaml):
https://pulpito.ceph.com/melissali-2021-12-01_06:19:34-rbd:iscsi-master-distro-basic-smithi/6537907/

Same error as the branch without the PR changes.

@ideepika
Copy link
Copy Markdown
Member

ideepika commented Dec 2, 2021

@sebastian-philipp @melissa-kun-li ISCSI gateway creation is failing itself somehow

2021-12-01T16:28:19.623 INFO:tasks.cram.client.0.smithi026.stdout:   > IP=`hostname -i | awk '{print $1}'`
2021-12-01T16:28:19.624 INFO:tasks.cram.client.0.smithi026.stdout:   > sudo podman exec $ISCSI_CONTAINER gwcli iscsi-targets/iqn.2003-01.com.redhat.iscsi-gw:ceph-gw/gateways create ip_addresses=$IP gateway_name=$HOST
2021-12-01T16:28:19.624 INFO:tasks.cram.client.0.smithi026.stdout:   $ sudo podman exec $ISCSI_CONTAINER gwcli ls iscsi-targets/ | grep 'o- gateways' | awk -F'[' '{print $2}'

@melissa-kun-li can you run this test on ceph master+symlink change, also would be good to have the symlink change in a different PR.

@melissa-kun-li
Copy link
Copy Markdown
Contributor Author

@melissa-kun-li can you run this test on ceph master+symlink change, also would be good to have the symlink change in a different PR.

Ran with just the symlink change on master using my branch new-rbd-iscsi (melissa-kun-li@30fe259) and got the issue in #43833 (comment)

I did:
teuthology-suite -v --ceph-repo https://github.com/ceph/ceph-ci.git --suite-repo https://github.com/melissa-kun-li/ceph.git -c master -m smithi -s rbd/iscsi -k distro -p 50 --suite-branch new-rbd-iscsi -S 0f9ed11e6727bf4fccef4a0481cd9391f27d3cff --force-priority

If there's another thing you meant, please let me know

@ideepika
Copy link
Copy Markdown
Member

ideepika commented Dec 6, 2021

@melissa-kun-li can you run this test on ceph master+symlink change, also would be good to have the symlink change in a different PR.

Ran with just the symlink change on master using my branch new-rbd-iscsi (melissa-kun-li@30fe259) and got the issue in #43833 (comment)

I did: teuthology-suite -v --ceph-repo https://github.com/ceph/ceph-ci.git --suite-repo https://github.com/melissa-kun-li/ceph.git -c master -m smithi -s rbd/iscsi -k distro -p 50 --suite-branch new-rbd-iscsi -S 0f9ed11e6727bf4fccef4a0481cd9391f27d3cff --force-priority

If there's another thing you meant, please let me know

Thanks, I will take a look

@ideepika
Copy link
Copy Markdown
Member

ideepika commented Dec 6, 2021

@melissa-kun-li I tried reproducing on master, but could not: http://pulpito.front.sepia.ceph.com/ideepika-2021-12-06_19:34:00-rbd:iscsi-wip-yuval-test1-distro-default-smithi/ can you rebase and retry or share the branch you are using for testing?

Adds tcmu-runner daemon to `managed` to prevent cephadm from raising a cephadm_stray_daemon health warning whenever an image is added to a target

Fixes: https://tracker.ceph.com/issues/51111
Signed-off-by: Melissa Li <melissali@redhat.com>
@melissa-kun-li melissa-kun-li force-pushed the prevent-cephadm-stray-daemon-tcmurunner branch from 009c15c to 12e0eaa Compare December 22, 2021 17:05
@melissa-kun-li
Copy link
Copy Markdown
Contributor Author

@ideepika thank you for looking into it! I saw your PR: #44243 . I will try testing again. If we can merge this PR, we can look at the rbd/iscsi tests separately

@ideepika
Copy link
Copy Markdown
Member

@ideepika thank you for looking into it! I saw your PR: #44243 . I will try testing again. If we can merge this PR, we can look at the rbd/iscsi tests separately

@melissa-kun-li let me know if you need help with this PR regarding iscsi stuff

@sebastian-philipp sebastian-philipp added the wip-swagner-testing My Teuthology tests label Jan 5, 2022
@melissa-kun-li
Copy link
Copy Markdown
Contributor Author

@sebastian-philipp
Copy link
Copy Markdown
Contributor

https://pulpito.ceph.com/swagner-2022-01-06_10:04:35-orch:cephadm-wip-swagner-testing-2022-01-05-1435-distro-default-smithi/

somehow one of the PRs in this run broke NFS and RGW ingress. I really doubt this PR caused this.

@idryomov
Copy link
Copy Markdown
Contributor

idryomov commented Jan 11, 2022

It looks like this introduced a bug:

https://tracker.ceph.com/issues/53830

Note the "No such path /iscsi-targets/iqn.2003-01.com.redhat.iscsi-gw:ceph-gw/hosts/iqn.1994-05.com.redhat:client" error that is mentioned in the previous comments.

Melissa's successful run linked in comment #43833 (comment) didn't actually test this PR's code because it was scheduled against the master branch.

I'll confirm by testing the revert and report back.

@idryomov
Copy link
Copy Markdown
Contributor

Sorry for the noise, it still fails with this PR taken out. It's just that nothing seems to be have changed on the RBD side in that date range...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants