Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quincy: cephadm: handle exceptions applying extra services during bootstrap #50904

Merged
merged 1 commit into from Apr 10, 2023

Conversation

adk3798
Copy link
Contributor

@adk3798 adk3798 commented Apr 5, 2023

backport tracker: https://tracker.ceph.com/issues/59302


backport of #50548
parent tracker: https://tracker.ceph.com/issues/59082

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

Otherwise we risk hitting a mismatch between the cephadm binary version
and the container image version we're bootstrapping on, resulting in
bootstrap failing. Example in the tracker.

Fixes: https://tracker.ceph.com/issues/59082

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit a57fc00)
@adk3798 adk3798 requested a review from a team as a code owner April 5, 2023 19:52
@adk3798 adk3798 added this to the quincy milestone Apr 5, 2023
@adk3798
Copy link
Contributor Author

adk3798 commented Apr 10, 2023

https://pulpito.ceph.com/adking-2023-04-06_03:45:59-orch:cephadm-wip-adk3-testing-2023-04-05-1604-quincy-distro-default-smithi/

4 dead jobs are a general issue I'm seeing with the upgrade-with-workload task. It seems the upgrade completes but the workload test at the end never does for some reason. This started happening on multiple branches simultaneously. For example, it came up in the reef baseline run (https://pulpito.ceph.com/yuriw-2023-04-08_15:50:42-orch-reef-distro-default-smithi/7235797). Due to it showing up in multiple stable branches at once with different codebases, thinking it's likely not an actual bug within ceph and shouldn't block PR merging, although it will require investigation.

Rerun of failed jobs: https://pulpito.ceph.com/adking-2023-04-08_15:01:19-orch:cephadm-wip-adk3-testing-2023-04-05-1604-quincy-distro-default-smithi/

After reruns, 2 failures. Both are same test that is failing because it expects some command to give a zero error code but give an error message and instead it's giving a nonzero return code. To clarify, the command is meant to fail (it's checking on a nonexistent cluster) it's just the test expects it to fail a different way. I think it's just an issue of the test being backported without the change to the actual function's returncode, and shouldn't block merging.

@adk3798 adk3798 merged commit c6aaf21 into ceph:quincy Apr 10, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants