quincy: ceph-volume: fix issue with fast device allocs when there are multiple PVs per VG #50879

guits · 2023-04-05T16:01:15Z

backport tracker: https://tracker.ceph.com/issues/59312

backport of #50279
parent tracker: https://tracker.ceph.com/issues/58857

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

Adds a test case to reproduce a bug with get_physical_fast_allocs for clusters that have multiple fast device PVs in a single VG (deployed prior to v15.2.8). Also fixes other test cases for this function to more accurately represent reality. Signed-off-by: Cory Snyder <csnyder@1111systems.com> (cherry picked from commit 02592cb)

…e PVs per VG Fixes a regression with fast device allocations when there are multiple PVs per VG. This is the case for clusters that were deployed prior to v15.2.8. Fixes: https://tracker.ceph.com/issues/58857 Signed-off-by: Cory Snyder <csnyder@1111systems.com> (cherry picked from commit efcf71b)

adk3798 · 2023-04-10T15:23:05Z

https://pulpito.ceph.com/adking-2023-04-06_03:45:59-orch:cephadm-wip-adk3-testing-2023-04-05-1604-quincy-distro-default-smithi/

4 dead jobs are a general issue I'm seeing with the upgrade-with-workload task. It seems the upgrade completes but the workload test at the end never does for some reason. This started happening on multiple branches simultaneously. For example, it came up in the reef baseline run (https://pulpito.ceph.com/yuriw-2023-04-08_15:50:42-orch-reef-distro-default-smithi/7235797). Due to it showing up in multiple stable branches at once with different codebases, thinking it's likely not an actual bug within ceph and shouldn't block PR merging, although it will require investigation.

Rerun of failed jobs: https://pulpito.ceph.com/adking-2023-04-08_15:01:19-orch:cephadm-wip-adk3-testing-2023-04-05-1604-quincy-distro-default-smithi/

After reruns, 2 failures. Both are same test that is failing because it expects some command to give a zero error code but give an error message and instead it's giving a nonzero return code. To clarify, the command is meant to fail (it's checking on a nonexistent cluster) it's just the test expects it to fail a different way. I think it's just an issue of the test being backported without the change to the actual function's returncode, and shouldn't block merging.

cfsnyder added 2 commits April 5, 2023 18:01

guits requested a review from a team as a code owner April 5, 2023 16:01

guits added this to the quincy milestone Apr 5, 2023

guits added the ceph-volume label Apr 5, 2023

adk3798 approved these changes Apr 5, 2023

View reviewed changes

adk3798 added the wip-adk3-testing label Apr 5, 2023

adk3798 merged commit 2910ef3 into ceph:quincy Apr 10, 2023
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quincy: ceph-volume: fix issue with fast device allocs when there are multiple PVs per VG #50879

quincy: ceph-volume: fix issue with fast device allocs when there are multiple PVs per VG #50879

guits commented Apr 5, 2023

adk3798 commented Apr 10, 2023

quincy: ceph-volume: fix issue with fast device allocs when there are multiple PVs per VG #50879

quincy: ceph-volume: fix issue with fast device allocs when there are multiple PVs per VG #50879

Conversation

guits commented Apr 5, 2023

adk3798 commented Apr 10, 2023