-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSD prepare job cannot create vols with /dev/mapper alias on PVCs #11301
OSD prepare job cannot create vols with /dev/mapper alias on PVCs #11301
Conversation
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: Ausias Armesto <ausiasarmesto@gmail.com>
dda64c7
to
989671e
Compare
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: Ausias Armesto <ausiasarmesto@gmail.com>
989671e
to
574240b
Compare
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: Ausias Armesto <ausiasarmesto@gmail.com>
574240b
to
3beb9d3
Compare
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: Ausias Armesto <ausiasarmesto@gmail.com> Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
3beb9d3
to
9b39405
Compare
You also need to remove this check from the unit test https://github.com/rook/rook/blob/master/pkg/daemon/ceph/osd/volume_test.go |
If I understand correctly, this only affects PVC-based OSDs. Is that correct and intended? If so, I would request the title and commit subject be sure to clarify that it is only the PVC OSDs. |
I also think @satoru-takeuchi 's input will be invaluable here. As long as this is limited to PVC-based OSDs, I think this is safe, but I'm not sure if there are side-effects I'm not foreseeing. |
…ing job prepare execution.\nRemove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.\nCloses: rook#11301\Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
…ing job prepare execution. Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
89eb49e
to
33cd542
Compare
…ing job prepare execution. Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
33cd542
to
50407bc
Compare
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
50407bc
to
b12c16f
Compare
@ausias-armesto Does this PR ready to be reviewed? In addition, please reflect this comment if necessary. |
I see this commit body:
The way I read this, I think |
Yes, that is the case that I am facing, for PVC based OSD. I've updated the commit message and fixed the unit test so you can review again. |
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
b12c16f
to
c2947ea
Compare
Hi @satoru-takeuchi @BlaineEXE , |
@ausias-armesto I've not reviewed this yet. Please wait for a while. I'll look into it in the next week. |
@ausias-armesto Please wait for several days to complete the review. The OSD creation code is complicated very much and I'm afraid regressions. |
I assume the root cause is not the treatment of "/dev/mapper" but the improper handling of I guess you can create an OSD without applying this PR with the following configuration which you mentioned. Could you confirm this?
And the following configuration results in the problem that you saw. The difference is the number of
This can be verified by reading your log of the prepare pod.
I guess that your problem will disappear if |
@satoru-takeuchi I'm investigating a case where an OSD cannot be created on a PVC due to the same issue where the device has mappings under /dev/mappings. But when using a PVC I don't believe we need to use those mappings since the PVC gives us the exact path. Shouldn't we remove the dev mappings handling code for OSDs on PVCs? Or why would we need it? |
You're right. This logic is not necessary. So this patch makes sense not as a bug fix but as a cleanup. How about merging this PR anyway? In addition, I'll submit a PR to handle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking changes requested until discussion is finalized.
@@ -586,12 +579,6 @@ func (a *OsdAgent) initializeDevicesLVMMode(context *clusterd.Context, devices * | |||
|
|||
logger.Infof("configuring new LVM device %s", name) | |||
deviceArg := path.Join("/dev", name) | |||
// ceph-volume prefers to use /dev/mapper/<name> if the device has this kind of alias | |||
for _, devlink := range device.PersistentDevicePaths { | |||
if strings.HasPrefix(devlink, "/dev/mapper") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we remove the dev mappings handling code for OSDs on PVCs? Or why would we need it?
You're right. This logic is not necessary. So this patch makes sense not as a bug fix but as a cleanup. How about merging this PR anyway? In addition, I'll submit a PR to handle
osdsPerDevice
properly.
I wonder if there is still a scenario where we need to keep the /dev/mapper detection. If the user has specified useAllDevices or similar settings, the /dev/mapper path is still preferred. But in the case where the user has specified explicit paths such as with local PVs or named devices in the cluster CR, there is no need for this search since they can be specified already as such with the /dev/mapper path. If we add this check back in for the non-pvc case here, we should add it for raw mode in the initializeDevicesRawMode()
method where it is missing now. Or perhaps we keep it simple, consistent, and predictable by merging this PR without further changes.
@satoru-takeuchi @ausias-armesto thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@travisn I only focused on LV-backed PVC by mistake and now I have a concern.
This logic is not necessary for LV-backed OSD for both OSD on device
nd OSD on PVC because this configuration always uses raw mode and ceph-volume raw prepare
accepts all device names. However, if a new OSD creation uses lvm mode, /dev/mapper prefix must be used because ceph-volume lvm prepare
only accept the device names prefixed by /dev/mapper. This limitation exists for both OSD on device and OSD on PCV.
Please wait for a while. I'll verify my concern later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@travisn , I can only test on LV-backed PVC, and cannot provide further information about how would behave in other contexts. On the other side, changing osdsPerDevice
to 1
has not changed anything, is still not working.
Thanks for following this issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ausias-armesto Could you show me your cluster.yaml and the log of osd prepare pod when you specified osdsPerDevice: 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
osd-prepare-1.log
osd-prepare-2.log
osd-prepare-3.log
helm-cluster.yaml.txt
The disk partitions for servers running on osd2 and osd3 looks like this:
root@server-stage-fsn1-002 /home/debian # lsblk -f
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
sda
├─sda1 linux_raid_member 1.2 rescue:0 492095b8-f9ee-e7c0-9e1e-e19193d3c00f
│ └─md0 ext3 1.0 ce3f530e-9eb8-4d67-ac74-44fe208f9328 877.3M 6% /boot
└─sda2 linux_raid_member 1.2 rescue:1 d9d405cd-528f-6f96-99b3-9b7fb7469523
└─md1 LVM2_member LVM2 001 ECx2sJ-9et7-Sg4F-oC59-nge2-LLDQ-3WACPR
├─vg0-root ext4 1.0 0545917f-d52f-4902-81f6-88c5f02e84ba 37.9G 17% /
├─vg0-swap swap 1 236c4901-9f82-4399-a8d6-b61147818169
├─vg0-tmp reiserfs 3.6 9b4771e7-73a5-41f9-ab4f-56972f9be3e1 4.6G 8% /tmp
└─vg0-data
sdb
├─sdb1 linux_raid_member 1.2 rescue:0 492095b8-f9ee-e7c0-9e1e-e19193d3c00f
│ └─md0 ext3 1.0 ce3f530e-9eb8-4d67-ac74-44fe208f9328 877.3M 6% /boot
└─sdb2 linux_raid_member 1.2 rescue:1 d9d405cd-528f-6f96-99b3-9b7fb7469523
└─md1 LVM2_member LVM2 001 ECx2sJ-9et7-Sg4F-oC59-nge2-LLDQ-3WACPR
├─vg0-root ext4 1.0 0545917f-d52f-4902-81f6-88c5f02e84ba 37.9G 17% /
├─vg0-swap swap 1 236c4901-9f82-4399-a8d6-b61147818169
├─vg0-tmp reiserfs 3.6 9b4771e7-73a5-41f9-ab4f-56972f9be3e1 4.6G 8% /tmp
└─vg0-data
The server running on osd1, has 2 extra disks sdc
and sdd
which are correctly detected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is not necessary for LV-backed OSD for both OSD on device and OSD on PVC because this configuration always uses raw mode and ceph-volume raw prepare accepts all device names. However, if a new OSD creation uses lvm mode, /dev/mapper prefix must be used because ceph-volume lvm prepare only accept the device names prefixed by /dev/mapper. This limitation exists for both OSD on device and OSD on PCV.
Please wait for a while. I'll verify my concern later.
My worry proved groundless because of the following reasons.
ceph-volume lvm batch
accepts only "disk", "part", "lvm", and "loop". In other words, lvm mode is used only for these disk types.- Rook supports only raw mode for OSD on LV. So "/dev/mapper" prefix (will be removed by this PR) is not necessary.
- loop devices are not provided by device mapper. So "/dev/mapper" prefix is not necessary too.
So we can merge this PR if the current conflict is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ausias-armesto Could you specify LV devices explicitly by devices.name
field? Rook doesn't pick up LV devices specified by filters like deviceFilter
by design. As a side note, deviceClas
should be deviceClass.
nodes:
...
- name: "server-stage-fsn1-002"
devices:
- name: "/dev/vg0/data"
config:
osdsPerDevice: "1"
deviceClass: "hdd"
- name: "server-stage-hel1-001"
devices:
- name: "/dev/vg0/data"
config:
osdsPerDevice: "1"
deviceClas: "hdd"
@ausias-armesto Can you rebase to remove the merge commit from the PR? A single commit is preferred. See these instructions. |
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
9e5f032
to
15049c4
Compare
This pull request has merge conflicts that must be resolved before it can be merged. @ausias-armesto please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork |
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
15049c4
to
bc41656
Compare
@ausias-armesto Commitlint claims "body must have leading blank line [body-leading-blank]", please check it. https://github.com/rook/rook/actions/runs/3693055432/jobs/6252601418#step:4:20 Other test failures seem to be intermittent. |
The CI was having issues earlier today. If you rebase when you update the commit message, the CI should pass now as well. |
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
bc41656
to
34c9513
Compare
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
34c9513
to
42731d6
Compare
It removes the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail. Closes: rook#11301 Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
42731d6
to
f55c457
Compare
Thi problem will be handled by #11438 |
OSD prepare job cannot create vols with /dev/mapper alias on PVCs (backport #11301)
Description of your changes:
The kubernetes job
rook-ceph-osd-prepare-<server-name>
is not able to complete correctly when trying to create a OSD whose device is created by a LVM and has an alias starting by/dev/mapper
. The job fails trying to create aceph-volume
Which issue is resolved by this Pull Request:
Resolves #11297
Checklist:
skip-ci
on the PR.