Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSD prepare job cannot create vols with /dev/mapper alias on PVCs #11301

Merged

Conversation

ausias-armesto
Copy link
Contributor

@ausias-armesto ausias-armesto commented Nov 11, 2022

Description of your changes:
The kubernetes job rook-ceph-osd-prepare-<server-name> is not able to complete correctly when trying to create a OSD whose device is created by a LVM and has an alias starting by /dev/mapper. The job fails trying to create a ceph-volume

Which issue is resolved by this Pull Request:
Resolves #11297

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide).
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Nov 12, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: Ausias Armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from dda64c7 to 989671e Compare November 12, 2022 09:22
ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Nov 12, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: Ausias Armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 989671e to 574240b Compare November 12, 2022 09:28
ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Nov 12, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: Ausias Armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 574240b to 3beb9d3 Compare November 12, 2022 09:30
ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Nov 12, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: Ausias Armesto <ausiasarmesto@gmail.com>
Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 3beb9d3 to 9b39405 Compare November 12, 2022 09:34
@parth-gr
Copy link
Member

You also need to remove this check from the unit test https://github.com/rook/rook/blob/master/pkg/daemon/ceph/osd/volume_test.go
So the unit-test CI should pass

@BlaineEXE
Copy link
Member

If I understand correctly, this only affects PVC-based OSDs. Is that correct and intended? If so, I would request the title and commit subject be sure to clarify that it is only the PVC OSDs.

@BlaineEXE
Copy link
Member

I also think @satoru-takeuchi 's input will be invaluable here. As long as this is limited to PVC-based OSDs, I think this is safe, but I'm not sure if there are side-effects I'm not foreseeing.

ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Nov 16, 2022
…ing job prepare execution.\nRemove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.\nCloses: rook#11301\Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Nov 16, 2022
…ing job prepare execution.

Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.

Closes: rook#11301

Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 89eb49e to 33cd542 Compare November 16, 2022 14:41
ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Nov 16, 2022
…ing job prepare execution.

Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.

Closes: rook#11301

Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 33cd542 to 50407bc Compare November 16, 2022 15:12
ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Nov 16, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.

Closes: rook#11301

Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 50407bc to b12c16f Compare November 16, 2022 15:26
@satoru-takeuchi
Copy link
Member

@ausias-armesto Does this PR ready to be reviewed?

In addition, please reflect this comment if necessary.

@BlaineEXE BlaineEXE changed the title OSD prepare job cannot create volumes with alias /dev/mapper OSD prepare job cannot create /dev/mapper volumes on PVCs Nov 16, 2022
@BlaineEXE BlaineEXE changed the title OSD prepare job cannot create /dev/mapper volumes on PVCs OSD prepare job cannot create vols with /dev/mapper alias on PVCs Nov 16, 2022
@BlaineEXE
Copy link
Member

I see this commit body:

Remove the condition that select the devices name prefixed with /dev/mapper as it is preventing the ceph-volume command to fail.

The way I read this, I think preventing is intended to be causing> is that right?

@ausias-armesto
Copy link
Contributor Author

ausias-armesto commented Nov 16, 2022

If I understand correctly, this only affects PVC-based OSDs. Is that correct and intended? If so, I would request the title and commit subject be sure to clarify that it is only the PVC OSDs.

Yes, that is the case that I am facing, for PVC based OSD. I've updated the commit message and fixed the unit test so you can review again.
I've also changed 'preventing' by 'causing'

ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Nov 16, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from b12c16f to c2947ea Compare November 16, 2022 20:55
@ausias-armesto
Copy link
Contributor Author

Hi @satoru-takeuchi @BlaineEXE ,
Is there anything else that I can do to get approved this PR

@satoru-takeuchi
Copy link
Member

@ausias-armesto I've not reviewed this yet. Please wait for a while. I'll look into it in the next week.

@satoru-takeuchi
Copy link
Member

@ausias-armesto Please wait for several days to complete the review. The OSD creation code is complicated very much and I'm afraid regressions.

@satoru-takeuchi
Copy link
Member

I assume the root cause is not the treatment of "/dev/mapper" but the improper handling of osdsPerDevice field. Although OSD on LV doesn't support osdsPerDevice field (it's not documented yet and I'll fix it later), users can specify this field and Rook tries to osd by mistake.

I guess you can create an OSD without applying this PR with the following configuration which you mentioned. Could you confirm this?

          storage:
            useAllNodes: false
            useAllDevices: false
            # deviceFilter:
            nodes:
              - name: "server-stage-fsn1-002"
                devices:
                  - name: "/dev/vg0/data"
                    config:
                      osdsPerDevice: "1"
                      deviceClas: "hdd"

And the following configuration results in the problem that you saw. The difference is the number of osdsPerDevice.

          storage:
            useAllNodes: false
            useAllDevices: false
            # deviceFilter:
            nodes:
              - name: "server-stage-fsn1-002"
                devices:
                  - name: "/dev/vg0/data"
                    config:
                      osdsPerDevice: "3"
                      deviceClas: "hdd"

This can be verified by reading your log of the prepare pod.

2022-11-10 22:04:17.199022 D | cephosd: won't use raw mode since osd per device is 3

I guess that your problem will disappear if osdsPerDevice is not specified or is set to 1. Please try this.

@travisn
Copy link
Member

travisn commented Dec 5, 2022

@satoru-takeuchi I'm investigating a case where an OSD cannot be created on a PVC due to the same issue where the device has mappings under /dev/mappings. But when using a PVC I don't believe we need to use those mappings since the PVC gives us the exact path. Shouldn't we remove the dev mappings handling code for OSDs on PVCs? Or why would we need it?

@satoru-takeuchi
Copy link
Member

@travisn

Shouldn't we remove the dev mappings handling code for OSDs on PVCs? Or why would we need it?

You're right. This logic is not necessary. So this patch makes sense not as a bug fix but as a cleanup. How about merging this PR anyway? In addition, I'll submit a PR to handle osdsPerDevice properly.

Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking changes requested until discussion is finalized.

@@ -586,12 +579,6 @@ func (a *OsdAgent) initializeDevicesLVMMode(context *clusterd.Context, devices *

logger.Infof("configuring new LVM device %s", name)
deviceArg := path.Join("/dev", name)
// ceph-volume prefers to use /dev/mapper/<name> if the device has this kind of alias
for _, devlink := range device.PersistentDevicePaths {
if strings.HasPrefix(devlink, "/dev/mapper") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travisn

Shouldn't we remove the dev mappings handling code for OSDs on PVCs? Or why would we need it?

You're right. This logic is not necessary. So this patch makes sense not as a bug fix but as a cleanup. How about merging this PR anyway? In addition, I'll submit a PR to handle osdsPerDevice properly.

I wonder if there is still a scenario where we need to keep the /dev/mapper detection. If the user has specified useAllDevices or similar settings, the /dev/mapper path is still preferred. But in the case where the user has specified explicit paths such as with local PVs or named devices in the cluster CR, there is no need for this search since they can be specified already as such with the /dev/mapper path. If we add this check back in for the non-pvc case here, we should add it for raw mode in the initializeDevicesRawMode() method where it is missing now. Or perhaps we keep it simple, consistent, and predictable by merging this PR without further changes.
@satoru-takeuchi @ausias-armesto thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travisn I only focused on LV-backed PVC by mistake and now I have a concern.

This logic is not necessary for LV-backed OSD for both OSD on device
nd OSD on PVC because this configuration always uses raw mode and ceph-volume raw prepare accepts all device names. However, if a new OSD creation uses lvm mode, /dev/mapper prefix must be used because ceph-volume lvm prepare only accept the device names prefixed by /dev/mapper. This limitation exists for both OSD on device and OSD on PCV.

Please wait for a while. I'll verify my concern later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travisn , I can only test on LV-backed PVC, and cannot provide further information about how would behave in other contexts. On the other side, changing osdsPerDevice to 1 has not changed anything, is still not working.

Thanks for following this issue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ausias-armesto Could you show me your cluster.yaml and the log of osd prepare pod when you specified osdsPerDevice: 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

osd-prepare-1.log
osd-prepare-2.log
osd-prepare-3.log
helm-cluster.yaml.txt

The disk partitions for servers running on osd2 and osd3 looks like this:

root@server-stage-fsn1-002 /home/debian # lsblk -f
NAME           FSTYPE            FSVER    LABEL    UUID                                   FSAVAIL FSUSE% MOUNTPOINT
sda                                                                                                      
├─sda1         linux_raid_member 1.2      rescue:0 492095b8-f9ee-e7c0-9e1e-e19193d3c00f                  
│ └─md0        ext3              1.0               ce3f530e-9eb8-4d67-ac74-44fe208f9328    877.3M     6% /boot
└─sda2         linux_raid_member 1.2      rescue:1 d9d405cd-528f-6f96-99b3-9b7fb7469523                  
  └─md1        LVM2_member       LVM2 001          ECx2sJ-9et7-Sg4F-oC59-nge2-LLDQ-3WACPR                
    ├─vg0-root ext4              1.0               0545917f-d52f-4902-81f6-88c5f02e84ba     37.9G    17% /
    ├─vg0-swap swap              1                 236c4901-9f82-4399-a8d6-b61147818169                  
    ├─vg0-tmp  reiserfs          3.6               9b4771e7-73a5-41f9-ab4f-56972f9be3e1      4.6G     8% /tmp
    └─vg0-data                                                                                           
sdb                                                                                                      
├─sdb1         linux_raid_member 1.2      rescue:0 492095b8-f9ee-e7c0-9e1e-e19193d3c00f                  
│ └─md0        ext3              1.0               ce3f530e-9eb8-4d67-ac74-44fe208f9328    877.3M     6% /boot
└─sdb2         linux_raid_member 1.2      rescue:1 d9d405cd-528f-6f96-99b3-9b7fb7469523                  
  └─md1        LVM2_member       LVM2 001          ECx2sJ-9et7-Sg4F-oC59-nge2-LLDQ-3WACPR                
    ├─vg0-root ext4              1.0               0545917f-d52f-4902-81f6-88c5f02e84ba     37.9G    17% /
    ├─vg0-swap swap              1                 236c4901-9f82-4399-a8d6-b61147818169                  
    ├─vg0-tmp  reiserfs          3.6               9b4771e7-73a5-41f9-ab4f-56972f9be3e1      4.6G     8% /tmp
    └─vg0-data                                                                                           

The server running on osd1, has 2 extra disks sdc and sdd which are correctly detected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travisn @ausias-armesto

This logic is not necessary for LV-backed OSD for both OSD on device and OSD on PVC because this configuration always uses raw mode and ceph-volume raw prepare accepts all device names. However, if a new OSD creation uses lvm mode, /dev/mapper prefix must be used because ceph-volume lvm prepare only accept the device names prefixed by /dev/mapper. This limitation exists for both OSD on device and OSD on PCV.

Please wait for a while. I'll verify my concern later.

My worry proved groundless because of the following reasons.

  • ceph-volume lvm batch accepts only "disk", "part", "lvm", and "loop". In other words, lvm mode is used only for these disk types.
  • Rook supports only raw mode for OSD on LV. So "/dev/mapper" prefix (will be removed by this PR) is not necessary.
  • loop devices are not provided by device mapper. So "/dev/mapper" prefix is not necessary too.

So we can merge this PR if the current conflict is fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ausias-armesto Could you specify LV devices explicitly by devices.name field? Rook doesn't pick up LV devices specified by filters like deviceFilter by design. As a side note, deviceClas should be deviceClass.

          nodes:
...
            - name: "server-stage-fsn1-002"
              devices:
              - name: "/dev/vg0/data"
                config:
                    osdsPerDevice: "1"
                    deviceClass: "hdd"
            - name: "server-stage-hel1-001"
               devices:
              - name: "/dev/vg0/data"
                 config:
                   osdsPerDevice: "1"
                   deviceClas: "hdd"

@travisn
Copy link
Member

travisn commented Dec 12, 2022

@ausias-armesto Can you rebase to remove the merge commit from the PR? A single commit is preferred. See these instructions.

ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Dec 14, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 9e5f032 to 15049c4 Compare December 14, 2022 08:08
@mergify
Copy link

mergify bot commented Dec 14, 2022

This pull request has merge conflicts that must be resolved before it can be merged. @ausias-armesto please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Dec 14, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 15049c4 to bc41656 Compare December 14, 2022 08:29
@satoru-takeuchi
Copy link
Member

@ausias-armesto Commitlint claims "body must have leading blank line [body-leading-blank]", please check it.

https://github.com/rook/rook/actions/runs/3693055432/jobs/6252601418#step:4:20

Other test failures seem to be intermittent.

@travisn
Copy link
Member

travisn commented Dec 14, 2022

The CI was having issues earlier today. If you rebase when you update the commit message, the CI should pass now as well.

ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Dec 15, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from bc41656 to 34c9513 Compare December 15, 2022 14:59
ausias-armesto added a commit to ausias-armesto/rook that referenced this pull request Dec 15, 2022
Remove the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 34c9513 to 42731d6 Compare December 15, 2022 15:03
It removes the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail.

Closes: rook#11301
Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
@ausias-armesto ausias-armesto force-pushed the fix/11297-create-volumes-with-mapper-alias branch from 42731d6 to f55c457 Compare December 15, 2022 15:05
@travisn travisn merged commit 03bfd5e into rook:master Dec 15, 2022
mergify bot pushed a commit that referenced this pull request Dec 15, 2022
It removes the condition that select the devices name prefixed with /dev/mapper as it is causing the ceph-volume command to fail.

Closes: #11301
Signed-off-by: ausias-armesto <ausiasarmesto@gmail.com>
(cherry picked from commit f55c457)
@satoru-takeuchi
Copy link
Member

Although OSD on LV doesn't support osdsPerDevice field (it's not documented yet and I'll fix it later), users can specify this field and Rook tries to osd by mistake.

Thi problem will be handled by #11438

mergify bot added a commit that referenced this pull request Dec 16, 2022
OSD prepare job cannot create vols with /dev/mapper alias on PVCs (backport #11301)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OSD prepare job cannot create volumes with alias /dev/mapper
5 participants