Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pacific: krbd: make sure the device node is accessible after the mapping #39642

Merged
merged 1 commit into from
Feb 24, 2021

Conversation

idryomov
Copy link
Contributor

We have always assumed this to be the case and users' scripts and
orchestration tools have grown to depend on this. Let's add some
enforcement, prompted by [1]:

"I am running my Kubernetes worker node inside of an LXC container
which doesn't benefit from the device node created by the kernel, so
I'm using udev to create the /dev/rbd* device nodes inside of the LXC
container."

which, through the unfortunate interaction with ceph-csi rbd plugin,
results in data loss for "volumeMode: Filesystem" PVs because it ends
up recreating the filesystem every time the PV is attached to the pod:

"When deleting the pod and re-creating it, I can see that the RBD
image is indeed being reformatted. This seems to be because when
blkid is being run to check if the image is formatted, the /dev/rbd*
device has not yet been created by udev. By the time the code gets
down to running mkfs, the device is there and the damage is done."

[1] ceph/ceph-csi#1820

Fixes: https://tracker.ceph.com/issues/49410
Signed-off-by: Ilya Dryomov idryomov@gmail.com
(cherry picked from commit f6854ac)

We have always assumed this to be the case and users' scripts and
orchestration tools have grown to depend on this.  Let's add some
enforcement, prompted by [1]:

  "I am running my Kubernetes worker node inside of an LXC container
   which doesn't benefit from the device node created by the kernel, so
   I'm using udev to create the /dev/rbd* device nodes inside of the LXC
   container."

which, through the unfortunate interaction with ceph-csi rbd plugin,
results in data loss for "volumeMode: Filesystem" PVs because it ends
up recreating the filesystem every time the PV is attached to the pod:

  "When deleting the pod and re-creating it, I can see that the RBD
   image is indeed being reformatted. This seems to be because when
   blkid is being run to check if the image is formatted, the /dev/rbd*
   device has not yet been created by udev. By the time the code gets
   down to running mkfs, the device is there and the damage is done."

[1] ceph/ceph-csi#1820

Fixes: https://tracker.ceph.com/issues/49410
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit f6854ac)
@idryomov idryomov added this to the pacific milestone Feb 23, 2021
@idryomov idryomov changed the title krbd: make sure the device node is accessible after the mapping pacific: krbd: make sure the device node is accessible after the mapping Feb 23, 2021
Copy link

@dillaman dillaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@idryomov idryomov merged commit c356ba8 into ceph:pacific Feb 24, 2021
@idryomov idryomov deleted the wip-rbd-map-sanity-check-pacific branch February 24, 2021 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants