-
Notifications
You must be signed in to change notification settings - Fork 591
Description
Describe the bug
Mount failed on encrypted RBD device with wrong fs type error message. Issue seems to be intermittent.
Pod cannot be started and when describing pod status the following is visible:
Warning FailedMount 108s (x28 over 43m) kubelet MountVolume.MountDevice failed for volume "pvc-cc5127e5-2598-41a9-a742-f51290f28b08" : rpc error: code = Internal desc = mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t ext4 -o _netdev,defaults /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b: wrong fs type, bad option, bad superblock on /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b, missing codepage or helper program, or other error.
Environment details
-
Image/version of Ceph CSI driver :
repository: quay.io/cephcsi/cephcsi
tag: v3.13.0 -
Helm chart version :
CHART: rook-ceph
VERSION: v1.16.0 -
Kernel version : 5.14.21-150500.55.83-default
-
Mounter used for mounting PVC (for cephFS its
fuseorkernel. for rbd its
krbdorrbd-nbd) : -
Kubernetes cluster version : v1.31.1
-
Ceph cluster version :
cephVersion:
image: quay.io/ceph/ceph:v19.2.0
Steps to reproduce
Steps to reproduce the behavior:
- Setup details: not known exactly, how to reproduce. Issue comes randomly. On the cluster there are continous reinstallation in eric-eea-ns namespace and there are cases when random pod cannot be started due to PVC mount issue.
Reported issue comes randomly. What we can observe, that after a k8s-cluster re-installation the issue might come more frequently for some days. - Deployment to trigger the issue '....'
- See error
Actual results
PVC cannot be mounted
Expected behavior
PVC can be created without described issue.
Logs
In eric-eea-ns namespace the pod eric-eea-refdata-data-document-database-pg-1 cannot be started
File: logs_eric-eea-ns_2025-03-14-01-12-26.tgz/describe/PODS/pods.txt
eric-eea-refdata-data-document-database-pg-1 0/3 ContainerCreating 0 43m <none> seliics07842e01 <none> <none>
When describing the pod (file: logs_eric-eea-ns_2025-03-14-01-12-26.tgz/describe/PODS/eric-eea-refdata-data-document-database-pg-1.yaml)the following is observed:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 44m default-scheduler 0/4 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
Normal Scheduled 44m default-scheduler Successfully assigned eric-eea-ns/eric-eea-refdata-data-document-database-pg-1 to seliics07842e01
Normal SuccessfulAttachVolume 44m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-cc5127e5-2598-41a9-a742-f51290f28b08"
Warning FailedMount 108s (x28 over 43m) kubelet MountVolume.MountDevice failed for volume "pvc-cc5127e5-2598-41a9-a742-f51290f28b08" : rpc error: code = Internal desc = mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t ext4 -o _netdev,defaults /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b: wrong fs type, bad option, bad superblock on /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b, missing codepage or helper program, or other error.
Related PVC is pvc-cc5127e5-2598-41a9-a742-f51290f28b08
In the rook-ceph namespace the following error is visible continously:
File: logs_rook-ceph_2025-03-14-01-26-46/logs/err/csi-rbdplugin-qtscn_csi-rbdplugin.err.txt
I0314 00:28:52.535743 2028 mount_linux.go:452] `fsck` error fsck from util-linux 2.37.4
fsck: error 2 (No such file or directory) while executing fsck.ext4dev for /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b: wrong fs type, bad option, bad superblock on /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b, missing codepage or helper program, or other error.
E0314 00:28:52.540850 2028 nodeserver.go:842] ID: 290500 Req-ID: 0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b failed to mount device path (/dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b) to staging path (/var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b) for volume (0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b) error: mount failed: exit status 32
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b: wrong fs type, bad option, bad superblock on /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b, missing codepage or helper program, or other error.
E0314 00:28:52.738512 2028 utils.go:271] ID: 290500 Req-ID: 0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b GRPC error: rpc error: code = Internal desc = mount failed: exit status 32
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b: wrong fs type, bad option, bad superblock on /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b, missing codepage or helper program, or other error.
I0314 00:28:55.895428 2028 mount_linux.go:452] `fsck` error fsck from util-linux 2.37.4
in the /var/log/messages file the following is observable:
2025-03-14T01:28:52.739124+01:00 seliics07842e01 kubelet[29074]: E0314 01:28:52.739039 29074 csi_attacher.go:366] kubernetes.io/csi: attacher.MountDevice failed: rpc error: code = Internal desc = mount failed: exit status 32
2025-03-14T01:28:52.739262+01:00 seliics07842e01 kubelet[29074]: Mounting command: mount
2025-03-14T01:28:52.739321+01:00 seliics07842e01 kubelet[29074]: Mounting arguments: -t ext4 -o _netdev,defaults /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b
2025-03-14T01:28:52.739366+01:00 seliics07842e01 kubelet[29074]: Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b: wrong fs type, bad option, bad superblock on /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b, missing codepage or helper program, or other error.
2025-03-14T01:28:52.739420+01:00 seliics07842e01 kubelet[29074]: E0314 01:28:52.739262 29074 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b podName: nodeName:}" failed. No retries permitted until 2025-03-14 01:28:53.239241093 +0100 CET m=+1239617.995228529 (durationBeforeRetry 500ms). Error: MountVolume.MountDevice failed for volume "pvc-cc5127e5-2598-41a9-a742-f51290f28b08" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b") pod "eric-eea-refdata-data-document-database-pg-1" (UID: "a5774eb8-a4d2-4708-92c7-3d092b1580cb") : rpc error: code = Internal desc = mount failed: exit status 32
2025-03-14T01:28:52.739506+01:00 seliics07842e01 kubelet[29074]: Mounting command: mount
2025-03-14T01:28:52.739539+01:00 seliics07842e01 kubelet[29074]: Mounting arguments: -t ext4 -o _netdev,defaults /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b
2025-03-14T01:28:52.739574+01:00 seliics07842e01 kubelet[29074]: Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/2677416184d1804456c8cda2e754b18d3359f0d524484b48ec7f534aba3fe540/globalmount/0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b: wrong fs type, bad option, bad superblock on /dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b, missing codepage or helper program, or other error.
2025-03-14T01:28:52.749315+01:00 seliics07842e01 systemd[1]: cri-containerd-5f9ca82db63019c5d49dd0af0a95f5af5c4ddb47007cbddd691949f6981e7181.scope: Deactivated successfully.
2025-03-14T01:28:52.816663+01:00 seliics07842e01 systemd[1]: run-containerd-io.containerd.runtime.v2.task-k8s.io-5f9ca82db63019c5d49dd0af0a95f5af5c4ddb47007cbddd691949f6981e7181-rootfs.mount: Deactivated successfully.
2025-03-14T01:28:53.005385+01:00 seliics07842e01 systemd[1]: Started libcontainer container 3f0d37beb1e8fc21d22a93a9986ff5f6cba2204f846a74610c228a15adf461c8.
2025-03-14T01:28:53.312877+01:00 seliics07842e01 kubelet[29074]: I0314 01:28:53.312308 29074 operation_generator.go:538] "MountVolume.WaitForAttach entering for volume \"pvc-cc5127e5-2598-41a9-a742-f51290f28b08\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b\") pod \"eric-eea-refdata-data-document-database-pg-1\" (UID: \"a5774eb8-a4d2-4708-92c7-3d092b1580cb\") DevicePath \"\"" pod="eric-eea-ns/eric-eea-refdata-data-document-database-pg-1"
2025-03-14T01:28:53.314864+01:00 seliics07842e01 (udev-worker)[27815]: dm-23: Failed to create/update device symlink '/dev/mapper/luks-rbd-0001-0009-rook-ceph-0000000000000001-10ab64ad-094a-41f6-b65f-b9fa2b7848cb', ignoring: File exists
2025-03-14T01:28:53.315613+01:00 seliics07842e01 kubelet[29074]: I0314 01:28:53.315489 29074 operation_generator.go:548] "MountVolume.WaitForAttach succeeded for volume \"pvc-cc5127e5-2598-41a9-a742-f51290f28b08\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-b7c610dc-61bc-4136-b03d-ae5c86fef99b\") pod \"eric-eea-refdata-data-document-database-pg-1\" (UID: \"a5774eb8-a4d2-4708-92c7-3d092b1580cb\") DevicePath \"csi-d60a1d1fb2afebbee9fc3f3ea1f324638a6c760b88bff7ac604f4c2cb77635df\"" pod="eric-eea-ns/eric-eea-refdata-data-document-database-pg-1"
log files are attached:
Additional context
Error messages and sympthon is the same that is reported in #3913
Info regarding the setup and attached log files:
-
rook-ceph has its dedicated namespace, every log from that namespace is collected in the attahced file logs_rook-ceph_2025-03-14-01-26-46.tgz
-
kube-system namespace logs are in file logs_kube-system_2025-03-14-01-26-22.tgz
-
The namespace which contains the product that is tested is deployed in namespace eric-eea-ns, logs are collected to : logs_eric-eea-ns_2025-03-14-01-12-26.tgz
-
var/log/messages and dmesg logs are attached in seliics07842e01.zip.