[v3.11.0] pod with pvc failed to mount from ceph cluster. (stderr: unable to get monitor info from DNS SRV with service name: ceph-mon) #4771

thomasnew · 2024-08-16T06:59:29Z

Describe the bug

use the example yaml from repos (ceph-csi-3.11.0/examples/cephfs)

A clear and concise description of what the bug is.

Warning  FailedMount  1s  kubelet  MountVolume.MountDevice failed for volume "pvc-b9db7abb-12ee-4d36-8470-4ff6b8cef2f6" : rpc error: code = Internal desc = an error (exit status 32) occurred while running mount args: [-t ceph 10.220.8.44:6789,10.220.8.45:6789,10.220.8.46:6789,10.220.8.47:6789,10.220.8.48:6789:/volumes/csimll/csi-vol-a432fea2-3e0a-4277-afae-f1c2dc108571/a1f50d0f-2aa9-4b61-b955-34fad94165a2 /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/eca464ba90308fdd53f2643a1b3b08b3c885b134ecca598a8df1a0e8f6f6a40b/globalmount -o name=iscse-cephfs-hdd,secretfile=/tmp/csi/keys/keyfile-894763626,mds_namespace=iscse-cephfs-hdd,discard,_netdev] stderr: unable to get monitor info from DNS SRV with service name: ceph-mon
2024-08-16T06:43:12.748+0000 7fb4f2ed8140 -1 failed for service _ceph-mon._tcp
mount error 22 = Invalid argument

Environment details

- Image/version of Ceph CSI driver : 3.11
- Helm chart version : 3.11
- Kernel version : 5.15.0-118-generic, ubuntu 22.04
- Mounter used for mounting PVC (for cephFS its `fuse` or `kernel`. for rbd its
  `krbd` or `rbd-nbd`) : cephfs csi
- Kubernetes cluster version : v1.28.12
- Ceph cluster version :  quincy (stable)

Steps to reproduce

Steps to reproduce the behavior:

1. Setup cephfs csi from helm 3
2. create pvc use  ./ceph-csi-3.11.0/examples/cephfs/pvc.yaml, and pvc bound success.
3. create pod use ./ceph-csi-3.11.0/examples/cephfs/pod.yaml, but pod can not running , stay "ContainerCreating"

Actual results

Describe what happened

Expected behavior

A clear and concise description of what you expected to happen.

Logs

Warning  FailedMount  1s  kubelet  MountVolume.MountDevice failed for volume "pvc-b9db7abb-12ee-4d36-8470-4ff6b8cef2f6" : rpc error: code = Internal desc = an error (exit status 32) occurred while running mount args: [-t ceph 10.220.8.44:6789,10.220.8.45:6789,10.220.8.46:6789,10.220.8.47:6789,10.220.8.48:6789:/volumes/csimll/csi-vol-a432fea2-3e0a-4277-afae-f1c2dc108571/a1f50d0f-2aa9-4b61-b955-34fad94165a2 /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/eca464ba90308fdd53f2643a1b3b08b3c885b134ecca598a8df1a0e8f6f6a40b/globalmount -o name=iscse-cephfs-hdd,secretfile=/tmp/csi/keys/keyfile-894763626,mds_namespace=iscse-cephfs-hdd,discard,_netdev] stderr: unable to get monitor info from DNS SRV with service name: ceph-mon
2024-08-16T06:43:12.748+0000 7fb4f2ed8140 -1 failed for service _ceph-mon._tcp
mount error 22 = Invalid argument

=============
If the issue is in PVC creation, deletion, cloning please attach complete logs
of below containers.

csi-provisioner and csi-rbdplugin/csi-cephfsplugin container logs from the
provisioner pod.

If the issue is in PVC resize please attach complete logs of below containers.

csi-resizer and csi-rbdplugin/csi-cephfsplugin container logs from the
provisioner pod.

If the issue is in snapshot creation and deletion please attach complete logs
of below containers.

csi-snapshotter and csi-rbdplugin/csi-cephfsplugin container logs from the
provisioner pod.

If the issue is in PVC mounting please attach complete logs of below containers.

csi-rbdplugin/csi-cephfsplugin and driver-registrar container logs from
plugin pod from the node where the mount is failing.
if required attach dmesg logs.

Note:- If its a rbd issue please provide only rbd related logs, if its a
cephFS issue please provide cephFS logs.

Additional context

Add any other context about the problem here.

For example:

Any existing bug report which describe about the similar issue/behavior

The text was updated successfully, but these errors were encountered:

Madhu-1 · 2024-08-16T07:02:10Z

discard

This looks to be the problem, can you remove discard from the storageclass and try the steps again.

thomasnew · 2024-08-16T07:09:11Z

discard

This looks to be the problem, can you remove discard from the storageclass and try the steps again.

Greate thanks.
deleted the mountOptions: discard.
Pod become running.

Madhu-1 closed this as completed Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v3.11.0] pod with pvc failed to mount from ceph cluster. (stderr: unable to get monitor info from DNS SRV with service name: ceph-mon) #4771

[v3.11.0] pod with pvc failed to mount from ceph cluster. (stderr: unable to get monitor info from DNS SRV with service name: ceph-mon) #4771

thomasnew commented Aug 16, 2024

Madhu-1 commented Aug 16, 2024

thomasnew commented Aug 16, 2024

[v3.11.0] pod with pvc failed to mount from ceph cluster. (stderr: unable to get monitor info from DNS SRV with service name: ceph-mon) #4771

[v3.11.0] pod with pvc failed to mount from ceph cluster. (stderr: unable to get monitor info from DNS SRV with service name: ceph-mon) #4771

Comments

thomasnew commented Aug 16, 2024

Describe the bug

Environment details

Steps to reproduce

Actual results

Expected behavior

Logs

Additional context

Madhu-1 commented Aug 16, 2024

thomasnew commented Aug 16, 2024