Skip to content

Create a new cephfs PVC from snapshot fails #5060

@sle78

Description

@sle78

Describe the bug

Creating a new cephfs PVC from snapshot is failing.

Environment details

  • Image/version of Ceph CSI driver : 8.13.0
  • Helm chart version :
  • Kernel version : 5.15.0-130-generic
  • Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its
    krbd or rbd-nbd) : N/A
  • Kubernetes cluster version : v1.30.8
  • Ceph cluster version :
{
    "mon": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 3
    },
    "mgr": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 2
    },
    "osd": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 34
    },
    "mds": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 2
    },
    "rgw": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 2
    },
    "overall": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 43
    }
}

rook-ceph-provisioner stack

    Image:      registry.k8s.io/sig-storage/csi-attacher:v4.6.1
    Image:      registry.k8s.io/sig-storage/csi-snapshotter:v8.2.0
    Image:      registry.k8s.io/sig-storage/csi-resizer:v1.11.1
    Image:      registry.k8s.io/sig-storage/csi-provisioner:v5.0.1
    Image:      quay.io/cephcsi/cephcsi:v3.13.0
    Image:      quay.io/cephcsi/cephcsi:v3.13.0

Steps to reproduce

Create a new cephfs PVC from snapshot

Name:          create-mn-k8s-snapshot-bdf4f0a5-0b7f-4572-977a-ff3ef1db402d-a969c600-4ff1-4c58-be99-f2f09de50f92-0-0-eae92200-4353-440b-8123-217991f95a56
Namespace:     data
StorageClass:  ceph-filesystem
Status:        Pending
Volume:        
Annotations:   volume.beta.kubernetes.io/storage-provisioner: storage.cephfs.csi.ceph.com
               volume.kubernetes.io/storage-provisioner: storage.cephfs.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
DataSource:
  APIGroup:  snapshot.storage.k8s.io
  Kind:      VolumeSnapshot
  Name:      data-fs-datasp58f
Events:
  Warning  ProvisioningFailed    103s                storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-6b5b5b49c5-f5t88_47b8f641-6a63-4d09-ad08-0359f1b6a9ee  failed to provision volume with StorageClass "ceph-filesystem": rpc error: code = Aborted desc = clone from snapshot is pending
  Normal   Provisioning          30s (x8 over 104s)  storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-6b5b5b49c5-f5t88_47b8f641-6a63-4d09-ad08-0359f1b6a9ee  External provisioner is provisioning volume for claim "data/create-mn-k8s-snapshot-bdf4f0a5-0b7f-4572-977a-ff3ef1db402d-a969c600-4ff1-4c58-be99-f2f09de50f92-0-0-eae92200-4353-440b-8123-217991f95a56"
  Warning  ProvisioningFailed    28s (x7 over 98s)   storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-6b5b5b49c5-f5t88_47b8f641-6a63-4d09-ad08-0359f1b6a9ee  failed to provision volume with StorageClass "ceph-filesystem": rpc error: code = Aborted desc = clone from snapshot is already in progress
  Normal   ExternalProvisioning  8s (x9 over 104s)   persistentvolume-controller                                                                                     Waiting for a volume to be created either by the external provisioner 'storage.cephfs.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.

Logs

These errors keep coming out over and over again.
CSI-provisioner logs:

E0108 12:51:46.920488       1 controller.go:974] error syncing claim "e753bfee-ea1c-47af-bc73-2fd2e66576fb": failed to provision volume with StorageClass "ceph-filesystem": rpc error: code = Aborted desc = clone from snapshot is already in progress
I0108 12:51:46.920511       1 event.go:389] "Event occurred" object="rubrik-kupr/create-mn-k8s-snapshot-bdf4f0a5-0b7f-4572-977a-ff3ef1db402d-a969c600-4ff1-4c58-be99-f2f09de50f92-0-0-eae92200-4353-440b-8123-217991f95a56" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="ProvisioningFailed" message="failed to provision volume with StorageClass \"ceph-filesystem\": rpc error: code = Aborted desc = clone from snapshot is already in progress"

csi-cephfsplugin logs:

E0108 12:30:47.848741       1 omap.go:80] ID: 77 Req-ID: 0001-0007-storage-0000000000000001-56c9d391-0f05-4465-ae5f-b73cbc5bb949 omap not found (pool="ceph-filesystem-metadata", namespace="csi", name="csi.volume.56c9d391-0f05-4465-ae5f-b73cbc5bb949"): rados: ret=-2, No such file or directory
W0108 12:30:47.848803       1 voljournal.go:737] ID: 77 Req-ID: 0001-0007-storage-0000000000000001-56c9d391-0f05-4465-ae5f-b73cbc5bb949 unable to read omap keys: pool or key missing: key not found: rados: ret=-2, No such file or directory
E0108 12:30:47.862818       1 volume.go:164] ID: 77 Req-ID: 0001-0007-storage-0000000000000001-56c9d391-0f05-4465-ae5f-b73cbc5bb949 failed to get subvolume info for the vol csi-vol-56c9d391-0f05-4465-ae5f-b73cbc5bb949: rados: ret=-2, No such file or directory: "subvolume 'csi-vol-56c9d391-0f05-4465-ae5f-b73cbc5bb949' does not exist"
E0108 12:30:47.863001       1 controllerserver.go:529] ID: 77 Req-ID: 0001-0007-storage-0000000000000001-56c9d391-0f05-4465-ae5f-b73cbc5bb949 Error returned from newVolumeOptionsFromVolID: volume not found
E0108 12:47:19.229934       1 controllerserver.go:141] ID: 78 Req-ID: pvc-e753bfee-ea1c-47af-bc73-2fd2e66576fb failed to create clone from snapshot csi-snap-6a3144fa-0148-4f0c-89bc-d0a6c256fbc5: clone from snapshot is pending

The snapshots are being taken successfully and it's failing at copying content into the new pvc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions