You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.
But if the cloned image[1] is in trash[2],
CephCSI still tries to add a task to flatten that cloned image (without checking if it is in trash) and fails.
The failure is only logged. CephCSI will not be able to create any more snapshots or pvc-pvc clones once the number hits 450+.
[1] cloned image: underlying a k8s snapshot or pvc-pvc clone)
[2] image in trash: this occurs when the image has child images of its own still alive.
Environment details
Image/version of Ceph CSI driver : all
Helm chart version : all
Kernel version : all
Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its krbd or rbd-nbd) : rbd
Kubernetes cluster version : all
Ceph cluster version : all
Steps to reproduce
Steps to reproduce the behavior:
Create a rbd PVC.
Repeat both the belove steps greater than 450 times.
Create Snapshot.
Restore Snapshot into PVC.
Delete Snapshot.
No more snapshots above 450 can be created.
Actual results
The final snapshot creation fails.
Expected behavior
Snapshots should be created without error.
Propose solutions 1:
During k8s Snapshot Deletion :
Check snapshot count on parent rbd image(underlying the parent PVC) [ Maybe skip this step and always flatten ? ]
Add a task to flatten on rbd image underlying the k8s snapshot itself
Move k8s Snapshot rbd image to trash and add a task to remove it [current process]
Rakshith-R
changed the title
RBD: flattening when max Snapshots limit is reached fails
RBD: Snapshots >450 cannot be taken on a PVC in certain cases
Aug 22, 2023
Describe the bug
CephCSI follows the below logic before a k8s snapshot or pvc-pvc clone is created.
ceph-csi/internal/rbd/controllerserver.go
Lines 532 to 539 in a57fe08
But if the cloned image[1] is in trash[2],
CephCSI still tries to add a task to flatten that cloned image (without checking if it is in trash) and fails.
The failure is only logged. CephCSI will not be able to create any more snapshots or pvc-pvc clones once the number hits 450+.
[1] cloned image: underlying a k8s snapshot or pvc-pvc clone)
[2] image in trash: this occurs when the image has child images of its own still alive.
Environment details
fuseorkernel. for rbd itskrbdorrbd-nbd) : rbdSteps to reproduce
Steps to reproduce the behavior:
Actual results
The final snapshot creation fails.
Expected behavior
Snapshots should be created without error.
Propose solutions 1:
During k8s Snapshot Deletion :
@idryomov @Madhu-1 @nixpanic Can you please provide your opinions/suggestions regarding this?
The text was updated successfully, but these errors were encountered: