rbd: use DeepCopy() to create thick-provisioned volumes from a snapshot #2184

nixpanic · 2021-06-17T18:02:13Z

Describe what this PR does

While restoring a volume from a snapshot, the new volume should be thick-provisioned in case the parent volume was thick-provisioned too. This was not the case, as cloning from snapshots always makes volumes thin-provisioned.

Is there anything that requires special attention

Writing an e2e test case for this is not trivial. The existing validation functions used for the snapshot test cases are not small/modular enough. Hopefully #2178 can bring improvements there.

Related issues

Fixes: #2181
~~Depends-on: #2134 ([DNM] until this is merged)~~

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

/retest ci/centos/<job-name>: retest the <job-name> after unrelated
failure (please report the failure too!)
/retest all: run this in case the CentOS CI failed to start/report any test
progress or results

Rakshith-R · 2021-06-18T05:37:06Z

internal/rbd/controllerserver.go

+	if rbdVol.ThickProvision {
+		err = parentVol.DeepCopy(rbdVol)
+		if err != nil {
+			return status.Errorf(codes.Internal, "failed to deep copy %q into %q: %w", parentVol, rbdVol, err)


Suggested change

return status.Errorf(codes.Internal, "failed to deep copy %q into %q: %w", parentVol, rbdVol, err)

return status.Errorf(codes.Internal, "failed to deep copy %q into %q: %v", parentVol, rbdVol, err)

Using %w inside status.Errorf throws internal/rbd/controllerserver.go:499:11: printf: Errorf call has error-wrapping directive %w (govet) govet linter error as status.Errorf internally calls fmt.Sprintf()
https://github.com/grpc/grpc-go/blob/0257c8657362b76f24e7a8cfb61df48d4cb735d3/status/status.go#L62

Its also the reason for go-test failure. @nixpanic ^^

thanks! I'll test it out and verify that make containerized-test passes

yesh.. its true, we dont need to wrap it.

humblec · 2021-06-18T08:48:55Z

@nixpanic I see the DNM here, are you still testing it ?

nixpanic · 2021-06-18T09:02:15Z

@nixpanic I see the DNM here, are you still testing it ?

I think it still needs a way to restart in case the provisioner aborted the DeepCopy(), just like is done for the PVC cloning situation.

nixpanic · 2021-06-18T10:10:33Z

Manually tested with container image quay.io/nixpanic/cephcsi:testing_rbd_thick-provisioning_snapshot. Deleting the provisioner pod while restoring is in progress works, and I think there are no unused objects left behind.

The error messages are quite nice in the describe PVC output:

$ kubectl describe pvc/rbd-pvc-restore
Name:          rbd-pvc-restore
Namespace:     default
StorageClass:  ocs-storagecluster-ceph-rbd-thick
Status:        Bound
Volume:        pvc-b7c81ecf-9662-4b26-8b1d-bf7445597a97
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: openshift-storage.rbd.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      10Gi
Access Modes:  RWO
VolumeMode:    Filesystem
DataSource:
  APIGroup:  snapshot.storage.k8s.io
  Kind:      VolumeSnapshot
  Name:      rbd-pvc-snapshot
Used By:     <none>
Events:
  Type     Reason                 Age                    From                                                                                                                Message
  ----     ------                 ----                   ----                                                                                                                -------
  Normal   Provisioning           8m37s                  openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-59bc76887b-fjwxb_cdceab9e-fc99-4647-b38a-1951032b5618  External provisioner is provisioning volume for claim "default/rbd-pvc-restore"
  Warning  ProvisioningFailed     8m5s                   openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-59bc76887b-2jr5c_b1768ebe-cb26-463d-a347-a5df1b7862ca  failed to provision volume with StorageClass "ocs-storagecluster-ceph-rbd-thick": rpc error: code = Aborted desc = restoring thick-provisioned volume "ocs-storagecluster-cephblockpool/csi-vol-0f807e84-d01c-11eb-9320-0a580a810215" has been interrupted, please retry
  Normal   Provisioning           8m4s (x2 over 8m5s)    openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-59bc76887b-2jr5c_b1768ebe-cb26-463d-a347-a5df1b7862ca  External provisioner is provisioning volume for claim "default/rbd-pvc-restore"
  Normal   ExternalProvisioning   7m13s (x8 over 8m37s)  persistentvolume-controller                                                                                         waiting for a volume to be created, either by external provisioner "openshift-storage.rbd.csi.ceph.com" or manually created by system administrator
  Normal   ProvisioningSucceeded  7m1s                   openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-59bc76887b-2jr5c_b1768ebe-cb26-463d-a347-a5df1b7862ca  Successfully provisioned volume pvc-b7c81ecf-9662-4b26-8b1d-bf7445597a97

Madhu-1 · 2021-06-18T10:24:57Z

Looks like something wrong. I see PVC and PVC-restore in bound state but i don't see any backend volume associated with it?

NAME              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
rbd-pvc           Bound    pvc-6a783b7c-1127-4f2b-bfba-dac91897b58f   1Gi        RWO            rook-ceph-block   6h36m
rbd-pvc-restore   Bound    pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253   1Gi        RWO            rook-ceph-block   45s

sh-4.4# rbd ls --pool=replicapool
csi-snap-9e764bb7-d01e-11eb-996f-0242ac110006
csi-vol-5b72e226-cfe7-11eb-b972-0242ac110005

kubectl logs po/csi-rbdplugin-provisioner-5bd88676fb-qbc42 -nrook-ceph -c csi-rbdplugin|grep -i "pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253"
I0618 10:19:24.800093       1 utils.go:163] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 GRPC call: /csi.v1.Controller/CreateVolume
I0618 10:19:24.804569       1 utils.go:167] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253","parameters":{"clusterID":"rook-ceph","csi.storage.k8s.io/pv/name":"pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253","csi.storage.k8s.io/pvc/name":"rbd-pvc-restore","csi.storage.k8s.io/pvc/namespace":"default","imageFeatures":"layering","imageFormat":"2","pool":"replicapool","thickProvision":"true"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}],"volume_content_source":{"Type":{"Snapshot":{"snapshot_id":"0001-0009-rook-ceph-0000000000000001-9e764bb7-d01e-11eb-996f-0242ac110006"}}}}
I0618 10:19:24.810686       1 rbd_util.go:997] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 setting disableInUseChecks: false image features: [layering] mounter: rbd
I0618 10:19:24.815263       1 omap.go:84] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 got omap values: (pool="replicapool", namespace="", name="csi.snap.9e764bb7-d01e-11eb-996f-0242ac110006"): map[csi.imageid:2ab8f2525ef csi.imagename:csi-snap-9e764bb7-d01e-11eb-996f-0242ac110006 csi.snapname:snapshot-64470390-2f1e-4605-a431-11e59d2b9e8b csi.source:csi-vol-5b72e226-cfe7-11eb-b972-0242ac110005 csi.volume.owner:default]
I0618 10:19:24.819565       1 omap.go:84] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 got omap values: (pool="replicapool", namespace="", name="csi.volumes.default"): map[]
I0618 10:19:24.857829       1 omap.go:148] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 set omap keys (pool="replicapool", namespace="", name="csi.volumes.default"): map[csi.volume.pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253:a7d853ff-d01e-11eb-996f-0242ac110006])
I0618 10:19:24.864206       1 omap.go:148] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 set omap keys (pool="replicapool", namespace="", name="csi.volume.a7d853ff-d01e-11eb-996f-0242ac110006"): map[csi.imagename:csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006 csi.volname:pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 csi.volume.owner:default])
I0618 10:19:24.864257       1 rbd_journal.go:472] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 generated Volume ID (0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006) and image name (csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006) for request name (pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253)
I0618 10:19:24.868861       1 omap.go:84] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 got omap values: (pool="replicapool", namespace="", name="csi.snap.9e764bb7-d01e-11eb-996f-0242ac110006"): map[csi.imageid:2ab8f2525ef csi.imagename:csi-snap-9e764bb7-d01e-11eb-996f-0242ac110006 csi.snapname:snapshot-64470390-2f1e-4605-a431-11e59d2b9e8b csi.source:csi-vol-5b72e226-cfe7-11eb-b972-0242ac110005 csi.volume.owner:default]
I0618 10:19:40.523617       1 controllerserver.go:531] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 create volume pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 from snapshot csi-snap-9e764bb7-d01e-11eb-996f-0242ac110006
I0618 10:19:40.523664       1 controllerserver.go:557] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 created volume pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 from snapshot csi-snap-9e764bb7-d01e-11eb-996f-0242ac110006
I0618 10:19:40.523672       1 controllerserver.go:573] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 created volume pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 backed by image csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006
I0618 10:19:40.541586       1 omap.go:148] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 set omap keys (pool="replicapool", namespace="", name="csi.volume.a7d853ff-d01e-11eb-996f-0242ac110006"): map[csi.imageid:2ab856634e09])
I0618 10:19:40.558507       1 rbd_util.go:631] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 clone depth is (0), configured softlimit (4) and hardlimit (8) for replicapool/csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006
I0618 10:19:40.560841       1 utils.go:174] ID: 20 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 GRPC response: {"volume":{"capacity_bytes":1073741824,"content_source":{"Type":{"Snapshot":{"snapshot_id":"0001-0009-rook-ceph-0000000000000001-9e764bb7-d01e-11eb-996f-0242ac110006"}}},"volume_context":{"clusterID":"rook-ceph","csi.storage.k8s.io/pv/name":"pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253","csi.storage.k8s.io/pvc/name":"rbd-pvc-restore","csi.storage.k8s.io/pvc/namespace":"default","imageFeatures":"layering","imageFormat":"2","imageName":"csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006","journalPool":"replicapool","pool":"replicapool","thickProvision":"true"},"volume_id":"0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006"}}
I0618 10:19:40.788221       1 utils.go:163] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 GRPC call: /csi.v1.Controller/CreateVolume
I0618 10:19:40.789536       1 utils.go:167] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253","parameters":{"clusterID":"rook-ceph","csi.storage.k8s.io/pv/name":"pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253","csi.storage.k8s.io/pvc/name":"rbd-pvc-restore","csi.storage.k8s.io/pvc/namespace":"default","imageFeatures":"layering","imageFormat":"2","pool":"replicapool","thickProvision":"true"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}],"volume_content_source":{"Type":{"Snapshot":{"snapshot_id":"0001-0009-rook-ceph-0000000000000001-9e764bb7-d01e-11eb-996f-0242ac110006"}}}}
I0618 10:19:40.789862       1 rbd_util.go:997] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 setting disableInUseChecks: false image features: [layering] mounter: rbd
I0618 10:19:40.809644       1 omap.go:84] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 got omap values: (pool="replicapool", namespace="", name="csi.snap.9e764bb7-d01e-11eb-996f-0242ac110006"): map[csi.imageid:2ab8f2525ef csi.imagename:csi-snap-9e764bb7-d01e-11eb-996f-0242ac110006 csi.snapname:snapshot-64470390-2f1e-4605-a431-11e59d2b9e8b csi.source:csi-vol-5b72e226-cfe7-11eb-b972-0242ac110005 csi.volume.owner:default]
I0618 10:19:40.816018       1 omap.go:84] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 got omap values: (pool="replicapool", namespace="", name="csi.volumes.default"): map[csi.volume.pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253:a7d853ff-d01e-11eb-996f-0242ac110006]
I0618 10:19:40.826808       1 omap.go:84] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 got omap values: (pool="replicapool", namespace="", name="csi.volume.a7d853ff-d01e-11eb-996f-0242ac110006"): map[csi.imageid:2ab856634e09 csi.imagename:csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006 csi.volname:pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 csi.volume.owner:default]
I0618 10:19:40.871054       1 rbd_journal.go:330] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 found existing volume (0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006) with image name (csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006) for request (pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253)
I0618 10:19:40.901254       1 rbd_util.go:484] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 rbd: delete csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006 using mon 192.168.50.112:6789, pool replicapool
I0618 10:19:40.946496       1 rbd_util.go:457] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 executing [rbd task add trash remove replicapool/2ab856634e09 --id csi-rbd-provisioner --keyfile=/tmp/csi/keys/keyfile-642470564 -m 192.168.50.112:6789] for image (csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006) using mon 192.168.50.112:6789, pool replicapool
I0618 10:19:41.588420       1 cephcmds.go:59] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 command succeeded: ceph [rbd task add trash remove replicapool/2ab856634e09 --id csi-rbd-provisioner --keyfile=***stripped*** -m 192.168.50.112:6789]
I0618 10:19:41.600398       1 omap.go:118] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 removed omap keys (pool="replicapool", namespace="", name="csi.volumes.default"): [csi.volume.pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253]
E0618 10:19:41.600490       1 utils.go:172] ID: 22 Req-ID: pvc-ac3e8cbd-ed51-472e-a3fe-bbfbf5346253 GRPC error: rpc error: code = Aborted desc = restoring thick-provisioned volume "replicapool/csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006" has been interrupted, please retry

PVC Delete Request

kubectl delete -f pvc-restore.yaml 
persistentvolumeclaim "rbd-pvc-restore" deleted

I0618 10:25:30.332917       1 utils.go:163] ID: 28 Req-ID: 0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006 GRPC call: /csi.v1.Controller/DeleteVolume
I0618 10:25:30.336803       1 utils.go:167] ID: 28 Req-ID: 0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006 GRPC request: {"secrets":"***stripped***","volume_id":"0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006"}
E0618 10:25:30.346446       1 omap.go:77] ID: 28 Req-ID: 0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006 omap not found (pool="replicapool", namespace="", name="csi.volume.a7d853ff-d01e-11eb-996f-0242ac110006"): rados: ret=-2, No such file or directory
W0618 10:25:30.348455       1 voljournal.go:649] ID: 28 Req-ID: 0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006 unable to read omap keys: pool or key missing: key not found: rados: ret=-2, No such file or directory
E0618 10:25:30.369835       1 rbd_journal.go:635] ID: 28 Req-ID: 0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006 failed to get image id replicapool/csi-vol-a7d853ff-d01e-11eb-996f-0242ac110006: image not found: RBD image not found
I0618 10:25:30.432900       1 omap.go:118] ID: 28 Req-ID: 0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006 removed omap keys (pool="replicapool", namespace="", name="csi.volumes.default"): [csi.volume.]
I0618 10:25:30.433643       1 utils.go:174] ID: 28 Req-ID: 0001-0009-rook-ceph-0000000000000001-a7d853ff-d01e-11eb-996f-0242ac110006 GRPC response: {}

Madhu-1 · 2021-06-18T10:39:26Z

This is also leaving stale images in trash

kubectl get pvc,volumesnapshot
No resources found in default namespace.

sh-4.4# rbd ls --pool=replicapool
sh-4.4# rbd trash ls --pool=replicapool
2ab89b9a2db6 csi-vol-ef56157a-d020-11eb-996f-0242ac110006
sh-4.4# rbd trash rm 2ab89b9a2db6 --pool=replicapool
rbd: image has snapshots - these must be deleted with 'rbd snap purge' before the image can be removed.
Removing image: 0% complete...failed.
sh-4.4# rbd trash restore 2ab89b9a2db6 --pool=replicapool
sh-4.4# rbd ls --pool=replicapool
csi-vol-ef56157a-d020-11eb-996f-0242ac110006
sh-4.4# rbd info csi-vol-ef56157a-d020-11eb-996f-0242ac110006 --pool=replicapool
rbd image 'csi-vol-ef56157a-d020-11eb-996f-0242ac110006':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 1
	id: 2ab89b9a2db6
	block_name_prefix: rbd_data.2ab89b9a2db6
	format: 2
	features: layering
	op_features: 
	flags: 
	create_timestamp: Fri Jun 18 10:35:43 2021
	access_timestamp: Fri Jun 18 10:35:43 2021
	modify_timestamp: Fri Jun 18 10:35:43 2021
sh-4.4# rbd snap ls csi-vol-ef56157a-d020-11eb-996f-0242ac110006 --pool=replicapool
SNAPID  NAME                                           SIZE   PROTECTED  TIMESTAMP               
    28  csi-snap-ea77c5e2-d020-11eb-996f-0242ac110006  1 GiB             Fri Jun 18 10:35:44 2021
sh-4.4# rbd snap purge csi-vol-ef56157a-d020-11eb-996f-0242ac110006 --pool=replicapool
Removing all snapshots: 100% complete...done.
sh-4.4# rbd trash mv csi-vol-ef56157a-d020-11eb-996f-0242ac110006 --pool=replicapool
sh-4.4# rbd trash ls --pool=replicapool
2ab89b9a2db6 csi-vol-ef56157a-d020-11eb-996f-0242ac110006

sh-4.4# ceph rbd task add trash remove replicapool/2ab89b9a2db6                   
{"sequence": 14, "id": "f2b4acdd-09be-4d8e-8fec-8b07d7b09ec8", "message": "Removing image replicapool/2ab89b9a2db6 from trash", "refs": {"action": "trash remove", "pool_name": "replicapool", "pool_namespace": "", "image_id": "2ab89b9a2db6"}}
sh-4.4# rbd trash ls  --pool=replicapool

Madhu-1 · 2021-06-18T10:43:01Z

rbd deep cp copies the snapshot present on the clone.

humblec · 2021-06-18T10:44:37Z

rbd deep cp copies the snapshot present on the clone.

It does, this is documented iic.
if we (deep) flatten and then deep copy, shouldnt it remove the snaps ? 😕 ❓

Madhu-1 · 2021-06-18T10:50:13Z

rbd deep cp copies the snapshot present on the clone.

to fix this @nixpanic during CreateSnapshot operation if it's a thick PVC don't create a final snapshot on the clone and do a deep copy during CreateVolume operation?

Madhu-1 · 2021-06-18T11:02:00Z

rbd deep cp copies the snapshot present on the clone.

It does, this is documented iic.
if we (deep) flatten and then deep copy, shouldnt it remove the snaps ?

AFAIK this is not the way it works, We need to remove the snapshots

humblec · 2021-06-18T11:06:06Z

rbd deep cp copies the snapshot present on the clone.

It does, this is documented iic.
if we (deep) flatten and then deep copy, shouldnt it remove the snaps ?

AFAIK this is not the way it works, We need to remove the snapshots

Fine, then we have to find a different method here to remove the snap references then :

Btw here is the documentation of deep cp says:

deep cp (src-image-spec | src-snap-spec) dest-image-spec

    Deep copy the content of a src-image into the newly created dest-image. Dest-image will have the same size, object size, image format, and snapshots as src-image.

nixpanic · 2021-06-18T11:09:56Z

Looks like something wrong. I see PVC and PVC-restore in bound state but i don't see any backend volume associated with it?

That is weird. I do have one? Starting a pod that consumes the rbd-pvc-restore works for me... The related image is linked in the PV, I didn't check the logs for the name of the PV and it's image, just the details of the PV:

$ oc create -f examples/rbd/pod-restore.yaml 
pod/csi-rbd-restore-demo-pod created

$ oc get pod
NAME                       READY   STATUS              RESTARTS   AGE
csi-rbd-restore-demo-pod   0/1     ContainerCreating   0          6s

$ oc get pod -w
NAME                       READY   STATUS              RESTARTS   AGE
csi-rbd-restore-demo-pod   0/1     ContainerCreating   0          11s
csi-rbd-restore-demo-pod   1/1     Running             0          15s

$ oc get pvc
NAME              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                        AGE
rbd-pvc           Bound    pvc-e30e25f6-6cb1-4e5f-8ff8-0c18495286e1   10Gi       RWO            ocs-storagecluster-ceph-rbd-thick   76m
rbd-pvc-restore   Bound    pvc-b7c81ecf-9662-4b26-8b1d-bf7445597a97   10Gi       RWO            ocs-storagecluster-ceph-rbd-thick   64m

$ oc get pv/pvc-b7c81ecf-9662-4b26-8b1d-bf7445597a97 -oyaml | grep csi-vol
      imageName: csi-vol-23381a66-d01c-11eb-b3d4-0a580a830046

And in the toolbox pod:

sh-4.4# rbd du ocs-storagecluster-cephblockpool/csi-vol-5d3285f4-d01a-11eb-9320-0a580a810215
warning: fast-diff map is not enabled for csi-vol-5d3285f4-d01a-11eb-9320-0a580a810215. operation may be slow.
NAME                                         PROVISIONED USED   
csi-vol-5d3285f4-d01a-11eb-9320-0a580a810215      10 GiB 10 GiB

@Madhu-1, can you tell me exactly what steps you did?

Madhu-1 · 2021-06-18T11:12:52Z

Looks like something wrong. I see PVC and PVC-restore in bound state but i don't see any backend volume associated with it?

That is weird. I do have one? Starting a pod that consumes the rbd-pvc-restore works for me... The related image is linked in the PV, I didn't check the logs for the name of the PV and it's image, just the details of the PV:

$ oc create -f examples/rbd/pod-restore.yaml 
pod/csi-rbd-restore-demo-pod created

$ oc get pod
NAME                       READY   STATUS              RESTARTS   AGE
csi-rbd-restore-demo-pod   0/1     ContainerCreating   0          6s

$ oc get pod -w
NAME                       READY   STATUS              RESTARTS   AGE
csi-rbd-restore-demo-pod   0/1     ContainerCreating   0          11s
csi-rbd-restore-demo-pod   1/1     Running             0          15s

$ oc get pvc
NAME              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                        AGE
rbd-pvc           Bound    pvc-e30e25f6-6cb1-4e5f-8ff8-0c18495286e1   10Gi       RWO            ocs-storagecluster-ceph-rbd-thick   76m
rbd-pvc-restore   Bound    pvc-b7c81ecf-9662-4b26-8b1d-bf7445597a97   10Gi       RWO            ocs-storagecluster-ceph-rbd-thick   64m

$ oc get pv/pvc-b7c81ecf-9662-4b26-8b1d-bf7445597a97 -oyaml | grep csi-vol
      imageName: csi-vol-23381a66-d01c-11eb-b3d4-0a580a830046

And in the toolbox pod:

sh-4.4# rbd du ocs-storagecluster-cephblockpool/csi-vol-5d3285f4-d01a-11eb-9320-0a580a810215
warning: fast-diff map is not enabled for csi-vol-5d3285f4-d01a-11eb-9320-0a580a810215. operation may be slow.
NAME                                         PROVISIONED USED   
csi-vol-5d3285f4-d01a-11eb-9320-0a580a810215      10 GiB 10 GiB

@Madhu-1, can you tell me exactly what steps you did?

it happened only one time. I did create PVC, snapshot, create new PVC from the snapshot. i have attached the logs. if you want i can share more logs i have setup but the ceph backend not in the same state as i delete pvc,snapshot and pvc-restore

nixpanic · 2021-06-18T11:21:02Z

rbd deep cp copies the snapshot present on the clone.

to fix this @nixpanic during CreateSnapshot operation if it's a thick PVC don't create a final snapshot on the clone and do a deep copy during CreateVolume operation?

Right, a thick-provisioned VolumeSnapshot does not need an RBD-snapshot in the end. That should make it possible to delete the restored images.

humblec · 2021-06-18T11:24:53Z

rbd deep cp copies the snapshot present on the clone.

to fix this @nixpanic during CreateSnapshot operation if it's a thick PVC don't create a final snapshot on the clone and do a deep copy during CreateVolume operation?

Right, a thick-provisioned VolumeSnapshot does not need an RBD-snapshot in the end. That should make it possible to delete the restored images.

Please update the document of the process we are doing here with the thick and other cases at time of snapshot and clone operation. It helps to get evaluated later from ceph team or refer while we revisit these code paths/process.

nixpanic · 2021-06-21T12:35:03Z

Now addressed the following points:

skipping the creation of an RBD snapshot on the RBD image used as VolumeContentSource
marking a restored PVC as thick-provisioned (rbd metadata)
tested restarting of provisioning
deleting PVCs restored from snapshot without leaving behind images in the Ceph cluster

nixpanic · 2021-06-22T06:28:01Z

@humblec, @Madhu-1, please review again. Thanks!

humblec · 2021-06-22T08:20:00Z

internal/rbd/controllerserver.go

+		}
+		err = rbdVol.setThickProvisioned()
+		if err != nil {
+			return status.Errorf(codes.Internal, "failed mark %q thick-provisioned: %s", rbdVol, err)


failed marking or failed to mark

humblec · 2021-06-22T08:21:42Z

skipping the creation of an RBD snapshot on the RBD image used as VolumeContentSource-> @nixpanic this is when the contentsource image is "thick provisioned" case, isnt it?

humblec · 2021-06-22T08:41:16Z

skipping the creation of an RBD snapshot on the RBD image used as VolumeContentSource-> @nixpanic this is when the contentsource image is "thick provisioned" case, isnt it?

Above is not the case? we are skipping the snapshot creation During Create snapshot operation for the thick PVC not sure CreateVolume.

Its on the snapshot creation, but applicable only when the source is 'thick provisioned' image and not in general ?

Madhu-1 · 2021-06-22T08:43:08Z

skipping the creation of an RBD snapshot on the RBD image used as VolumeContentSource-> @nixpanic this is when the contentsource image is "thick provisioned" case, isnt it?

Above is not the case? we are skipping the snapshot creation During Create snapshot operation for the thick PVC not sure CreateVolume.

Its on the snapshot creation, but applicable only when the source is 'thick provisioned' image and not in general ?

Yes only if the source is thick provisioned

nixpanic · 2021-06-22T13:53:21Z

@nixpanic add E2E to cover basic testing?

I would love that, but the validation of the snapshot functionality needs a lot of changes before that can be re-used. #2071 contains the request for a PR with a test case.

Pull request has been modified.

nixpanic · 2021-06-22T13:58:17Z

@Madhu-1, @humblec updated with your recommendations.

Madhu-1 · 2021-06-22T14:27:24Z

@nixpanic add E2E to cover basic testing?

I would love that, but the validation of the snapshot functionality needs a lot of changes before that can be re-used. #2071 contains the request for a PR with a test case.

Keep the issue open to add E2E or open a new issue to add E2E for this case?

nixpanic · 2021-06-22T14:50:40Z

@nixpanic add E2E to cover basic testing?

I would love that, but the validation of the snapshot functionality needs a lot of changes before that can be re-used. #2071 contains the request for a PR with a test case.

Keep the issue open to add E2E or open a new issue to add E2E for this case?

Keeping it open, this PR closes an other one.

nixpanic · 2021-06-23T07:05:19Z

/retest ci/centos/mini-e2e/k8s-1.19

nixpanic · 2021-06-23T07:06:25Z

/retest ci/centos/mini-e2e/k8s-1.19

failed due to #1969 (logs):

Jun 22 17:45:27.070: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready

nixpanic · 2021-06-23T09:30:19Z

/retest ci/centos/upgrade-tests-cephfs

nixpanic · 2021-06-23T09:31:58Z

/retest ci/centos/upgrade-tests-cephfs

some infrastructure issue (logs)

nixpanic · 2021-06-23T09:32:50Z

/retest ci/centos/mini-e2e-helm/k8s-1.20

nixpanic · 2021-06-23T09:34:40Z

/retest ci/centos/mini-e2e/k8s-1.20

nixpanic · 2021-06-23T09:35:37Z

/retest ci/centos/mini-e2e/k8s-1.20

failed due to #1969 :

Jun 23 10:28:07.779: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready

nixpanic · 2021-06-23T11:08:44Z

/retest ci/centos/mini-e2e-helm/k8s-1.20

nixpanic · 2021-06-23T11:09:14Z

/retest ci/centos/mini-e2e-helm/k8s-1.20

Failed due to #1969 (logs)

Signed-off-by: Niels de Vos <ndevos@redhat.com>

In case restoring a snapshot of a thick-PVC failed during DeepCopy(), the image will exist, but have partial contents. Only when the image has the thick-provisioned metadata set, it has completed DeepCopy(). When the metadata is missing, the image is deleted, and an error is returned to the caller. Kubernetes will automatically retry provisioning on the ABORTED error, and the restoring will get restarted from the beginning. Signed-off-by: Niels de Vos <ndevos@redhat.com>

When cloning a volume from a (CSI) snapshot, we use DeepCopy() and do not need an RBD snapshot as source. Suggested-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>

nixpanic added bug Something isn't working DNM DO NOT MERGE backport-to-release-v3.3 labels Jun 17, 2021

mergify bot added the component/rbd Issues related to RBD label Jun 17, 2021

nixpanic force-pushed the rbd/thick-provisioning/snapshot branch from c15a6de to d843ff1 Compare June 17, 2021 18:07

Rakshith-R reviewed Jun 18, 2021

View reviewed changes

nixpanic force-pushed the rbd/thick-provisioning/snapshot branch from d843ff1 to bbc67ec Compare June 18, 2021 08:14

nixpanic removed the DNM DO NOT MERGE label Jun 18, 2021

nixpanic requested review from humblec, Rakshith-R and Madhu-1 June 18, 2021 10:05

nixpanic force-pushed the rbd/thick-provisioning/snapshot branch from 2be8a37 to d7e910b Compare June 21, 2021 12:32

humblec reviewed Jun 22, 2021

View reviewed changes

nixpanic force-pushed the rbd/thick-provisioning/snapshot branch from d7e910b to f95a420 Compare June 22, 2021 13:57

humblec approved these changes Jun 22, 2021

View reviewed changes

nixpanic requested a review from Madhu-1 June 22, 2021 14:13

Madhu-1 approved these changes Jun 22, 2021

View reviewed changes

nixpanic force-pushed the rbd/thick-provisioning/snapshot branch from f95a420 to ff3e1b8 Compare June 22, 2021 15:43

nixpanic force-pushed the rbd/thick-provisioning/snapshot branch from ff3e1b8 to 8511e26 Compare June 23, 2021 08:38

nixpanic added 3 commits June 23, 2021 12:45

rbd: use DeepCopy() when restoring a thick-snapshot

f7fe08b

Signed-off-by: Niels de Vos <ndevos@redhat.com>

rbd: no need to create a snapshot on a thick-provisioned volume

f3a3368

When cloning a volume from a (CSI) snapshot, we use DeepCopy() and do not need an RBD snapshot as source. Suggested-by: Madhu Rajanna <madhupr007@gmail.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>

nixpanic force-pushed the rbd/thick-provisioning/snapshot branch from 8511e26 to f3a3368 Compare June 23, 2021 12:45

mergify bot merged commit eeec447 into ceph:devel Jun 23, 2021

mergify bot mentioned this pull request Jun 23, 2021

rbd: use DeepCopy() to create thick-provisioned volumes from a snapshot (backport #2184) #2202

Closed

This was referenced Jun 25, 2021

Sync master branch with upstream/devel openshift/ceph-csi#62

Merged

BUG 1959793: PVC restored from a snapshot or cloned from a thick provisioned PVC, is not thick provisioned openshift/ceph-csi#63

Closed

	return status.Errorf(codes.Internal, "failed to deep copy %q into %q: %w", parentVol, rbdVol, err)
	return status.Errorf(codes.Internal, "failed to deep copy %q into %q: %v", parentVol, rbdVol, err)

rbd: use DeepCopy() to create thick-provisioned volumes from a snapshot #2184

rbd: use DeepCopy() to create thick-provisioned volumes from a snapshot #2184

Conversation

nixpanic commented Jun 17, 2021 • edited Loading

Describe what this PR does

Is there anything that requires special attention

Related issues

Rakshith-R Jun 18, 2021

Choose a reason for hiding this comment

Rakshith-R Jun 18, 2021

Choose a reason for hiding this comment

nixpanic Jun 18, 2021

Choose a reason for hiding this comment

humblec Jun 18, 2021

Choose a reason for hiding this comment

humblec commented Jun 18, 2021

nixpanic commented Jun 18, 2021

nixpanic commented Jun 18, 2021

Madhu-1 commented Jun 18, 2021 • edited Loading

Madhu-1 commented Jun 18, 2021

Madhu-1 commented Jun 18, 2021

humblec commented Jun 18, 2021 • edited Loading

Madhu-1 commented Jun 18, 2021

Madhu-1 commented Jun 18, 2021

humblec commented Jun 18, 2021

nixpanic commented Jun 18, 2021

Madhu-1 commented Jun 18, 2021

nixpanic commented Jun 18, 2021

humblec commented Jun 18, 2021 • edited Loading

nixpanic commented Jun 21, 2021 • edited Loading

nixpanic commented Jun 22, 2021

humblec Jun 22, 2021

Choose a reason for hiding this comment

humblec commented Jun 22, 2021

humblec commented Jun 22, 2021

Madhu-1 commented Jun 22, 2021

nixpanic commented Jun 22, 2021

nixpanic commented Jun 22, 2021

Madhu-1 commented Jun 22, 2021

nixpanic commented Jun 22, 2021

nixpanic commented Jun 23, 2021

nixpanic commented Jun 23, 2021

nixpanic commented Jun 23, 2021

nixpanic commented Jun 23, 2021

nixpanic commented Jun 23, 2021

nixpanic commented Jun 23, 2021

nixpanic commented Jun 23, 2021

nixpanic commented Jun 23, 2021

nixpanic commented Jun 23, 2021

nixpanic commented Jun 17, 2021 •

edited

Loading

Madhu-1 commented Jun 18, 2021 •

edited

Loading

humblec commented Jun 18, 2021 •

edited

Loading

humblec commented Jun 18, 2021 •

edited

Loading

nixpanic commented Jun 21, 2021 •

edited

Loading