nfs: add basic provisioner with create/delete procedures #2948

nixpanic · 2022-03-17T10:05:08Z

These NFS Controller and Identity servers are the base for the new
provisioner. The functionality is currently extremely limited, follow-up
PRs will implement various CSI procedures.

CreateVolume is implemented with the bare minimum. This makes it
possible to create a volume, and mount it with the
kubernetes-csi/csi-driver-nfs NodePlugin.

DeleteVolume unexports the volume from the Ceph managed NFS-Ganesha
service. In case the Ceph cluster provides multiple NFS-Ganesha
deployments, things might not work as expected. This is going to be
addressed in follow-up improvements.

Lots of TODO comments need to be resolved before this can be declared
"production ready". Unit- and e2e-tests are missing as well.

Updates: #2913

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

/retest ci/centos/<job-name>: retest the <job-name> after unrelated
failure (please report the failure too!)
/retest all: run this in case the CentOS CI failed to start/report any test
progress or results

Rakshith-R · 2022-03-18T10:29:22Z

For first few glances, it looks good to me,
few questions:

Will we add locks to the Create/Delete funcs like rbd and cephfs ?
Will there be repeated logs indicating Request and Response
No validation for presence of server ips in volume context that is received/being returned?

Madhu-1 · 2022-03-22T05:06:47Z

api/deploy/kubernetes/nfs/csidriver.yaml

+  attachRequired: false
+  volumeLifecycleModes:
+    - Persistent
+    - Ephemeral


attachRequired is false and do we plan to support Ephemeral also?

This is the default configuration from the csi-driver-nfs (NodePlugin)... Not sure if there are any special things needed for Ephemeral support?

I think For Ephemeral the CreateVolue needs to be taken care of in NodePublish https://kubernetes-csi.github.io/docs/ephemeral-local-volumes.html#implementing-csi-ephemeral-inline-support.

right, in that case, we should probably not support Ephemeral volumes from the start

Madhu-1 · 2022-03-22T05:09:10Z

internal/nfs/controller/controllerserver.go

+
+	backend := res.Volume
+
+	log.DebugLog(ctx, "CephFS volume created: %s", backend)


debug logging of volume volume is not required?

yes, this can probably be removed

Actually, I think it is useful to have this as DebugLog(), each step in the process has such a log message. Will only log the volume-id instead of the whole volume.

Madhu-1 · 2022-03-22T05:10:48Z

internal/nfs/controller/controllerserver.go

+		return nil, status.Error(codes.InvalidArgument, err.Error())
+	}
+
+	err = nfsVolume.Connect(cr)


calling Destroy is missing?

there was no go-ceph connection yet. With the updated PR go-ceph is used, and Destroy() has been added.

Madhu-1 · 2022-03-22T05:12:05Z

internal/nfs/controller/volume.go

+	return nil
+}
+
+func (nv *NFSVolume) GetExportPath() string {


add comment for all the exported functions?

yes, definitely need to do this!

still need to be addressed?

yes, indeed!

Madhu-1 · 2022-03-22T05:12:37Z

internal/nfs/controller/volume.go

+	}
+
+	// TODO: use new go-ceph API
+	_, _, err := util.ExecCommand(context.TODO(), "ceph", args...)


open go-ceph issue to track this one?

it is at ceph/go-ceph#655

Madhu-1 · 2022-03-22T05:14:10Z

vendor/github.com/ceph/ceph-csi/api/deploy/kubernetes/nfs/csidriver.go

@@ -0,0 +1,74 @@
+/*
+Copyright 2022 The Ceph-CSI Authors.


why do we have this vendor change?

that is mentioned in the commit message, as tools/yamlgen uses the API to generate files under deploy/, we are vendoring our own API (I would like to prevent that, but don't know how).

Madhu-1 · 2022-03-22T05:16:43Z

internal/nfs/controller/controllerserver.go

+	if err != nil {
+		log.ErrorLog(ctx, "failed to retrieve admin credentials: %v", err)
+
+		return nil, status.Error(codes.InvalidArgument, err.Error())


in case of failures we need to take create of cleaning up the OMAP and the cephfs subvolume?

yes, failures should still be handled better. Calls are probably not idempotent at the moment.

Planning to address this in a followup PR?

not in this PR, it needs a lot of manual testing to address all possible cases. I prefer to get this merged soon, and then iterate on the improvements with smaller PRs.

nixpanic · 2022-03-24T14:18:58Z

For first few glances, it looks good to me, few questions:
* Will we add locks to the Create/Delete funcs like rbd and cephfs ?

I do not think that is required. The calls should be made idempotent (they probably are not yet), and the actual complex volume creation is done in the CephFS subcomponent, which has locking already.

* Will there be repeated logs indicating Request and Response

no, this is what the logs currently look like:

CreateVolume

I0324 13:03:22.688920       1 utils.go:191] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 GRPC call: /csi.v1.Controller/CreateVolume
I0324 13:03:22.689196       1 utils.go:195] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-03913617-ba49-476b-97b0-b9c1aa415bb6","parameters":{"cephNFS":"my-nfs","clusterID":"openshift-storage","fsName":"ocs-storagecluster-cephfilesystem","server":"rook-ceph-nfs-my-nfs-a.openshift-storage.svc.cluster.local","volumeNamePrefix":"nfs-export-"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":5}}]}
I0324 13:03:22.716063       1 omap.go:87] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 got omap values: (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volumes.default"): map[]
I0324 13:03:22.727461       1 omap.go:155] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 set omap keys (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volumes.default"): map[csi.volume.pvc-03913617-ba49-476b-97b0-b9c1aa415bb6:c8ebd059-ab72-11ec-b874-0a580a810215])
I0324 13:03:22.731652       1 omap.go:155] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 set omap keys (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volume.c8ebd059-ab72-11ec-b874-0a580a810215"): map[csi.imagename:nfs-export-c8ebd059-ab72-11ec-b874-0a580a810215 csi.volname:pvc-03913617-ba49-476b-97b0-b9c1aa415bb6])
I0324 13:03:22.731679       1 fsjournal.go:284] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 Generated Volume ID (0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215) and subvolume name (nfs-export-c8ebd059-ab72-11ec-b874-0a580a810215) for request name (pvc-03913617-ba49-476b-97b0-b9c1aa415bb6)
I0324 13:03:22.756543       1 volume.go:228] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 cephfs: created subvolume group csi
I0324 13:03:22.816809       1 controllerserver.go:343] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 cephfs: successfully created backing volume named nfs-export-c8ebd059-ab72-11ec-b874-0a580a810215 for request name pvc-03913617-ba49-476b-97b0-b9c1aa415bb6
I0324 13:03:22.816887       1 controllerserver.go:83] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 CephFS volume created: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215
I0324 13:03:22.822595       1 omap.go:155] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 set omap keys (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volume.c8ebd059-ab72-11ec-b874-0a580a810215"): map[csi.nfs.cluster:my-nfs])
I0324 13:03:23.301861       1 cephcmds.go:63] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 command succeeded: ceph [--id csi-cephfs-provisioner --keyfile=***stripped*** -m 172.30.252.210:6789,172.30.58.127:6789,172.30.62.161:6789 nfs export create cephfs ocs-storagecluster-cephfilesystem my-nfs /0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 /volumes/csi/nfs-export-c8ebd059-ab72-11ec-b874-0a580a810215/c668735f-3626-46f4-ab77-44e64e324111]
I0324 13:03:23.301907       1 controllerserver.go:110] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 published NFS-export: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215
I0324 13:03:23.302065       1 utils.go:202] ID: 5 Req-ID: pvc-03913617-ba49-476b-97b0-b9c1aa415bb6 GRPC response: {"volume":{"capacity_bytes":1073741824,"volume_context":{"cephNFS":"my-nfs","clusterID":"openshift-storage","fsName":"ocs-storagecluster-cephfilesystem","server":"rook-ceph-nfs-my-nfs-a.openshift-storage.svc.cluster.local","share":"/0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215","subvolumeName":"nfs-export-c8ebd059-ab72-11ec-b874-0a580a810215","subvolumePath":"/volumes/csi/nfs-export-c8ebd059-ab72-11ec-b874-0a580a810215/c668735f-3626-46f4-ab77-44e64e324111","volumeNamePrefix":"nfs-export-"},"volume_id":"0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215"}}

DeleteVolume

I0324 13:03:53.585022       1 utils.go:191] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 GRPC call: /csi.v1.Controller/DeleteVolume
I0324 13:03:53.585165       1 utils.go:195] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 GRPC request: {"secrets":"***stripped***","volume_id":"0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215"}
I0324 13:03:53.588279       1 omap.go:87] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 got omap values: (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volume.c8ebd059-ab72-11ec-b874-0a580a810215"): map[csi.nfs.cluster:my-nfs]
I0324 13:03:54.061648       1 cephcmds.go:63] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 command succeeded: ceph [--id csi-cephfs-provisioner --keyfile=***stripped*** -m 172.30.252.210:6789,172.30.58.127:6789,172.30.62.161:6789 nfs export delete my-nfs /0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215]
I0324 13:03:54.061695       1 controllerserver.go:149] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 deleted NFS-export: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215
I0324 13:03:54.064215       1 omap.go:87] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 got omap values: (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volume.c8ebd059-ab72-11ec-b874-0a580a810215"): map[csi.imagename:nfs-export-c8ebd059-ab72-11ec-b874-0a580a810215 csi.volname:pvc-03913617-ba49-476b-97b0-b9c1aa415bb6]
I0324 13:03:54.102219       1 omap.go:123] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 removed omap keys (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volumes.default"): [csi.volume.pvc-03913617-ba49-476b-97b0-b9c1aa415bb6]
I0324 13:03:54.102280       1 controllerserver.go:468] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 cephfs: successfully deleted volume 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215
I0324 13:03:54.102348       1 utils.go:202] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-c8ebd059-ab72-11ec-b874-0a580a810215 GRPC response: {}

* No validation for presence of `server` ips in volume context that is received/being returned?

It should not be IP's, it ideally is a hostname (but can be an IP-address too). Not sure it makes sense to validate this. Mounting the volume will fail in the kubernetes-csi/csi-driver-nfs in that case, hopefully with a useful error message.

Madhu-1

LGTM left a few questions.

Madhu-1 · 2022-03-25T05:19:21Z

api/deploy/kubernetes/nfs/csidriver.yaml

+  attachRequired: false
+  volumeLifecycleModes:
+    - Persistent
+    - Ephemeral


I think For Ephemeral the CreateVolue needs to be taken care of in NodePublish https://kubernetes-csi.github.io/docs/ephemeral-local-volumes.html#implementing-csi-ephemeral-inline-support.

Madhu-1 · 2022-03-25T05:20:53Z

internal/nfs/controller/controllerserver.go

+	if err != nil {
+		log.ErrorLog(ctx, "failed to retrieve admin credentials: %v", err)
+
+		return nil, status.Error(codes.InvalidArgument, err.Error())


Planning to address this in a followup PR?

Madhu-1 · 2022-03-25T05:22:46Z

internal/nfs/controller/volume.go

+	return nil
+}
+
+func (nv *NFSVolume) GetExportPath() string {


still need to be addressed?

Madhu-1 · 2022-03-25T05:23:58Z

internal/nfs/controller/volume.go

+	// TODO: use new go-ceph API
+	_, stderr, err := util.ExecCommand(nv.ctx, "ceph", args...)
+	if err != nil {
+		return fmt.Errorf("executing ceph export command failed (%w): %s", err, stderr)


do we need to handle the already exported errors if we get any? or this call is idempotent?

This returns an error like EEXISTS, but as this is a CLI and not go-ceph, error checking is not as clean. I plan to use go-ceph in a follow-up PR, and then improved error-checking can be added too.

Madhu-1 · 2022-03-25T05:24:18Z

internal/nfs/controller/volume.go

+	// TODO: use new go-ceph API
+	_, stderr, err := util.ExecCommand(nv.ctx, "ceph", args...)
+	if err != nil {
+		return fmt.Errorf("executing ceph export command failed (%w): %s", err, stderr)


do we need to handle the already deleted exported errors if we get any? or this call is idempotent?

this needs to be checked once we use go-ceph for these calls

nixpanic · 2022-03-25T13:18:38Z

@Madhu-1 @Rakshith-R , I think all comments have been addressed now. Please have a look again. Thanks!

nixpanic · 2022-03-25T15:36:32Z

FWIW, partial instructions for setting up are available in #2963

api/deploy/kubernetes/nfs/csidriver.yaml

internal/nfs/driver/driver.go

Move the printing of the version and other information to its own function. This reduces the complexity enough so that golang-ci does not complain about it anymore. Signed-off-by: Niels de Vos <ndevos@redhat.com>

Signed-off-by: Niels de Vos <ndevos@redhat.com>

The API is extended for generation of the NFS CSIDriver object. The YAML file under deploy/ was created by `yamlgen`. The contents of the csidriver.yaml file is heavily based on the upstream CSIDriver from the Kubernetes csi-driver-nfs project. Because ./tools/yamlgen uses the API, it gets copied under vendor/ . This causes two copies of the API to be included in the repository, but that can not be prevented, it seems. See-also: https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/deploy/csi-nfs-driverinfo.yaml Signed-off-by: Niels de Vos <ndevos@redhat.com>

These NFS Controller and Identity servers are the base for the new provisioner. The functionality is currently extremely limited, follow-up PRs will implement various CSI procedures. CreateVolume is implemented with the bare minimum. This makes it possible to create a volume, and mount it with the kubernetes-csi/csi-driver-nfs NodePlugin. DeleteVolume unexports the volume from the Ceph managed NFS-Ganesha service. In case the Ceph cluster provides multiple NFS-Ganesha deployments, things might not work as expected. This is going to be addressed in follow-up improvements. Lots of TODO comments need to be resolved before this can be declared "production ready". Unit- and e2e-tests are missing as well. Signed-off-by: Niels de Vos <ndevos@redhat.com>

Deployments can use --type=nfs to deploy the NFS Controller Server (provisioner). Signed-off-by: Niels de Vos <ndevos@redhat.com>

NFSVolume instances are short lived, they only extist for a certain gRPC procedure. It is easier to store the calling Context in the NFSVolume struct, than to pass it to some of the functions that require it. Signed-off-by: Niels de Vos <ndevos@redhat.com>

Signed-off-by: Niels de Vos <ndevos@redhat.com>

nixpanic mentioned this pull request Mar 17, 2022

Provisioning of NFS exported volumes #2913

Open

11 tasks

nixpanic added the component/nfs Issues related to NFS label Mar 17, 2022

nixpanic requested a review from a team March 17, 2022 13:35

Rakshith-R mentioned this pull request Mar 21, 2022

csi: add nfs nodeplugin & provisioner rook/rook#9927

Merged

9 tasks

Madhu-1 reviewed Mar 22, 2022

View reviewed changes

This was referenced Mar 23, 2022

NFS: feature: NFS Export management API ceph/go-ceph#628

Closed

nfs: admin APIs for managing nfs exports ceph/go-ceph#655

Merged

nixpanic force-pushed the nfs/provisioner/driver branch from 8454f46 to 04728c4 Compare March 24, 2022 10:33

nixpanic requested review from Madhu-1 and Rakshith-R March 24, 2022 14:19

nixpanic force-pushed the nfs/provisioner/driver branch from 04728c4 to dd5f878 Compare March 24, 2022 14:51

Madhu-1 reviewed Mar 25, 2022

View reviewed changes

nixpanic force-pushed the nfs/provisioner/driver branch from dd5f878 to 223bba1 Compare March 25, 2022 13:18

nixpanic requested a review from Madhu-1 March 25, 2022 13:18

nixpanic force-pushed the nfs/provisioner/driver branch from 223bba1 to 1a1cf7d Compare March 25, 2022 15:33

Rakshith-R requested changes Mar 28, 2022

View reviewed changes

api/deploy/kubernetes/nfs/csidriver.yaml Outdated Show resolved Hide resolved

internal/nfs/driver/driver.go Show resolved Hide resolved

nixpanic added 8 commits March 28, 2022 11:17

cleanup: reduce complexity of main()

d1a3278

Move the printing of the version and other information to its own function. This reduces the complexity enough so that golang-ci does not complain about it anymore. Signed-off-by: Niels de Vos <ndevos@redhat.com>

ci: add "nfs" as allowed commit prefix

9dff6eb

Signed-off-by: Niels de Vos <ndevos@redhat.com>

nfs: enable NFS-provisioner with --type=nfs

0242083

Deployments can use --type=nfs to deploy the NFS Controller Server (provisioner). Signed-off-by: Niels de Vos <ndevos@redhat.com>

journal: add StoreAttribute/FetchAttribute

d668092

Signed-off-by: Niels de Vos <ndevos@redhat.com>

nfs: store the NFS-cluster name in the journal

f611e98

Signed-off-by: Niels de Vos <ndevos@redhat.com>

nixpanic force-pushed the nfs/provisioner/driver branch from 1a1cf7d to f611e98 Compare March 28, 2022 09:17

nixpanic requested a review from Rakshith-R March 28, 2022 09:18

Rakshith-R approved these changes Mar 28, 2022

View reviewed changes

Rakshith-R requested a review from a team March 28, 2022 10:31

Madhu-1 approved these changes Mar 28, 2022

View reviewed changes

nixpanic mentioned this pull request Mar 28, 2022

doc: initial/partial instructions for using NFS examples #2963

Merged

mergify bot merged commit 885295f into ceph:devel Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nfs: add basic provisioner with create/delete procedures #2948

nfs: add basic provisioner with create/delete procedures #2948

nixpanic commented Mar 17, 2022

Rakshith-R commented Mar 18, 2022

Madhu-1 Mar 22, 2022

nixpanic Mar 24, 2022

Madhu-1 Mar 25, 2022

nixpanic Mar 25, 2022

Madhu-1 Mar 22, 2022

nixpanic Mar 24, 2022

nixpanic Mar 24, 2022

Madhu-1 Mar 22, 2022

nixpanic Mar 24, 2022

Madhu-1 Mar 22, 2022

nixpanic Mar 24, 2022

Madhu-1 Mar 25, 2022

nixpanic Mar 25, 2022

Madhu-1 Mar 22, 2022

nixpanic Mar 24, 2022

Madhu-1 Mar 22, 2022

nixpanic Mar 24, 2022

Madhu-1 Mar 22, 2022

nixpanic Mar 24, 2022

Madhu-1 Mar 25, 2022

nixpanic Mar 25, 2022

nixpanic commented Mar 24, 2022

Madhu-1 left a comment

Madhu-1 Mar 25, 2022

Madhu-1 Mar 25, 2022

Madhu-1 Mar 25, 2022

Madhu-1 Mar 25, 2022

nixpanic Mar 25, 2022

Madhu-1 Mar 25, 2022

nixpanic Mar 25, 2022

nixpanic commented Mar 25, 2022

nixpanic commented Mar 25, 2022


		backend := res.Volume

		log.DebugLog(ctx, "CephFS volume created: %s", backend)

nfs: add basic provisioner with create/delete procedures #2948

nfs: add basic provisioner with create/delete procedures #2948

Conversation

nixpanic commented Mar 17, 2022

Rakshith-R commented Mar 18, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nixpanic commented Mar 24, 2022

CreateVolume

DeleteVolume

Madhu-1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nixpanic commented Mar 25, 2022

nixpanic commented Mar 25, 2022