Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NFS support violates CSI spec #1224

Closed
gnufied opened this issue Apr 20, 2023 · 9 comments · Fixed by #1236
Closed

NFS support violates CSI spec #1224

gnufied opened this issue Apr 20, 2023 · 9 comments · Fixed by #1236

Comments

@gnufied
Copy link
Contributor

gnufied commented Apr 20, 2023

The NFS support in CSI driver directly violates CSI spec:

/ If SP has VOLUME_MOUNT_GROUP node capability and CO provides
    // this field then SP MUST ensure that the volume_mount_group
    // parameter is passed as the group identifier to the underlying
    // operating system mount system call, 

The above wording makes it clear that provided volume_mount_group should be used as a mount option for mount system call. But NFS driver instead uses it to recurively chown and chmod volumes.

The support in CSI driver also breaks several Kubernetes conventions. The file or mount properties of CSI driver should be configurable via CSIDriver object. The fsGroupChangePolicy should come from pod object (instead of being configurable in StorageClass).

@andyzhangx
Copy link
Member

@gnufied thanks for raising the issue.

  1. volume_mount_group issue: since nfs mount does not support gid setting, use chown and chmod is the only way
  2. I think it's discussed here for fsGroupChangePolicy : feat: add fsGroupChangePolicy for nfs protocol #1013 (comment), CSIDriver object is a managed (immutable) object on managed k8s cluster(.e.g. AKS), fields in CSIDriver cannot be changed, and providing a fsGroupChangePolicy parameter is more feasible for different user requirements since CSIDriver can only use one fixed value.

@gnufied
Copy link
Contributor Author

gnufied commented Apr 20, 2023

volume_mount_group issue: since nfs mount does not support gid setting, use chown and chmod is the only way

Then IMO it should be performed by kubelet and not by the CSI driver. But strictly speaking that is not the only solution, for most of kubernetes lifetime users of NFS have done with using supplementalGroups to mount NFS volumes that solution works too.

I think it's discussed here for fsGroupChangePolicy : #1013 (comment), CSIDriver object is a managed (immutable) object on managed k8s cluster(.e.g. AKS), fields in CSIDriver cannot be changed, and providing a fsGroupChangePolicy parameter is more feasible for different user requirements since CSIDriver can only use one fixed value.

No the fsGroupChangePolicy is a property of the pod and hence is user configurable. I think you are thinking of FSGroupPolicy which is indeed a property of CSIDriver object.

The problem is workload portability. Kubernetes is designed to behave same way regardless of where an workload is deployed. We try really hard to provide same user experience for our users regardless of where they are deploying their workloads but we are not keeping our promise here by not respecting fields inside pod specs.

Now coming back to CSIDriver object. Part of the problem is, we have too many different volume types under one umbrella. I think at very minimum, we should consider splitting some of them or if Azurefile driver wants to keep using same name for different driver types, we will have to propose solutions that neatly integrates with rest of kubernetes.

@gnufied
Copy link
Contributor Author

gnufied commented Apr 20, 2023

cc @msau42 @jingxu97 @jsafrane

@gnufied
Copy link
Contributor Author

gnufied commented Apr 20, 2023

Then IMO it should be performed by kubelet and not by the CSI driver. But strictly speaking that is not the only solution, for most of kubernetes lifetime users of NFS have done with using supplementalGroups to mount NFS volumes that solution works too.

Let me clarify why it is important that we don't duplicate what kubelet is supposed to do in the driver. It is because several of internal k8s bits depends on kubelet's behaviour. Take subpath support for example, that code subtley depends on sgid bit set on the parent directory.

Then there is integration with CRI for id-mapped mounts etc being proposed. All of this depends on letting kubelet do its own things and not duplicate the functionality. If Kubelet changes its behaviour tomorrow (and all of this copied code is internal kubelet details), then we could break our users.

@andyzhangx
Copy link
Member

@gnufied for fsGroupChangePolicy support, I figured out why I made such implementation for NFS at that time, this driver(including smb and nfs) Supports CSIVolumeMountGroup, while CSIVolumeMountGroup is mainly for smb, not for nfs, but nfs shares the same driver config as smb, and due to this code snippet:

if !driverSupportsCSIVolumeMountGroup && c.supportsFSGroup(fsType, mounterArgs.FsGroup, fsGroupPolicy) {

https://github.com/kubernetes/kubernetes/blob/065428befa06c1d99deb4160b4490a8bf1a377ee/pkg/volume/csi/csi_mounter.go#L332

I have to make some workaround, otherwise volume.SetVolumeOwnership would never happen on azure file nfs volume. Current azure file nfs & smb volume mount actually happens in NodeStageVolume interface, so this fsGroupChangePolicy property is already a per volume property now, the implementation is already slightly different from https://github.com/kubernetes-csi/csi-driver-nfs which only implements NodePublishVolume interface

@jsafrane
Copy link
Contributor

One thing I was thinking about to get out of this situation is to have two CSI drivers, one for NFS and second for CIFS. They can be handled by the same azurefile-csi-driver binary, just with a different cmdline options to set their own respective driver name + CSIDriver instance with their own fsGroupPolicy. As downside, there would be two DaemonSets + Deployments with the controllers.

In the future, I could imagine even one Pod (driver + sidecars) handling two or more CSI drivers, but that would probably need bigger changes in the sidecars - the driver would expose two sockets + a sidecar would connect to both of them + route all CSI calls to the right one.

@andyzhangx
Copy link
Member

I have worked out #1236 to address this issue

@jsafrane
Copy link
Contributor

jsafrane commented May 9, 2023

@andyzhangx, if I understand #1236 correctly, we should then run two CSI driver deployments/daemon sets and have the NFS one run with --enable-volume-mount-group=false?

@andyzhangx
Copy link
Member

@andyzhangx, if I understand #1236 correctly, we should then run two CSI driver deployments/daemon sets and have the NFS one run with --enable-volume-mount-group=false?

@jsafrane the nfs one should run as --set feature.enableVolumeMountGroup=false --set feature.fsGroupPolicy=File, maintaining two drivers would cost more efforts while it should be more compliant, since you are the openshift owner, it's up to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants