[FEATURE] Support CSI fsGroupPolicy #2131

yasker · 2020-12-22T17:29:09Z

https://kubernetes.io/blog/2020/12/14/kubernetes-release-1.20-fsgroupchangepolicy-fsgrouppolicy/

Ref:
#1221
#1842 (comment)

Note: Manual test plan is required because:

We haven't had the v1.20 test environment ready yet.
Need to make sure we can speed up the fsgroup change

Applicable Kubernetes version: v1.20+

PhanLe1010 · 2020-12-24T05:22:18Z

From the Kubernetes' code perspective, it looks like everything should work fine.

There are 3 cases we need to consider:

Kubernetes v1.19 with CSIVolumeFSGroupPolicy enabled
Kubernetes v1.20 with CSIVolumeFSGroupPolicy disabled
Kubernetes v1.20 with CSIVolumeFSGroupPolicy enabled

Case#1 and Case#3 should work because when CSIVolumeFSGroupPolicy is enabled, Kubernetes automatically fills in the default value for fsGroupPolicy: ReadWriteOnceWithFSType. Therefore, we don't have this error

Case#2 should also work because when CSIVolumeFSGroupPolicy is disabled, Kubernetes automatically clears FSGroupPolicy field but it checks the feature CSIVolumeFSGroupPolicy gate here and returns immediately. Therefore, we shouldn't have the error neither.

PhanLe1010 · 2020-12-24T05:22:31Z

@khushboo-rancher already verified that Case#1 and Case#3 working as expected.
We are verifying Case#2

khushboo-rancher · 2021-01-04T19:54:33Z

Validated the longhorn basic functionality with v1.20, It looks good.

As mentioned in the #2131 (comment) Longhorn is validated with the below cases.

Kubernetes v1.19 with CSIVolumeFSGroupPolicy enabled
Kubernetes v1.20 with CSIVolumeFSGroupPolicy disabled
Kubernetes v1.20 with CSIVolumeFSGroupPolicy enabled

Running the integration test on the case#2, will update the result.

Update: Integration test looks good on the set up of a k3s v1.20 with CSIVolumeFSGroupPolicy disabled

joshimoo · 2021-01-05T08:47:05Z

@PhanLe1010 the fuzzer you linked to is code used only for testing, as recognized by the fact of the usage of random here :)
https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/apis/storage/fuzzer/fuzzer.go#L43

So in the case where the feature gate is enabled and the driver object exists as well as not having defined a .Spec.FSGroupPolicy setting, we will trigger this error:
https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/volume/csi/csi_plugin.go#L912

Do we define .Spec.FSGroupPolicy when creating the driver object?

PhanLe1010 · 2021-01-08T18:33:19Z

@joshimoo

the fuzzer you linked to is code used only for testing, as recognized by the fact of the usage of random here :)
https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/apis/storage/fuzzer/fuzzer.go#L43
So in the case where the feature gate is enabled and the driver object exists as well as not having defined a .Spec.FSGroupPolicy setting, we will trigger this error:
https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/volume/csi/csi_plugin.go#L912

Thank you for pointing out the mistake in the coded link. Yeah, it is testing code. The default values are filled here instead https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/apis/storage/v1/defaults.go#L56. So it is not possible that the feature gate is enabled and the driver object exists as well as not having defined a .Spec.FSGroupPolicy setting (I tested this behavior). Moreover, if K8s don't fill the .Spec.FSGroupPolicy, it will break many existing CSI providers. I think K8s team must have already considered that.

Do we define .Spec.FSGroupPolicy when creating the driver object?

No, we don't define. We actually cannot define it since we are using client-go v1.16 and it doesn't have the field .Spec.FSGroupPolicy in the CSIDriverSpec struct. So, we depend on the K8s to fill in the default value for .Spec.FSGroupPolicy.

PhanLe1010 · 2021-02-15T02:18:53Z

Analysis

1. Running pod as non-root user and the problem

It is good practice to run pod as a non-root user to prevent bugs or malicious code from taking over the system. An example of a pod that runs as non-root user could be:

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
  volumes:
  - name: sec-ctx-vol
    emptyDir: {}
  containers:
  - name: sec-ctx-demo
    image: busybox
    command: [ "sh", "-c", "sleep 1h" ]
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
    securityContext:
      allowPrivilegeEscalation: false

The above YAML file tells Kubernetes to run every process of every container inside the pod with user ID 1000 and primary group ID 3000. So far so good, processes are not running as root. However, there is a problem here. If the volume sec-ctx-vol contains files/directories that the user ID 1000 and the group ID 3000 don't have permission to access, the processes in the pod cannot access those files/directories. In the worst case, if the root of the volume sec-ctx-vol don't have read/write permission for user ID 1000 and group ID 3000, all processes cannot access the volume.

2. `fsGroup` and the problem

To solve the above problem, users can specify the field fsGroup as below:

spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000

Upon seeing the field fsGroup: 2000, Kubernetes will take the following actions:

Make sure that all processes of the containers inside the pod are part of the supplementary group ID 2000
Any new files created in the volume sec-ctx-vol will be Group ID 2000
If fsType of the PersistentVolumes is defined and the PersistentVolumes's accessModes is RWO, then Kubernetes will attempt to modify the volume ownership and permissions (to be Group ID 2000 in this case) every time the volume is mounted to the pod.

So, adding the field fsGroup: 2000 solves the problem we see in section 1. However, fsGroup introduces a new problem as described in K8s doc:

The side-effect of setting fsGroup is that, each time a volume is mounted, Kubernetes must recursively chown() and chmod() all the files and directories inside the volume - with a few exceptions noted below. This happens even if group ownership of the volume already matches the requested fsGroup, and can be pretty expensive for larger volumes with lots of small files, which causes pod startup to take a long time.

The bad news is that there is no workaround for this problem in K8s version v1.19 and before. The good news is that K8s v1.20 provides 2 options to solve this problem as discussed in the following sections.

3. pod.spec.securityContext.fsGroupChangePolicy

In section 1, K8s never modifies the volume ownership and permissions before giving it to the pod. In section 2, K8s always recursively modifies the volume ownership and permissions before giving it to the pod. Each of them has its own problem as mentioned.

Since version v1.20, Kubernetes introduces a new beta feature: the field fsGroupChangePolicy. When fsGroupChangePolicy is set to OnRootMismatch, if the root of the volume already has the correct permissions, the recursive permission and ownership change will be skipped. It means that if users don't change the pod.spec.securityContext.fsGroup between pod's restarts, K8s will only have to check the permissions and ownership of the root and the mounting process will be much faster compared to always recursively change the volumes' ownership and permissions.

Note that K8s v1.20 still allows users to keep the previous behavior. By setting fsGroupChangePolicy to always, K8s always changes permission and ownership of the volume when volume is mounted.

Note: if users don't set fsGroupChangePolicy, it will be the same as setting fsGroupChangePolicy to always

4. csiDriver.spec.fsGroupPolicy

For certain volume types, such as NFS or Gluster, the cluster doesn’t perform recursive permission changes even if the pod has a fsGroup. So, K8s v1.20 even optimizes the volume's permissions checking process further by allowing the CSI storage providers to explicitly indicate whether or not they support modifying a volume's ownership or permissions when the volume is being mounted.
CSI providers can specify the following values for csiDriver.spec.fsGroupPolicy:

None: Indicates that volumes will be mounted with no modifications, as the CSI volume driver does not support these operations.
File: Indicates that the CSI volume driver supports volume ownership and permission change via fsGroup, and Kubernetes may use fsGroup to change permissions and ownership of the volume to match user requested fsGroup in the pod's SecurityPolicy regardless of fstype or access mode.
ReadWriteOnceWithFSType: Indicates that volumes will be examined to determine if volume ownership and permissions should be modified to match the pod's security policy. Changes will only occur if the fsType is defined and the persistent volume's accessModes contains ReadWriteOnly.

Note:

If undefined, csiDriver.spec.fsGroupPolicy will default to ReadWriteOnceWithFSType, keeping the previous behavior.
when csiDriver.spec.fsGroupPolicy is set to None or File, it will overwrite the setting for pod.spec.securityContext.fsGroupChangePolicy. Users of the CSI provider cannot control the volume's permissions checking process in those cases.

Conclusion

As tested in [FEATURE] Support CSI fsGroupPolicy #2131 (comment), Longhorn works with Kubernetes v1.20 by default. Also, Kubernetes automatically set csiDriver.spec.fsGroupPolicy=ReadWriteOnceWithFSType for Longhorn CSI driver since Longhorn doesn't specify it.
Longhorn should not change the csiDriver.spec.fsGroupPolicy to a value different than the default value ReadWriteOnceWithFSType because Longhorn only provides block device storage. Base on the filesystem and volume access mode, the users of Longhorn block devices should be able to control the volume's permissions checking process by setting pod.spec.securityContext.fsGroupChangePolicy. If Longhorn sets csiDriver.spec.fsGroupPolicy to None or File, it will overwrite the setting pod.spec.securityContext.fsGroupChangePolicy.

In a nutshell, Longhorn doesn't need to do anything to support csiDriver.spec.fsGroupPolicy and pod.spec.securityContext.fsGroupChangePolicy in Kubernetes v1.20

Testing

The above analysis is supported by the linked documents and the below tests.

1. Running pod as non-root but don't set `pod.spec.securityContext.fsGroup`

Remove the read, write, and execute for other users on Longhorn volume's root. I.e., change permission for Longhorn volume's root to drwxr-x---. In order to do that, we will first running the flowing pod as root:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sec-ctx-vol-pvc
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  storageClassName: longhorn
  resources:
    requests:
      storage: 30Gi  
---
apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
  namespace: default
spec:
  restartPolicy: Always
  containers:
  - name: sec-ctx-demo
    command: [ "sh", "-c", "sleep 1h" ]
    image: ubuntu
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
  volumes:
  - name: sec-ctx-vol
    persistentVolumeClaim:
      claimName: sec-ctx-vol-pvc

exec into the pod, and create some file

root@security-context-demo:/data# id
uid=0(root) gid=0(root) groups=0(root)
root@security-context-demo:~# cd /data/
root@security-context-demo:/data# ls -l
total 4
drwxr-xr-x 3 root root 4096 Feb 15 00:05 demo
root@security-context-demo:/data# ls demo/ -l
total 16
drwx------ 2 root root 16384 Feb 15 00:05 lost+found
root@security-context-demo:/data# chmod 750 demo/
root@security-context-demo:/data# ls -l
total 4
drwxr-x--- 3 root root 4096 Feb 15 00:05 demo

Now delete the above pod, re-deploy the pod as non-root user by adding runAsUser, and runAsGroup . Note that we don't add fsGroup :

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
  namespace: default
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
  restartPolicy: Always
  containers:
  - name: sec-ctx-demo
    command: [ "sh", "-c", "sleep 1h" ]
    image: ubuntu
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
  volumes:
  - name: sec-ctx-vol
    persistentVolumeClaim:
      claimName: sec-ctx-vol-pvc

Exec into the pod. Observe that we cannot access directory /data/demo as user ID 1000 and primary group ID 3000

I have no name!@security-context-demo:/$ id
uid=1000 gid=3000 groups=3000
I have no name!@security-context-demo:/$ ls /data/ -l
total 4
drwxr-x--- 3 root root 4096 Feb 15 00:05 demo
I have no name!@security-context-demo:/$ ls /data/demo -l
ls: cannot open directory '/data/demo': Permission denied
I have no name!@security-context-demo:/$ touch /data/demo/file.txt   
touch: cannot touch '/data/demo/file.txt': Permission denied

2. Running pod as non-root and set `pod.spec.securityContext.fsGroup`

Delete the above pod. Re-deploy the pod and add fsGroup: 2000 as bellow:

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
  namespace: default
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
  restartPolicy: Always
  containers:
  - name: sec-ctx-demo
    command: [ "sh", "-c", "sleep 1h" ]
    image: ubuntu
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
  volumes:
  - name: sec-ctx-vol
    persistentVolumeClaim:
      claimName: sec-ctx-vol-pvc

Exec into the pod, we can see that Kubernetes did 3 actions as mentioned in section # 2 in the above analysis.

I have no name!@security-context-demo:/$ id
uid=1000 gid=3000 groups=3000,2000
I have no name!@security-context-demo:/$ ls /data/ -l
total 4
drwxrws--- 3 root 2000 4096 Feb 15 00:05 demo
I have no name!@security-context-demo:/$ cd /data/demo/
I have no name!@security-context-demo:/data/demo$ ls -l
total 16
drwxrws--- 2 root 2000 16384 Feb 15 00:05 lost+found
I have no name!@security-context-demo:/data/demo$ echo "hello world" > file.txt
I have no name!@security-context-demo:/data/demo$ ls -l
total 20
-rw-r--r-- 1 1000 2000    12 Feb 15 00:46 file.txt
drwxrws--- 2 root 2000 16384 Feb 15 00:05 lost+found

One key point to take away is that Kubernetes did recursively change ownership and permissions for every file/directory in the Longhorn volume.

Now, to see the problem with recursively change ownership and permissions. Let's create a lot of files inside /data/demo/files. We can use the following script to generate the files:

#!/bin/bash
file_prefix="longhorn-file-"

POSITIONAL=()
while [[ $# -gt 0 ]]; do
        key="$1"
        case $key in
                -c|--count)
                count="$2"
                shift # past argument
                shift # past value
                ;;
                -h|--help)
                help="true"
                shift
                ;;
                *)
                echo "Error! invalid flag: ${key}"
                help="true"
                break
                ;;
        esac
done

usage () {
        echo "USAGE: $0 --count 10000"
        echo "  [-c|--count] number of file to be created inside the current folder"
        echo "  [-h|--help] Usage message"
}

if [[ $help ]]; then
        usage
        exit 0
fi

set -e -x


if [[ $count ]]; then
    i="0"
        while [ $i -lt $count ]
    do
    dd if=/dev/urandom of="$file_prefix$i" count=1 bs=512
    i=$[$i+1]
    done
fi

Copy the script into /data/demo/files/create_file.sh.

kubectl cp create-file.sh default/security-context-demo:data/demo/files

Create 500000 files of 512 bytes inside data/demo/files:

I have no name!@security-context-demo:/$ cd /data/demo/files/
I have no name!@security-context-demo:/data/demo/files$ ls -l
total 4
-rwxr-xr-x 1 1000 2000 663 Feb 15 01:02 create-file.sh
create-file.sh
I have no name!@security-context-demo:/data/demo/files$ ./create-file.sh --count 500000

Let's see how long it takes Kubernetes to remount Longhorn volume. Let's delete the pod and re-deploy it as in step 1 of this test case. Observe that it takes 3m30s for Kunertes to finish mounting the Longhorn volume.
Observe that it takes 15 mins for Kunertes to finish mounting the Longhorn volume if we create 2 millions files inside /data/demo/files/

3. Set `pod.spec.securityContext.fsGroupChangePolicy= OnRootMismatch`

Delete the pod. Re-deploy the pod with pod.spec.securityContext.fsGroupChangePolicy= OnRootMismatch

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
  namespace: default
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    fsGroupChangePolicy: "OnRootMismatch"
  restartPolicy: Always
  containers:
  - name: sec-ctx-demo
    command: [ "sh", "-c", "sleep 1h" ]
    image: ubuntu
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
  volumes:
  - name: sec-ctx-vol
    persistentVolumeClaim:
      claimName: sec-ctx-vol-pvc

Observe that it takes only 13s for Kubernetes to finish mounting the Longhorn volume and the pod is in running state. It is because we didn't change the pod.spec.securityContext.fsGroup. So, it is matched with the volume root group ownership and Kubernetes doesn't recursively perform the chown()and chmod()

Let's change the fsGroup to 5000, delete the pod, and re-deploy the pod as:

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
  namespace: default
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 5000
    fsGroupChangePolicy: "OnRootMismatch"
  restartPolicy: Always
  containers:
  - name: sec-ctx-demo
    command: [ "sh", "-c", "sleep 1h" ]
    image: ubuntu
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
  volumes:
  - name: sec-ctx-vol
    persistentVolumeClaim:
      claimName: sec-ctx-vol-pvc

Because the fsGroup doesn't match the volume root group ownership. Kubernetes recursively perform the chown()and chmod(). Observe that it takes 3m30s for Kunertes to finish mounting the Longhorn volume.

4. Test csiDriver.spec. fsGroupPolicy

So far we have seen the behavior of Kubernetes when the csiDriver.spec.fsGroupPolicy is set to ReadWriteOnceWithFSType which is the default value. Let's see what happens when we csiDriver.spec.fsGroupPolicy is set to None which indicates that volumes will be mounted with no modifications.

Delete the pod from the previous step. Delete the PVC sec-ctx-vol-pvc which will automatically delete the corresponding PV and Longhorn volume.

Delete Longhorn CSI driver CR

kubectl delete csidriver driver.longhorn.io

Re-create Longhorn CSI driver CR with spec.fsGroupPolicy set to None:

apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
  annotations:
    driver.longhorn.io/kubernetes-version: v1.20.2+k3s1
    driver.longhorn.io/version: v1.1.0
  creationTimestamp: "2021-02-11T20:01:07Z"
  name: driver.longhorn.io
  resourceVersion: "50138"
  uid: 92358d9f-f82c-4157-975e-ec0120955591
spec:
  attachRequired: true
  # fsGroupPolicy: None
  podInfoOnMount: true
  volumeLifecycleModes:
  - Persistent

Deploy the following pod:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sec-ctx-vol-pvc
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  storageClassName: longhorn
  resources:
    requests:
      storage: 30Gi  
---
apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
  namespace: default
spec:
  restartPolicy: Always
  containers:
  - name: sec-ctx-demo
    command: [ "sh", "-c", "sleep 1h" ]
    image: ubuntu
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
  volumes:
  - name: sec-ctx-vol
    persistentVolumeClaim:
      claimName: sec-ctx-vol-pvc

exec into the pod. Remove the read, write, and execute for other users on Longhorn volume's root. I.e., change permission for Longhorn volume's root to drwxr-x---.

root@security-context-demo:/data# id
uid=0(root) gid=0(root) groups=0(root)
root@security-context-demo:~# cd /data/
root@security-context-demo:/data# chmod 750 demo/
root@security-context-demo:/data# ls -l
total 4
drwxr-x--- 3 root root 4096 Feb 15 00:05 demo

Now delete the above pod, re-deploy the pod as:

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
  namespace: default
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    fsGroupChangePolicy: "Always"
  restartPolicy: Always
  containers:
  - name: sec-ctx-demo
    command: [ "sh", "-c", "sleep 1h" ]
    image: ubuntu
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
  volumes:
  - name: sec-ctx-vol
    persistentVolumeClaim:
      claimName: sec-ctx-vol-pvc

Exec into the pod. Observe that we cannot access directory /data/demo as user ID 1000, primary group ID 3000, and supplemental group ID 2000

I have no name!@security-context-demo:/$ id
uid=1000 gid=3000 groups=3000,2000
I have no name!@security-context-demo:/$ ls /data/ -l
total 4
drwxr-x--- 3 root root 4096 Feb 15 00:05 demo
I have no name!@security-context-demo:/$ ls /data/demo -l
ls: cannot open directory '/data/demo': Permission denied
I have no name!@security-context-demo:/$ touch /data/demo/file.txt   
touch: cannot touch '/data/demo/file.txt': Permission denied

We can see that setting csiDriver.spec.fsGroupPolicy to None will overwrite the setting pod.spec.securityContext.fsGroupChangePolicy. From Longhorn perspective, we don't want to overwrite this as explained in section # 4 in the above analysis.

yasker · 2021-02-16T22:06:59Z

Thanks, @PhanLe1010 , it's a well-written explanation.

I also feel that we should put this either into our documentation or KB.

After that, I will leave it for QA to verify the result, then we can close it.

yasker · 2021-02-18T18:43:38Z

FYI: During the team meeting, we decide that we should write a KB article for the issue.

carpenike · 2021-02-24T22:55:46Z

Hey Folks,

I'm running into this problem and it seems as though something needs to be set in the csidriver. Even with the securityContext defined in the pod I'm still getting the kubernetes.io/csi: expected valid fsGroupPolicy, received nil value or empty string.

I've got this defined in spec.securityContext

  securityContext:
    fsGroup: 1000
    runAsGroup: 1000
    runAsUser: 1000

PhanLe1010 · 2021-02-24T22:58:38Z

@carpenike Which Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version are you using? Also, which Longhorn version?

carpenike · 2021-02-24T23:00:29Z

@PhanLe1010 --

Kubeadm deployed on top of Ubuntu. Longhorn 1.1.0.

ryan@h2-0:~$ sudo kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:25:59Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

carpenike · 2021-02-24T23:19:06Z

Yup! They're ubuntu nodes so it's just an apt update process. Kubelet restarts on each node and reports in to the control-plane with the updated version.

carpenike · 2021-02-24T23:26:55Z

Should I just remove the driver and re-create it?

PhanLe1010 · 2021-02-24T23:31:13Z

@carpenike Yes, the workaround is just simple as that. You can also just delete the pod longhorn-driver-deployer-xxxx and it will be restarted and recreate the CSIDrive CR.

However, is it possible that you can hold on a second? We would like to learn about how to reproduce the issue. It would be great if you can help us.

carpenike · 2021-02-24T23:31:38Z

Sure no problem. :)

PhanLe1010 · 2021-02-24T23:42:46Z

Thanks. What is your upgrade order?
Is it

worker nodes (drain, upgrade kubelet and kubectl, uncordon) -> control plane nodes
or
control plane nodes -> worker nodes (drain, upgrade kubelet and kubectl, uncordon)

carpenike · 2021-02-24T23:44:07Z

Generally the first one.

PhanLe1010 · 2021-02-24T23:47:58Z

Ok, that explains why Longhorn is saying that your Kubernetes version is v1.19.8 in the CSI Drive CR. Since at the time the longhorn-driver-deployer pod starts, it sees that K8s server has the old v1.19 version. Latter on, when the K8s server is upgraded to v1.20. The longhorn-driver-deployer pod is not restarted so it doesn't see the new v1.20 version and it doesn't update the CSI Driver CR.

PhanLe1010 · 2021-02-24T23:48:30Z

Can you try the workaround to see if the CSI Driver CR is updated with the new K8s version and has the field fsGroupPolicy?

delete the pod longhorn-driver-deployer-xxxx and it will be restarted and recreate the CSIDrive CR.

carpenike · 2021-02-24T23:51:32Z

Done. It seems like longhorn is caching the k8s version?

ryan on ﴱ ryan-k8s in ~ ❯ kubectl logs -n longhorn-system longhorn-driver-deployer-666c84fbb7-5b4qp                                                                                                                                                                                                                                                                 [18:50:46]
W0224 23:49:14.536576       1 client_config.go:541] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2021-02-24T23:49:14Z" level=debug msg="Deploying CSI driver"
time="2021-02-24T23:49:14Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Pending"
time="2021-02-24T23:49:15Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Pending"
time="2021-02-24T23:49:16Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Pending"
time="2021-02-24T23:49:17Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Pending"
time="2021-02-24T23:49:18Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Running"
time="2021-02-24T23:49:19Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Running"
time="2021-02-24T23:49:20Z" level=info msg="Proc found: kubelet"
time="2021-02-24T23:49:20Z" level=info msg="Try to find arg [--root-dir] in cmdline: [/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --node-ip r720-0.holthome.net ]"
time="2021-02-24T23:49:20Z" level=warning msg="Cmdline of proc kubelet found: \"/usr/bin/kubelet\x00--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf\x00--kubeconfig=/etc/kubernetes/kubelet.conf\x00--config=/var/lib/kubelet/config.yaml\x00--container-runtime=remote\x00--container-runtime-endpoint=/run/containerd/containerd.sock\x00--node-ip\x00r720-0.holthome.net\x00\". But arg \"--root-dir\" not found. Hence default value will be used: \"/var/lib/kubelet\""
time="2021-02-24T23:49:20Z" level=info msg="Detected root dir path: /var/lib/kubelet"
time="2021-02-24T23:49:20Z" level=info msg="Upgrading Longhorn related components for CSI v1.1.0"
time="2021-02-24T23:49:20Z" level=debug msg="Detected CSI Driver driver.longhorn.io CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected service csi-attacher CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected deployment csi-attacher CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected service csi-provisioner CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected deployment csi-provisioner CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected service csi-resizer CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected deployment csi-resizer CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected service csi-snapshotter CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected deployment csi-snapshotter CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected daemon set longhorn-csi-plugin CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="CSI deployment done"

carpenike · 2021-02-24T23:52:21Z

CSIDriver looks the same too:

ryan on ﴱ ryan-k8s in ~ ❯ kubectl get csidriver driver.longhorn.io -o yaml                                                                                                                                                                                                                                                                                          [18:51:59]
apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
  annotations:
    driver.longhorn.io/kubernetes-version: v1.19.8
    driver.longhorn.io/version: v1.1.0
  creationTimestamp: "2021-02-23T05:43:48Z"
  managedFields:
  - apiVersion: storage.k8s.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:driver.longhorn.io/kubernetes-version: {}
          f:driver.longhorn.io/version: {}
      f:spec:
        f:attachRequired: {}
        f:podInfoOnMount: {}
        f:volumeLifecycleModes: {}
    manager: longhorn-manager
    operation: Update
    time: "2021-02-23T05:43:48Z"
  - apiVersion: storage.k8s.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        f:fsGroupPolicy: {}
    manager: kubectl-edit
    operation: Update
    time: "2021-02-24T17:38:28Z"
  name: driver.longhorn.io
  resourceVersion: "1510508"
  selfLink: /apis/storage.k8s.io/v1/csidrivers/driver.longhorn.io
  uid: de782a0c-3da5-49dd-8178-b1e4ba5a1133
spec:
  attachRequired: true
  podInfoOnMount: true
  volumeLifecycleModes:
  - Persistent

PhanLe1010 · 2021-02-25T00:02:59Z

No, it fetches the info from the Kube API Server https://github.com/longhorn/longhorn-manager/blob/bb8a72669a61e36ab70d25651b9c8f67c5c7fb2b/csi/deployment_util.go#L158

Hmm

PhanLe1010 · 2021-02-25T00:15:01Z

Can you try:

kubectl proxy --port=8080
curl http://localhost:8080/version

and

kubectl version

carpenike · 2021-02-25T00:57:37Z

Figured it out -- the k8s kubelets had upgraded to 1.20.4 but the underlying API and services had not. Ran through the kubeadm upgrade process and now things are starting up as expected. Deleted the longhorn-driver and it's re-deploying everything.

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/

This would likely not have been encountered with a fresh 1.20.x cluster.

…inish mounting` longhorn/longhorn#2131 Signed-off-by: Phan Le <phan.le@rancher.com>

longhorn-io-github-bot · 2021-02-27T01:19:25Z

…inish mounting` longhorn/longhorn#2131 Signed-off-by: Phan Le <phan.le@rancher.com>

khushboo-rancher · 2021-03-11T18:16:42Z

Verified the scenarios from #2131 (comment), all the explanations hold good.

Validations - Pass

The pod with fsGroupChangePolicy: "Always" or no fsGroupChangePolicy or with different fsGroups and csiDriver.spec.fsGroupPolicy: ReadWriteOnceWithFSType takes around 6 min to mount the path containing 500000 files. This confirms the recursive change in the ownership and permissions.
The pod with fsGroupChangePolicy: "OnRootMismatch" and csiDriver.spec.fsGroupPolicy: ReadWriteOnceWithFSType takes around 18 sec to mount the path containing 500000 files.
If csiDriver.spec.fsGroupPolicy is set to None, the non-root users are unable to access the files in the mount path.
If csiDriver.spec.fsGroupPolicy is set to File, the pod always take around 6 min to mount the path irrespective of the fsGroupChangePolicy value.
On upgrade of the cluster from K8s v1.19.8 to K8s v1.20.4, the fsGroupPolicy: ReadWriteOnceWithFSType gets updated in CSIDriver driver.longhorn.io.

yasker added this to the v1.1.1 milestone Dec 22, 2020

yasker assigned PhanLe1010 Dec 22, 2020

PhanLe1010 mentioned this issue Jan 4, 2021

Failed to attach volume message for a pod deployment #1842

Closed

yasker added the highlight Important feature/issue to highlight label Jan 8, 2021

joshimoo mentioned this issue Jan 8, 2021

[QUESTION] Slow mount time on very large volumes #2168

Closed

yasker mentioned this issue Jan 19, 2021

[QUESTION] PVCs takes a long time (or fails) to be attached due to fsgroup change #1221

Closed

yasker removed the highlight Important feature/issue to highlight label Feb 23, 2021

PhanLe1010 mentioned this issue Feb 27, 2021

Add KB for the issue volumes with lot of files take a long time to finish mounting longhorn/website#265

Merged

PhanLe1010 added a commit to PhanLe1010/website that referenced this issue Feb 27, 2021

Add KB for the issue `volumes with lot of files take a long time to f…

025da86

…inish mounting` longhorn/longhorn#2131 Signed-off-by: Phan Le <phan.le@rancher.com>

PhanLe1010 added a commit to PhanLe1010/website that referenced this issue Feb 27, 2021

Add KB for the issue `volumes with lot of files take a long time to f…

46775a7

…inish mounting` longhorn/longhorn#2131 Signed-off-by: Phan Le <phan.le@rancher.com>

PhanLe1010 added a commit to PhanLe1010/website that referenced this issue Feb 27, 2021

Add KB for the issue `volumes with lot of files take a long time to f…

f1b28bc

…inish mounting` longhorn/longhorn#2131 Signed-off-by: Phan Le <phan.le@rancher.com>

PhanLe1010 added a commit to PhanLe1010/website that referenced this issue Mar 6, 2021

Add KB for the issue `volumes with lot of files take a long time to f…

ee4d64f

…inish mounting` longhorn/longhorn#2131 Signed-off-by: Phan Le <phan.le@rancher.com>

yasker pushed a commit to longhorn/website that referenced this issue Mar 6, 2021

Add KB for the issue `volumes with lot of files take a long time to f…

72686df

…inish mounting` longhorn/longhorn#2131 Signed-off-by: Phan Le <phan.le@rancher.com>

khushboo-rancher self-assigned this Mar 10, 2021

khushboo-rancher closed this as completed Mar 11, 2021

cclhsu mentioned this issue Mar 26, 2021

[BUG] Long delay to attach volume when statefulset restart #2322

Closed

innobead changed the title ~~[FEATURE]Support CSI fsGroupPolicy~~ [FEATURE] Support CSI fsGroupPolicy Apr 19, 2021

gnufied mentioned this issue May 17, 2021

Allow volume ownership to be only set after fs formatting. kubernetes/kubernetes#69699

Closed

khushboo-rancher mentioned this issue Sep 1, 2021

[TEST] Add test case in e2e integration test for fsgroup support #2967

Closed

2 tasks

This was referenced Mar 23, 2022

[BUG] Wordpress helm chart in RWX mode has permission issues #3661

Open

[FEATURE] Make fsGroupPolicy aware in CSIDriver #3762

Open

mmontes11 mentioned this issue Mar 2, 2024

[Bug] 0.25.0 maxscale statefulset failing to start mariadb-operator/mariadb-operator#381

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Support CSI fsGroupPolicy #2131

[FEATURE] Support CSI fsGroupPolicy #2131

yasker commented Dec 22, 2020 •

edited

Loading

PhanLe1010 commented Dec 24, 2020

PhanLe1010 commented Dec 24, 2020 •

edited by khushboo-rancher

Loading

khushboo-rancher commented Jan 4, 2021 •

edited

Loading

joshimoo commented Jan 5, 2021 •

edited

Loading

PhanLe1010 commented Jan 8, 2021 •

edited

Loading

PhanLe1010 commented Feb 15, 2021

yasker commented Feb 16, 2021

yasker commented Feb 18, 2021

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 24, 2021

carpenike commented Feb 24, 2021

carpenike commented Feb 24, 2021

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 24, 2021 •

edited

Loading

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 24, 2021 •

edited

Loading

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 24, 2021 •

edited

Loading

PhanLe1010 commented Feb 24, 2021 •

edited

Loading

carpenike commented Feb 24, 2021

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 25, 2021

PhanLe1010 commented Feb 25, 2021

carpenike commented Feb 25, 2021

longhorn-io-github-bot commented Feb 27, 2021 •

edited by PhanLe1010

Loading

khushboo-rancher commented Mar 11, 2021

[FEATURE] Support CSI fsGroupPolicy #2131

[FEATURE] Support CSI fsGroupPolicy #2131

Comments

yasker commented Dec 22, 2020 • edited Loading

PhanLe1010 commented Dec 24, 2020

PhanLe1010 commented Dec 24, 2020 • edited by khushboo-rancher Loading

khushboo-rancher commented Jan 4, 2021 • edited Loading

joshimoo commented Jan 5, 2021 • edited Loading

PhanLe1010 commented Jan 8, 2021 • edited Loading

PhanLe1010 commented Feb 15, 2021

Analysis

1. Running pod as non-root user and the problem

2. fsGroup and the problem

3. pod.spec.securityContext.fsGroupChangePolicy

4. csiDriver.spec.fsGroupPolicy

Conclusion

Testing

1. Running pod as non-root but don't set pod.spec.securityContext.fsGroup

2. Running pod as non-root and set pod.spec.securityContext.fsGroup

3. Set pod.spec.securityContext.fsGroupChangePolicy= OnRootMismatch

4. Test csiDriver.spec. fsGroupPolicy

yasker commented Feb 16, 2021

yasker commented Feb 18, 2021

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 24, 2021

carpenike commented Feb 24, 2021

carpenike commented Feb 24, 2021

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 24, 2021 • edited Loading

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 24, 2021 • edited Loading

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 24, 2021 • edited Loading

PhanLe1010 commented Feb 24, 2021 • edited Loading

carpenike commented Feb 24, 2021

carpenike commented Feb 24, 2021

PhanLe1010 commented Feb 25, 2021

PhanLe1010 commented Feb 25, 2021

carpenike commented Feb 25, 2021

longhorn-io-github-bot commented Feb 27, 2021 • edited by PhanLe1010 Loading

Pre-merged Checklist

khushboo-rancher commented Mar 11, 2021

yasker commented Dec 22, 2020 •

edited

Loading

PhanLe1010 commented Dec 24, 2020 •

edited by khushboo-rancher

Loading

khushboo-rancher commented Jan 4, 2021 •

edited

Loading

joshimoo commented Jan 5, 2021 •

edited

Loading

PhanLe1010 commented Jan 8, 2021 •

edited

Loading

2. `fsGroup` and the problem

1. Running pod as non-root but don't set `pod.spec.securityContext.fsGroup`

2. Running pod as non-root and set `pod.spec.securityContext.fsGroup`

3. Set `pod.spec.securityContext.fsGroupChangePolicy= OnRootMismatch`

PhanLe1010 commented Feb 24, 2021 •

edited

Loading

PhanLe1010 commented Feb 24, 2021 •

edited

Loading

PhanLe1010 commented Feb 24, 2021 •

edited

Loading

PhanLe1010 commented Feb 24, 2021 •

edited

Loading

longhorn-io-github-bot commented Feb 27, 2021 •

edited by PhanLe1010

Loading