Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support CSI fsGroupPolicy #2131

Closed
yasker opened this issue Dec 22, 2020 · 33 comments
Closed

[FEATURE] Support CSI fsGroupPolicy #2131

yasker opened this issue Dec 22, 2020 · 33 comments
Assignees
Labels
area/kubernetes Kubernetes related like K8s version compatibility component/longhorn-manager Longhorn manager (control plane) priority/0 Must be fixed in this release (managed by PO) require/doc Require updating the longhorn.io documentation require/knowledge-base Require adding knowledge base document
Milestone

Comments

@yasker
Copy link
Member

yasker commented Dec 22, 2020

https://kubernetes.io/blog/2020/12/14/kubernetes-release-1.20-fsgroupchangepolicy-fsgrouppolicy/

Ref:
#1221
#1842 (comment)

Note: Manual test plan is required because:

  1. We haven't had the v1.20 test environment ready yet.
  2. Need to make sure we can speed up the fsgroup change

Applicable Kubernetes version: v1.20+

@yasker yasker added kind/feature Feature request, new feature component/longhorn-manager Longhorn manager (control plane) area/driver area/kubernetes Kubernetes related like K8s version compatibility priority/0 Must be fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/doc Require updating the longhorn.io documentation require/manual-test-plan Require adding/updating manual test cases if they can't be automated labels Dec 22, 2020
@yasker yasker added this to the v1.1.1 milestone Dec 22, 2020
@PhanLe1010
Copy link
Contributor

From the Kubernetes' code perspective, it looks like everything should work fine.

There are 3 cases we need to consider:

  1. Kubernetes v1.19 with CSIVolumeFSGroupPolicy enabled
  2. Kubernetes v1.20 with CSIVolumeFSGroupPolicy disabled
  3. Kubernetes v1.20 with CSIVolumeFSGroupPolicy enabled

Case#1 and Case#3 should work because when CSIVolumeFSGroupPolicy is enabled, Kubernetes automatically fills in the default value for fsGroupPolicy: ReadWriteOnceWithFSType. Therefore, we don't have this error

Case#2 should also work because when CSIVolumeFSGroupPolicy is disabled, Kubernetes automatically clears FSGroupPolicy field but it checks the feature CSIVolumeFSGroupPolicy gate here and returns immediately. Therefore, we shouldn't have the error neither.

@PhanLe1010
Copy link
Contributor

PhanLe1010 commented Dec 24, 2020

@khushboo-rancher already verified that Case#1 and Case#3 working as expected.
We are verifying Case#2

@khushboo-rancher
Copy link
Contributor

khushboo-rancher commented Jan 4, 2021

Validated the longhorn basic functionality with v1.20, It looks good.

As mentioned in the #2131 (comment) Longhorn is validated with the below cases.

  1. Kubernetes v1.19 with CSIVolumeFSGroupPolicy enabled
  2. Kubernetes v1.20 with CSIVolumeFSGroupPolicy disabled
  3. Kubernetes v1.20 with CSIVolumeFSGroupPolicy enabled

Running the integration test on the case#2, will update the result.

Update: Integration test looks good on the set up of a k3s v1.20 with CSIVolumeFSGroupPolicy disabled

@joshimoo
Copy link
Contributor

joshimoo commented Jan 5, 2021

@PhanLe1010 the fuzzer you linked to is code used only for testing, as recognized by the fact of the usage of random here :)
https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/apis/storage/fuzzer/fuzzer.go#L43

So in the case where the feature gate is enabled and the driver object exists as well as not having defined a .Spec.FSGroupPolicy setting, we will trigger this error:
https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/volume/csi/csi_plugin.go#L912

Do we define .Spec.FSGroupPolicy when creating the driver object?

@yasker yasker added the highlight Important feature/issue to highlight label Jan 8, 2021
@PhanLe1010
Copy link
Contributor

PhanLe1010 commented Jan 8, 2021

@joshimoo

the fuzzer you linked to is code used only for testing, as recognized by the fact of the usage of random here :)
https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/apis/storage/fuzzer/fuzzer.go#L43
So in the case where the feature gate is enabled and the driver object exists as well as not having defined a .Spec.FSGroupPolicy setting, we will trigger this error:
https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/volume/csi/csi_plugin.go#L912

Thank you for pointing out the mistake in the coded link. Yeah, it is testing code. The default values are filled here instead https://github.com/kubernetes/kubernetes/blob/af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38/pkg/apis/storage/v1/defaults.go#L56. So it is not possible that the feature gate is enabled and the driver object exists as well as not having defined a .Spec.FSGroupPolicy setting (I tested this behavior). Moreover, if K8s don't fill the .Spec.FSGroupPolicy, it will break many existing CSI providers. I think K8s team must have already considered that.

Do we define .Spec.FSGroupPolicy when creating the driver object?

No, we don't define. We actually cannot define it since we are using client-go v1.16 and it doesn't have the field .Spec.FSGroupPolicy in the CSIDriverSpec struct. So, we depend on the K8s to fill in the default value for .Spec.FSGroupPolicy.

@PhanLe1010
Copy link
Contributor

Analysis

1. Running pod as non-root user and the problem

It is good practice to run pod as a non-root user to prevent bugs or malicious code from taking over the system. An example of a pod that runs as non-root user could be:

apiVersion: v1
kind: Pod
metadata:
  name: security-context-demo
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
  volumes:
  - name: sec-ctx-vol
    emptyDir: {}
  containers:
  - name: sec-ctx-demo
    image: busybox
    command: [ "sh", "-c", "sleep 1h" ]
    volumeMounts:
    - name: sec-ctx-vol
      mountPath: /data/demo
    securityContext:
      allowPrivilegeEscalation: false

The above YAML file tells Kubernetes to run every process of every container inside the pod with user ID 1000 and primary group ID 3000. So far so good, processes are not running as root. However, there is a problem here. If the volume sec-ctx-vol contains files/directories that the user ID 1000 and the group ID 3000 don't have permission to access, the processes in the pod cannot access those files/directories. In the worst case, if the root of the volume sec-ctx-vol don't have read/write permission for user ID 1000 and group ID 3000, all processes cannot access the volume.

2. fsGroup and the problem

To solve the above problem, users can specify the field fsGroup as below:

spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000

Upon seeing the field fsGroup: 2000, Kubernetes will take the following actions:

  1. Make sure that all processes of the containers inside the pod are part of the supplementary group ID 2000
  2. Any new files created in the volume sec-ctx-vol will be Group ID 2000
  3. If fsType of the PersistentVolumes is defined and the PersistentVolumes's accessModes is RWO, then Kubernetes will attempt to modify the volume ownership and permissions (to be Group ID 2000 in this case) every time the volume is mounted to the pod.

So, adding the field fsGroup: 2000 solves the problem we see in section 1. However, fsGroup introduces a new problem as described in K8s doc:

The side-effect of setting fsGroup is that, each time a volume is mounted, Kubernetes must recursively chown() and chmod() all the files and directories inside the volume - with a few exceptions noted below. This happens even if group ownership of the volume already matches the requested fsGroup, and can be pretty expensive for larger volumes with lots of small files, which causes pod startup to take a long time.

The bad news is that there is no workaround for this problem in K8s version v1.19 and before. The good news is that K8s v1.20 provides 2 options to solve this problem as discussed in the following sections.

3. pod.spec.securityContext.fsGroupChangePolicy

In section 1, K8s never modifies the volume ownership and permissions before giving it to the pod. In section 2, K8s always recursively modifies the volume ownership and permissions before giving it to the pod. Each of them has its own problem as mentioned.

Since version v1.20, Kubernetes introduces a new beta feature: the field fsGroupChangePolicy. When fsGroupChangePolicy is set to OnRootMismatch, if the root of the volume already has the correct permissions, the recursive permission and ownership change will be skipped. It means that if users don't change the pod.spec.securityContext.fsGroup between pod's restarts, K8s will only have to check the permissions and ownership of the root and the mounting process will be much faster compared to always recursively change the volumes' ownership and permissions.

Note that K8s v1.20 still allows users to keep the previous behavior. By setting fsGroupChangePolicy to always, K8s always changes permission and ownership of the volume when volume is mounted.

Note: if users don't set fsGroupChangePolicy, it will be the same as setting fsGroupChangePolicy to always

4. csiDriver.spec.fsGroupPolicy

For certain volume types, such as NFS or Gluster, the cluster doesn’t perform recursive permission changes even if the pod has a fsGroup. So, K8s v1.20 even optimizes the volume's permissions checking process further by allowing the CSI storage providers to explicitly indicate whether or not they support modifying a volume's ownership or permissions when the volume is being mounted.
CSI providers can specify the following values for csiDriver.spec.fsGroupPolicy:

  • None: Indicates that volumes will be mounted with no modifications, as the CSI volume driver does not support these operations.
  • File: Indicates that the CSI volume driver supports volume ownership and permission change via fsGroup, and Kubernetes may use fsGroup to change permissions and ownership of the volume to match user requested fsGroup in the pod's SecurityPolicy regardless of fstype or access mode.
  • ReadWriteOnceWithFSType: Indicates that volumes will be examined to determine if volume ownership and permissions should be modified to match the pod's security policy. Changes will only occur if the fsType is defined and the persistent volume's accessModes contains ReadWriteOnly.

Note:

  1. If undefined, csiDriver.spec.fsGroupPolicy will default to ReadWriteOnceWithFSType, keeping the previous behavior.
  2. when csiDriver.spec.fsGroupPolicy is set to None or File, it will overwrite the setting for pod.spec.securityContext.fsGroupChangePolicy. Users of the CSI provider cannot control the volume's permissions checking process in those cases.

Conclusion

  1. As tested in [FEATURE] Support CSI fsGroupPolicy #2131 (comment), Longhorn works with Kubernetes v1.20 by default. Also, Kubernetes automatically set csiDriver.spec.fsGroupPolicy=ReadWriteOnceWithFSType for Longhorn CSI driver since Longhorn doesn't specify it.
  2. Longhorn should not change the csiDriver.spec.fsGroupPolicy to a value different than the default value ReadWriteOnceWithFSType because Longhorn only provides block device storage. Base on the filesystem and volume access mode, the users of Longhorn block devices should be able to control the volume's permissions checking process by setting pod.spec.securityContext.fsGroupChangePolicy. If Longhorn sets csiDriver.spec.fsGroupPolicy to None or File, it will overwrite the setting pod.spec.securityContext.fsGroupChangePolicy.

In a nutshell, Longhorn doesn't need to do anything to support csiDriver.spec.fsGroupPolicy and pod.spec.securityContext.fsGroupChangePolicy in Kubernetes v1.20


Testing

The above analysis is supported by the linked documents and the below tests.

1. Running pod as non-root but don't set pod.spec.securityContext.fsGroup

  1. Remove the read, write, and execute for other users on Longhorn volume's root. I.e., change permission for Longhorn volume's root to drwxr-x---. In order to do that, we will first running the flowing pod as root:
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: sec-ctx-vol-pvc
    spec:
      accessModes:
        - ReadWriteOnce
      volumeMode: Filesystem
      storageClassName: longhorn
      resources:
        requests:
          storage: 30Gi  
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: security-context-demo
      namespace: default
    spec:
      restartPolicy: Always
      containers:
      - name: sec-ctx-demo
        command: [ "sh", "-c", "sleep 1h" ]
        image: ubuntu
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: sec-ctx-vol
          mountPath: /data/demo
      volumes:
      - name: sec-ctx-vol
        persistentVolumeClaim:
          claimName: sec-ctx-vol-pvc
    
  2. exec into the pod, and create some file
    root@security-context-demo:/data# id
    uid=0(root) gid=0(root) groups=0(root)
    root@security-context-demo:~# cd /data/
    root@security-context-demo:/data# ls -l
    total 4
    drwxr-xr-x 3 root root 4096 Feb 15 00:05 demo
    root@security-context-demo:/data# ls demo/ -l
    total 16
    drwx------ 2 root root 16384 Feb 15 00:05 lost+found
    root@security-context-demo:/data# chmod 750 demo/
    root@security-context-demo:/data# ls -l
    total 4
    drwxr-x--- 3 root root 4096 Feb 15 00:05 demo
    
  3. Now delete the above pod, re-deploy the pod as non-root user by adding runAsUser, and runAsGroup . Note that we don't add fsGroup :
    apiVersion: v1
    kind: Pod
    metadata:
      name: security-context-demo
      namespace: default
    spec:
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000
      restartPolicy: Always
      containers:
      - name: sec-ctx-demo
        command: [ "sh", "-c", "sleep 1h" ]
        image: ubuntu
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: sec-ctx-vol
          mountPath: /data/demo
      volumes:
      - name: sec-ctx-vol
        persistentVolumeClaim:
          claimName: sec-ctx-vol-pvc
    
  4. Exec into the pod. Observe that we cannot access directory /data/demo as user ID 1000 and primary group ID 3000
    I have no name!@security-context-demo:/$ id
    uid=1000 gid=3000 groups=3000
    I have no name!@security-context-demo:/$ ls /data/ -l
    total 4
    drwxr-x--- 3 root root 4096 Feb 15 00:05 demo
    I have no name!@security-context-demo:/$ ls /data/demo -l
    ls: cannot open directory '/data/demo': Permission denied
    I have no name!@security-context-demo:/$ touch /data/demo/file.txt   
    touch: cannot touch '/data/demo/file.txt': Permission denied
    

2. Running pod as non-root and set pod.spec.securityContext.fsGroup

  1. Delete the above pod. Re-deploy the pod and add fsGroup: 2000 as bellow:
    apiVersion: v1
    kind: Pod
    metadata:
      name: security-context-demo
      namespace: default
    spec:
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000
        fsGroup: 2000
      restartPolicy: Always
      containers:
      - name: sec-ctx-demo
        command: [ "sh", "-c", "sleep 1h" ]
        image: ubuntu
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: sec-ctx-vol
          mountPath: /data/demo
      volumes:
      - name: sec-ctx-vol
        persistentVolumeClaim:
          claimName: sec-ctx-vol-pvc
    
  2. Exec into the pod, we can see that Kubernetes did 3 actions as mentioned in section # 2 in the above analysis.
    I have no name!@security-context-demo:/$ id
    uid=1000 gid=3000 groups=3000,2000
    I have no name!@security-context-demo:/$ ls /data/ -l
    total 4
    drwxrws--- 3 root 2000 4096 Feb 15 00:05 demo
    I have no name!@security-context-demo:/$ cd /data/demo/
    I have no name!@security-context-demo:/data/demo$ ls -l
    total 16
    drwxrws--- 2 root 2000 16384 Feb 15 00:05 lost+found
    I have no name!@security-context-demo:/data/demo$ echo "hello world" > file.txt
    I have no name!@security-context-demo:/data/demo$ ls -l
    total 20
    -rw-r--r-- 1 1000 2000    12 Feb 15 00:46 file.txt
    drwxrws--- 2 root 2000 16384 Feb 15 00:05 lost+found
    
  3. One key point to take away is that Kubernetes did recursively change ownership and permissions for every file/directory in the Longhorn volume.
  4. Now, to see the problem with recursively change ownership and permissions. Let's create a lot of files inside /data/demo/files. We can use the following script to generate the files:
    #!/bin/bash
    file_prefix="longhorn-file-"
    
    POSITIONAL=()
    while [[ $# -gt 0 ]]; do
            key="$1"
            case $key in
                    -c|--count)
                    count="$2"
                    shift # past argument
                    shift # past value
                    ;;
                    -h|--help)
                    help="true"
                    shift
                    ;;
                    *)
                    echo "Error! invalid flag: ${key}"
                    help="true"
                    break
                    ;;
            esac
    done
    
    usage () {
            echo "USAGE: $0 --count 10000"
            echo "  [-c|--count] number of file to be created inside the current folder"
            echo "  [-h|--help] Usage message"
    }
    
    if [[ $help ]]; then
            usage
            exit 0
    fi
    
    set -e -x
    
    
    if [[ $count ]]; then
        i="0"
            while [ $i -lt $count ]
        do
        dd if=/dev/urandom of="$file_prefix$i" count=1 bs=512
        i=$[$i+1]
        done
    fi
    
  5. Copy the script into /data/demo/files/create_file.sh.
    kubectl cp create-file.sh default/security-context-demo:data/demo/files
    
  6. Create 500000 files of 512 bytes inside data/demo/files:
    I have no name!@security-context-demo:/$ cd /data/demo/files/
    I have no name!@security-context-demo:/data/demo/files$ ls -l
    total 4
    -rwxr-xr-x 1 1000 2000 663 Feb 15 01:02 create-file.sh
    create-file.sh
    I have no name!@security-context-demo:/data/demo/files$ ./create-file.sh --count 500000
    
  7. Let's see how long it takes Kubernetes to remount Longhorn volume. Let's delete the pod and re-deploy it as in step 1 of this test case. Observe that it takes 3m30s for Kunertes to finish mounting the Longhorn volume.
  8. Observe that it takes 15 mins for Kunertes to finish mounting the Longhorn volume if we create 2 millions files inside /data/demo/files/

3. Set pod.spec.securityContext.fsGroupChangePolicy= OnRootMismatch

  1. Delete the pod. Re-deploy the pod with pod.spec.securityContext.fsGroupChangePolicy= OnRootMismatch
    apiVersion: v1
    kind: Pod
    metadata:
      name: security-context-demo
      namespace: default
    spec:
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000
        fsGroup: 2000
        fsGroupChangePolicy: "OnRootMismatch"
      restartPolicy: Always
      containers:
      - name: sec-ctx-demo
        command: [ "sh", "-c", "sleep 1h" ]
        image: ubuntu
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: sec-ctx-vol
          mountPath: /data/demo
      volumes:
      - name: sec-ctx-vol
        persistentVolumeClaim:
          claimName: sec-ctx-vol-pvc
    
  2. Observe that it takes only 13s for Kubernetes to finish mounting the Longhorn volume and the pod is in running state. It is because we didn't change the pod.spec.securityContext.fsGroup. So, it is matched with the volume root group ownership and Kubernetes doesn't recursively perform the chown()and chmod()
  3. Let's change the fsGroup to 5000, delete the pod, and re-deploy the pod as:
    apiVersion: v1
    kind: Pod
    metadata:
      name: security-context-demo
      namespace: default
    spec:
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000
        fsGroup: 5000
        fsGroupChangePolicy: "OnRootMismatch"
      restartPolicy: Always
      containers:
      - name: sec-ctx-demo
        command: [ "sh", "-c", "sleep 1h" ]
        image: ubuntu
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: sec-ctx-vol
          mountPath: /data/demo
      volumes:
      - name: sec-ctx-vol
        persistentVolumeClaim:
          claimName: sec-ctx-vol-pvc
    
  4. Because the fsGroup doesn't match the volume root group ownership. Kubernetes recursively perform the chown()and chmod(). Observe that it takes 3m30s for Kunertes to finish mounting the Longhorn volume.

4. Test csiDriver.spec. fsGroupPolicy

So far we have seen the behavior of Kubernetes when the csiDriver.spec.fsGroupPolicy is set to ReadWriteOnceWithFSType which is the default value. Let's see what happens when we csiDriver.spec.fsGroupPolicy is set to None which indicates that volumes will be mounted with no modifications.

  1. Delete the pod from the previous step. Delete the PVC sec-ctx-vol-pvc which will automatically delete the corresponding PV and Longhorn volume.

  2. Delete Longhorn CSI driver CR

    kubectl delete csidriver driver.longhorn.io
    
  3. Re-create Longhorn CSI driver CR with spec.fsGroupPolicy set to None:

    apiVersion: storage.k8s.io/v1
    kind: CSIDriver
    metadata:
      annotations:
        driver.longhorn.io/kubernetes-version: v1.20.2+k3s1
        driver.longhorn.io/version: v1.1.0
      creationTimestamp: "2021-02-11T20:01:07Z"
      name: driver.longhorn.io
      resourceVersion: "50138"
      uid: 92358d9f-f82c-4157-975e-ec0120955591
    spec:
      attachRequired: true
      # fsGroupPolicy: None
      podInfoOnMount: true
      volumeLifecycleModes:
      - Persistent
    
  4. Deploy the following pod:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: sec-ctx-vol-pvc
    spec:
      accessModes:
        - ReadWriteOnce
      volumeMode: Filesystem
      storageClassName: longhorn
      resources:
        requests:
          storage: 30Gi  
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: security-context-demo
      namespace: default
    spec:
      restartPolicy: Always
      containers:
      - name: sec-ctx-demo
        command: [ "sh", "-c", "sleep 1h" ]
        image: ubuntu
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: sec-ctx-vol
          mountPath: /data/demo
      volumes:
      - name: sec-ctx-vol
        persistentVolumeClaim:
          claimName: sec-ctx-vol-pvc
    
  5. exec into the pod. Remove the read, write, and execute for other users on Longhorn volume's root. I.e., change permission for Longhorn volume's root to drwxr-x---.

    root@security-context-demo:/data# id
    uid=0(root) gid=0(root) groups=0(root)
    root@security-context-demo:~# cd /data/
    root@security-context-demo:/data# chmod 750 demo/
    root@security-context-demo:/data# ls -l
    total 4
    drwxr-x--- 3 root root 4096 Feb 15 00:05 demo
    
  6. Now delete the above pod, re-deploy the pod as:

    apiVersion: v1
    kind: Pod
    metadata:
      name: security-context-demo
      namespace: default
    spec:
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000
        fsGroup: 2000
        fsGroupChangePolicy: "Always"
      restartPolicy: Always
      containers:
      - name: sec-ctx-demo
        command: [ "sh", "-c", "sleep 1h" ]
        image: ubuntu
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: sec-ctx-vol
          mountPath: /data/demo
      volumes:
      - name: sec-ctx-vol
        persistentVolumeClaim:
          claimName: sec-ctx-vol-pvc
    
  7. Exec into the pod. Observe that we cannot access directory /data/demo as user ID 1000, primary group ID 3000, and supplemental group ID 2000

    I have no name!@security-context-demo:/$ id
    uid=1000 gid=3000 groups=3000,2000
    I have no name!@security-context-demo:/$ ls /data/ -l
    total 4
    drwxr-x--- 3 root root 4096 Feb 15 00:05 demo
    I have no name!@security-context-demo:/$ ls /data/demo -l
    ls: cannot open directory '/data/demo': Permission denied
    I have no name!@security-context-demo:/$ touch /data/demo/file.txt   
    touch: cannot touch '/data/demo/file.txt': Permission denied
    

We can see that setting csiDriver.spec.fsGroupPolicy to None will overwrite the setting pod.spec.securityContext.fsGroupChangePolicy. From Longhorn perspective, we don't want to overwrite this as explained in section # 4 in the above analysis.

@yasker
Copy link
Member Author

yasker commented Feb 16, 2021

Thanks, @PhanLe1010 , it's a well-written explanation.

I also feel that we should put this either into our documentation or KB.

After that, I will leave it for QA to verify the result, then we can close it.

@yasker yasker added require/knowledge-base Require adding knowledge base document and removed kind/feature Feature request, new feature require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/manual-test-plan Require adding/updating manual test cases if they can't be automated labels Feb 16, 2021
@yasker
Copy link
Member Author

yasker commented Feb 18, 2021

FYI: During the team meeting, we decide that we should write a KB article for the issue.

@yasker yasker removed the highlight Important feature/issue to highlight label Feb 23, 2021
@carpenike
Copy link

Hey Folks,

I'm running into this problem and it seems as though something needs to be set in the csidriver. Even with the securityContext defined in the pod I'm still getting the kubernetes.io/csi: expected valid fsGroupPolicy, received nil value or empty string.

I've got this defined in spec.securityContext

  securityContext:
    fsGroup: 1000
    runAsGroup: 1000
    runAsUser: 1000

@PhanLe1010
Copy link
Contributor

@carpenike Which Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version are you using? Also, which Longhorn version?

@carpenike
Copy link

@PhanLe1010 --

Kubeadm deployed on top of Ubuntu. Longhorn 1.1.0.

ryan@h2-0:~$ sudo kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:25:59Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

@carpenike
Copy link

Yup! They're ubuntu nodes so it's just an apt update process. Kubelet restarts on each node and reports in to the control-plane with the updated version.

@carpenike
Copy link

Should I just remove the driver and re-create it?

@PhanLe1010
Copy link
Contributor

PhanLe1010 commented Feb 24, 2021

@carpenike Yes, the workaround is just simple as that. You can also just delete the pod longhorn-driver-deployer-xxxx and it will be restarted and recreate the CSIDrive CR.

However, is it possible that you can hold on a second? We would like to learn about how to reproduce the issue. It would be great if you can help us.

@carpenike
Copy link

Sure no problem. :)

@PhanLe1010
Copy link
Contributor

PhanLe1010 commented Feb 24, 2021

Thanks. What is your upgrade order?
Is it

  1. worker nodes (drain, upgrade kubelet and kubectl, uncordon) -> control plane nodes
    or
  2. control plane nodes -> worker nodes (drain, upgrade kubelet and kubectl, uncordon)

@carpenike
Copy link

Generally the first one.

@PhanLe1010
Copy link
Contributor

PhanLe1010 commented Feb 24, 2021

Ok, that explains why Longhorn is saying that your Kubernetes version is v1.19.8 in the CSI Drive CR. Since at the time the longhorn-driver-deployer pod starts, it sees that K8s server has the old v1.19 version. Latter on, when the K8s server is upgraded to v1.20. The longhorn-driver-deployer pod is not restarted so it doesn't see the new v1.20 version and it doesn't update the CSI Driver CR.

@PhanLe1010
Copy link
Contributor

PhanLe1010 commented Feb 24, 2021

Can you try the workaround to see if the CSI Driver CR is updated with the new K8s version and has the field fsGroupPolicy?

delete the pod longhorn-driver-deployer-xxxx and it will be restarted and recreate the CSIDrive CR.

@carpenike
Copy link

Done. It seems like longhorn is caching the k8s version?

ryan on ﴱ ryan-k8s in ~ ❯ kubectl logs -n longhorn-system longhorn-driver-deployer-666c84fbb7-5b4qp                                                                                                                                                                                                                                                                 [18:50:46]
W0224 23:49:14.536576       1 client_config.go:541] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2021-02-24T23:49:14Z" level=debug msg="Deploying CSI driver"
time="2021-02-24T23:49:14Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Pending"
time="2021-02-24T23:49:15Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Pending"
time="2021-02-24T23:49:16Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Pending"
time="2021-02-24T23:49:17Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Pending"
time="2021-02-24T23:49:18Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Running"
time="2021-02-24T23:49:19Z" level=debug msg="proc cmdline detection pod discover-proc-kubelet-cmdline in phase: Running"
time="2021-02-24T23:49:20Z" level=info msg="Proc found: kubelet"
time="2021-02-24T23:49:20Z" level=info msg="Try to find arg [--root-dir] in cmdline: [/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --node-ip r720-0.holthome.net ]"
time="2021-02-24T23:49:20Z" level=warning msg="Cmdline of proc kubelet found: \"/usr/bin/kubelet\x00--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf\x00--kubeconfig=/etc/kubernetes/kubelet.conf\x00--config=/var/lib/kubelet/config.yaml\x00--container-runtime=remote\x00--container-runtime-endpoint=/run/containerd/containerd.sock\x00--node-ip\x00r720-0.holthome.net\x00\". But arg \"--root-dir\" not found. Hence default value will be used: \"/var/lib/kubelet\""
time="2021-02-24T23:49:20Z" level=info msg="Detected root dir path: /var/lib/kubelet"
time="2021-02-24T23:49:20Z" level=info msg="Upgrading Longhorn related components for CSI v1.1.0"
time="2021-02-24T23:49:20Z" level=debug msg="Detected CSI Driver driver.longhorn.io CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected service csi-attacher CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected deployment csi-attacher CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected service csi-provisioner CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected deployment csi-provisioner CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected service csi-resizer CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected deployment csi-resizer CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected service csi-snapshotter CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected deployment csi-snapshotter CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="Detected daemon set longhorn-csi-plugin CSI version v1.1.0 Kubernetes version v1.19.8 has already been deployed"
time="2021-02-24T23:49:20Z" level=debug msg="CSI deployment done"

@carpenike
Copy link

CSIDriver looks the same too:

ryan on ﴱ ryan-k8s in ~ ❯ kubectl get csidriver driver.longhorn.io -o yaml                                                                                                                                                                                                                                                                                          [18:51:59]
apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
  annotations:
    driver.longhorn.io/kubernetes-version: v1.19.8
    driver.longhorn.io/version: v1.1.0
  creationTimestamp: "2021-02-23T05:43:48Z"
  managedFields:
  - apiVersion: storage.k8s.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:driver.longhorn.io/kubernetes-version: {}
          f:driver.longhorn.io/version: {}
      f:spec:
        f:attachRequired: {}
        f:podInfoOnMount: {}
        f:volumeLifecycleModes: {}
    manager: longhorn-manager
    operation: Update
    time: "2021-02-23T05:43:48Z"
  - apiVersion: storage.k8s.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        f:fsGroupPolicy: {}
    manager: kubectl-edit
    operation: Update
    time: "2021-02-24T17:38:28Z"
  name: driver.longhorn.io
  resourceVersion: "1510508"
  selfLink: /apis/storage.k8s.io/v1/csidrivers/driver.longhorn.io
  uid: de782a0c-3da5-49dd-8178-b1e4ba5a1133
spec:
  attachRequired: true
  podInfoOnMount: true
  volumeLifecycleModes:
  - Persistent

@PhanLe1010
Copy link
Contributor

@PhanLe1010
Copy link
Contributor

Can you try:

kubectl proxy --port=8080
curl http://localhost:8080/version

and

kubectl version

@carpenike
Copy link

Figured it out -- the k8s kubelets had upgraded to 1.20.4 but the underlying API and services had not. Ran through the kubeadm upgrade process and now things are starting up as expected. Deleted the longhorn-driver and it's re-deploying everything.

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/

This would likely not have been encountered with a fresh 1.20.x cluster.

PhanLe1010 added a commit to PhanLe1010/website that referenced this issue Feb 27, 2021
…inish mounting`

longhorn/longhorn#2131

Signed-off-by: Phan Le <phan.le@rancher.com>
PhanLe1010 added a commit to PhanLe1010/website that referenced this issue Feb 27, 2021
…inish mounting`

longhorn/longhorn#2131

Signed-off-by: Phan Le <phan.le@rancher.com>
PhanLe1010 added a commit to PhanLe1010/website that referenced this issue Feb 27, 2021
…inish mounting`

longhorn/longhorn#2131

Signed-off-by: Phan Le <phan.le@rancher.com>
@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Feb 27, 2021

Pre-merged Checklist

  • Is the reproduce steps/test steps documented?

  • Is there a workaround for the issue? If so, is it documented?
    No for K8s v1.19-. Yes for K8s v1.20+

  • Does the PR include the explanation for the fix or the feature?

  • Does the PR include deployment change (YAML/Chart)? If so, have both YAML file and Chart been updated in the PR?

  • Is the backend code merged (Manager, Engine, Instance Manager, BackupStore etc)?
    The PR is at

  • Which areas/issues this PR might have potential impacts on?
    Area
    Issues

  • If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?
    The LEP PR is at

  • If labeled: area/ui Has the UI issue filed or ready to be merged?
    The UI issue/PR is at

  • If labeled: require/doc Has the necessary document PR submitted or merged?
    The Doc issue/PR is at Add KB for the issue volumes with lot of files take a long time to finish mounting website#265

  • If labeled: require/automation-e2e Has the end-to-end test plan been merged? Have QAs agreed on the automation test case?
    The automation skeleton PR is at
    The automation test case PR is at

  • If labeled: require/automation-engine Has the engine integration test been merged?
    The engine automation PR is at

  • If labeled: require/manual-test-plan Has the manual test plan been documented?
    The updated manual test plan is at

  • If the fix introduces the code for backward compatibility Has an separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at

PhanLe1010 added a commit to PhanLe1010/website that referenced this issue Mar 6, 2021
…inish mounting`

longhorn/longhorn#2131

Signed-off-by: Phan Le <phan.le@rancher.com>
yasker pushed a commit to longhorn/website that referenced this issue Mar 6, 2021
…inish mounting`

longhorn/longhorn#2131

Signed-off-by: Phan Le <phan.le@rancher.com>
@khushboo-rancher khushboo-rancher self-assigned this Mar 10, 2021
@khushboo-rancher
Copy link
Contributor

Verified the scenarios from #2131 (comment), all the explanations hold good.

Validations - Pass

  1. The pod with fsGroupChangePolicy: "Always" or no fsGroupChangePolicy or with different fsGroups and csiDriver.spec.fsGroupPolicy: ReadWriteOnceWithFSType takes around 6 min to mount the path containing 500000 files. This confirms the recursive change in the ownership and permissions.
  2. The pod with fsGroupChangePolicy: "OnRootMismatch" and csiDriver.spec.fsGroupPolicy: ReadWriteOnceWithFSType takes around 18 sec to mount the path containing 500000 files.
  3. If csiDriver.spec.fsGroupPolicy is set to None, the non-root users are unable to access the files in the mount path.
  4. If csiDriver.spec.fsGroupPolicy is set to File, the pod always take around 6 min to mount the path irrespective of the fsGroupChangePolicy value.
  5. On upgrade of the cluster from K8s v1.19.8 to K8s v1.20.4, the fsGroupPolicy: ReadWriteOnceWithFSType gets updated in CSIDriver driver.longhorn.io.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes Kubernetes related like K8s version compatibility component/longhorn-manager Longhorn manager (control plane) priority/0 Must be fixed in this release (managed by PO) require/doc Require updating the longhorn.io documentation require/knowledge-base Require adding knowledge base document
Projects
None yet
Development

No branches or pull requests

6 participants