[BUG] Pod mount took a long time even PV/PVC bound #2590

jenting · 2021-05-13T01:31:33Z

Describe the bug

Pod mount took a long time even PV/PVC bound.
The way to troubleshoot it is to get the kubelet log, and generally, we'll see what's happening in it.

To Reproduce

I can make it happen when the longhorn manager is busy receiving the request, that is I have 1 control plane + 3 worker nodes and longhorn runs on these 4 nodes on k3s v1.21.0+k3s1.

Steps to reproduce the behavior:

Create lots of storage class like 30 with recurring job 1 min 1/* * * * *.
Create the Pods that use these storage classes.
You could see that the PV/PVC bound, however, the Pod took a long time to mount the volume or never be able to mount the volume.
At kubelet log, you could see either

2021-05-12T05:03:40.341728+00:00 jenting-longhorn-master-0 k3s[17498]: E0512 05:03:40.341637   17498 remote_runtime.go:394] "ExecSync cmd from runtime service failed" err="rpc error: code = DeadlineExceeded desc = failed to exec in container: timeout 1s exceeded: context deadline exceeded" containerID="2772def98866331ae76c100ab412ce1961a1e35e19bf4f9f9345077a0d733f9b" cmd=[sh -c ls /data/longhorn && /data/longhorn version --client-only]

or

2021-05-12T04:41:01.854806+00:00 jenting-longhorn-master-0 k3s[17498]: I0512 04:41:01.854710   17498 reconciler.go:295] attacherDetacher.AttachVolume started for volume "pvc-2f923abe-9a0d-4262-8d71-a85e1245724a" (UniqueName: "kubernetes.io/csi/driver.longhorn.io^pvc-2f923abe-9a0d-4262-8d71-a85e1245724a") from node "jenting-longhorn-master-0"
2021-05-12T04:41:02.442548+00:00 jenting-longhorn-master-0 k3s[17498]: E0512 04:41:02.441322   17498 csi_attacher.go:698] kubernetes.io/csi: attachment for pvc-2f923abe-9a0d-4262-8d71-a85e1245724a failed: rpc error: code = DeadlineExceeded desc = Failed to attach volume pvc-2f923abe-9a0d-4262-8d71-a85e1245724a to node jenting-longhorn-master-0
2021-05-12T04:41:02.442891+00:00 jenting-longhorn-master-0 k3s[17498]: E0512 04:41:02.441429   17498 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/driver.longhorn.io^pvc-2f923abe-9a0d-4262-8d71-a85e1245724a podName: nodeName:}" failed. No retries permitted until 2021-05-12 04:43:04.441401417 +0000 UTC m=+71475.522372066 (durationBeforeRetry 2m2s). Error: "AttachVolume.Attach failed for volume \"pvc-2f923abe-9a0d-4262-8d71-a85e1245724a\" (UniqueName: \"kubernetes.io/csi/driver.longhorn.io^pvc-2f923abe-9a0d-4262-8d71-a85e1245724a\") from node \"jenting-longhorn-master-0\" : rpc error: code = DeadlineExceeded desc = Failed to attach volume pvc-2f923abe-9a0d-4262-8d71-a85e1245724a to node jenting-longhorn-master-0"
2021-05-12T04:41:02.443059+00:00 jenting-longhorn-master-0 k3s[17498]: I0512 04:41:02.441842   17498 event.go:291] "Event occurred" object="default/volume-test-1" kind="Pod" apiVersion="v1" type="Warning" reason="FailedAttachVolume" message="AttachVolume.Attach failed for volume \"pvc-2f923abe-9a0d-4262-8d71-a85e1245724a\" : rpc error: code = DeadlineExceeded desc = Failed to attach volume pvc-2f923abe-9a0d-4262-8d71-a85e1245724a to node jenting-longhorn-master-0"

Expected behavior

The Pod should be able to mount the volume quickly.

Log
If applicable, add the Longhorn managers' log when the issue happens.

You can also attach a Support Bundle here. You can generate a Support Bundle using the link at the footer of the Longhorn UI.

Environment:

Longhorn version: v1.1.1
Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Rancher Catalog App
Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: k3s v1.21.0+k3s1
- Number of management node in the cluster: 1
- Number of worker node in the cluster: 3
Node config
- OS type and version: SLES15 SP2
- CPU per node: 4
- Memory per node: 8G
- Disk type(e.g. SSD/NVMe): SSD
- Network bandwidth between the nodes:
Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): VMWare
Number of Longhorn volumes in the cluster: 0

Additional context

Since the longhorn manager is the control plan of the longhorn cluster.
I'm think how to make the longhorn manager won't influence like running recurring job very often.

For example, if we run a recurring job for 3 mins, but the job needs 5 mins to finished. This would make the resource starvation. When we launch a new job to run the recurring job, we should make sure this volume has no previous job running.

Furthermore, the current snapshot backup is running in longhorn-manager Pod, we should make it run as a job to not occupy longhorn-manager memory resource.

Or, the simplest way is to run longhorn manager Pods on control plane nodes the other longhorn components run on the worker node.

Furthermore, if possible, we should enhance the longhorn support bundle to get the host kubelet log as well.

The text was updated successfully, but these errors were encountered:

jenting · 2021-05-13T01:35:25Z

Related issues

surenraju-zz · 2021-05-25T04:38:05Z

I am adding the same comment from #2583

Often I notice this issue. On random node/nodes, CSI plugin is stuck on mount command. I also noticed that after restarting longhorn-csi-plugin-____ for that problematic node, issue is getting resolved. But issue appears in another node after few days.

Could this be related to amount of data(about 10GB) the RWX volume holds or number of pods associated with this volume? We are using RWX volumes with argo workflows. 100's of pods are launched with that specific RWX volumes. These pods run to complete some steps and new pods will be launched. All those 100's of the pods will consume same RWX volume.

innobead · 2021-05-26T09:49:51Z

cc @joshimoo

innobead · 2021-05-26T10:02:27Z

@khushboo-rancher please help reproduce this one and attach the bundle for the debugging reference.

cc @kaxing

jenting · 2021-06-08T02:29:28Z

Besides that, I encounter another issue on Kubernetes v1.21.0.

2021-05-12T05:03:40.341728+00:00 jenting-longhorn-master-0 k3s[17498]: E0512 05:03:40.341637   17498 remote_runtime.go:394] "ExecSync cmd from runtime service failed" err="rpc error: code = DeadlineExceeded desc = failed to exec in container: timeout 1s exceeded: context deadline exceeded" containerID="2772def98866331ae76c100ab412ce1961a1e35e19bf4f9f9345077a0d733f9b" cmd=[sh -c ls /data/longhorn && /data/longhorn version --client-only]

The workaround to fix it is to increase the readinessProbe.timeoutSeconds on engine-image-ei-xxx DaemonSet from 1 sec to 3 secs. According to the https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes, start from Kubernetes 1.20, the timeoutSeconds behavior got fixed. Therefore, if the user's Kubernetes cluster that the readinesssProbe is over than 1 sec, the Pod can't mount value successfully. The only way to know the root cause is to dump the kubelet log.

joshimoo · 2021-06-08T09:26:04Z

@innobead @jenting If the current liveness probes that we use require more then 1s to return, we might want to consider using alternative probes. In a previous issue we discussed that the exec probes are pretty heavy. While a http probe is pretty cheap.

innobead · 2021-06-17T01:04:56Z

@innobead @jenting If the current liveness probes that we use require more then 1s to return, we might want to consider using alternative probes. In a previous issue we discussed that the exec probes are pretty heavy. While a http probe is pretty cheap.

@jenting enhanced the timeout increase in readiness probe and changing some liveness probe from exec to socket in longhorn/longhorn-manager#928, but this is not the fix for this issue directly, but instead, another finding when testing on 1.21.0.

mantissahz · 2023-10-12T03:04:04Z

When we launch a new job to run the recurring job, we should make sure this volume has no previous job running.
Furthermore, the current snapshot backup is running in longhorn-manager Pod, we should make it run as a job to not occupy longhorn-manager memory resource.

And recurring jobs had been removed from storage class
Now try to changes all ReadnessProbe and LivenessProbe from executing longhorn command to HTTP or TCP sockets.

Related: #3907
@longhorn/qa Could anyone help me reproduce this issue and I would continue reproducing this issue as well.

derekbit · 2024-04-14T11:36:04Z

@longhorn/qa Can you help check if the issue is still valid? Thank you.

cc @roger-ryao @chriscchien @yangchiu

khushboo-rancher · 2024-04-15T17:41:23Z

@roger-ryao Could you please help with this whenever get a chance?

yangchiu · 2024-05-08T07:42:05Z

Verified passed on master-head (longhorn-manager 1f86d76) with the following steps:

Create recurring jobs for backup and snapshot with cron * * * * *.
Create storage class with the recurring jobs:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: longhorn-test-recurring
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "2880"
  fromBackup: ""
  fsType: "ext4"
  recurringJobSelector: '[{"name":"snapshot", "isGroup":false},{"name":"backup", "isGroup":false}]'

Create statefulset with 50 replicas to create lots of volumes with the storage class:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test-statefulset
  namespace: default
spec:
  selector:
    matchLabels:
      app: test-statefulset
  serviceName: test-statefulset
  replicas: 50
  template:
    metadata:
      labels:
        app: test-statefulset
    spec:
      terminationGracePeriodSeconds: 10
      containers:
        - image: ubuntu:22.04
          imagePullPolicy: IfNotPresent
          name: sleep
          args: ['/bin/sh', '-c', 'while true;do date;sleep 5; done']
          volumeMounts:
            - name: pod-data
              mountPath: /data
  volumeClaimTemplates:
    - metadata:
        name: pod-data
      spec:
        accessModes: ['ReadWriteOnce']
        storageClassName: 'longhorn-test-recurring'
        resources:
          requests:
            storage: 1Gi

50 volumes can be created and attached without problems. 50 statefulset replica pods can run without problems. And check k3s logs on each worker node with sudo journalctl -u k3s-agent.service --no-pager | grep -i deadline , there is no failed to attach related error logs.

jenting added the kind/bug label May 13, 2021

jenting changed the title ~~[TASBUG] Pod mount took a long time even PV/PVC bound~~ [BUG] Pod mount took a long time even PV/PVC bound May 13, 2021

innobead added area/stability System or volume stability component/longhorn-manager Longhorn manager (control plane) labels May 17, 2021

innobead added the area/csi CSI related like control/node driver, sidecars label May 26, 2021

shuo-wu mentioned this issue Jun 8, 2021

[BUG] Self-Hosted Minio-Backupstorage - timeout during Backup #2218

Closed

innobead added this to the Planning milestone Jun 8, 2021

innobead added severity/3 Function working but has a major issue w/ workaround area/performance System, volume performance and removed area/stability System or volume stability labels Jun 8, 2021

innobead modified the milestones: Planning, v1.2.0 Jun 8, 2021

innobead assigned jenting Jun 16, 2021

This was referenced Jun 16, 2021

Increase ReadinessProbe TimeoutSeconds to 4 secs longhorn/longhorn-manager#928

Merged

[Backport-v1.1.2] Increase ReadinessProbe TimeoutSeconds to 4 secs longhorn/longhorn-manager#930

Merged

innobead added the require/manual-test-plan Require adding/updating manual test cases if they can't be automated label Jun 17, 2021

innobead unassigned jenting Jun 21, 2021

innobead modified the milestones: v1.2.0, v1.3.0 Aug 12, 2021

innobead modified the milestones: v1.3.0, v1.4.0 Mar 31, 2022

innobead assigned mantissahz Nov 7, 2022

innobead removed this from the v1.4.0 milestone Nov 7, 2022

innobead added this to the v1.5.0 milestone Nov 7, 2022

innobead modified the milestones: v1.5.0, v1.6.0 May 3, 2023

innobead modified the milestones: v1.6.0, v1.7.0 Nov 29, 2023

yangchiu closed this as completed May 8, 2024

yangchiu self-assigned this May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Pod mount took a long time even PV/PVC bound #2590

[BUG] Pod mount took a long time even PV/PVC bound #2590

jenting commented May 13, 2021 •

edited

jenting commented May 13, 2021 •

edited

surenraju-zz commented May 25, 2021

innobead commented May 26, 2021

innobead commented May 26, 2021

jenting commented Jun 8, 2021

joshimoo commented Jun 8, 2021

innobead commented Jun 17, 2021 •

edited

mantissahz commented Oct 12, 2023

derekbit commented Apr 14, 2024

khushboo-rancher commented Apr 15, 2024

yangchiu commented May 8, 2024

[BUG] Pod mount took a long time even PV/PVC bound #2590

[BUG] Pod mount took a long time even PV/PVC bound #2590

Comments

jenting commented May 13, 2021 • edited

jenting commented May 13, 2021 • edited

surenraju-zz commented May 25, 2021

innobead commented May 26, 2021

innobead commented May 26, 2021

jenting commented Jun 8, 2021

joshimoo commented Jun 8, 2021

innobead commented Jun 17, 2021 • edited

mantissahz commented Oct 12, 2023

derekbit commented Apr 14, 2024

khushboo-rancher commented Apr 15, 2024

yangchiu commented May 8, 2024

jenting commented May 13, 2021 •

edited

jenting commented May 13, 2021 •

edited

innobead commented Jun 17, 2021 •

edited