Skip to content
This repository has been archived by the owner on Sep 7, 2022. It is now read-only.

PV attachment causes VM to hang #502

Closed
bradbeam opened this issue Aug 22, 2018 · 8 comments
Closed

PV attachment causes VM to hang #502

bradbeam opened this issue Aug 22, 2018 · 8 comments

Comments

@bradbeam
Copy link

bradbeam commented Aug 22, 2018

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:

When a pod is rescheduled and volumes are moved from one VM to another, we're seeing approx 25s of unresponsiveness per volume attachment.

What you expected to happen:

Volume attachment/detachment happens seamlessly without interrupting / pausing the node.

How to reproduce it (as minimally and precisely as possible):

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: pvctest
  name: pvctest
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pvctest
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: pvctest
    spec:
      containers:
      - image: nginx:latest
        imagePullPolicy: Always
        name: pvctest
        resources: {}
        volumeMounts:
        - mountPath: /pvc1
          name: pvc1
        - mountPath: /pvc2
          name: pvc2
        - mountPath: /pvc3
          name: pvc3
        - mountPath: /pvc4
          name: pvc4
        - mountPath: /pvc5
          name: pvc5
        - mountPath: /pvc6
          name: pvc6
        - mountPath: /pvc7
          name: pvc7
        - mountPath: /pvc8
          name: pvc8
        - mountPath: /pvc9
          name: pvc9
      dnsPolicy: ClusterFirst
      volumes:
      - name: pvc1
        persistentVolumeClaim:
          claimName: pvctest1
      - name: pvc2
        persistentVolumeClaim:
          claimName: pvctest2
      - name: pvc3
        persistentVolumeClaim:
          claimName: pvctest3
      - name: pvc4
        persistentVolumeClaim:
          claimName: pvctest4
      - name: pvc5
        persistentVolumeClaim:
          claimName: pvctest5
      - name: pvc6
        persistentVolumeClaim:
          claimName: pvctest6
      - name: pvc7
        persistentVolumeClaim:
          claimName: pvctest7
      - name: pvc8
        persistentVolumeClaim:
          claimName: pvctest8
      - name: pvc9
        persistentVolumeClaim:
          claimName: pvctest9
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: pvctest
  name: pvctest1
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: pvctest
  name: pvctest2
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: pvctest
  name: pvctest3
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: pvctest
  name: pvctest4
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: pvctest
  name: pvctest5
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: pvctest
  name: pvctest6
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: pvctest
  name: pvctest7
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: pvctest
  name: pvctest8
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: pvctest
  name: pvctest9
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

On the first deployment everything seems to come up fine. After deleting the running pod, we experience the behavior.

ubuntu@s01-mgmt01:~/brad$ kubectl get pods -n brad -o wide
NAME                       READY     STATUS              RESTARTS   AGE       IP        NODE
pvctest-69998f46c6-rs5qc   0/1       ContainerCreating   0          51s       <none>    s01-k8s01.cluster.local
ubuntu@s01-mgmt01:~/brad$ ping s01-pyrex01
PING s01-k8s01.cluster.local (10.0.0.20) 56(84) bytes of data.
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=1 ttl=64 time=17340 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=2 ttl=64 time=16341 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=3 ttl=64 time=15341 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=4 ttl=64 time=14341 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=5 ttl=64 time=13341 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=6 ttl=64 time=12341 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=7 ttl=64 time=11341 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=8 ttl=64 time=10341 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=9 ttl=64 time=9341 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=10 ttl=64 time=8342 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=11 ttl=64 time=7342 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=12 ttl=64 time=6342 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=13 ttl=64 time=5342 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=14 ttl=64 time=4342 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=15 ttl=64 time=3342 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=16 ttl=64 time=2342 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=17 ttl=64 time=1342 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=18 ttl=64 time=342 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=19 ttl=64 time=0.235 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=20 ttl=64 time=0.272 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=21 ttl=64 time=23123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=22 ttl=64 time=22123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=23 ttl=64 time=21123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=24 ttl=64 time=20123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=25 ttl=64 time=19123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=26 ttl=64 time=18123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=27 ttl=64 time=17123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=28 ttl=64 time=16123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=29 ttl=64 time=15123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=30 ttl=64 time=14123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=31 ttl=64 time=13123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=32 ttl=64 time=12123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=33 ttl=64 time=11123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=34 ttl=64 time=10123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=35 ttl=64 time=9123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=36 ttl=64 time=8123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=37 ttl=64 time=7123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=38 ttl=64 time=6123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=39 ttl=64 time=5123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=40 ttl=64 time=4123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=41 ttl=64 time=3123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=42 ttl=64 time=2123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=43 ttl=64 time=1123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=44 ttl=64 time=123 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=45 ttl=64 time=0.239 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=46 ttl=64 time=0.287 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=47 ttl=64 time=0.250 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=48 ttl=64 time=0.273 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=49 ttl=64 time=0.199 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=50 ttl=64 time=0.261 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=51 ttl=64 time=0.271 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=52 ttl=64 time=0.241 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=53 ttl=64 time=0.317 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=54 ttl=64 time=0.254 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=55 ttl=64 time=14894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=56 ttl=64 time=13894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=57 ttl=64 time=12894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=58 ttl=64 time=11894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=59 ttl=64 time=10894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=60 ttl=64 time=9894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=61 ttl=64 time=8894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=62 ttl=64 time=7894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=63 ttl=64 time=6894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=64 ttl=64 time=5894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=65 ttl=64 time=4894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=66 ttl=64 time=3894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=67 ttl=64 time=2894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=68 ttl=64 time=1894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=69 ttl=64 time=894 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=70 ttl=64 time=0.192 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=71 ttl=64 time=24935 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=72 ttl=64 time=23927 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=73 ttl=64 time=22919 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=74 ttl=64 time=21911 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=75 ttl=64 time=20903 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=76 ttl=64 time=19896 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=77 ttl=64 time=18888 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=78 ttl=64 time=17880 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=79 ttl=64 time=16872 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=80 ttl=64 time=15864 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=81 ttl=64 time=14856 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=82 ttl=64 time=13848 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=83 ttl=64 time=12840 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=84 ttl=64 time=11832 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=85 ttl=64 time=10824 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=86 ttl=64 time=9816 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=87 ttl=64 time=8808 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=88 ttl=64 time=7800 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=89 ttl=64 time=6792 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=90 ttl=64 time=5784 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=91 ttl=64 time=4776 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=92 ttl=64 time=3770 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=93 ttl=64 time=2763 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=94 ttl=64 time=1755 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=95 ttl=64 time=747 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=96 ttl=64 time=0.247 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=97 ttl=64 time=0.251 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=98 ttl=64 time=0.213 ms
64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=99 ttl=64 time=0.254 ms
Events:
  Type     Reason                  Age   From                                            Message
  ----     ------                  ----  ----                                            -------
  Normal   Scheduled               3m    default-scheduler                               Successfully assigned pvctest-64b5d5cb96-h6jxh to s01-k8s05.cluster.local
  Warning  FailedAttachVolume      3m    attachdetach-controller                         AttachVolume.Attach failed for volume "pvc-f9026429-a642-11e8-8ef3-005056a3dca5" : Failed to add disk 'scsi1:8'.
  Warning  FailedAttachVolume      3m    attachdetach-controller                         AttachVolume.Attach failed for volume "pvc-f9054441-a642-11e8-8ef3-005056a3dca5" : Failed to add disk 'scsi1:8'.
  Warning  FailedAttachVolume      2m    attachdetach-controller                         AttachVolume.Attach failed for volume "pvc-f911e697-a642-11e8-8ef3-005056a3dca5" : Failed to add disk 'scsi1:8'.
  Warning  FailedAttachVolume      2m    attachdetach-controller                         AttachVolume.Attach failed for volume "pvc-f9080a65-a642-11e8-8ef3-005056a3dca5" : Failed to add disk 'scsi1:8'.
  Warning  FailedAttachVolume      1m    attachdetach-controller                         AttachVolume.Attach failed for volume "pvc-f90033b5-a642-11e8-8ef3-005056a3dca5" : Failed to add disk 'scsi1:8'.
  Warning  FailedAttachVolume      1m    attachdetach-controller                         AttachVolume.Attach failed for volume "pvc-f90989d1-a642-11e8-8ef3-005056a3dca5" : Failed to add disk 'scsi1:8'.
  Warning  FailedAttachVolume      58s   attachdetach-controller                         AttachVolume.Attach failed for volume "pvc-f90e9d37-a642-11e8-8ef3-005056a3dca5" : Failed to add disk 'scsi1:8'.
  Warning  FailedAttachVolume      32s   attachdetach-controller                         AttachVolume.Attach failed for volume "pvc-f90cb795-a642-11e8-8ef3-005056a3dca5" : Failed to add disk 'scsi1:8'.
  Warning  FailedAttachVolume      6s    attachdetach-controller                         AttachVolume.Attach failed for volume "pvc-f9141b6b-a642-11e8-8ef3-005056a3dca5" : Failed to add disk 'scsi1:8'.
  Warning  FailedMount             1m    kubelet, s01-k8s05.cluster.local  Unable to mount volumes for pod "pvctest-64b5d5cb96-h6jxh_default(e4136d7d-a651-11e8-82ea-005056a3dca5)": timeout expired waiting for volumes to attach or mount for pod "default"/"pvctest-64b5d5cb96-h6jxh". list of unmounted volumes=[pvc1 pvc2 pvc3 pvc4 pvc5 pvc6 pvc7 pvc8 pvc9 default-token-82j8x]. list of unattached volumes=[pvc1 pvc2 pvc3 pvc4 pvc5 pvc6 pvc7 pvc8 pvc9 default-token-82j8x]
  Normal   SuccessfulAttachVolume  6s    attachdetach-controller                         AttachVolume.Attach succeeded for volume "pvc-f9026429-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulAttachVolume  5s    attachdetach-controller                         AttachVolume.Attach succeeded for volume "pvc-f9054441-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulMountVolume   5s    kubelet, s01-k8s05.cluster.local  MountVolume.SetUp succeeded for volume "default-token-82j8x"
  Normal   SuccessfulAttachVolume  5s    attachdetach-controller                         AttachVolume.Attach succeeded for volume "pvc-f911e697-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulAttachVolume  5s    attachdetach-controller                         AttachVolume.Attach succeeded for volume "pvc-f9080a65-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulAttachVolume  4s    attachdetach-controller                         AttachVolume.Attach succeeded for volume "pvc-f90033b5-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulAttachVolume  4s    attachdetach-controller                         AttachVolume.Attach succeeded for volume "pvc-f90989d1-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulAttachVolume  3s    attachdetach-controller                         AttachVolume.Attach succeeded for volume "pvc-f90e9d37-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulMountVolume   3s    kubelet, s01-k8s05.cluster.local  MountVolume.SetUp succeeded for volume "pvc-f911e697-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulAttachVolume  3s    attachdetach-controller                         AttachVolume.Attach succeeded for volume "pvc-f90cb795-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulAttachVolume  2s    attachdetach-controller                         AttachVolume.Attach succeeded for volume "pvc-f9141b6b-a642-11e8-8ef3-005056a3dca5"
  Normal   SuccessfulMountVolume   0s    kubelet, s01-k8s05.cluster.local  MountVolume.SetUp succeeded for volume "pvc-f9141b6b-a642-11e8-8ef3-005056a3dca5"

I changed the order of the above slightly to reflect the time; FailedMount shows up with a 1m timestamp, but didn't appear until after the last FailedAttachVolume.
Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
ubuntu@puc-pyrex-mgmt01:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:21:50Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6+coreos.0", GitCommit:"c2aac694c2c7373dfe3ad81f47fc3cbc70a5a8fa", GitTreeState:"clean", BuildDate:"2018-03-21T21:54:22Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
    vSphere 6.5

  • OS (e.g. from /etc/os-release):

ubuntu@puc-pyrex-mgmt01:~$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="16.04.3 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.3 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
  • Kernel (e.g. uname -a):
Linux s01-mgmt01.cluster.local 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
    Kubespray

  • Others:

@cecilkootz
Copy link

Ran into the same issue. When tracking the request through the first step done is to attach the vmdk on the target VM before detaching on the source VM. Because of this the target VM will pause for ~30 seconds while it tries to attach the VMDK causing the VM to freeze during that time. I think VMware should check if the lock exists before pausing the VM to attach the VMDK because this probably affects many other requests but given that it doesn't, would the order of operations need to change?

Attempt to mount c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c at 21:25:28

2018-08-22T21:25:28.577Z| vmx| I125: VigorTransportProcessClientPayload: opID=55bb9a7b-01-19-8395 seq=197756: Receiving HotPlugManager.BeginBatch request.
2018-08-22T21:25:28.577Z| vmx| I125: VigorTransport_ServerSendResponse opID=55bb9a7b-01-19-8395 seq=197756: Completed HotPlugManager request.
2018-08-22T21:25:28.580Z| vmx| I125: VigorTransportProcessClientPayload: opID=55bb9a7b-01-19-8395 seq=197758: Receiving SCSI.IsChildPresent request.
2018-08-22T21:25:28.580Z| vmx| I125: VigorTransport_ServerSendResponse opID=55bb9a7b-01-19-8395 seq=197758: Completed SCSI request.
2018-08-22T21:25:28.583Z| vmx| I125: VigorTransportProcessClientPayload: opID=55bb9a7b-01-19-8395 seq=197760: Receiving Disk.SetBackingInfo request.
2018-08-22T21:25:28.583Z| vmx| I125: VigorTransport_ServerSendResponse opID=55bb9a7b-01-19-8395 seq=197760: Completed DiskEarlyDevice request.
2018-08-22T21:25:28.584Z| vmx| I125: VigorTransportProcessClientPayload: opID=55bb9a7b-01-19-8395 seq=197761: Receiving Disk.SetMode request.
2018-08-22T21:25:28.584Z| vmx| I125: VigorTransport_ServerSendResponse opID=55bb9a7b-01-19-8395 seq=197761: Completed DiskEarlyDevice request.
2018-08-22T21:25:28.584Z| vmx| I125: VigorTransportProcessClientPayload: opID=55bb9a7b-01-19-8395 seq=197762: Receiving Disk.SetDigest request.
2018-08-22T21:25:28.584Z| vmx| I125: VigorTransport_ServerSendResponse opID=55bb9a7b-01-19-8395 seq=197762: Completed DiskEarlyDevice request.
2018-08-22T21:25:28.584Z| vmx| I125: VigorTransportProcessClientPayload: opID=55bb9a7b-01-19-8395 seq=197763: Receiving Disk.SetCtkEnabled request.
2018-08-22T21:25:28.584Z| vmx| I125: VigorTransport_ServerSendResponse opID=55bb9a7b-01-19-8395 seq=197763: Completed DiskEarlyDevice request.
2018-08-22T21:25:28.585Z| vmx| I125: VigorTransportProcessClientPayload: opID=55bb9a7b-01-19-8395 seq=197764: Receiving Disk.SetVFlashEnabled request.
2018-08-22T21:25:28.585Z| vmx| I125: VigorTransport_ServerSendResponse opID=55bb9a7b-01-19-8395 seq=197764: Completed DiskEarlyDevice request.
2018-08-22T21:25:28.585Z| vmx| I125: VigorTransportProcessClientPayload: opID=55bb9a7b-01-19-8395 seq=197765: Receiving Disk.SetPresent request.
2018-08-22T21:25:28.585Z| vmx| I125: HotAdd: Adding scsi-hardDisk with mode 'independent-persistent' to scsi1:8
2018-08-22T21:25:28.586Z| vmx| I125: DISK: OPEN scsi1:8 '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk' independent-persistent R[]
2018-08-22T21:25:32.589Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:36.896Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:41.206Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:45.514Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:49.821Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:54.139Z| vmx| I125: OBJLIB-VSANOBJ: VsanObjLock: Could not lock object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c': Failed to lock the file (4).
2018-08-22T21:25:54.139Z| vmx| I125: DISKLIB-VMFS  : "vsan://c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c" : failed to open (Failed to lock the file): ObjLib_Open failed. Type 3
2018-08-22T21:25:54.139Z| vmx| I125: DISKLIB-LINK  : "/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk" : failed to open (Failed to lock the file).
2018-08-22T21:25:54.139Z| vmx| I125: DISKLIB-CHAIN : "/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk" : failed to open (Failed to lock the file).
2018-08-22T21:25:54.142Z| vmx| I125: DISKLIB-VMFS  : "vsan://c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c" : open successful (524309) size = 10737418240, hd = 0. Type 3
2018-08-22T21:25:54.148Z| vmx| I125: DISKLIB-VMFS  : "vsan://c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c" : closed.
2018-08-22T21:25:54.148Z| vmx| I125: DISKLIB-LIB   : Failed to open '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk' with flags 0xa Failed to lock the file (16392).
2018-08-22T21:25:54.148Z| vmx| I125: DISK: Cannot open disk '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk': Failed to lock the file (16392).
2018-08-22T21:25:54.148Z| vmx| I125: VigorTransport_ServerSendResponse opID=55bb9a7b-01-19-8395 seq=197765: Completed Disk request.
2018-08-22T21:25:54.150Z| vmx| I125: VigorTransportProcessClientPayload: opID=55bb9a7b-01-19-8395 seq=197767: Receiving HotPlugManager.EndBatch request.
2018-08-22T21:25:54.150Z| vmx| I125: VigorTransport_ServerSendResponse opID=55bb9a7b-01-19-8395 seq=197767: Completed HotPlugManager request.

Removed volume c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c at 21:25:54

2018-08-22T21:25:54.912Z| vmx| I125: VigorTransportProcessClientPayload: opID=5e198a62-01-42-1726 seq=196535: Receiving HotPlugManager.BeginBatch request.
2018-08-22T21:25:54.912Z| vmx| I125: VigorTransport_ServerSendResponse opID=5e198a62-01-42-1726 seq=196535: Completed HotPlugManager request.
2018-08-22T21:25:54.912Z| vmx| I125: VigorTransportProcessClientPayload: opID=5e198a62-01-42-1726 seq=196537: Receiving Disk.SetPresent request.
2018-08-22T21:25:54.913Z| vcpu-0| I125: Destroying virtual dev for scsi1:10 vscsi=8312
2018-08-22T21:25:54.913Z| vcpu-0| I125: VMMon_VSCSIStopVports: No such target on adapter
2018-08-22T21:25:54.913Z| vcpu-0| I125: scsi1:10: numIOs = 0 numMergedIOs = 0 numSplitIOs = 0 ( 0.0%)
2018-08-22T21:25:54.913Z| vcpu-0| I125: Closing disk 'scsi1:10'
2018-08-22T21:25:54.913Z| vcpu-0| I125: SVMotion_DiskCloseCB: Closing disks in svmPhase 0. Not destroying mirror node.
2018-08-22T21:25:54.935Z| vcpu-0| I125: DISKLIB-VMFS  : "vsan://c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c" : closed.
2018-08-22T21:25:54.935Z| vcpu-0| A100: ConfigDB: Unsetting all entries with prefix "scsi1:10."
2018-08-22T21:25:54.935Z| vcpu-0| A100: ConfigDB: Unsetting all entries with prefix "sched.scsi1:10."
2018-08-22T21:25:54.946Z| vcpu-0| I125: VigorTransport_ServerSendResponse opID=5e198a62-01-42-1726 seq=196537: Completed DiskEarlyDevice request.
2018-08-22T21:25:54.946Z| vcpu-0| I125: PVSCSI: MSG: device scsi1:10 REMOVED
2018-08-22T21:25:54.946Z| vcpu-0| I125: PVSCSI: Posting device REMOVED msg at prodIdx=13
2018-08-22T21:25:54.947Z| vmx| I125: VigorTransportProcessClientPayload: opID=5e198a62-01-42-1726 seq=196538: Receiving Hbr.SetHBRDiskInfo request.
2018-08-22T21:25:54.947Z| vmx| I125: VigorTransport_ServerSendResponse opID=5e198a62-01-42-1726 seq=196538: Completed Hbr request.
2018-08-22T21:25:54.958Z| vmx| I125: VigorTransportProcessClientPayload: opID=5e198a62-01-42-1726 seq=196545: Receiving HotPlugManager.EndBatch request.
2018-08-22T21:25:54.958Z| vmx| I125: VigorTransport_ServerSendResponse opID=5e198a62-01-42-1726 seq=196545: Completed HotPlugManager request.

Successfully attaches volume c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c at 21:29:20

2018-08-22T21:29:20.463Z| vmx| I125: VigorTransportProcessClientPayload: opID=3ca02fa6-01-d-8704 seq=198939: Receiving HotPlugManager.BeginBatch request.
2018-08-22T21:29:20.463Z| vmx| I125: VigorTransport_ServerSendResponse opID=3ca02fa6-01-d-8704 seq=198939: Completed HotPlugManager request.
2018-08-22T21:29:20.465Z| vmx| I125: VigorTransportProcessClientPayload: opID=3ca02fa6-01-d-8704 seq=198941: Receiving SCSI.IsChildPresent request.
2018-08-22T21:29:20.465Z| vmx| I125: VigorTransport_ServerSendResponse opID=3ca02fa6-01-d-8704 seq=198941: Completed SCSI request.
2018-08-22T21:29:20.467Z| vmx| I125: VigorTransportProcessClientPayload: opID=3ca02fa6-01-d-8704 seq=198943: Receiving Disk.SetBackingInfo request.
2018-08-22T21:29:20.467Z| vmx| I125: VigorTransport_ServerSendResponse opID=3ca02fa6-01-d-8704 seq=198943: Completed DiskEarlyDevice request.
2018-08-22T21:29:20.468Z| vmx| I125: VigorTransportProcessClientPayload: opID=3ca02fa6-01-d-8704 seq=198944: Receiving Disk.SetMode request.
2018-08-22T21:29:20.468Z| vmx| I125: VigorTransport_ServerSendResponse opID=3ca02fa6-01-d-8704 seq=198944: Completed DiskEarlyDevice request.
2018-08-22T21:29:20.468Z| vmx| I125: VigorTransportProcessClientPayload: opID=3ca02fa6-01-d-8704 seq=198945: Receiving Disk.SetDigest request.
2018-08-22T21:29:20.468Z| vmx| I125: VigorTransport_ServerSendResponse opID=3ca02fa6-01-d-8704 seq=198945: Completed DiskEarlyDevice request.
2018-08-22T21:29:20.469Z| vmx| I125: VigorTransportProcessClientPayload: opID=3ca02fa6-01-d-8704 seq=198946: Receiving Disk.SetCtkEnabled request.
2018-08-22T21:29:20.469Z| vmx| I125: VigorTransport_ServerSendResponse opID=3ca02fa6-01-d-8704 seq=198946: Completed DiskEarlyDevice request.
2018-08-22T21:29:20.469Z| vmx| I125: VigorTransportProcessClientPayload: opID=3ca02fa6-01-d-8704 seq=198947: Receiving Disk.SetVFlashEnabled request.
2018-08-22T21:29:20.469Z| vmx| I125: VigorTransport_ServerSendResponse opID=3ca02fa6-01-d-8704 seq=198947: Completed DiskEarlyDevice request.
2018-08-22T21:29:20.469Z| vmx| I125: VigorTransportProcessClientPayload: opID=3ca02fa6-01-d-8704 seq=198948: Receiving Disk.SetPresent request.
2018-08-22T21:29:20.470Z| vmx| I125: HotAdd: Adding scsi-hardDisk with mode 'independent-persistent' to scsi1:8
2018-08-22T21:29:20.470Z| vmx| I125: DISK: OPEN scsi1:8 '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk' independent-persistent R[]
2018-08-22T21:29:20.475Z| vmx| I125: DISKLIB-VMFS  : "vsan://c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c" : open successful (10) size = 10737418240, hd = 13222844. Type 3
2018-08-22T21:29:20.475Z| vmx| I125: DISKLIB-DSCPTR: Opened [0]: "vsan://c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c" (0xa)
2018-08-22T21:29:20.475Z| vmx| I125: DISKLIB-LINK  : Opened '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk' (0xa): vmfs, 20971520 sectors / 10 GB.
2018-08-22T21:29:20.475Z| vmx| I125: DISKLIB-LIB   : Opened "/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk" (flags 0xa, type vmfs).
2018-08-22T21:29:20.475Z| vmx| I125: DISK: Disk '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk' has UUID '60 00 c2 98 9b cf 06 1e-55 58 73 12 28 cf 39 2f'
2018-08-22T21:29:20.475Z| vmx| I125: DISK: OPEN '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk' Geo (1305/255/63) BIOS Geo (0/0/0)
2018-08-22T21:29:20.480Z| vmx| I125: UTIL: Change file descriptor limit from soft 8006,hard 8006 to soft 8121,hard 8121.
2018-08-22T21:29:20.480Z| vmx| I125: SCSI DEVICE (scsi1:8): Computed value of scsi1:8.useBounceBuffers: default
2018-08-22T21:29:20.480Z| vmx| I125: Creating virtual dev for 'scsi1:8'.
2018-08-22T21:29:20.480Z| vmx| I125: DumpDiskInfo: scsi1:8 createType=11, capacity = 20971520, numLinks = 1, allocationType = 2
2018-08-22T21:29:20.481Z| vmx| I125: SCSIDiskESXPopulateVDevDesc: Using FS backend
2018-08-22T21:29:20.481Z| vmx| I125: DISKUTIL: scsi1:8 : geometry=1305/255/63
2018-08-22T21:29:20.481Z| vmx| I125: SCSIFilterESXAttachCBRCInt: CBRC not enabled or opened without filters,skipping CBRC           filter attach.
2018-08-22T21:29:20.481Z| vmx| I125: SCSIFilterSBDAttachCBRC: device scsi1:8 is not SBD. Skipping CBRC attach SBD way.
2018-08-22T21:29:20.481Z| vmx| A100: ConfigDB: Setting scsi1:8.deviceType = "scsi-hardDisk"
2018-08-22T21:29:20.481Z| vmx| A100: ConfigDB: Setting scsi1:8.fileName = "/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk"
2018-08-22T21:29:20.481Z| vmx| A100: ConfigDB: Setting scsi1:8.mode = "independent-persistent"
2018-08-22T21:29:20.481Z| vmx| A100: ConfigDB: Setting scsi1:8.present = "TRUE"
2018-08-22T21:29:20.481Z| vmx| A100: ConfigDB: Setting scsi1:8.redo = ""
2018-08-22T21:29:20.509Z| vmx| I125: VigorTransport_ServerSendResponse opID=3ca02fa6-01-d-8704 seq=198948: Completed DiskEarlyDevice request.
2018-08-22T21:29:20.509Z| vcpu-0| I125: PVSCSI: MSG: device scsi1:8 ADDED
2018-08-22T21:29:20.509Z| vcpu-0| I125: PVSCSI: Posting device ADDED msg at prodIdx=23
2018-08-22T21:29:20.512Z| vmx| I125: VigorTransportProcessClientPayload: opID=3ca02fa6-01-d-8704 seq=198955: Receiving HotPlugManager.EndBatch request.
2018-08-22T21:29:20.512Z| vmx| I125: VigorTransport_ServerSendResponse opID=3ca02fa6-01-d-8704 seq=198955: Completed HotPlugManager request.

@bradbeam
Copy link
Author

To follow up, we tried upgrading to 1.10.5 and saw no change. We started looking through the code and we're suspecting https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/attachdetach/reconciler/reconciler.go#L155 // kubernetes#51066 is at the root of our problem. I reverted that patch and volume attachment happened much faster and the node no longer froze. I'm going to go through another test on 1.9.6 to see if it'll address the issue there.

@dgabrysch
Copy link

@bradbeam please keep us updated :)

@bradbeam
Copy link
Author

Just deployed a patched 1.9.6 with 51066 reverted and we're seeing good results, no node hangup and ~30s for pod failover with 9 pv.

@bradbeam
Copy link
Author

bradbeam commented Aug 27, 2018

Do the settings mentioned in vmware-archive/vsphere-storage-for-docker#1532 (comment) apply in this use case?

set the "multi-writer" flag on a VMDK in the VMX config scsix:y.sharing = "multi-writer".

The only requirement is that the vmdk must also be created as a thick disk as dynamic growing is disallowed for a multi-writer disk.

Our default storage class is set to thin provisioned, and I'm not seeing any code paths in the vsphere cloud provider code to modify the vmdk to enable multi-writer.

I think fixing up/removing the multi-attach code is still the appropriate thing to do at the current time, but I am curious if there's more going on under the covers to cause this issue. Especially given @mwelch1's output above[1], the attempts to obtain a lock seem to line up with the ~25s I see with a node being unresponsive [2].

There's also mention of a disk being set to eager zeroed thick in this vmware kb article [3].

[1]

2018-08-22T21:25:28.585Z| vmx| I125: HotAdd: Adding scsi-hardDisk with mode 'independent-persistent' to scsi1:8
2018-08-22T21:25:28.586Z| vmx| I125: DISK: OPEN scsi1:8 '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk' independent-persistent R[]
2018-08-22T21:25:32.589Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:36.896Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:41.206Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:45.514Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:49.821Z| vmx| I125: OBJLIB-VSANOBJ: Object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c' locked, retrying
2018-08-22T21:25:54.139Z| vmx| I125: OBJLIB-VSANOBJ: VsanObjLock: Could not lock object 'c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c': Failed to lock the file (4).
2018-08-22T21:25:54.139Z| vmx| I125: DISKLIB-VMFS  : "vsan://c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c" : failed to open (Failed to lock the file): ObjLib_Open failed. Type 3
2018-08-22T21:25:54.139Z| vmx| I125: DISKLIB-LINK  : "/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk" : failed to open (Failed to lock the file).
2018-08-22T21:25:54.139Z| vmx| I125: DISKLIB-CHAIN : "/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk" : failed to open (Failed to lock the file).
2018-08-22T21:25:54.142Z| vmx| I125: DISKLIB-VMFS  : "vsan://c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c" : open successful (524309) size = 10737418240, hd = 0. Type 3
2018-08-22T21:25:54.148Z| vmx| I125: DISKLIB-VMFS  : "vsan://c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c" : closed.
2018-08-22T21:25:54.148Z| vmx| I125: DISKLIB-LIB   : Failed to open '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk' with flags 0xa Failed to lock the file (16392).
2018-08-22T21:25:54.148Z| vmx| I125: DISK: Cannot open disk '/vmfs/volumes/vsan:52d56d4e69a728b3-b0b94a23d9f60a63/0c227b5b-8d29-870c-fd19-a0369fdb353c/kubernetes-dynamic-pvc-f9026429-a642-11e8-8ef3-005056a3dca5.vmdk': Failed to lock the file (16392).

[2]

64 bytes from s01-k8s01.cluster.local (10.0.0.20): icmp_seq=21 ttl=64 time=23123 ms

[3]
https://kb.vmware.com/s/article/1034165

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Aug 29, 2018
Automatic merge from submit-queue (batch tested with PRs 67745, 67432, 67569, 67825, 67943). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Fix VMWare VM freezing bug by reverting #51066

**What this PR does / why we need it**: kube-controller-manager, VSphere specific: When the controller tries to attach a Volume to Node A that is already attached to Node B, Node A freezes until the volume is attached.  Kubernetes continues to try to attach the volume as it thinks that it's 'multi-attachable' when it's not. #51066 is the culprit.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes vmware-archive#500 / vmware-archive#502 (same issue)

**Special notes for your reviewer**:

- Repro:

Vsphere installation, any k8s version from 1.8 and above, pod with attached PV/PVC/VMDK:

1. cordon the node which the pod is in
2. `kubectl delete po/[pod] --force --grace-period=0`
3. the pod is immediately rescheduled to a new node. Grab the new node from a `kubectl describe [pod]` and attempt to Ping it or SSH into it.
4. you can see that pings/ssh fail to reach the new node. `kubectl get node` shows it as 'NotReady'. New node is frozen until the volume is attached - usually 1 minute freeze for 1 volume in a low-load cluster, and many minutes more with higher loads and more volumes involved.

- Patch verification:

Tested a custom patched 1.9.10 kube-controller-manager with #51066 reverted and the above bug is resolved - can't repro it anymore. New node doesn't freeze at all, and attaching happens quite quickly, in a few seconds.


**Release note**:

``` 
Fix VSphere VM Freezing bug by reverting #51066 

```
@divyenpatel
Copy link

Fix for this issue is merged to the master branch - kubernetes#67825 Cherry picking for this change is in progress.

@bradbeam
Copy link
Author

bradbeam commented Sep 4, 2018

@divyenpatel can you provide any clarification on my last comment?

@divyenpatel
Copy link

@bradbeam Yes we can apply multi writer setting in this case.
Currently vSphere Cloud Provider does only support ReadWriteOnce access mode.

When we support vsphere volume with ReadWriteMany and ReadOnlyMany access modes, we can consider using this approach.

I see you are using vSphere 6.5. Freezing issue you are observing may already be fixed in newer releases. I see lots of Hot Disk related fixes in recent VDDK releases notes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants