-
Notifications
You must be signed in to change notification settings - Fork 31
PV attachment causes VM to hang #502
Comments
Ran into the same issue. When tracking the request through the first step done is to attach the vmdk on the target VM before detaching on the source VM. Because of this the target VM will pause for ~30 seconds while it tries to attach the VMDK causing the VM to freeze during that time. I think VMware should check if the lock exists before pausing the VM to attach the VMDK because this probably affects many other requests but given that it doesn't, would the order of operations need to change? Attempt to mount c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c at 21:25:28
Removed volume c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c at 21:25:54
Successfully attaches volume c1bb7d5b-7d49-34cd-a8a0-a0369fdb353c at 21:29:20
|
To follow up, we tried upgrading to 1.10.5 and saw no change. We started looking through the code and we're suspecting https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/attachdetach/reconciler/reconciler.go#L155 // kubernetes#51066 is at the root of our problem. I reverted that patch and volume attachment happened much faster and the node no longer froze. I'm going to go through another test on 1.9.6 to see if it'll address the issue there. |
@bradbeam please keep us updated :) |
Just deployed a patched 1.9.6 with 51066 reverted and we're seeing good results, no node hangup and ~30s for pod failover with 9 pv. |
Do the settings mentioned in vmware-archive/vsphere-storage-for-docker#1532 (comment) apply in this use case?
Our default storage class is set to thin provisioned, and I'm not seeing any code paths in the vsphere cloud provider code to modify the vmdk to enable multi-writer. I think fixing up/removing the multi-attach code is still the appropriate thing to do at the current time, but I am curious if there's more going on under the covers to cause this issue. Especially given @mwelch1's output above[1], the attempts to obtain a lock seem to line up with the ~25s I see with a node being unresponsive [2]. There's also mention of a disk being set to eager zeroed thick in this vmware kb article [3]. [1]
[2]
|
Automatic merge from submit-queue (batch tested with PRs 67745, 67432, 67569, 67825, 67943). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. Fix VMWare VM freezing bug by reverting #51066 **What this PR does / why we need it**: kube-controller-manager, VSphere specific: When the controller tries to attach a Volume to Node A that is already attached to Node B, Node A freezes until the volume is attached. Kubernetes continues to try to attach the volume as it thinks that it's 'multi-attachable' when it's not. #51066 is the culprit. **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes vmware-archive#500 / vmware-archive#502 (same issue) **Special notes for your reviewer**: - Repro: Vsphere installation, any k8s version from 1.8 and above, pod with attached PV/PVC/VMDK: 1. cordon the node which the pod is in 2. `kubectl delete po/[pod] --force --grace-period=0` 3. the pod is immediately rescheduled to a new node. Grab the new node from a `kubectl describe [pod]` and attempt to Ping it or SSH into it. 4. you can see that pings/ssh fail to reach the new node. `kubectl get node` shows it as 'NotReady'. New node is frozen until the volume is attached - usually 1 minute freeze for 1 volume in a low-load cluster, and many minutes more with higher loads and more volumes involved. - Patch verification: Tested a custom patched 1.9.10 kube-controller-manager with #51066 reverted and the above bug is resolved - can't repro it anymore. New node doesn't freeze at all, and attaching happens quite quickly, in a few seconds. **Release note**: ``` Fix VSphere VM Freezing bug by reverting #51066 ```
Fix for this issue is merged to the master branch - kubernetes#67825 Cherry picking for this change is in progress. |
@divyenpatel can you provide any clarification on my last comment? |
@bradbeam Yes we can apply multi writer setting in this case. When we support vsphere volume with I see you are using vSphere 6.5. Freezing issue you are observing may already be fixed in newer releases. I see lots of Hot Disk related fixes in recent VDDK releases notes. |
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
When a pod is rescheduled and volumes are moved from one VM to another, we're seeing approx 25s of unresponsiveness per volume attachment.
What you expected to happen:
Volume attachment/detachment happens seamlessly without interrupting / pausing the node.
How to reproduce it (as minimally and precisely as possible):
On the first deployment everything seems to come up fine. After deleting the running pod, we experience the behavior.
I changed the order of the above slightly to reflect the time; FailedMount shows up with a 1m timestamp, but didn't appear until after the last FailedAttachVolume.
Anything else we need to know?:
Environment:
kubectl version
):Cloud provider or hardware configuration:
vSphere 6.5
OS (e.g. from /etc/os-release):
uname -a
):Install tools:
Kubespray
Others:
The text was updated successfully, but these errors were encountered: