-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] WaitForUpdatesEx doesn't return error in response #2724
Comments
Howdy 🖐 vrevelas ! Thank you for your interest in this project. We value your feedback and will respond soon. If you want to contribute to this project, please make yourself familiar with the |
Hi @dougm , thanks for taking a look. You're right, this issue is a effectively duplicate of #1604. This is a stack trace I pulled from Kubernetes when I hit the issue: https://gist.github.com/vrevelas/ac65eeb5837b320cbe0f4c20f43a94ab Kubernetes calls into govmomi at govmomi/task/wait:WaitForResult on line 15. Correct me if I'm wrong, but I don't think there's any way to pass {PropagateMissing: true} from that part of the API? |
Ah, PR #1579 always sets it to true in Line 126 in b780c8c
That fix was after v0.20.3, so bumping to 0.27.x should fix it. % git tag --contains e373feb8e90894dfbe39871e72c05946b4cf848f
prerelease-v0.21.0-58-g8d28646
prerelease-v0.22.1-247-g770fcba2
v0.22.0
v0.22.1
v0.22.2
v0.23.0
v0.23.1
v0.24.0
v0.24.1
v0.24.2
v0.25.0
v0.26.0
v0.26.1
v0.27.0
v0.27.1
v0.27.2
v0.27.3
v0.27.4 |
Ah perfect! Line 126 in b780c8c
{PropagateMissing: true} I used my previous comment, but it didn't occur to me to check if the call stack included it 😄 I've raised the kubernetes issue linked above to request they upgrade their version of govmomi. Many thanks for your help!
|
Describe the bug
I was using the Rancher (v2.6.2) Kubernetes (version v1.20.11-rancher1-2) in-tree vsphere cloud provider with vSphere 6.7.0.50000 to provision persistent volumes. When creating a new PersistentVolumeClaim, the status of the claim would remain hung in a 'pending' state. I followed the troubleshooting guide and increased the log level of kube-controller-manager, but the logs still showed no error.
Desperate, I set up an instance of mitmproxy in reverse proxy mode to spy on traffic between kubernetes and vSphere. This showed that Kubernetes was creating a disk, then using the WaitForUpdatesEx request to wait until the creation was complete. However, the first WaitForUpdatesEx request was returning the following response:
This request was immediately followed by a second WaitForUpdatesEx request, as if the client hadn't understood that the response contained an error. The response from vSphere for the second request was "not found". The client then sent a third WaitForUpdatesEx request. The response this time was completely empty - an EOF.
When deleting the PVC, Kubernetes would display an "unexpected EOF" error in the events of the PVC. The real problem though was when creating the PVC, which just hung with no error at all.
I found a few other bug reports related to govmomi and "unexpected EOF" which may or may not be related to this issue: #2611, #1025, and jetbrains-infra/packer-builder-vsphere#87.
Taking a look at the code, it looks like WaitForUpdatesEx doesn't contain any error handling for the contents of the response body - it just checks for connectivity errors from the RoundTrip:
govmomi/vim25/methods/methods.go
Line 18183 in f04d77d
resBody.Fault_ != nil
, however that also failed to pick up the NoPermission fault, so it looks like the parsing of the response body will need to be improved. At this point the third party I am working with gave my account the System.Read privilege on the vCenter level (I previously only had it on the data center level) and I am no longer able to reproduce the issue.To Reproduce
Steps to reproduce the behavior:
Expected behavior
An error message stating that permission was denied due to lack of System.Read on the
name-redacted
folder should have been displayed when usingkubectl describe pvc pvcname
, or when viewing the kube-controller-manager logs.Affected version
This was experienced with govmomi v0.20.3, however I can confirm by reading the code that the issue should still exist in the most recent version, at time of writing 0.27.2
Screenshots/Debug Output
None, but I'm happy to supply any additional detail if required.
Additional context
NA
The text was updated successfully, but these errors were encountered: