-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The object 'vim.VirtualMachine:vm-988369' has already been deleted or has not been completely created" #2495
Comments
@jingxu97 @dzjiang91 Here is the PR - #2546. The PR was merged couple of days ago on master branch and hence the fix will be either available in 3.1.1 or 3.2.0 whichever is earliest. |
Fix will be available in the v3.1.1 patch release. |
Thank you @chethanv28 and @divyenpatel I am wondering whether the fix will be cherrypicked to 3.0.x release? Thank you! |
@jingxu97 Yes, the fix is being cherry-picked to 3.0.3 release - #2561 |
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
We recreated nodes one by one with the same node name (deleting a node and creating a new node with the same name but different VM) and vSphere controller was not restarted. After that, we observed PVC creation failure due to volume list failure:
What you expected to happen:
No failure of PVC requests after node recreation.
How to reproduce it (as minimally and precisely as possible):
IIUC, this failure is caused by race condition and I dont have a steady way to reproduce this. We may produce it by recreating node with the same name and try PVC creation.
Anything else we need to know?:
From code reading and logs, the vsphere csi controller watches CSINode object and keeps two maps: 'nodeNameToUUID' (node name to provider id mapping) and 'nodeVMs' (provider id to vm information mapping). When there is a csinode deletion, it removes node name as well as its provider id in both maps. However, during the machine recreation, the vsphere csi controller did not observe csinode removal but observed that the csinode is updated with new provider id. In that case, it updates 'nodeNameToUUID' and
nodeVMs
with new provider id without deleting old machine in the 'nodeVMs'.During PVC creation, vsphere controller tried to list attached volumes from all machines in
nodeVMs
. BecausenodeVMs
contained a machine that was already removed, the list operation failed as what we observed.Node first registered:
Node get registered again with new provider ID. There were no node unregister logs between so old machine with provider id
4211836d-87b7-9c59-4988-3846de984083
was not removed fromnodeVMs
.Subsequent list volume requests failed after attempting to get information of an old machine that was already deleted:
Environment:
uname -a
):The text was updated successfully, but these errors were encountered: