Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fall back to look up the node by name #1966

Conversation

divyenpatel
Copy link
Member

@divyenpatel divyenpatel commented Sep 12, 2022

What this PR does / why we need it:
fall back to look up the node by name during controller unpublish.

This fix is required for platforms like TKGi where control plane nodes are upgraded first and workload nodes are upgraded later. While upgrading workload nodes, during Node drain operation, ControllerUnpublishVolume is called with Node name, as workload node is not yet upgraded to publish Node VM UUID as Node ID. When ControllerUnpublishVolume is called with Node name, VM lookup can not be done using UUID, so here we are adding fallback to perform one more look up to find VM by name from the in-memory cache.

This change will help TKGi to utilize use-csinode-id feature (also known as decouple CSI from CPI) released in v2.5.0.

Testing done:
Logs

{"level":"info","time":"2022-09-12T22:23:15.448894412Z","caller":"vanilla/controller.go:1173","msg":"ControllerUnpublishVolume: called with args {VolumeId:00e2181e-6133-444b-a504-6948aadb6587 NodeId:4202ec7e-5ab3-6cf0-dbaa-8ecf117e4ce9 Secrets:map[] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}","TraceId":"db3cbe3d-ab0a-441f-9da1-47fa5168f078"}
{"level":"error","time":"2022-09-12T22:23:15.458989476Z","caller":"node/manager.go:152","msg":"Node not found with nodeName 4202ec7e-5ab3-6cf0-dbaa-8ecf117e4ce9","TraceId":"db3cbe3d-ab0a-441f-9da1-47fa5168f078","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/node.(*defaultManager).GetNodeByName\n\t/build/pkg/common/cns-lib/node/manager.go:152\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/node.(*Nodes).GetNodeByName\n\t/build/pkg/common/cns-lib/node/nodes.go:181\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerUnpublishVolume.func1\n\t/build/pkg/csi/service/vanilla/controller.go:1251\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerUnpublishVolume\n\t/build/pkg/csi/service/vanilla/controller.go:1277\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerUnpublishVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.5.0/lib/go/csi/csi.pb.go:5723\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.40.0/server.go:1297\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.40.0/server.go:1626\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.40.0/server.go:941"}
{"level":"info","time":"2022-09-12T22:23:15.459697682Z","caller":"vanilla/controller.go:1253","msg":"Performing node VM lookup using node VM UUID: "4202ec7e-5ab3-6cf0-dbaa-8ecf117e4ce9"","TraceId":"db3cbe3d-ab0a-441f-9da1-47fa5168f078"}
{"level":"info","time":"2022-09-12T22:23:16.674390528Z","caller":"volume/manager.go:757","msg":"DetachVolume: volumeID: "00e2181e-6133-444b-a504-6948aadb6587", vm: "VirtualMachine:vm-53 [VirtualCenterHost: 10.168.197.125, UUID: 4202ec7e-5ab3-6cf0-dbaa-8ecf117e4ce9, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: 10.168.197.125]]", opId: "08e460cb"","TraceId":"db3cbe3d-ab0a-441f-9da1-47fa5168f078"}
{"level":"info","time":"2022-09-12T22:23:16.674567967Z","caller":"volume/manager.go:792","msg":"DetachVolume: Volume detached successfully. volumeID: "00e2181e-6133-444b-a504-6948aadb6587", vm: "VirtualMachine:vm-53 [VirtualCenterHost: 10.168.197.125, UUID: 4202ec7e-5ab3-6cf0-dbaa-8ecf117e4ce9, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: 10.168.197.125]]", opId: "08e460cb"","TraceId":"db3cbe3d-ab0a-441f-9da1-47fa5168f078"}
{"level":"info","time":"2022-09-12T22:23:16.674613441Z","caller":"vanilla/controller.go:1274","msg":"ControllerUnpublishVolume successful for volume ID: 00e2181e-6133-444b-a504-6948aadb6587","TraceId":"db3cbe3d-ab0a-441f-9da1-47fa5168f078"}

Special notes for your reviewer:

Release note:

fall back to look up the node by name

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 12, 2022
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 12, 2022
@divyenpatel divyenpatel force-pushed the fallback-to-lookup-node-by-name branch from 0811a20 to 59e364a Compare September 12, 2022 19:21
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 12, 2022
@svcbot-qecnsdp
Copy link

Started vanilla Block pipeline... Build Number: 1431

@svcbot-qecnsdp
Copy link

Block vanilla build status: FAILURE 
Stage before exit: e2e-tests 
Jenkins E2E Test Results: 
JUnit report was created: /home/worker/workspace/Block-Vanilla/Results/1431/vsphere-csi-driver/tests/e2e/junit.xml

Ran 1 of 630 Specs in 417.845 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 629 Skipped
PASS

Ginkgo ran 1 suite in 8m10.522497841s
Test Suite Passed
--
JUnit report was created: /home/worker/workspace/Block-Vanilla/Results/1431/vsphere-csi-driver/tests/e2e/junit.xml

Ran 13 of 630 Specs in 4588.348 seconds
SUCCESS! -- 13 Passed | 0 Failed | 0 Pending | 617 Skipped
PASS

Ginkgo ran 1 suite in 1h16m46.947578911s
Test Suite Passed
--
/home/worker/workspace/Block-Vanilla/Results/1431/vsphere-csi-driver/tests/e2e/csi_cns_telemetry.go:203

Ran 41 of 630 Specs in 546.912 seconds
FAIL! -- 40 Passed | 1 Failed | 0 Pending | 589 Skipped


Ginkgo ran 1 suite in 9m25.395300407s
Test Suite Failed

@svcbot-qecnsdp
Copy link

Started Vanilla block pre-checkin pipeline... Build Number: 1338

@divyenpatel divyenpatel force-pushed the fallback-to-lookup-node-by-name branch from 59e364a to e6e4810 Compare September 12, 2022 22:08
@svcbot-qecnsdp
Copy link

Started Vanilla block pre-checkin pipeline... Build Number: 1339

@svcbot-qecnsdp
Copy link

Build ID: 1339
Block vanilla build status: FAILURE 
Stage before exit: e2e-tests 
Jenkins E2E Test Results: 
JUnit report was created: /home/worker/workspace/csi-block-vanilla-pre-check-in/1339/vsphere-csi-driver/tests/e2e/junit.xml

Ran 1 of 630 Specs in 318.147 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 629 Skipped
PASS

Ginkgo ran 1 suite in 6m13.202093657s
Test Suite Passed
--
JUnit report was created: /home/worker/workspace/csi-block-vanilla-pre-check-in/1339/vsphere-csi-driver/tests/e2e/junit.xml

Ran 12 of 630 Specs in 3958.477 seconds
SUCCESS! -- 12 Passed | 0 Failed | 0 Pending | 618 Skipped
PASS

Ginkgo ran 1 suite in 1h6m15.659342457s
Test Suite Passed
--
/home/worker/workspace/csi-block-vanilla-pre-check-in/1339/vsphere-csi-driver/tests/e2e/vsphere_volume_expansion.go:3241

Ran 40 of 630 Specs in 923.342 seconds
FAIL! -- 35 Passed | 5 Failed | 0 Pending | 590 Skipped


Ginkgo ran 1 suite in 15m41.682716359s
Test Suite Failed

@shalini-b
Copy link
Collaborator

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Sep 13, 2022
@shalini-b
Copy link
Collaborator

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 13, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chethanv28, divyenpatel, shalini-b

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [chethanv28,divyenpatel,shalini-b]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 7fad798 into kubernetes-sigs:master Sep 13, 2022
adikul30 pushed a commit to adikul30/vsphere-csi-driver that referenced this pull request Sep 22, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Sep 28, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Sep 28, 2022
divyenpatel added a commit that referenced this pull request Sep 28, 2022
[cherry-pick 2.5] fall back to look up node by name (#1966)
divyenpatel added a commit that referenced this pull request Sep 28, 2022
[cherry-pick 2.6]fall back to look up node by name (#1966)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants