Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

running vSphere CSI node daemonset pods with hostNetwork true #1217

Conversation

divyenpatel
Copy link
Member

What this PR does / why we need it:
This PR is configuring vSphere CSI Node DaemonSet Pod with hostNetwork true and dnsPolicy ClusterFirstWithHostNet.

This change is required to prevent vSAN file share volume (vSphere CSI RWM volume) from becoming inaccessible when vSphere CSI Node Daemon set pod restarts.

Which issue this PR fixes
fixes #1216

Testing done:
Deployed vSphere CSI Driver node Daemonset Pod with this change, and confirmed volumes on the node are accessible after multiple restarts of the vSphere CSI Driver node Daemonset

Special notes for your reviewer:

  • We need to patch all prior releases which support RWM volume with this change.
  • We need to provide a workaround to the customers to recover frozen volumes If they have restarted vSphere CSI Node Deamonset Pod.
  • We need to provide steps to customers they can execute before they upgrade the vSphere CSI Driver. Upgrading the vSphere CSI Driver also restarts the vSphere CSI Driver node Deamonset Pods, which can result in frozen volumes.

Release note:

running vSphere CSI node Daemonset pods with hostNetwork true

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 9, 2021
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Aug 9, 2021
@xing-yang
Copy link
Contributor

/approve

@SandeepPissay
Copy link
Contributor

/approve

@divyenpatel can you trigger E2E pipelines?

@divyenpatel
Copy link
Member Author

jtest block-vanilla
jtest file-vanilla

@svcbot-qecnsdp
Copy link

Started Vanilla block pipeline...

@svcbot-qecnsdp
Copy link

Started vanilla file pipeline...

@svcbot-qecnsdp
Copy link

File vanilla build status: FAILURE 
Stage before exit: testbed-deploy 

@svcbot-qecnsdp
Copy link

Block vanilla build status: FAILURE 
Stage before exit: e2e-tests 
Jenkins E2E Test Results: 
JUnit report was created: /home/worker/workspace/github-csi-block-vanilla/Results/614/vsphere-csi-driver/tests/e2e/junit.xml

Ran 1 of 214 Specs in 398.438 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 213 Skipped
PASS

Ginkgo ran 1 suite in 7m9.186236068s
Test Suite Passed
--
JUnit report was created: /home/worker/workspace/github-csi-block-vanilla/Results/614/vsphere-csi-driver/tests/e2e/junit.xml

Ran 8 of 214 Specs in 3522.707 seconds
SUCCESS! -- 8 Passed | 0 Failed | 0 Pending | 206 Skipped
PASS

Ginkgo ran 1 suite in 58m54.857096238s
Test Suite Passed
--
/home/worker/workspace/github-csi-block-vanilla/Results/614/vsphere-csi-driver/tests/e2e/vsphere_volume_expansion.go:2527

Ran 38 of 214 Specs in 934.418 seconds
FAIL! -- 31 Passed | 7 Failed | 0 Pending | 176 Skipped


Ginkgo ran 1 suite in 15m49.649034379s
Test Suite Failed

@svcbot-qecnsdp
Copy link

Started Vanilla file pre-checkin pipeline... Build Number: 53

@svcbot-qecnsdp
Copy link

Build ID: 53
File vanilla build status: FAILURE 
Stage before exit: e2e-tests 
Jenkins E2E Test Results: 
Ran 25 of 214 Specs in 9089.614 seconds
FAIL! -- 18 Passed | 7 Failed | 0 Pending | 189 Skipped
--- FAIL: TestE2E (9089.69s)
FAIL

Ginkgo ran 1 suite in 2h32m12.209536556s
Test Suite Failed
make: Leaving directory `/home/worker/workspace/csi-file-vanilla-pre-check-in/53/vsphere-csi-driver`

@chethanv28 chethanv28 added the release-2.3.0-candidate Indicates PR needs to be cherry-picked for 2.3.0 release label Aug 10, 2021
@svcbot-qecnsdp
Copy link

Started Vanilla file pre-checkin pipeline... Build Number: 55

@svcbot-qecnsdp
Copy link

Build ID: 55
File vanilla build status: FAILURE 
Stage before exit: e2e-tests 
Jenkins E2E Test Results: 
/home/worker/workspace/csi-file-vanilla-pre-check-in/55/vsphere-csi-driver/tests/e2e/vsphere_file_volume_basic_mount.go:343

Ran 4 of 75 Specs in 2247.717 seconds
FAIL! -- 1 Passed | 3 Failed | 0 Pending | 71 Skipped

Ginkgo ran 1 suite in 38m26.868490835s
Test Suite Failed
make: Leaving directory `/home/worker/workspace/csi-file-vanilla-pre-check-in/55/vsphere-csi-driver`

@svcbot-qecnsdp
Copy link

Started Vanilla file pre-checkin pipeline... Build Number: 56

@svcbot-qecnsdp
Copy link

Build ID: 56
File vanilla build status: SUCCESS 
Stage before exit: e2e-tests 
Jenkins E2E Test Results: 

Ran 6 of 214 Specs in 590.377 seconds
SUCCESS! -- 6 Passed | 0 Failed | 0 Pending | 208 Skipped
PASS

Ginkgo ran 1 suite in 10m47.149727125s
Test Suite Passed
make: Leaving directory `/home/worker/workspace/csi-file-vanilla-pre-check-in/56/vsphere-csi-driver`

@divyenpatel
Copy link
Member Author

All files share tests passed with this change. There was DNS issue in the fileshare setup, @marunachalam is fixing the pipeline by correcting coredns config to forward the DNS resolution to the nameserver if dns is vsanfs-sh.prv

After adding following in the core DNS config failed test cases passed.

vsanfs-sh.prv:53 {
    errors
    cache 30
    forward . 10.161.191.241
}

Copy link
Collaborator

@chethanv28 chethanv28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 10, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chethanv28, divyenpatel, SandeepPissay, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [SandeepPissay,chethanv28,divyenpatel,xing-yang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-2.3.0-candidate Indicates PR needs to be cherry-picked for 2.3.0 release release-2.3.0-cherry-picked Indicates PR is cherry-picked for 2.3.0 release size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
6 participants