Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod stuck in Terminating if trident volume is not mounted #572

Closed
Missxiaoguo opened this issue May 5, 2021 · 1 comment
Closed

Pod stuck in Terminating if trident volume is not mounted #572

Missxiaoguo opened this issue May 5, 2021 · 1 comment

Comments

@Missxiaoguo
Copy link

Describe the bug
Pod stuck in "Terminating" status forever due to the following error "volume not mounted" when unmounting volume.

2021-05-03T03:09:35.742478381Z stderr F time="2021-01-13T03:09:35Z" level=error msg="GRPC error: rpc error: code = NotFound desc = volume not mounted"

2021-05-03T03:09:36.325 compute-0 kubelet[24343]: info E0113 03:09:35.796487 24343 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi.trident.netapp.io^pvc-80a43fd6-9066-4b98-8a86-5de82fe2cf1b podName:97a388a2-3dd4-4ca0-9694-72683feeea4d nodeName:}" failed. No retries permitted until 2021-01-13 03:09:36.296442309 +0000 UTC m=+2077.724479529 (durationBeforeRetry 500ms). Error: "UnmountVolume.TearDown failed for volume "mon-elasticsearch-master" (UniqueName: "kubernetes.io/csi/csi.trident.netapp.io^pvc-80a43fd6-9066-4b98-8a86-5de82fe2cf1b") pod "97a388a2-3dd4-4ca0-9694-72683feeea4d" (UID: "97a388a2-3dd4-4ca0-9694-72683feeea4d") : kubernetes.io/csi: mounter.TearDownAt failed: rpc error: code = NotFound desc = volume not mounted"

Environment
Provide accurate information about the environment to help us reproduce the issue.

  • Trident version: v20.04.0
  • Trident installation flags used: tridentctl install -n trident --generate-custom-yaml
  • Container runtime: containerd/1.3.3
  • Kubernetes version: v1.18.1
  • Kubernetes orchestrator: self-manager cluster
  • Kubernetes enabled feature gates: none
  • OS: CentOS 7.6
  • NetApp backend types: ontap-nas

To Reproduce
I am not sure why the volume became unmounted on our system.
But to simulate the scenario, manually unmount the volume path.
ie. umount /var/lib/kubelet/pods/97a388a2-3dd4-4ca0-9694-72683feeea4d/volumes/kubernetes.io~csi/pvc-80a43fd6-9066-4b98-8a86-5de82fe2cf1b/mount
Then delete the pod without using force. (kubectl delete pod -n )

Expected behavior
Pod can be terminated successfully even if the volume is not mounted.

Additional context
Trident returns error if volume path is not mounted, see https://github.com/NetApp/trident/blob/master/frontend/csi/node_server.go#L171-L173

It causes volume teardown failed, see https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/csi_mounter.go#L364-#L366

Trident should handle the case that the volume is not mounted and allow k8s to complete the teardown, do the same thing as for rbd (delete the volume path if it's not a mount point), see
https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/rbd/disk_manager.go#L112-#L115

@Missxiaoguo Missxiaoguo added the bug label May 5, 2021
@gnarl gnarl added the tracked label May 5, 2021
@clintonk
Copy link
Contributor

This is fixed in commit 10efaf2 and will be included in the 21.10.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants