New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pods using csi volumes fail to terminate if csi driver pods have been evicted #895
Comments
@juan-lee thanks for raising this issue. This is an issue across all CSI drivers and seems like a good enhancement to have in Kubernetes to control order in which pods are scheduled/deleted. Similar issue arises during node scale up event when workload pods start running before the CSI driver is running on th node. Kubelet retries the volume mount and it eventually succeeds but can take order of minutes some time.
this solves the problem where the service doesn't stop until the node goes away but introduces a problem where the service keeps runnings even if csi driver is uninstalled. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What steps did you take and what happened:
Pods that mount volumes provided by csi drivers fail to terminate if the corresponding csi driver has been evicted or is missing. kubelet gets blocked when trying to unmount because the csi driver isn't available.
I observed this issue when cluster-autoscaler tried to rebalance pods to another node. My pod was stuck terminating because the secret-store-csi volume was evicted before kubelet could unmount the pods volume.
What did you expect to happen:
kubelet dependencies shouldn't be evicted before pods dependent on the drivers are successfully terminated.
Anything else you would like to add:
Consider having csi daemonsets install drivers to the host managed by systemd similar to cni drivers. This will prevent drivers from disappearing before kubelet is done with them.
KEP: kubernetes/enhancements#1003
Which provider are you using:
Azure KeyVault
Environment:
kubectl version
): v1.21.9The text was updated successfully, but these errors were encountered: