New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iscsi PV does not recover under a node down. #63475
Comments
/sig storage |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
@dElogics AFAIK this fundamental issue has been fixed only in several cloud providers, as if bare-metal clusters were a second-class citizen... |
@adampl can you mention for which providers it has been fixed? I am chasing a simple bug where in Openstack the volumes are not being freed from a node which has been shutdown... |
I don't remember now, but probably AWS and/or GCE. Now I'm not even sure if it's actually fixed, or it's just a difference in behavior. Some providers actually remove the node from the cluster when it's shut down, which AFAIK somehow helps K8S to force-detach the volume. But generally, several solutions for this problem appeared over time as pull requests (like #67977), but eventually they have been held/canceled in favor of something better (like #65392). |
I had similar problem with iSCSI when the node was crashed due to kernel panic. |
@aizuddin85 any work arounds? |
There is a timer within the provisioner.go that will release the lock eventually after 6 mins if I recall correctly. Somehow I forgot to capture which line it was from the codebase. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
cc: @humblec |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind bug
What happened:
Suppose a pod is bound to a PV and the node running that pod goes down. Under default configuration, the PV must be force-detached from the node which got down within 6 minutes (I think depending on --attach-detach-reconcile-sync-period) and successfully relocate to another node; however in the other node, the pod is stuck at ContainerCreating state unless the
Looking at the bug reports #57497, #50004, #50200, this bug appears to be fixed in many PV drivers, but not in the iscsi one.
How to reproduce it (as minimally and precisely as possible):
Make a node unavailable by taking it's IP out, or pkill kubelet, or crash kernel, poweroff hardware/VM etc...
Environment:
kubectl version
):Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.2", GitCommit:"bdaeafa71f6c7c04636251031f93464384d54963", GitTreeState:"clean", BuildDate:"2017-10-24T19:48:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.2", GitCommit:"bdaeafa71f6c7c04636251031f93464384d54963", GitTreeState:"clean", BuildDate:"2017-10-24T19:38:10Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Bare metal, 512GB, 4 proc nodes.
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
uname -a
):Linux aws-prod132 4.9.0-3-amd64 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Debian 4.9.30-2+deb9u5 (2017-09-19) x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: