-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod from DaemonSet stuck in Unknown state because of NodeLost #44458
Comments
@kubernetes/sig-apps-bugs |
The above should work. The behavior was modified in this PR. Not deleting the unreachable pod was a decision made in 1.5 in the interest of providing safety guarantees. The relevant rationale doc is https://github.com/kubernetes/community/blob/master/contributors/design-proposals/pod-safety.md |
I had a similar issue as wrote in #41916, restarting the apiserver on master node makes the daemonset running on the node to exit and be stuck in |
Closing this issue, because behavior is as intended. The way around it would be to delete the node from k8s to keep it in sync with the infrastructure. |
But should node be automatically deleted from cluster if kubelet does not response in some period of time ? |
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): no
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): unknown daemonset nodelost
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
Kubernetes version (use
kubectl version
):Environment:
uname -a
): 4.9.16-coreos-r1What happened:
canal-node.yaml
contains a DaemonSet: https://github.com/projectcalico/canal/blob/master/k8s-install/canal.yamlI cannot even recreate the DaemonSet - it will mention it needs 16 pods and has 1 running (for interesting definitions of "running", apparently), but not run any additional pods.
What you expected to happen:
canal-node-36b0t
should have been deleted when its node got lost. Or when the DaemonSet was being deleted. At latest, it should have been deleted when I rankubectl delete --force --now
.How to reproduce it (as minimally and precisely as possible):
Run a DaemonSet, make the node fail, be stuck with a stale pod. (Note that I did not actually try to reproduce this.)
Anything else we need to know:
Please ask for anything that might be missing.
The text was updated successfully, but these errors were encountered: