New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node that is restarted never reconnects to cluster #45753
Comments
@sjezewski please make the same veriosn between client and version |
@kubernetes/sig-node-bugs |
Can you make sure kubelet is auto started after your VM is rebooted? |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.):
No
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): restart node
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT
Kubernetes version (use
kubectl version
):Environment:
uname -a
):Linux ip-172-20-34-246 4.4.41-k8s #1 SMP Mon Jan 9 15:34:39 UTC 2017 x86_64 GNU/Linux
Version 1.6.0-beta.1 (git-77f222d)
What happened:
As mentioned here I need to restart a node as part of the GPU nvidia driver installation process.
However, when doing a restart (either via
/sbin/shutdown -r
or via the AWS UI), the node never seems to come back into the k8s cluster (it never shows up in the output ofkubectl get nodes
) ... UNLESS ... I kill the api server pod, e.g:It takes ~2-3 min for the node to show up again ... but it does show up under the output of
kubectl get nodes
I don't think its just a matter of waiting. I've waited an hour after a restart and the node never re-appeared. It seems I must kill the api-server pod for the node to get detected again.
What you expected to happen:
After a node restart, the node would appear ready and part of the k8s cluster according to
kubectl get nodes
How to reproduce it (as minimally and precisely as possible):
I believe its a matter of just restarting any VM. I've only tested on AWS though.
Anything else we need to know:
The text was updated successfully, but these errors were encountered: