New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubeadm: fix the bug that 'kubeadm upgrade' hangs in single node cluster #88434
Conversation
/test pull-kubernetes-e2e-gce-100-performance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/assign @rajansandeep
// If we're dry-running, we don't need to wait for the new DNS addon to become ready | ||
if !dryRun { | ||
nodes, err := client.CoreV1().Nodes().List(context.TODO(), metav1.ListOptions{ | ||
FieldSelector: fields.Set{"spec.unschedulable": "false"}.AsSelector().String(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do not recall observing nodes being unschedulable while the CoreDNS addon is being upgraded during "kubeadm upgrade". is this something you have seen @SataQiu ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to this guide https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/#upgrade-the-first-control-plane-node, kubectl drain <cp-node-name> --ignore-daemonsets
in step 2 will make the control plane node unschedulable.
If we are using a single node cluster, the only node will be marked as unschedulable. In this case, new DNS deployment will never be ready. That's why the program is stuck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Is it worth having a single node cluster upgrade test in |
given the minimal bandwidth that we have to monitor and update our e2e tests, i'd argue that the maintenance burden will not be justified for having single-CP upgrade tests for all branches. but something to note here is that our current e2e tests does not drain/cordon at all: so possibly this is something that can be done first. /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: neolit123, SataQiu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
What this PR does / why we need it:
kubeadm: fix the bug that 'kubeadm upgrade' hangs in single node cluster
Which issue(s) this PR fixes:
Fixes kubernetes/kubeadm#2035
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: