Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single node "best-effort drain" during upgrade #3445

Closed
TeddyAndrieux opened this issue Jul 13, 2021 · 0 comments · Fixed by #3447
Closed

Single node "best-effort drain" during upgrade #3445

TeddyAndrieux opened this issue Jul 13, 2021 · 0 comments · Fixed by #3447
Labels
kind:bug Something isn't working release:blocker An issue that blocks a release until resolved topic:lifecycle Issues related to upgrade or downgrade of MetalK8s

Comments

@TeddyAndrieux
Copy link
Collaborator

Component:

'lifecycle'

What happened:

Due to a "bug" (:question:) in kubelet when upgrading a single node cluster with a bunch of Pods running, upgrade fail because it takes too much time to restart static Pods (like apiserver).

See: kubernetes/kubernetes#103658

What was expected:

Single node upgrade to work properly

Resolution proposal (optional):

In order to avoid that kind of issue let's drain the node during the upgrade even if it's not really needed since it's .... single-node cluster so anyway every services may/will have downtime during the upgrade process.

Draining may not be possible on a single node as you do not have any other node to schedule needed Pods (e.g.: If you have some PodDisruptionBudget configured in the cluster).

Add a "best-effort" drain used for single-node drain during the upgrade, this drain just cordon the node and evict all Pods possible, and do not retry/fail if one Pod cannot be evicted just continue the "classic upgrade process"

@TeddyAndrieux TeddyAndrieux added kind:bug Something isn't working topic:lifecycle Issues related to upgrade or downgrade of MetalK8s release:blocker An issue that blocks a release until resolved labels Jul 13, 2021
@TeddyAndrieux TeddyAndrieux added this to the MetalK8s 2.10.0 milestone Jul 13, 2021
TeddyAndrieux added a commit that referenced this issue Jul 13, 2021
When running in single node cluster we want to drain the node so that we
have as less pod running on the node as possible.

Fixes: #3445
TeddyAndrieux added a commit that referenced this issue Jul 13, 2021
When running in single node cluster we want to drain the node so that we
have as less pod running on the node as possible.

Fixes: #3445
TeddyAndrieux added a commit that referenced this issue Jul 13, 2021
When running in single node cluster we want to drain the node so that we
have as less pod running on the node as possible.

Fixes: #3445
TeddyAndrieux added a commit that referenced this issue Jul 13, 2021
When running in single node cluster we want to drain the node so that we
have as less pod running on the node as possible.

Fixes: #3445
TeddyAndrieux added a commit that referenced this issue Jul 13, 2021
When running in single node cluster we want to drain the node so that we
have as less pod running on the node as possible.

Fixes: #3445
TeddyAndrieux added a commit that referenced this issue Jul 13, 2021
When running in single node cluster we want to drain the node so that we
have as less pod running on the node as possible.

Fixes: #3445
@bert-e bert-e closed this as completed in 5db464c Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug Something isn't working release:blocker An issue that blocks a release until resolved topic:lifecycle Issues related to upgrade or downgrade of MetalK8s
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant