Single node "best-effort drain" during upgrade #3445
Labels
kind:bug
Something isn't working
release:blocker
An issue that blocks a release until resolved
topic:lifecycle
Issues related to upgrade or downgrade of MetalK8s
Milestone
Component:
'lifecycle'
What happened:
Due to a "bug" (:question:) in kubelet when upgrading a single node cluster with a bunch of Pods running, upgrade fail because it takes too much time to restart static Pods (like apiserver).
See: kubernetes/kubernetes#103658
What was expected:
Single node upgrade to work properly
Resolution proposal (optional):
In order to avoid that kind of issue let's drain the node during the upgrade even if it's not really needed since it's .... single-node cluster so anyway every services may/will have downtime during the upgrade process.
Draining may not be possible on a single node as you do not have any other node to schedule needed Pods (e.g.: If you have some PodDisruptionBudget configured in the cluster).
Add a "best-effort" drain used for single-node drain during the upgrade, this drain just cordon the node and evict all Pods possible, and do not retry/fail if one Pod cannot be evicted just continue the "classic upgrade process"
The text was updated successfully, but these errors were encountered: