Unschedulable Pod might take a long time to get the condition set #109796
Labels
kind/bug
Categorizes issue or PR as related to a bug.
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
sig/scheduling
Categorizes an issue or PR as relevant to SIG Scheduling.
What happened?
If there is a connection failure when updating the Pod status, we don't retry.
Furthermore, due to #108761, we won't have another chance until much later.
The problem is that if we don't mark the Pod as Unschedulable, other controllers (such as cluster-autoscaler) wouldn't react to these pods.
What did you expect to happen?
We should have stronger guarantees to mark a Pod as Unschedulable.
How can we reproduce it (as minimally and precisely as possible)?
You need an unschedulable Pod (for example, a Pod with a non matching node affinity) and a flaky connection to apiserver.
Anything else we need to know?
No response
Kubernetes version
master
This is particularly problematic in 1.24, due to #108761.
Cloud provider
Any
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: