The high availabilty of gardener resource manager should be disabled right after the kube-apiserver is destroyed #5464
Labels
area/ops-productivity
Operator productivity related (how to improve operations)
area/robustness
Robustness, reliability, resilience related
kind/enhancement
Enhancement, improvement, extension
How to categorize this issue?
/area ops-productivity robustness
/kind enhancement
What would you like to be added:
The resource manager to be destroyed right after the kube-apiserver in deletion and hibernation flow. Or at least its pod disruption budget to be deleted.
Why is this needed:
Once the kube-apiserver of a shoot cluster is no longer deployed all replicas of the gardener resource manager are no longer working as they cannot connect to the shoot cluster. Due to this and the PDB, none of the replicas can be gracefully deleted. Most of the time this is not a problem, but lifecycle operation with the seed clusters are unnecessary prolonged until a forceful drain of the nodes is applied. Depending on other settings and home much shoots with failed resource managed are deployed, an average sized seed cluster might not complete its node roll out in 12 hours, while it was taking 2-3 hours previously.
The text was updated successfully, but these errors were encountered: