Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The high availabilty of gardener resource manager should be disabled right after the kube-apiserver is destroyed #5464

Closed
vpnachev opened this issue Feb 18, 2022 · 0 comments · Fixed by #5466
Labels
area/ops-productivity Operator productivity related (how to improve operations) area/robustness Robustness, reliability, resilience related kind/enhancement Enhancement, improvement, extension

Comments

@vpnachev
Copy link
Member

How to categorize this issue?

/area ops-productivity robustness
/kind enhancement

What would you like to be added:
The resource manager to be destroyed right after the kube-apiserver in deletion and hibernation flow. Or at least its pod disruption budget to be deleted.

Why is this needed:
Once the kube-apiserver of a shoot cluster is no longer deployed all replicas of the gardener resource manager are no longer working as they cannot connect to the shoot cluster. Due to this and the PDB, none of the replicas can be gracefully deleted. Most of the time this is not a problem, but lifecycle operation with the seed clusters are unnecessary prolonged until a forceful drain of the nodes is applied. Depending on other settings and home much shoots with failed resource managed are deployed, an average sized seed cluster might not complete its node roll out in 12 hours, while it was taking 2-3 hours previously.

@vpnachev vpnachev added the kind/enhancement Enhancement, improvement, extension label Feb 18, 2022
@gardener-robot gardener-robot added area/ops-productivity Operator productivity related (how to improve operations) area/robustness Robustness, reliability, resilience related labels Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ops-productivity Operator productivity related (how to improve operations) area/robustness Robustness, reliability, resilience related kind/enhancement Enhancement, improvement, extension
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants