Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch default KCM --node-monitor-grace-period to 40s #7688

Closed
Tracked by #6529
vlerenc opened this issue Mar 22, 2023 · 3 comments · Fixed by #7883
Closed
Tracked by #6529

Switch default KCM --node-monitor-grace-period to 40s #7688

vlerenc opened this issue Mar 22, 2023 · 3 comments · Fixed by #7883

Comments

@vlerenc
Copy link
Member

vlerenc commented Mar 22, 2023

What would you like to be added:

While introducing DWD, we set KCM's --node-monitor-grace-period to 120s, but this is very long and way beyond KCM's default of 40s. We should go back to that default, for seeds and shoots. See also gardener/dependency-watchdog#79.

Why is this needed:

We saw during "zone outage simulations" that recovery happens very slowly. It takes 2m for KCM to put a node into Unknown state and only then things move forward like endpoint and loadbalancer deregistration. Notable is especially our Istio ingress gateway/the API server that appears available for 2m before it gets updated even though it's dead for the entire time.

@vlerenc vlerenc changed the title Switch default KCM --node-monitor-grace-period to 40s Switch default KCM --node-monitor-grace-period to 40s Mar 23, 2023
@timuthy
Copy link
Contributor

timuthy commented Apr 12, 2023

Should we change the default setting to 40s with Kubernetes v1.27?
cc @ary1992

@acumino
Copy link
Member

acumino commented May 16, 2023

Completed in #7883
/close

@gardener-prow
Copy link
Contributor

gardener-prow bot commented May 16, 2023

@acumino: Closing this issue.

In response to this:

Completed in #7883
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gardener-prow gardener-prow bot closed this as completed May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants