Switch default KCM `--node-monitor-grace-period` to `40s` #7688

vlerenc · 2023-03-22T15:00:27Z

What would you like to be added:

While introducing DWD, we set KCM's --node-monitor-grace-period to 120s, but this is very long and way beyond KCM's default of 40s. We should go back to that default, for seeds and shoots. See also gardener/dependency-watchdog#79.

Why is this needed:

We saw during "zone outage simulations" that recovery happens very slowly. It takes 2m for KCM to put a node into Unknown state and only then things move forward like endpoint and loadbalancer deregistration. Notable is especially our Istio ingress gateway/the API server that appears available for 2m before it gets updated even though it's dead for the entire time.

The text was updated successfully, but these errors were encountered:

timuthy · 2023-04-12T09:36:21Z

Should we change the default setting to 40s with Kubernetes v1.27?
cc @ary1992

acumino · 2023-05-16T08:32:45Z

Completed in #7883
/close

gardener-prow · 2023-05-16T08:32:48Z

@acumino: Closing this issue.

In response to this:

Completed in #7883
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

vlerenc changed the title ~~Switch default KCM --node-monitor-grace-period to 40s~~ Switch default KCM --node-monitor-grace-period to 40s Mar 23, 2023

This was referenced Mar 23, 2023

DWD is seed-centrally installed/configured, but shoots have individual node monitor grace periods gardener/dependency-watchdog#79

Closed

☂️ [GEP-20] Highly Available Seed and Shoot Clusters #6529

Closed

rfranzke mentioned this issue Apr 13, 2023

☂️-Issue for "Support for Kubernetes v1.27” #7783

Closed

17 tasks

acumino mentioned this issue May 16, 2023

Support for Kubernetes v1.27 #7883

Merged

gardener-prow bot closed this as completed May 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch default KCM `--node-monitor-grace-period` to `40s` #7688

Switch default KCM `--node-monitor-grace-period` to `40s` #7688

vlerenc commented Mar 22, 2023 •

edited

timuthy commented Apr 12, 2023

acumino commented May 16, 2023

gardener-prow bot commented May 16, 2023

Switch default KCM --node-monitor-grace-period to 40s #7688

Switch default KCM --node-monitor-grace-period to 40s #7688

Comments

vlerenc commented Mar 22, 2023 • edited

timuthy commented Apr 12, 2023

acumino commented May 16, 2023

gardener-prow bot commented May 16, 2023

Switch default KCM `--node-monitor-grace-period` to `40s` #7688

Switch default KCM `--node-monitor-grace-period` to `40s` #7688

vlerenc commented Mar 22, 2023 •

edited