Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase leader election settings for shoot's GRM #2667

Merged

Conversation

rfranzke
Copy link
Member

@rfranzke rfranzke commented Aug 5, 2020

How to categorize this PR?

/area cost networking robustness
/kind enhancement
/priority normal

What this PR does / why we need it:
We run one GRM per shoot control plane, and the GRM is doing its leader election via configmaps in the seed - by default every 2s. This can lead to a lot of PUT /v1/configmaps requests on the API server, and given that a seed is very busy anyways, we should not unnecessarily stress the API server with this leader election. The GRM's sync period is 1m anyways, so it doesn't matter too much if the leadership determination may take up to one minute.

Which issue(s) this PR fixes:
Part of #1953

Special notes for your reviewer:
✅ Depends on gardener-attic/gardener-resource-manager#72 and a new release.
/invite @timebertt @vlerenc

Release note:

It is now possible to specify the leader election settings via the following command line parameters: `--leader-election-lease-duration` (default: `15s`), `--leader-election-renew-deadline` (default: `10s`), `--leader-election-retry-period` (default: `2s`).
The leader election performed by the `gardener-resource-manager` deployments in shoot namespaces in the seed is now happening less frequently to prevent overloading the seed's API server unnecessarily.

@gardener-robot gardener-robot added area/cost Cost related area/networking Networking related area/robustness Robustness, reliability, resilience related kind/enhancement Enhancement, improvement, extension needs/review labels Aug 5, 2020
@rfranzke
Copy link
Member Author

rfranzke commented Aug 5, 2020

/invite @zanetworker @wyb1 @istvanballok

vpnachev
vpnachev previously approved these changes Aug 5, 2020
Copy link
Member

@vpnachev vpnachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

ialidzhikov
ialidzhikov previously approved these changes Aug 5, 2020
Copy link
Member

@ialidzhikov ialidzhikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@rfranzke
Copy link
Member Author

rfranzke commented Aug 5, 2020

/ready

@rfranzke rfranzke requested a review from vpnachev August 5, 2020 13:25
@gardener-robot gardener-robot marked this pull request as ready for review August 5, 2020 13:25
@gardener-robot gardener-robot requested a review from a team as a code owner August 5, 2020 13:25
Copy link
Contributor

@zanetworker zanetworker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Member

@timebertt timebertt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
/lgtm

WDYT about adding a release note, that the grm LE config is changed now and CM updates should be reduced now?

@rfranzke
Copy link
Member Author

rfranzke commented Aug 5, 2020

@timebertt I didn't add one on purpose as I wasn't sure how this information is helpful to an operator/admin, or? It's rather an internal detail, isn't it?

@timebertt
Copy link
Member

timebertt commented Aug 5, 2020

As you like :)
Still, I think cutting down some of the large amount of requests, that we have seen and that have caused some serious issues, is not an internal detail 😉

@rfranzke
Copy link
Member Author

rfranzke commented Aug 5, 2020

OK, I've added a small note now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cost Cost related area/networking Networking related area/robustness Robustness, reliability, resilience related kind/enhancement Enhancement, improvement, extension
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants