New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1858403: Use client-go leader election to write less. #239
Bug 1858403: Use client-go leader election to write less. #239
Conversation
We were using the defaults from controller runtime previously: lease duration: 15s renew deadline: 10s retry period: 2s This meant that the active leader was writing to etcd every 2 seconds to update the lease, which is excessive writing and spawned the bug above. We now implement leader election using the underlying client-go code to get access to ReleaseOnCancel, which is not presently exposed in controller-runtime. This allows us to immediately release the lock on normal shutdown eliminating delay before another pod takes over, as well as startup delay when doing development etc.
@dgoodwin: This pull request references Bugzilla bug 1858403, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold |
/test unit |
/test e2e-aws-upgrade |
1 similar comment
/test e2e-aws-upgrade |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgoodwin, joelddiaz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
@dgoodwin: All pull requests linked via external trackers have merged: Bugzilla bug 1858403 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We were using the defaults from controller runtime previously:
lease duration: 15s
renew deadline: 10s
retry period: 2s
This meant that the active leader was writing to etcd every 2 seconds to
update the lease, which is excessive writing and spawned the bug above.
We now implement leader election using the underlying client-go code to
get access to ReleaseOnCancel, which is not presently exposed in
controller-runtime.
This allows us to immediately release the lock on normal shutdown
eliminating delay before another pod takes over, as well as startup
delay when doing development etc.