New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon: make consecutive quorum errors threshold configurable #16885
Merged
vadorovsky
merged 1 commit into
cilium:master
from
ctripcloud:make_quorum_threshold_configurable
Aug 2, 2021
Merged
daemon: make consecutive quorum errors threshold configurable #16885
vadorovsky
merged 1 commit into
cilium:master
from
ctripcloud:make_quorum_threshold_configurable
Aug 2, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
maintainer-s-little-helper
bot
added
the
dont-merge/needs-release-note-label
The author needs to describe the release impact of these changes.
label
Jul 14, 2021
ArthurChiao
force-pushed
the
make_quorum_threshold_configurable
branch
from
July 15, 2021 02:31
250a569
to
dd475d9
Compare
tklauser
reviewed
Jul 19, 2021
ArthurChiao
force-pushed
the
make_quorum_threshold_configurable
branch
3 times, most recently
from
July 19, 2021 14:25
265f9d6
to
d324ad3
Compare
ArthurChiao
force-pushed
the
make_quorum_threshold_configurable
branch
from
July 19, 2021 14:29
d324ad3
to
24a77b7
Compare
On detecting heartbeat (written by cilium-operator) missing from kvstore with consecutive probes, the clustermesh module in cilium-agent will re-connect kvstore. For large clusters/meshes, e.g. clusters with thousands of nodes, the concurrent reconnecting and list+watching behaviors pose significant pressue on kvstore, to the extent of crashing it. The threshold is currently hardcoded as 2, and this patch makes it configurable, which gives users a chance to choose from fast failure, or being more patient on encountering kvstore/operator/k8s-control-plane problems. Signed-off-by: ArthurChiao <arthurchiao@hotmail.com>
ArthurChiao
force-pushed
the
make_quorum_threshold_configurable
branch
from
July 19, 2021 14:40
24a77b7
to
8979e92
Compare
errordeveloper
approved these changes
Jul 19, 2021
joestringer
approved these changes
Jul 19, 2021
joestringer
added
the
release-note/minor
This PR changes functionality that users may find relevant to operating Cilium.
label
Jul 19, 2021
maintainer-s-little-helper
bot
removed
the
dont-merge/needs-release-note-label
The author needs to describe the release impact of these changes.
label
Jul 19, 2021
tklauser
approved these changes
Jul 19, 2021
borkmann
approved these changes
Aug 2, 2021
borkmann
added
the
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
label
Aug 2, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
release-note/minor
This PR changes functionality that users may find relevant to operating Cilium.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
On detecting heartbeat (written by cilium-operator) missing from kvstore
with consecutive probes, the clustermesh module in cilium-agent will
re-connect kvstore. For large clusters/meshes, e.g. clusters with
thousands of nodes, the concurrent reconnecting and list+watching
behaviors pose significant pressue on kvstore, to the extent of crashing
it.
The threshold is currently hardcoded as 2, and this patch makes it
configurable, which gives users a chance to choose from fast failure, or
being more patient on encountering kvstore/operator/k8s-control-plane
problems.
Signed-off-by: ArthurChiao arthurchiao@hotmail.com