New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1883662: Tune sb-db raft cluster election-timer #812
Bug 1883662: Tune sb-db raft cluster election-timer #812
Conversation
At current scale test (100 nodes, 3000+ services, 15K+ pods), sb-db cluster is partitioning very frequently with the current default election time of 5 seconds. This makes the cluster stable and once it goes to unstable state, it takes longer to recover. Given the huge number of flows being installed, using any utility tool to dump the logical flows from sb-db causes the cluster partition with the current election timer. nb-db raft cluster was hitting the similar issue as well. Signed-off-by: Anil Vishnoi <avishnoi@redhat.com>
cf33277
to
baf8543
Compare
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dcbw, vishnoianil The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@vishnoianil: This pull request references Bugzilla bug 1883662, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@dcbw: This pull request references Bugzilla bug 1883662, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
ovn-step-registry failure is |
/retest Please review the full test history for this PR and help us cut down flakes. |
/skip |
/retest Please review the full test history for this PR and help us cut down flakes. |
3 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@vishnoianil: All pull requests linked via external trackers have merged: Bugzilla bug 1883662 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
At current scale test (100 nodes, 3000+ services, 15K+ pods), sb-db
cluster is partitioning very frequently with the current default
election time of 5 seconds. This makes the cluster
stable and once it goes to unstable state, it takes longer to recover.
Given the huge number of flows being installed, using any utility tool to
dump the logical flows from sb-db causes the cluster partition with the
current election timer.
Signed-off-by: Anil Vishnoi avishnoi@redhat.com