-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of "leader changed" errors #11426
Conversation
68c5f91
to
a9e3e5f
Compare
@hickeyma @technosophos @mattfarina Folks, @cenkalti tested a bunch of scenarios recreating the problem(s) with etcd with this patch. |
This reverts commit ebc79fa. Signed-off-by: Cenk Alti <cenkalti@gmail.com>
Signed-off-by: Cenk Alti <cenkalti@gmail.com>
b6938a2
to
b5378b3
Compare
Hey @technosophos @hickeyma. Bumping up this for visibility. It's a small fix. Can you take a look please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hickeyma I tested this fix manually by setting up a custom k8s cluster with 3 etcd nodes and running I checked https://github.com/helm/acceptance-testing project to see If I can write this scenario as an acceptance test but I couldn't find a solution as the clusters are created with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @cenkalti
Needs another maintainer approval for merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@technosophos @mattfarina Can you take a look please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
@hickeyma feel free to merge whenever you feel good about it. |
Hi, Any chance this could be patched to the 3.2.x release. |
What this PR does / why we need it:
This PR aims to fix temporary "etcdserver: leader changed" errors from kube-apiserver that is previously attempted by #11401 by adding a single retry when this kind of error is detected.
/cc @dims
Special notes for your reviewer:
I also reverted the previous fix that didn't solve the issue. One commit is the revert, the other one is the new fix. It's better if you review commits separately.
If applicable: