Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
kubeadm: run MemberAdd/Remove for etcd clients with exp-backoff retry #79677
What this PR does / why we need it:
Implement exponential backoff retry around the MemberAdd call.
This solves a kubeadm problem when concurrently joining
From experiment, a few retries with milliseconds apart are
Apply the same backoff to MemberRemove in case the concurrent
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
NOTE: should be backported to 1.15.
When adding a new etcd member the etcd cluster can enter a state of vote, where any new members added at the exact same time will fail with an error right away. Implement exponential backoff retry around the MemberAdd call. This solves a kubeadm problem when concurrently joining control-plane nodes with stacked etcd members. From experiment, a few retries with milliseconds apart are sufficient to achieve the concurrent join of a 3xCP cluster. Apply the same backoff to MemberRemove in case the concurrent removal of members fails for similar reasons.
[APPROVALNOTIFIER] This PR is APPROVED
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing
actually, we are discussing using the same logic in etcdadm, because etcd simply does not support concurrent join to the best of our knowledge.
thanks for the +1s, will send a cherry pick soon.