Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance leader election doc #77585

Merged
merged 1 commit into from
May 16, 2019

Conversation

andyxning
Copy link
Member

@andyxning andyxning commented May 8, 2019

What type of PR is this?
/kind client-go

What this PR does / why we need it:
When we use leader election pkg under two operator app instances with lease id being Host-A and Host-B, i.e., hostname for lease id. If Host-A is the leader and Host-B is always trying to acquire the lease. After some time of running, we both stop operator app instances running on Host-A and Host-B. After ten minutes later which is longer than lease duration, we then first start operator app instances running on Host-B. But, it can not acquire the lease in the first lease duration time since it is first acquired by an instance running on Host-A.

This will trigger a problem as there is some time that no operator instances can run.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 8, 2019
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 8, 2019
@andyxning
Copy link
Member Author

/test pull-kubernetes-bazel-test

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels May 8, 2019
@jennybuckley
Copy link

/assign @cheftako
/cc @jpbetz

@andyxning
Copy link
Member Author

/test pull-kubernetes-integration

@andyxning
Copy link
Member Author

@neolit123 @cheftako @jpbetz CI is green now. PTAL.

@neolit123
Copy link
Member

the change makes sense, but i will defer to the client-go maintainers.

@andyxning
Copy link
Member Author

// 2. Record obtained, check the Identity & Time
if !reflect.DeepEqual(le.observedRecord, *oldLeaderElectionRecord) {
le.observedRecord = *oldLeaderElectionRecord
le.observedTime = le.clock.Now()
}
if len(oldLeaderElectionRecord.HolderIdentity) > 0 &&
le.observedTime.Add(le.config.LeaseDuration).After(now.Time) &&
!(firstStart && oldLeaderElectionRecord.RenewTime.Add(le.config.LeaseDuration).Before(now.Time)) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time stamp is not meaningful if it was collected on another machine. The implementation of this client only acts on locally collected time stamps and cannot rely on the accuracy of time stamp in the record for correctness. The client needs to wait the full lease duration without observing a change to the record before it can attempt to take over. Start isn’t different. This is documented here:

// leader (a.k.a. fencing). A client observes timestamps captured locally to
// infer the state of the leader election. Thus the implementation is tolerant
// to arbitrary clock skew, but is not tolerant to arbitrary clock skew rate.

10 minute lease duration sounds too long. It’s defaulted to 15 seconds in core components:

LeaseDuration: metav1.Duration{Duration: 15 * time.Second},

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikedanese So it is designed by purpose not a bug, actually?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time stamp is not meaningful if it was collected on another machine. The implementation of this client only acts on locally collected time stamps and cannot rely on the accuracy of time stamp in the record for correctness. The client needs to wait the full lease duration without observing a change to the record before it can attempt to take over. Start isn’t different.

This absolutely needs to be added to the leader election package documentation, IMO. It is more cleaer in describing a actual scenario.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the intended behavior:

// LeaseDuration is the duration that non-leader candidates will
// wait to force acquire leadership. This is measured against time of
// last observed ack.
LeaseDuration time.Duration

And "A client observes timestamps captured locally to infer the state of the leader election" are both documented but an example would definitely make this clearer. We should also document reasonable values for LeaseDruation, RenewDeadline and RetryPeriod.

An ack is any change of the record observed by the client. The RenewTime could change to the previous day, and it would still reset the LeaseDuration countdown for non-leader clients waiting to become leader.

For a starting process, the countdown should start at the time that the initial record is read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikedanese In a distributed system, time coherence in different nodes are hard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikedanese I have updated the PR title and content to add more doc about this behavior. PTAL.

@mikedanese mikedanese assigned mikedanese and unassigned cheftako May 13, 2019
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 14, 2019
@andyxning andyxning force-pushed the fix_leader_election_start branch 2 times, most recently from 87b3d05 to 7f1ddb0 Compare May 14, 2019 02:47
@andyxning andyxning changed the title fix leader election start enhance leader election doc May 14, 2019
@andyxning
Copy link
Member Author

/test pull-kubernetes-e2e-gce-100-performance

@timothysc timothysc removed their request for review May 14, 2019 14:16
@@ -22,6 +22,19 @@ limitations under the License.
// leader (a.k.a. fencing). A client observes timestamps captured locally to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe break the first sentence of this paragraph into the previous paragraph so that discussion on the finer points of timing is cleanly separated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

// timestamps and cannot rely on the accuracy of timestamp in the record for
// correctness.
//
// A client needs to wait a full LeaseDuration without observing a change
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably better placed in the LeaseDuration field documentation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@andyxning
Copy link
Member Author

@mikedanese Done. PTAL.

@andyxning
Copy link
Member Author

/test pull-kubernetes-kubemark-e2e-gce-big

1 similar comment
@andyxning
Copy link
Member Author

/test pull-kubernetes-kubemark-e2e-gce-big

@andyxning
Copy link
Member Author

/retest

@mikedanese mikedanese added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label May 15, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label May 15, 2019
@mikedanese
Copy link
Member

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 15, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyxning, mikedanese

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 15, 2019
@k8s-ci-robot k8s-ci-robot merged commit c2847e8 into kubernetes:master May 16, 2019
@andyxning andyxning deleted the fix_leader_election_start branch May 19, 2019 01:47
nikhita pushed a commit to nikhita/kubernetes that referenced this pull request Jun 13, 2019
…n_start

enhance leader election doc

Kubernetes-commit: c2847e8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants