New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leaderelection: Allow leader elected code to step down on a context cancel #71490

Merged
merged 3 commits into from Jan 12, 2019

Conversation

@smarterclayton
Copy link
Contributor

smarterclayton commented Nov 28, 2018

The current code simply exits without continuing to renew the lease, which means participants using a slower lease duration might have to wait multiple minutes before a new leader is elected. Allow an optional flag to be set on LeaderElectionConfig that will release the lease when the calling context is cancelled. Callers must ensure their lease guarded code has exited before the context is cancelled, or other processes may acquire the lease before this lease has released (this is why the behavior is opt-in).

Add an example command that demonstrates how cancellation could be done.

This supports operators, custom controllers, and eventually our own controllers in being able to rapidly react when they are updated (i.e. you update your operator via a helm update, but you don't want to have to wait until the lease expires).

/kind feature

The leaderelection package allows the lease holder to release its lease when the calling context is cancelled. This allows
faster handoff when a leader-elected process is gracefully terminated. 
@smarterclayton

This comment has been minimized.

Copy link
Contributor Author

smarterclayton commented Nov 28, 2018

/assign @mikedanese

@smarterclayton

This comment has been minimized.

Copy link
Contributor Author

smarterclayton commented Nov 28, 2018

@deads2k as well

// you call cancel. Otherwise, you could have a background
// loop still running and another process could
// get elected before your background loop finished, violating
// the stated goal of the lease.

This comment has been minimized.

@justinsb

justinsb Nov 28, 2018

Member

This seems like a fairly hard guarantee to make. Do we have an example, or are you planning a follow-on PR to make this easier?

This comment has been minimized.

@smarterclayton

smarterclayton Nov 28, 2018

Author Contributor

There's a couple of options

  1. most controllers are reentrant, so it doesn't actually matter (the ones I would be concerned with in controller manager are the ones that talk to external systems and allocate things, and even they are probably ok)
  2. have a global "disable client" option when you get a client (basically a .Stop() on client) which is pretty easy to make work - this would be the best option for most people
  3. some controllers are correctly written even today to drain, it's reasonably easy to test them

The ones who would consume this first are things like 1. I need to look at 2.

This comment has been minimized.

@smarterclayton

smarterclayton Dec 27, 2018

Author Contributor

I implemented 2 - it should be fairly easy to use (I made the example demonstrate it)

LeaseDuration: 60 * time.Second,
RenewDeadline: 15 * time.Second,
RetryPeriod: 5 * time.Second,
Callbacks: leaderelection.LeaderCallbacks{

This comment has been minimized.

@justinsb

justinsb Nov 28, 2018

Member

For example, one way we could make this error-proof is to have leaderelection run the controller loop that we're likely running here.

This comment has been minimized.

@smarterclayton

smarterclayton Nov 28, 2018

Author Contributor

Only if exit of the run loop guarantees drain (some do, some don't)

go func() {
<-ch
log.Printf("Received termination, signaling shutdown")
cancel()

This comment has been minimized.

@justinsb

justinsb Nov 28, 2018

Member

I'm a big fan of using context instead of passing the channel around 👍

config, err = rest.InClusterConfig()
}
if err != nil {
log.Fatalf("failed to create client: %v", err)

This comment has been minimized.

@justinsb

justinsb Nov 28, 2018

Member

We should make sure that klog is a drop-in for log as well, I guess :-)

This comment has been minimized.

@smarterclayton

smarterclayton Nov 28, 2018

Author Contributor

I can flip the example, not hard. we don't do a ton of examples, but I wouldn't expect someone looking at this to have to use klog.

This comment has been minimized.

@justinsb

justinsb Nov 28, 2018

Member

Right - this was more an action-item for myself as someone wanting to get involved in klog :-)

@smarterclayton smarterclayton force-pushed the smarterclayton:step_down branch from 69acd06 to a27198f Nov 28, 2018

@smarterclayton

This comment has been minimized.

Copy link
Contributor Author

smarterclayton commented Dec 1, 2018

/retest

},
Client: kubernetes.NewForConfigOrDie(config).CoreV1(),
LockConfig: resourcelock.ResourceLockConfig{
Identity: id,

This comment has been minimized.

@mikedanese

mikedanese Dec 5, 2018

Member

Doesn't this need to be unique per instance of this main?

This comment has been minimized.

@smarterclayton

smarterclayton Dec 9, 2018

Author Contributor

Yes, as a test example you have to pass that in as an argument.

leaderElectionRecord := rl.LeaderElectionRecord{
LeaderTransitions: le.observedRecord.LeaderTransitions,
}
if err := le.config.Lock.Update(leaderElectionRecord); err != nil {

This comment has been minimized.

@mikedanese

mikedanese Dec 5, 2018

Member

tryAcquireOrRenew calls Get() which refreshes resource version. We might want to refresh here as well to avoid conflict errors.

This comment has been minimized.

@smarterclayton

smarterclayton Dec 9, 2018

Author Contributor

So when writing this I was biasing to "best effort" and "step down as quickly as possible before we get nuked by the sig kill". If we refresh, we could potentially have to wait 2 round trips vs one, with the potential for longer behavior.

Since we only step down if we're the leader, shouldn't we have an updated resource version in the lock already as the last writer?

This comment has been minimized.

@mikedanese

mikedanese Jan 10, 2019

Member

shouldn't we have an updated resource version in the lock already as the last writer?

Usually. Tradeoff SGTM.

@@ -284,7 +313,8 @@ func (le *LeaderElector) tryAcquireOrRenew() bool {
le.observedRecord = *oldLeaderElectionRecord
le.observedTime = le.clock.Now()
}
if le.observedTime.Add(le.config.LeaseDuration).After(now.Time) &&
if len(oldLeaderElectionRecord.HolderIdentity) > 0 &&

This comment has been minimized.

@mikedanese

mikedanese Dec 5, 2018

Member

What are the compatibility implications here?

This comment has been minimized.

@smarterclayton

smarterclayton Dec 9, 2018

Author Contributor

I'll add a test to prove behavior for this, but older clients watching observe the lease as still held (they don't refresh). So they assume the holder still holds. If a new client comes up (N+1) while old clients (N) are still running, the new client would have to acquire the lease, then step down, and the old clients would simply be less aggressive in acquiring.

@smarterclayton smarterclayton force-pushed the smarterclayton:step_down branch from a27198f to deeedef Dec 27, 2018

@k8s-ci-robot k8s-ci-robot added size/XL and removed size/L labels Dec 27, 2018

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Dec 27, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@smarterclayton

This comment has been minimized.

Copy link
Contributor Author

smarterclayton commented Dec 27, 2018

Additional changes (rounding out the "how do I use this practically from a controller").

  • refactor how transport wrappers are provided so that we can simplify the client - it's fairly error prone
  • fixed a minor bug where we weren't copying the rest.Config passed to the discovery client, so we could unintentionally mutate the rest.Config for a subsequent call
  • added a ContextCircuitBreaker transport wrapper that prevents new API requests from being made once the context is closed. This ensures that no new api requests are made once we release the lease, although it's possible that requests based on an earlier observed state are still waiting to be sent (which should be safe).

smarterclayton added some commits Nov 28, 2018

leaderelection: Allow leader elected code to step down on a context c…
…ancel

The current code simply exits without continuing to renew the lease, which means
participants using a slower lease duration might have to wait multiple minutes
before a new leader is elected. Allow an optional flag to be set on
LeaderElectionConfig that will release the lease when the calling context is
cancelled. Callers *must* ensure their lease guarded code has completed before
the context is cancelled, or other processes may acquire the lease before this
lease has released.

Add an example command that demonstrates how cancellation could be done.

As a convenience to users, make event recorder optional - not all users of the
lock code will need a recorder.
Make wrapping a client transport more pleasant
Properly wrapping a transport can be tricky. Make the normal case
(adding a non-nil transport wrapper to a config) easier with a helper.
Also enforce a rough ordering, which in the future we can use to
simplify the WrapTransport mechanism down into an array of functions
we execute in order and avoid wrapping altogether.

@smarterclayton smarterclayton force-pushed the smarterclayton:step_down branch from deeedef to ac48b43 Dec 27, 2018

@smarterclayton smarterclayton force-pushed the smarterclayton:step_down branch from ac48b43 to 8ac5f2c Dec 27, 2018

@smarterclayton

This comment has been minimized.

Copy link
Contributor Author

smarterclayton commented Dec 27, 2018

Ready for review as a complete "a controller can easily step down" PR now.

@smarterclayton

This comment has been minimized.

Copy link
Contributor Author

smarterclayton commented Dec 28, 2018

/retest

var reportedLeader string
var lock rl.Interface
for i := range tests {
test := &tests[i]

This comment has been minimized.

@mikedanese

mikedanese Jan 10, 2019

Member

Is this necessary? I looks like a _, t reference won't change out from underneath the test.

This comment has been minimized.

@smarterclayton

smarterclayton Jan 11, 2019

Author Contributor

if it ever gets made parallel it would be - I can change if you like.

// ContextCircuitBreaker prevents new requests after the provided context is finished.
// err is returned when the context is closed, allowing the caller to provide a context
// appropriate error.
func ContextCircuitBreaker(ctx context.Context, err error) WrapperFunc {

This comment has been minimized.

@mikedanese

mikedanese Jan 10, 2019

Member

ContextCanceller or Cancellable? When I here circuit breaker, I think of error thresholds.

This comment has been minimized.

@smarterclayton

smarterclayton Jan 10, 2019

Author Contributor

Fair. Stopper or Disable also work. I like Cancellable for succinctness.

This comment has been minimized.

@smarterclayton

smarterclayton Jan 11, 2019

Author Contributor

Went with ContextCanceller since it read well

@mikedanese

This comment has been minimized.

Copy link
Member

mikedanese commented Jan 10, 2019

mostly lgtm.

Add transport wrapper that blocks api calls after context close
The ContextCanceller transport wrapper blocks all API requests
after the provided context is closed. Used with the leader election
step down, a controller can ensure that new requests are not made
after the client has stepped down.

@smarterclayton smarterclayton force-pushed the smarterclayton:step_down branch from 8ac5f2c to fe74efb Jan 11, 2019

@mikedanese

This comment has been minimized.

Copy link
Member

mikedanese commented Jan 11, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Jan 11, 2019

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Jan 11, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

2 similar comments
@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Jan 12, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Jan 12, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 5e14bf6 into kubernetes:master Jan 12, 2019

17 of 18 checks passed

pull-kubernetes-kubemark-e2e-gce-big Job triggered.
Details
cla/linuxfoundation smarterclayton authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-e2e-kubeadm-gce Skipped
pull-kubernetes-godeps Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
tide In merge pool.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment