Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing cancellation support in LeaderElector #2207

Closed
dgloeckner opened this issue May 7, 2020 · 4 comments · Fixed by #4125
Closed

Missing cancellation support in LeaderElector #2207

dgloeckner opened this issue May 7, 2020 · 4 comments · Fixed by #4125

Comments

@dgloeckner
Copy link

dgloeckner commented May 7, 2020

After calling run() on the leader it will eventually enter the loop for renewing its lease.
There does not seem to be an API for gracefully shutting down a leader elector. I imagine that there should be a way to make the LeaderElector break out of the renewal loop, e.g. calling cancel() on a leader elector.

Following code does not cancel the leader elector

ExecutorService executorService = Executors.newSingleThreadExecutor();
		try (KubernetesClient kc = new DefaultKubernetesClient()) {
			LeaderElector leaderElector = kc.leaderElector().withConfig(
					new LeaderElectionConfigBuilder().withName("Sample Leader Election configuration").withLeaseDuration(Duration.ofSeconds(15L))
							.withLock(new LeaseLock(namespace, leaseName, identity)).withRenewDeadline(Duration.ofSeconds(10L))
							.withRetryPeriod(Duration.ofSeconds(2L)).withLeaderCallbacks(
							new LeaderCallbacks(() -> startLeading.countDown(), () -> log.info("STOPPED LEADERSHIP"),
									newLeader -> log.info("New leader elected {}", newLeader))).withReleaseOnCancel(true).build()).build();
			// The future takes care of re-newing the lease until cancelled.
			Future<?> makeMeTheLeader = executorService.submit(() -> leaderElector.run());
			startLeading.await();
			// We have the lock.
			...
			// Release the lock.
			makeMeTheLeader.cancel(true);
		}

Note: the original kubernetes Java client has a similar issue, see kubernetes-client/java#944

Relates to

@manusa
Copy link
Member

manusa commented May 11, 2020

In your example you can achieve this in a "not so clean" way by killing the thread that is running the LeaderElector loop, exception handling in LeaderElector should take care of cleaning up local resources, cluster should get outdated in the provided period, so a new leader would take its place (this is demonstrated in this example).

However, adding a cleaner way to stop the LeaderElector could be easily added, we should definitely consider it.

@dgloeckner
Copy link
Author

Hi there,

I guess by "killing the thread" you mean I should set its interrupted flag, right?

I did that by calling cancel(true) on the Future (s.a.). From my experience the leader keeps leading ;). In the original K8S client (see linked issue) this approach did indeed work and the leader stopped after logging a nasty error message.

@manusa
Copy link
Member

manusa commented May 11, 2020

I guess by "killing the thread" you mean I should set its interrupted flag, right?

Yes if it's the same Java application (as our provided example). But there could be several instances of the same application running (so it would mean killing the leader application). Or maybe several Pods running (so it would mean deleting the Pod).

I don't know what happened in you example, but basically this is what's implemented in the provided example (which kills the thread which acquired the lock every 2.5 seconds):

private Future<?> killLeaders(CountDownLatch leadersToKillCountDown) {
System.out.printf("\rSpawning thread to kill %s leader candidates %s millis after they become leaders%n",
THREADS_TO_KILL, WAIT_TO_KILL_TIME);
return executorService.scheduleWithFixedDelay(() -> {
final String currentLeader = leaderReference.get();
Optional.ofNullable(currentLeader)
.map(leaderCandidates::get)
.ifPresent(leader -> {
try {
Thread.sleep(WAIT_TO_KILL_TIME);
leader.cancel(true);
leaderCandidates.remove(currentLeader);
leadersToKillCountDown.countDown();
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
});
}, 0L, TASK_SLEEP, TimeUnit.MILLISECONDS);
}

You can check out the example by running mvn exec:java -Pleader-election -pl kubernetes-examples. Again, as stated in the previous comment, this is not the cleanest approach but it does get the job done. The leader will "keep leading" until a new leader gets elected (lock duration shouldn't be so long anyway, thread may die by any other reason and this would lead to problems).

Anyway, adding the cancel method is something to be considered.

@stale
Copy link

stale bot commented Oct 17, 2020

This issue has been automatically marked as stale because it has not had any activity since 90 days. It will be closed if no further activity occurs within 7 days. Thank you for your contributions!

@stale stale bot added the status/stale label Oct 17, 2020
@stale stale bot removed the status/stale label Oct 19, 2020
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 6, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 6, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 6, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 6, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 6, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 6, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 10, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 10, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 10, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
and a cleanup of the retry logic
shawkins added a commit to shawkins/kubernetes-client that referenced this issue May 11, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
and a cleanup of the retry logic
manusa pushed a commit to shawkins/kubernetes-client that referenced this issue May 19, 2022
…and easy cancel

also removing holding a thread when using start
and adding more jitter support
and a cleanup of the retry logic
@manusa manusa added this to the 6.0.0 milestone May 19, 2022
manusa pushed a commit that referenced this issue May 19, 2022
also removing holding a thread when using start
and adding more jitter support
and a cleanup of the retry logic
@manusa manusa closed this as completed May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants