Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: SynchronizationContext exposed by LoadBalancer.Helper #4971

Merged
merged 9 commits into from Oct 23, 2018

Conversation

zhangkun83
Copy link
Contributor

@zhangkun83 zhangkun83 commented Oct 18, 2018

Provides a SynchronizationContext for scheduling tasks, with and without delay, from LoadBalancer implementations. This absorbs and extends the internal utility ChannelExecutor. It supersedes Helper.runSerialized(), which is now deprecated.

Motivation

I see multiple cases that schedule tasks with a delay while requiring the task to run in the "Channel Executor". There have been repeated work to wrap scheduled tasks and handle races between cancellation and task run (see the diff in GrpclbState.java for example). The LoadBalancer implementation (e.g., GrpclbLoadBalancer) also has to acquire the ScheduledExecutorService from somewhere and release it upon shutdown.

The upcoming HealthCheckLoadBalancer (#4932), which would use back-off policy to retry health-checking streams, would have to do all the things above. At this point I think we need to provide something that combines runSerialized() with a scheduled executor with the same synchronization guarantees.

Design details

SynchronizationContext is a similar to ScheduledExecutorService but tailored for use in LoadBalancer and potentially other cases outside of LoadBalancer. It offers task queuing and serialization and delayed scheduling. It guarantees non-reentrancy and happens-before among tasks. It owns no thread, but run tasks on caller's or caller-provided threads.

All channel-level state mutations and callback methods on LoadBalancer are done in a SynchronizationContext, which was previously referred to as "Channel Executor".

SynchronizationContext.schedule() returns a ScheduledHandle for status checking and cancellation. ScheduedFuture from SchedulingExecutorService.schedule() is too broad for our use cases (e.g., the blocking get() should never be used).

SynchronizationContext.schedule() requires a ScheduledExecutorService, which is now available through Helper.getScheduledExecutorService(). LoadBalancers don't need to worry about where to get SchedulingExecutorService any more.

Alternatives

Alternatively, we could keep Helper.runSerialized() and add something like Helper.runSerialiezdWithDelay(), but having them on their own interface allows clean fake implementation by FakeClock for test, and allows other components (potentially InternalSubchannel for reconnection backoff) to use it too.

Instead of asking caller of schedule() to provide the ScheduledExecutorService, we considered having SynchronizationContext take a ScheduledExecutorService at construction. It would be inconvenient for LoadBalancer implementations that don't use schedule(), as they would be forced to provide a fake ScheduledExecutorService (which is cumbersome).

Instead of making SynchronizationContext a (semi-)concrete class, we considered making it an pure abstract class. However, we found it nontrivial to implement execute() correctly with the non-reentrancy guarantee.

* submitted.
*/
public final ScheduledContext scheduleNow(Runnable task) {
return schedule(task, 0, TimeUnit.NANOSECONDS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not excited about making an easy way to make a zero-delay task. This just abuses the scheduled executor and is a strong code smell to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can instead abstract it and make schedule() call scheduleNow() when delay <= 0. Is it better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It would be surprising for a task to suddenly run in the current thread when the delay is 0. They are fundamentally different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same as the current runSerialized(). I still don't understand the issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you're saying is the same. Today runSerialized() runs on the current thread:
https://github.com/grpc/grpc-java/blob/v1.15.0/core/src/main/java/io/grpc/internal/ManagedChannelImpl.java#L1236-L1238

And any schedule() would run on a separate thread. I'm against having schedule() turn into running on the current thread based on the timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I have decoupled schedule() with scheduleNow().

/**
* Returns the current time in nanos from the same clock that {@link #schedule} uses.
*/
public abstract long currentTimeNanos();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird API to expose, since it will not agree with either currentTimeMillis nor nanoTime. Based on the documentation it would appear to be similar to nanoTime(), but the actual implementation uses a epoch of 1970 like currentTimeMillis, except if currentTimeMillis and nanoTime get out-of-sync. It seems this should just be nanoTime().

(I don't care really if it has a different offset than nanoTime(), but aligning it to 1970 seems like a bad idea since it can't be guaranteed to align with 1970.)

/**
* Schedules a task to run as soon as poassible.
*
* <p>Non-reentrency is guaranteed. Although task may run inline, but if this method is called
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Although task may run inline, but if/If/ . That seems more clear.

public abstract ScheduledContext scheduleNow(Runnable task);

/**
* Schedules a task to be run after a delay. Unlike {@link #scheduleNow}, the task will typically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "typically" is hard to reason about. Could we just say, "Unlike {@link #scheduleNow}, will never be run inline."? (Or maybe even, just "Will never be run inline.")

And make it a semi-concrete class. And it absorbs ChannelExecutor.

TODO: unit test on new methods on SynchronizationContext.
private final PriorityBlockingQueue<ScheduledTask> tasks =
new PriorityBlockingQueue<ScheduledTask>();
// Must keep the ordering of tasks as they are required by ControlPlaneScheduler.scheduleNow().
private final LinkedBlockingQueue<ScheduledTask> tasks = new LinkedBlockingQueue<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simply use two queues instead? One for pending (ready to be executed) tasks and one for scheduled (for a future time) tasks? We'd keep the previous PriorityBlockingQueue and then just add a LinkedBlockingQueue for execute(). That more closely matches what would happen in practice and makes the code more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

*
* <p>The default implementation logs a warning.
*/
protected void handleUncaughtThrowable(Throwable t) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the class could be made final and be passed a Thread.UncaughtExceptionHandler (with a note that the thread will not die after executing the handler, which is different from its documentation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/**
* Enqueues a task that will be run when {@link #drain} is called.
*/
public final void executeLater(Runnable runnable) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to point out this is useful for adding things from within a lock and then calling drain outside the lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@zhangkun83 zhangkun83 changed the title core: ControlPlaneScheduler exposed by LoadBalancer.Helper core: SynchronizationContext exposed by LoadBalancer.Helper Oct 23, 2018
Copy link
Contributor Author

@zhangkun83 zhangkun83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ejona86. All comments are addressed.

private final PriorityBlockingQueue<ScheduledTask> tasks =
new PriorityBlockingQueue<ScheduledTask>();
// Must keep the ordering of tasks as they are required by ControlPlaneScheduler.scheduleNow().
private final LinkedBlockingQueue<ScheduledTask> tasks = new LinkedBlockingQueue<>();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/**
* Enqueues a task that will be run when {@link #drain} is called.
*/
public final void executeLater(Runnable runnable) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

*
* <p>The default implementation logs a warning.
*/
protected void handleUncaughtThrowable(Throwable t) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* what's documented on {@link UncaughtExceptionHandler#uncaughtException}, the thread is
* not terminated when the handler is called.
*/
public SynchronizationContext(UncaughtExceptionHandler uncaughtExceptionHandler) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could provide a zero-arg version that just logs by default. But we can do that at any time. This seems fine for now.

@zhangkun83 zhangkun83 merged commit 7582049 into grpc:master Oct 23, 2018
@ericgribkoff ericgribkoff mentioned this pull request Oct 24, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Jan 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants