Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPOLL - decouple schedule tasks from epoll_wait life cycle #7834

Merged
merged 3 commits into from Aug 14, 2019

Conversation

Scottmitch
Copy link
Member

@Scottmitch Scottmitch commented Apr 3, 2018

Motivation:
EPOLL supports decoupling the timed wakeup mechanism from the selector call. The EPOLL transport takes advantage of this in order to offer more fine grained timer resolution. However we are current calling timerfd_settime on each call to epoll_wait and this is expensive. We don't have to re-arm the timer on every call to epoll_wait and instead only have to arm the timer when a task is scheduled with an earlier expiration than any other existing scheduled task.

Modifications:

  • Before scheduled tasks are added to the task queue, we determine if the new
    duration is the soonest to expire, and if so update with timerfd_settime. We
    also drain all the tasks at the end of the event loop to make sure we service
    any expired tasks and get an accurate next time delay.
  • EpollEventLoop maintains a volatile variable which represents the next deadline to expire. This variable is modified inside the event loop thread (before calling epoll_wait) and out side the event loop thread (immediately to ensure proper wakeup time).
  • Execute the task queue before the schedule task priority queue. This means we
    may delay the processing of scheduled tasks but it ensures we transfer all
    pending tasks from the task queue to the scheduled priority queue to run the
    soonest to expire scheduled task first.
  • Deprecate IORatio on EpollEventLoop, and drain the executor and scheduled queue on each event loop wakeup. Coupling the amount of time we are allowed to drain the executor queue to a proportion of time we process inbound IO may lead to unbounded queue sizes and unpredictable latency.

Result:
Fixes #7829

  • In most cases this results in less calls to timerfd_settime
  • Less event loop wakeups just to check for scheduled tasks executed outside the event loop
  • More predictable executor queue and scheduled task queue draining
  • More accurate and responsive scheduled task execution

@Scottmitch
Copy link
Member Author

more testing / analysis is required, but figured I would push this upstream to see if everyone is on board.

@carl-mastrangelo - would you mind running the same benchmarks you ran for PR #7816?

/**
* Represents a task that is scheduled to expire some time in the future.
*/
public interface ScheduledTask {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to get the deadline, not the delay.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Scottmitch that said should we just make this part of the Abstract base class and mark as unstable ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What abstract base class? The issue is we need it to be accessible outside the package (PromiseTask and ScheduledFutureTask are package private). I considered putting a protected interface in AbstractScheduledEventExecutor and can do this to reduce the exposure of this API. I will push this change and we can discuss more if necessary.

On a related note ... the relationship between ScheduledFutureTask and AbstractScheduledEventExecutor is very tightly coupled. I would like to tease these two apart in general as there a assumptions needed about using the "correct" nanoTime that may be made less "leaky" for the next major release (I added something to the doc).

@@ -102,23 +102,23 @@ public static FileDescriptor newTimerFd() {
public static native void eventFdWrite(int fd, long value);
public static native void eventFdRead(int fd);
static native void timerFdRead(int fd);
static native void timerFdSetTime(int fd, int sec, int nsec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a throws as it may throw a ClosedChannelException

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch (its actually ChannelException at the moment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However this is a runtime exception ... would you prefer a IOException using the errno like in epollWait?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think this would be better (using NativeIOException)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just went with IOException for now ... avoids additional conditional checking and the string description of the error is already included in the IoExecution thrown by the JNI code.

* with the time source (e.g. calling System.nanoTime()) which can be expensive.
*/
private final AtomicLong nextDeadlineNanos = new AtomicLong(MAXIMUM_DEADLINE);
private final AtomicInteger wakenUp = new AtomicInteger();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use an AtomicBoolean as we use it as a boolean.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this, however it involves additional conditional behind the scenes. Since this is internal I think we can just use AtomicInteger.

I also moved away from the AtomicFieldUpdater because:

  • the excepted number of instances for this class is low
  • the atomic field updaters have an additional "instance check" which can be avoided.

@normanmaurer
Copy link
Member

@Scottmitch Thanks for this... in general this looks ok. That said I would be interested to see how it compares to #7816 and if it worth it compared to the "easier" solution that was used in #7816

@Scottmitch
Copy link
Member Author

Agreed more analysis is necessary. This PR does a lot of things PR #7816 doesn't do, so lets see if I can quantify it some how :)

}

static long deadlineToDelayNanos(long deadlineNanos) {
return Math.max(0, deadlineNanos - nanoTime());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think you need to allow negative values here. If you were sorting task based on their delay rather than their deadline, you still need to keep track of which task is most tardy.

Copy link
Member Author

@Scottmitch Scottmitch Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this could likely be changed, but is pre-existing behavior and doesn't impact this PR. Note that we are sorting, and determining "next to expire", based upon deadline (not delay).

Lets discuss in a followup issue: #7840

Copy link
Member

@carl-mastrangelo carl-mastrangelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, no other concerns from me.

@Scottmitch
Copy link
Member Author

@carl-mastrangelo - thanks for review! would you be able to run the same benchmark you did in PR #7816? I'm also planning on doing some analysis, but the more data points the better.

@trustin
Copy link
Member

trustin commented Apr 19, 2018

Sorry I didn't have a chance to review your PR yet, @Scottmitch. Will find some time soon.

@M-AJ
Copy link

M-AJ commented Apr 19, 2018

How can we get this artifact jar or artificatId for a maven dependency?

@normanmaurer
Copy link
Member

@M-AJ I can build one for you if you are interested otherwise you would just apply the patch and build it yourself.

@M-AJ
Copy link

M-AJ commented Apr 20, 2018

@normanmaurer If you can build one, that'd be great! Really appreciate it! I've tried building it locally but the process takes a while for me.

@Scottmitch
Copy link
Member Author

Scottmitch commented Apr 21, 2018

I just rebased to make getting recent fixes easier ... sorry super busy at the moment ... this is still on the backlog and hopefully will get to this soon. However it would be much appreciated if others could also try this out though 👍

@normanmaurer
Copy link
Member

@M-AJ you want a netty-all jar or just the epoll jar ?

@M-AJ
Copy link

M-AJ commented Apr 23, 2018

@normanmaurer netty-all would be perfect. Appreciate it!

@M-AJ
Copy link

M-AJ commented Apr 28, 2018

Hey @normanmaurer, any update on the build? Can't wait to try it out and see how it performs!

@normanmaurer
Copy link
Member

Sorry it took me some time :(

Try this one:
https://drive.google.com/open?id=1S4Ro2xJu0Mf2NepbaFR4zM75wKXdMhDx

@Scottmitch
Copy link
Member Author

@normanmaurer - I updated the commit message. let me know if it makes more sense now.

// is being shutdown, so the timerFd and associated polling mechanism will be destroyed anyways.
}

protected final ScheduledTask peekScheduledTaskDelayNanos() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Javadoc?

* Get the time stamp relative to some monotonically increasing time source for which this {@link ScheduledTask}
* expires.
* @return the time stamp relative to some monotonically increasing time source for which this
* {@link ScheduledTask} expires.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation

protected interface ScheduledTaskRunnable extends Runnable, ScheduledTask {
/**
* Determine if this object is the result of adding or removing a scheduled task.
* @return {@code true} if this object is the result of adding or removing a scheduled task.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this method return false on removal?

* occur on the event loop thread do not interact with this method.
* @param runnable The {@link ScheduledTaskRunnable} to execute.
*/
protected void executeScheduledRunnable(ScheduledTaskRunnable runnable) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if the parameter type of this method can be widen to ScheduledTask or even Runnable. If so, we could hide ScheduledTaskRunnable (and even ScheduledTask) completely from the public API, i.e. less maintenance burden.

If a subclass needs the information like deadline, we could just add it as an additional parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this suggestion was implemented makes things more complex IMHO. We are trying to communicate from here that the type of task is for scheduled execution, and this information is used conditionally at higher layers (e.g. should we wake up). This has been changed to introduce 3 methods (e.g. beforeScheduledTaskSubmitted, afterScheduledTaskSubmitted, wakesUpForScheduledRunnable) and 1 additional type is exposed NonWakeupRunnable publicly which may require conditional allocation/wrapping at higher layers. IOW we are losing information by just allocating a Runnable here and have to compensate by complicating method signatures and type hierarchy at other layers.

I'll submit a followup PR (since this plus an additional PR was recently reverted) so we can take a look. /cc @normanmaurer @njhill

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Scottmitch. I gave this quite a lot of thought so would be good to discuss more. A few initial responses though:

  • The prior communication depended on the notion of an intermediate task which happens to involve enqueing some other scheduled task onto the internal scheduled-task queue (i.e. not the scheduled task itself), which is something we should not be exposing at all imho... that seems like an internal detail which EL impls shouldn't be aware of.
  • I don't see wakesUpForScheduledRunnable that you referenced, do you mean wakesUpForRunnable or executeScheduledRunnable?
  • The only reason I left wakesUpForRunnable there is that it's protected and so would be a breaking change to remove it, I'd suggest that we deprecate it. A public NonWakeupRunnable interface serves a similar purpose but makes much more sense imho. Whether a given task is required to be run immediately is in general a property of that task and should be independent of the executor/EL implementation. We could consider a lazyExecute(Runnable) method in addition to this (might be be nice to have on the EventExecutor interface in netty 5).
  • I actually think we should remove executeScheduledRunnable altogether (possible since it's package-private), and instead just override schedule(...) in SingleThreadEventExecutor. This would further simplify things and avoid the double Runnable wrapping. Not sure it's also what you had in mind but it's a small change, I'll also push an example of that
  • Only the changes to EpollEventLoop itself were reverted for the release, the reworked superclass abstractions are still there. I'm still very interested to find out what the issues were ... @normanmaurer any luck with a reproducer or more clues? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Scottmitch I made the changes alluded to above here 3923176 but took things a bit further still, uplifting the before/after hooks to AbstractScheduledEventExecutor and introducing higher level idea of lazyExecute / LazyRunnable which could have general applicability.

I really think this is both simpler to exploit and more generally useful for subclasses (i.e. event loop impls).

/**
* The next task to expire (e.g. minimum delay) has been removed from the scheduled priority queue.
*/
protected void minimumDelayScheduledTaskRemoved(@SuppressWarnings("unused") ScheduledTask task) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto - can ScheduledTask be widen to Runnable?

@normanmaurer
Copy link
Member

@netty-bot test this please

Motivation:
EPOLL supports decoupling the timed wakeup mechanism from the selector call. The EPOLL transport takes advantage of this in order to offer more fine grained timer resolution. However we are current calling timerfd_settime on each call to epoll_wait and this is expensive. We don't have to re-arm the timer on every call to epoll_wait and instead only have to arm the timer when a task is scheduled with an earlier expiration than any other existing scheduled task.

Modifications:
- Before scheduled tasks are added to the task queue, we determine if the new
  duration is the soonest to expire, and if so update with timerfd_settime. We
also drain all the tasks at the end of the event loop to make sure we service
any expired tasks and get an accurate next time delay.
- EpollEventLoop maintains a volatile variable which represents the next deadline to expire. This variable is modified inside the event loop thread (before calling epoll_wait) and out side the event loop thread (immediately to ensure proper wakeup time).
- Execute the task queue before the schedule task priority queue. This means we
  may delay the processing of scheduled tasks but it ensures we transfer all
pending tasks from the task queue to the scheduled priority queue to run the
soonest to expire scheduled task first.
- Deprecate IORatio on EpollEventLoop, and drain the executor and scheduled queue on each event loop wakeup. Coupling the amount of time we are allowed to drain the executor queue to a proportion of time we process inbound IO may lead to unbounded queue sizes and unpredictable latency.

Result:
Fixes netty#7829
- In most cases this results in less calls to timerfd_settime
- Less event loop wakeups just to check for scheduled tasks executed outside the event loop
- More predictable executor queue and scheduled task queue draining
- More accurate and responsive scheduled task execution
@normanmaurer normanmaurer force-pushed the decouple_timer_and_poll branch 2 times, most recently from dc4ceba to 834763e Compare July 2, 2019 13:34
@normanmaurer
Copy link
Member

This is sitting here for a long time now... Let me merge this.

@normanmaurer normanmaurer merged commit 1fa7a5e into netty:4.1 Aug 14, 2019
@normanmaurer normanmaurer added this to the 4.1.40.Final milestone Aug 14, 2019
Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Scottmitch @normanmaurer apologies for the belated review of this. I think the general change is great, as well as much of the impl.

I made a few comments inline but in general have some reservations regarding the new abstractions exposed by the superclasses to allow EventLoop impls to integrate with the common timed-task scheduling logic.

I'll put more detail as a comment directly in the PR but my immediate concern is that once these APIs ship we can't easily change them, and I think they could be simpler.

// Just ignore as we use ET mode for the eventfd and timerfd.
//
// See also https://stackoverflow.com/a/12492308/1074097
} else if (fd == timerFd.intValue()) {
// consume wakeup event, necessary because the timer is added with ET mode.
Native.timerFdRead(fd);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be my misunderstanding, but wouldn't ET imply this read isn't needed?

* @return {@code true} if at least {@link Runnable#run()} was called.
*/
private boolean runExistingTasksFrom(Queue<Runnable> taskQueue) {
return taskQueue.offer(BOOKEND_TASK) ? runExistingTasksUntilBookend(taskQueue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the same as @carl-mastrangelo's earlier comment here - that the bookend could be easily avoided by just polling size() times. It would be cheaper and simpler, e.g. there's no need to deal specially with the bounded/full case. The JCTools impls won't involve any iteration when called from the consumer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea! calling from consumer thread for MPSC shouldn't require looping.

@@ -186,6 +186,31 @@ static jint netty_epoll_native_epollCreate(JNIEnv* env, jclass clazz) {
return efd;
}

static void netty_epoll_native_timerFdSetTime(JNIEnv* env, jclass clazz, jint timerFd, jint tvSec, jint tvNsec) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you ever consider changing to use TFD_TIMER_ABSTIME? We could just measure/store the offset of System.nanoTime() from the system clock during initialization and then use deadlines rather than delays across the board to further avoid continual absolute/relative conversion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I didn't consider this.

// lower value.
nextDeadline = nextDeadlineNanos.get();
if (nextDeadline - candidateNextDeadline < 0) {
setTimerFd(deadlineToDelayNanos(nextDeadline));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this fully addresses the race condition. I think access to setTimerFd probably needs to be synchronized.

@Override
public void run() {
try {
assertEquals(1, Native.epollWait(epoll, eventArray, timerFd, -1, -1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of removing this test why not just adjust it to use the new equivalent Native.epollWaitNoTimeout method?

@normanmaurer
Copy link
Member

normanmaurer commented Aug 15, 2019 via email

@njhill
Copy link
Member

njhill commented Aug 15, 2019

@normanmaurer sure, stand by :)

@njhill
Copy link
Member

njhill commented Aug 16, 2019

@normanmaurer see #9470

if (scheduledTask.deadlineNanos() <= nanoTime) {
scheduledTaskQueue.remove();
ScheduledFutureTask<?> scheduledTask = scheduledTaskQueue.peek();
if (scheduledTask != null && scheduledTask.deadlineNanos() <= nanoTime) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace direct comparison using nanoTime with differences ie scheduledTask.deadlineNanos() - nanoTime <= 0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Netty's ScheduledFutureTask offsets all time stamps and they are generally compared using equality for deadlines while giving up half of the domain (e.g. overflow earlier). I agree it is confusing relative to typical use of System.nanoTime() and we may want to revisit how time duration/expiration is done in Netty 5 in this area.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"earlier" being after only 300 years, rather than 600 :)

njhill added a commit to njhill/netty that referenced this pull request Aug 19, 2019
Motivation

The epoll transport was updated in netty#7834 to decouple setting of the
timerFd from the event loop, so that scheduling delayed tasks does not
require waking up epoll_wait. To achieve this, new overridable hooks
were added in the AbstractScheduledEventExecutor and
SingleThreadEventExecutor superclasses.

However, the minimumDelayScheduledTaskRemoved hook has no current
purpose and I can't envisage a _practical_ need for it. Removing
it would reduce complexity and avoid supporting this specific
API indefinitely. We can add something similar later if needed
but the opposite is not true.

There also isn't a _nice_ way to use the abstractions for
wakeup-avoidance optimizations in other EventLoops that don't have a
decoupled timer.

This PR replaces executeScheduledRunnable and
wakesUpForScheduledRunnable
with two new methods before/afterFutureTaskScheduled that have slightly
different semantics:
 - They only apply to additions; given the current internals there's no
practical use for removals
 - They allow per-submission wakeup decisions via a boolean return val,
which makes them easier to exploit from other existing EL impls (e.g.
NIO/KQueue)
 - They are subjectively "cleaner", taking just the deadline parameter
and not exposing Runnables
 - For current EL/queue impls, only the "after" hook is really needed,
but specialized blocking queue impls can conditionally wake on task
submission (I have one lined up)

Also included are further optimization/simplification/fixes to the
timerFd manipulation logic.

Modifications

- Remove AbstractScheduledEventExecutor#minimumDelayScheduledTaskRemoved()
and supporting methods
- Uplift NonWakeupRunnable and corresponding default wakesUpForTask()
impl from SingleThreadEventLoop to SingleThreadEventExecutor
- Change executeScheduledRunnable() to be package-private, and have a
final impl in SingleThreadEventExecutor which triggers new overridable
hooks before/afterFutureTaskScheduled()
- Remove unnecessary use of bookend tasks while draining the task queue
- Use new hooks to add simpler wake-up avoidance optimization to
NioEventLoop (primarily to demonstrate utility/simplicity)
- Reinstate removed EpollTest class

In EpollEventLoop:
 - Refactor to use only the new afterFutureTaskScheduled() hook for
updating timerFd
 - Fix setTimerFd race condition using a monitor
 - Set nextDeadlineNanos to a negative value while the EL is awake and
use this to block timer changes from outside the EL. Restore the
known-set value prior to sleeping, updating timerFd first if necessary
 - Don't read from timerFd when processing expiry event

Result

- Cleaner API for integrating with different EL/queue timing impls
- Fixed race condition to avoid missing scheduled wakeups
- Eliminate unnecessary timerFd updates while EL is awake, and
unnecessary expired timerFd reads
- Avoid unnecessary scheduled-task wakeups when using NIO transport

I did not yet further explore the suggestion of using
TFD_TIMER_ABSTIME for the timerFd.
normanmaurer pushed a commit that referenced this pull request Aug 21, 2019
Motivation

The epoll transport was updated in #7834 to decouple setting of the
timerFd from the event loop, so that scheduling delayed tasks does not
require waking up epoll_wait. To achieve this, new overridable hooks
were added in the AbstractScheduledEventExecutor and
SingleThreadEventExecutor superclasses.

However, the minimumDelayScheduledTaskRemoved hook has no current
purpose and I can't envisage a _practical_ need for it. Removing
it would reduce complexity and avoid supporting this specific
API indefinitely. We can add something similar later if needed
but the opposite is not true.

There also isn't a _nice_ way to use the abstractions for
wakeup-avoidance optimizations in other EventLoops that don't have a
decoupled timer.

This PR replaces executeScheduledRunnable and
wakesUpForScheduledRunnable
with two new methods before/afterFutureTaskScheduled that have slightly
different semantics:
 - They only apply to additions; given the current internals there's no
practical use for removals
 - They allow per-submission wakeup decisions via a boolean return val,
which makes them easier to exploit from other existing EL impls (e.g.
NIO/KQueue)
 - They are subjectively "cleaner", taking just the deadline parameter
and not exposing Runnables
 - For current EL/queue impls, only the "after" hook is really needed,
but specialized blocking queue impls can conditionally wake on task
submission (I have one lined up)

Also included are further optimization/simplification/fixes to the
timerFd manipulation logic.

Modifications

- Remove AbstractScheduledEventExecutor#minimumDelayScheduledTaskRemoved()
and supporting methods
- Uplift NonWakeupRunnable and corresponding default wakesUpForTask()
impl from SingleThreadEventLoop to SingleThreadEventExecutor
- Change executeScheduledRunnable() to be package-private, and have a
final impl in SingleThreadEventExecutor which triggers new overridable
hooks before/afterFutureTaskScheduled()
- Remove unnecessary use of bookend tasks while draining the task queue
- Use new hooks to add simpler wake-up avoidance optimization to
NioEventLoop (primarily to demonstrate utility/simplicity)
- Reinstate removed EpollTest class

In EpollEventLoop:
 - Refactor to use only the new afterFutureTaskScheduled() hook for
updating timerFd
 - Fix setTimerFd race condition using a monitor
 - Set nextDeadlineNanos to a negative value while the EL is awake and
use this to block timer changes from outside the EL. Restore the
known-set value prior to sleeping, updating timerFd first if necessary
 - Don't read from timerFd when processing expiry event

Result

- Cleaner API for integrating with different EL/queue timing impls
- Fixed race condition to avoid missing scheduled wakeups
- Eliminate unnecessary timerFd updates while EL is awake, and
unnecessary expired timerFd reads
- Avoid unnecessary scheduled-task wakeups when using NIO transport

I did not yet further explore the suggestion of using
TFD_TIMER_ABSTIME for the timerFd.
@normanmaurer normanmaurer removed this from the 4.1.40.Final milestone Sep 12, 2019
@Scottmitch Scottmitch deleted the decouple_timer_and_poll branch January 9, 2020 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Timer Wakeup for Native Transports
8 participants