New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid unnecessary event loop wakeups #9605
Conversation
In hindsight would have been nice to have had this in the last release! |
Can one of the admins verify this patch? |
@netty-bot test this please |
@normanmaurer I noticed a mistake in the original commit, the task queue must be checked after the volatile deadline is updated to avoid possible race condition, now fixed. This was careless of me, apologies. I also verified that it's not a problem in the |
@netty-bot test this please |
@normanmaurer sorry for the churn, I just pushed another commit to eliminate the infamous 1 second default delay when there's no future tasks scheduled. I looked through the history of this and am fairly certain it serves no purpose now. Let me know if you're not convinced and I can try to convince you :) In fact the way it was implemented meant that if scheduled tasks were not being used, the timerfd was being re-armed every time the EL went to sleep, which is a significant overhead as was demonstrated last year in the context of @carl-mastrangelo's re-arm skipping optimization. |
d7d8d7b
to
7642ec3
Compare
Benchmark results to illustrate the above: Before:
After:
|
transport-native-epoll/src/main/java/io/netty/channel/epoll/EpollEventLoop.java
Outdated
Show resolved
Hide resolved
transport-native-epoll/src/main/java/io/netty/channel/epoll/EpollEventLoop.java
Outdated
Show resolved
Hide resolved
@normanmaurer interestingly while playing with some benchmarks of this I also saw some hangs at the end of the test i.e. during shutdown. Digging further I found it was due to a kind of indirectly related issue ... opened #9616 with fix for this. Hope it's the same root cause as what you're encountering... (though not holding my breath!) |
@njhill was just about to write that it is related to shutdown as I the future returned by shutdownGracefully never completed in some cases (noticed while taking some jstacks). Let me apply your fix as well and see how it goes :) |
@njhill yes this fixes the issue ... Good job 🎉 |
@normanmaurer I pushed another commit which indirectly addresses your comments :) Realized that it's straightforward to unify the deadline and wakenUp volatiles, and the end result is cleaner and more elegant imho. We might want to reevaluate the net benefit of allowing timerfd updates from outside of the EL (i.e. #9590), since this PR provides a lot of the benefit with less synchronization complexity/cost. In any case this seems like it would be a nicer incremental change to make first, WDYT? (there are some orthogonal tweaks in the other PR that would still be nice to include regardless, e.g. processing of scheduled task queue directly rather than copying to the main taskQueue first) |
@njhill can you rebase please ? |
1a196b3
to
f152983
Compare
@normanmaurer done |
@netty-bot test this please |
1 similar comment
@netty-bot test this please |
@njhill sorry but please rebase one last time :) |
f152983
to
4780639
Compare
@normanmaurer done |
@netty-bot test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me test this one today
@njhill want to squash and adjust commit message ? |
@njhill all tests pass in our internal test suite... This is good to go. Good work! |
Motivation The recently-introduced event loop scheduling hooks can be exploited by the epoll transport to avoid waking the event loop when scheduling future tasks if there is a timer already set to wake up sooner. There is also a "default" timeout which will wake the event loop after 1 second if there are no pending future tasks. The performance impact of these wakeups themselves is likely negligible but there's significant overhead in having to re-arm the timer every time the event loop goes to sleep (see netty#7816). It's not 100% clear why this timeout was there originally but we're sure it's no longer needed. Modification Combine the existing volatile wakenUp and non-volatile prevDeadlineNanos fields into a single AtomicLong that stores the next scheduled wakeup time while the event loop is in epoll_wait, and is -1 while it is awake. Use this as a guard to debounce wakeups from both immediate scheduled tasks and future scheduled tasks, the latter using the new before/afterScheduledTaskSubmitted overrides and based on whether the new deadline occurs prior to an already-scheduled timer. A similar optimization was already added to NioEventLoop, but it still uses two separate volatiles. We should consider similar streamlining of that in a future update. Result Fewer event loop wakeups when scheduling future tasks, greatly reduced overhead when no future tasks are scheduled.
4780639
to
aac896d
Compare
@normanmaurer I squashed and reworded the commit. I will update the PR wording to match when next in front of my comp. |
@netty-bot test this please |
@njhill thanks a lot! Can you please take a look how to port this to master ? |
Motivation
The recently-introduced event loop scheduling hooks can be exploited by
the epoll transport to avoid waking the event loop when scheduling
future tasks if there is a timer already set to wake up sooner.
There is also a "default" timeout which will wake the event
loop after 1 second if there are no pending future tasks. The
performance impact of these wakeups themselves is likely negligible but
there's significant overhead in having to re-arm the timer every time
the event loop goes to sleep (see #7816). It's not 100% clear why this
timeout was there originally but we're sure it's no longer needed.
Modification
Combine the existing volatile wakenUp and non-volatile prevDeadlineNanos
fields into a single AtomicLong that stores the next scheduled wakeup
time while the event loop is in epoll_wait, and is -1 while it is awake.
Use this as a guard to debounce wakeups from both immediate scheduled
tasks and future scheduled tasks, the latter using the new
before/afterScheduledTaskSubmitted overrides and based on whether the
new deadline occurs prior to an already-scheduled timer.
A similar optimization was already added to NioEventLoop, but it still
uses two separate volatiles. We should consider similar streamlining of
that in a future update.
Result
Fewer event loop wakeups when scheduling future tasks, greatly reduced
overhead when no future tasks are scheduled.