Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable algorithm to choose next EventLoop #2470

Closed
wants to merge 1 commit into from

Conversation

@buchgr
Copy link
Contributor

commented May 7, 2014

Hello,

here is a first draft of what I think how #1230 should be implemented.

EpollEventLoopGroup and NioEventLoopGroup get additional constructors to accept an EventLoopChooser object. At its heart the EventLoopChooser has a method EventLoop next(SocketAddress remoteAddress), which is called on each accepted channel and maps each channel to an EventLoop. The remoteAddress parameter is the IP address of the accepted channel. That's useful in cases where a developer wants to map channels to event loops based on IP address or subnet. However, it might also be null, in cases where e.g. a task is assigned an event loop. So developers have to ensure that their code works in both cases. Most developers will simply want to ignore the parameter and assign channels to event loops based on some other criteria, e.g. provided by a MetricsProvider.

So what's a MetricsProvider? A MetricsProvider is mostly (85,7% to be precise) a MetricsCollector and the latter one has lots of methods that are called in the different Channel implementations to collect data about their inside. Like the number of registered channels, transferred bytes, etc. Each event loop gets its own MetricsCollector, that way a developer can implement a MetricsCollector to provide metrics on a per event loop basis and use this information in the EventLoopChooser#next(SocketAddress remoteAddress) method to decide which channel is mapped to which event loop.

In order to make metrics computation easier, some basic data structures for (what I believe to be) common use cases are provided. One such data structure is what I call the TickValueHistory.

A TickValueHistory object simply provides the sum of the last n ticks (think n seconds) through a call to TickValueHistory#value(). It can be used to answer queries like "how many bytes were transferred in the last n seconds?". It's usage is illustrated in the DefaultMetricsProvider class. Basically, a timer increments an integer currentTick every second and whenever bytes are read from a channel TickValueHistory#update(currentTick, bytesRead) is called. The main advantage of this design is that it avoids lots of calls to System.nanoTime() and reduces the "hot path" to two simple integer additions. Furthermore, I am aware that on the accuracy of Netty's scheduled task times can't be counted on (1 second, might really be 2 seconds if the event loop is real busy). I am not sure how much of a problem this will be in practice and so I plan to do some experiments to see it for myself. If times turn out to be really inaccurate, an easy fix would be to simply call System.nanoTime() every tick (think "netty second") and normalize the sum to a real second (that is, divide the sum by the actual time passed).
Also I did not use Fenwick Trees (as opposed to what I wrote in my proposal). The reason being that I think that the generality they provide is simply not needed. Basically, with a Fenwick Tree one could get the sum for arbitrary periods of time (at a greater cost), while the current implementation only allows to look back a specific period of time. In other words, with Fenwick trees the numPastTicks parameter would not be in the constructor but in the value method.

What's still missing besides tests and comments?

  • Add another data structure to provide a moving average. That's useful to answer questions like "What was the average pipeline traversal time in the last m seconds?"
  • Add more metric collection methods. I would like to have some more discussions about what's really useful tough, in order to not add metric hooks prematurely.
  • Add metric hooks to all transports.
  • Benchmarks and lots of testing
  • Some EventLoopChooser implementations to demonstrate the usefulness of all of this.
  • Address @normanmaurer's and @trustin's concerns :)
@ghost

This comment has been minimized.

Copy link

commented May 7, 2014

Build result for #2470 at 6ffbdbad362628fb60be7c1df697f99fe9f4f75b: Failure

@ghost

This comment has been minimized.

Copy link

commented May 7, 2014

Build result for #2470 at d12a8e3305904fb2bd4c9b24b49306a6a092289c: Failure

@normanmaurer

This comment has been minimized.

Copy link
Member

commented May 7, 2014

@jakobbuchgraber after reading your comments I think all of this makes a lot of sense... Now looking at the code directly...

success = true;
} catch (Exception e) {
// TODO: Think about if this is a good exception type
throw new IllegalStateException("failed to create a child event loop", e);
} finally {
if (!success) {
for (int j = 0; j < i; j ++) {
children[j].shutdownGracefully();
for (EventExecutor e : this.chooser.children()) {

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer May 7, 2014

Member

We basically used the (int j = ....) stuff to make sure the JVM does not create an iterator for this kind of work and so produce more GC

This comment has been minimized.

Copy link
@buchgr

buchgr May 7, 2014

Author Contributor

@normanmaurer How come? My impression is that this code is hardly ever executed at all. Only in cases when Netty crashes or is shutting down?

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer May 7, 2014

Member

true... I only wanted to highlight why we had it like this :)

@normanmaurer

View changes

common/src/main/java/io/netty/util/metrics/NoMetricsProvider.java Outdated

import io.netty.util.concurrent.EventExecutor;

public class NoMetricsProvider implements MetricsProvider {

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer May 7, 2014

Member

make the constructor private and provide a static final instance. There is no need to create a new instance of this when an EventLoop wants to use it.

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer May 7, 2014

Member

Also what I currently wonder is if it makes sense to have these metrics just called by an ChannelHandler implementation. As far as I can see all of those would be possible to get updated by the methods provided by such a handler.

This comment has been minimized.

Copy link
@buchgr

buchgr May 7, 2014

Author Contributor

@normanmaurer
Ad 1) Good point
Ad 2) Also a good point. I thought about the same thing yesterday evening, when I (by accident ) saw Nifty using this approach (here ChannelStatistics.java).

The downside of this approach is that the channel pipeline gets longer and it requires a little extra work by the developer. However, it also comes with some considerable advantages:

  • It's more extensible by developers. We could make MetricsCollector a marker interface or only provide very basic methods (channelRegistered, bytes read/write) and leave any additional methods up to the creativity of developers.
  • It's easier to maintain and in my opinion also cleaner, because it does not touch any code in the transports.

I will put some more thought into this, to make sure we don't overlook something. I am in favor of this tough. / cc: @trustin .

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer May 7, 2014

Member

Maybe also @andrewcox want to share his experience :)

@benjchristensen

This comment has been minimized.

Copy link

commented May 7, 2014

/cc @spodila who is exploring similar logic for RxJava event loop Schedulers.

@normanmaurer

This comment has been minimized.

Copy link
Member

commented May 7, 2014

@spodila welcome :)

@spodila

This comment has been minimized.

Copy link

commented May 7, 2014

Thanks. :)

@@ -27,14 +35,12 @@
* Abstract base class for {@link EventExecutorGroup} implementations that handles their tasks with multiple threads at
* the same time.
*/
public abstract class MultithreadEventExecutorGroup extends AbstractEventExecutorGroup {
public abstract class MultithreadEventExecutorGroup<T extends EventExecutor> extends AbstractEventExecutorGroup {

This comment has been minimized.

Copy link
@buchgr

buchgr May 7, 2014

Author Contributor

@normanmaurer What do you think of this change? I introduced the generic parameter, because EventExecutorChooser#children() returns a List<T> and generic collections are invariant. Otherwise the EventExecutor* naming, that is used internally would have been exposed to the user (https://github.com/netty/netty/pull/2470/files#diff-642dc25c87844818aa9bd66f58448c32R21).

I think if I returned an array instead, we would not need generics. Might actually be the smart thing to do, if I think about it now?

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer May 7, 2014

Member

The problem with an array is that the user may modify it ?

This comment has been minimized.

Copy link
@buchgr

buchgr May 7, 2014

Author Contributor

The EventLoopChooser object is only accessible from inside the
Multithread*Group. It's not exposed via any "public" API. So that
should not be a problem?

On Wed, May 7, 2014 at 8:18 PM, Norman Maurer notifications@github.comwrote:

In
common/src/main/java/io/netty/util/concurrent/MultithreadEventExecutorGroup.java:

@@ -27,14 +35,12 @@

  • Abstract base class for {@link EventExecutorGroup} implementations that handles their tasks with multiple threads at
  • the same time.
    */
    -public abstract class MultithreadEventExecutorGroup extends AbstractEventExecutorGroup {
    +public abstract class MultithreadEventExecutorGroup extends AbstractEventExecutorGroup {

The problem with an array is that the user may modify it ?


Reply to this email directly or view it on GitHubhttps://github.com//pull/2470/files#r12392937
.

Mit freundlichen Grüßen / Best Regards
Jakob Buchgraber

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer May 7, 2014

Member

true... so yeah +1 :)

@ghost

This comment has been minimized.

Copy link

commented May 25, 2014

Build result for #2470 at 224941f627e83741f80840308d8947208f1629d9: Failure

@ghost

This comment has been minimized.

Copy link

commented May 25, 2014

Build result for #2470 at 3319f9b423bcb35ab6952935ce7afe38807c8bed: Success

@buchgr

This comment has been minimized.

Copy link
Contributor Author

commented May 25, 2014

I just pushed an updated commit. Here's what's new:

  • Made the NoMetricsCollector a singleton.
  • Removed all calls to MetricsCollector methods from the transports, since from now on metrics collection will be done by adding an additional ChannelHandler to the pipeline. Furthermore I made MetricsCollector a marker interface. The idea behind this step is to allow a developer to attach an object to an event loop, that takes care of collecting the metrics he needs and that he can then use to collect the metrics in whatever way in his own ChannelHandler.
  • Removed the MetricsProvider and MetricsProviderFactory interfaces. With the MetricsCollector being a marker interface now, they just don't make any sense.
  • Added a new class ExpMovingAverage which calculates the "exponential moving average" very similiar to the load average on a unix system (see code comments).
  • Renamed TickValueHistory to CumulativeHistory and simplified the code a lot.
  • Added lots of comments to document the most important classes.

Furthermore, I did some testing on the accuracy of Netty's Task Scheduler and was pleasantly surprised that even under high load its timing was generally very accurate (to +/- 1/100 seconds). However, I do realize that its accuracy depends mostly on how long it takes for ChannelHandlers to execute.

I also did some load testing to see if frequent calls to System.nanoTime() instead of using timers to update the metrics would impact performance.

Basically I did something like this:

public class DefaultMetricsCollector implements MetricsCollector {
     private long last;

     public void readBytes(long bytes) {
         if (last != 0) {
            long now = System.nanoTime();
            // update every second
            if (now - last > 1000000000) {
                bytesRead.tick();
                last = System.nanoTime();
            }
        } else {
            last = System.nanoTime();
        }

         bytesRead.update(bytes);
     }
}

I didn't notice any difference in performance, however I didn't use a profiler and my Netty application didn't do much computation either besides sending back some bytes.

I haven't done much work on my GSoC project this week as I was busy with university stuff and @normanmaurer is on vacation anyway :)

@ghost

This comment has been minimized.

Copy link

commented May 25, 2014

Build result for #2470 at fe0e10b6843cd292b91a397afb77a48e38bbcebb: Success

@normanmaurer

This comment has been minimized.

Copy link
Member

commented May 30, 2014

@jakobbuchgraber so what is the status of this ... ?

@ghost

This comment has been minimized.

Copy link

commented Jun 1, 2014

Build result for #2470 at c3c86707d6fb08d7d759744d74863215d435fffc: Success

@ghost

This comment has been minimized.

Copy link

commented Jun 12, 2014

Build result for #2470 at d4f2a8b1a6d27643a3fadedda7294354f95c378a: Success

@buchgr buchgr changed the title [GSoC] Netty: Pluggable algorithm to choose next EventLoop [GSoC] Pluggable algorithm to choose next EventLoop Jul 21, 2014

@buchgr buchgr changed the title [GSoC] Pluggable algorithm to choose next EventLoop Pluggable algorithm to choose next EventLoop Jul 21, 2014

Pluggable algorithm to choose next EventLoop
Motivation:

Currently when a new event loop is needed to register a channel our EventLoopGroup implementations just
use a round-robin like algorithm to choose the next event loop. Unfortunately this is not always good
enough as different event loops may become more busy than others over time. This is especially true when
an application handles different kinds of connections (long-living and short-living). This change allows
developers to plug in their own algorithms to choose the next event loop and to collect metrics about the
busyness of the event loops. See #1230

Modifications:

- TODO

Result:

- Developers can plug in their own algorithms into MultithreadEventLoopGroup constructors.
- Developers can gather metrics/feedback from the event loops and use the data in their custom algorithms.
@ghost

This comment has been minimized.

Copy link

commented Aug 5, 2014

Build result for #2470 at bd266ed: Success

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.