Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netty: change default transport to Epoll if available, otherwise using Nio #5581

Merged
merged 12 commits into from Apr 16, 2019

Conversation

creamsoup
Copy link
Contributor

@creamsoup creamsoup commented Apr 11, 2019

Motivation:
To support TCP_USER_TIMEOUT(proposal). Nio doesn't support TCP_USER_TIMEOUT while Epoll and KQueue supports TCP_USER_TIME. Since most users/servers are using linux based system, adding Epoll is necessary. KQueue maybe supported later, but not in this PR.

To make it backward compatible, for cases when channelType or eventLoop is mixed in with default and user provided object(s). We will fallback to Nio when not all the values (channelType, eventLoop) are provided. Which should be same as current behavior. In later version (possibly 1.22.0) will throw exception since this is error prone.

@creamsoup
Copy link
Contributor Author

creamsoup commented Apr 12, 2019

i am not sure why gradle spotless plugin starts to complain about gradle build file.

https://source.cloud.google.com/results/invocations/44d0ea08-f2ed-4d56-975d-b14268fc301a/targets/grpc%2Fjava%2Fmaster%2Fpresubmit%2Flinux_artifacts/log

one of the gradle file had indentation issue due to using tab. resolved.

@creamsoup creamsoup requested a review from ejona86 April 12, 2019 16:26
@creamsoup creamsoup marked this pull request as ready for review April 12, 2019 16:26
Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, let's detect when users fail to specify all of channelType/worker/boss and issue a warning and fallback to using nio. In the future we will throw in build() when that happens.

netty/src/main/java/io/grpc/netty/Utils.java Outdated Show resolved Hide resolved
netty/src/main/java/io/grpc/netty/Utils.java Outdated Show resolved Hide resolved
netty/src/main/java/io/grpc/netty/Utils.java Outdated Show resolved Hide resolved
netty/src/main/java/io/grpc/netty/Utils.java Outdated Show resolved Hide resolved
netty/src/main/java/io/grpc/netty/NettyServer.java Outdated Show resolved Hide resolved
netty/src/main/java/io/grpc/netty/Utils.java Outdated Show resolved Hide resolved
netty/src/main/java/io/grpc/netty/Utils.java Outdated Show resolved Hide resolved
Class
.forName("io.netty.channel.epoll.EpollServerSocketChannel")
.asSubclass(ServerChannel.class);
logger.log(Level.FINE, "Using EpollServerSocketChannel");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this log? It seems a bit weird in this context.

logger.log(Level.FINE, "Epoll is not available", getEpollUnavailabilityCause());
}
return available;
} catch (Exception e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClassNotFound should probably be special-cased, since that is quite normal (although it would still be fair to log it at FINE). Otherwise this exception should be logged, probably at WARNING.

logger.log(
Level.WARNING,
"Both EventLoopGroup and ChannelType should be provided, otherwise client may not start "
+ "depends on the gRPC default value.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence is broken between "start" and "depends".

Include in the message that we are falling back to Nio for compatibility, and that this will cause an exception in the future.

+ "depends on the gRPC default value.");

if (eventLoopGroup == null) {
eventLoopGroup = SharedResourceHolder.get(Utils.NIO_WORKER_EVENT_LOOP_GROUP);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be mutating the builder. That can cause quite a bit of confusion if build() is called twice. For example, the way this code is, the second call to build() would not acquire a reference to the event loop and so it may be shut down while still in use.


if (eventLoopGroup == null) {
eventLoopGroup = SharedResourceHolder.get(Utils.NIO_WORKER_EVENT_LOOP_GROUP);
logger.log(Level.FINE, String.format("Using fall back EventLoopGroup %s", eventLoopGroup));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always call String.format(), even if not logging at FINE level. That should be avoided. Ditto next log.

@@ -108,17 +107,17 @@
long maxConnectionIdleInNanos,
long maxConnectionAgeInNanos, long maxConnectionAgeGraceInNanos,
boolean permitKeepAliveWithoutCalls, long permitKeepAliveTimeInNanos,
InternalChannelz channelz) {
InternalChannelz channelz, boolean usingSharedBossGroup, boolean usingSharedWorkerGroup) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having these bools passed in this deep can be sort of annoying. You may consider passing in ObjectPools instead. That way this code wouldn't need any conditionals and some of the handling elsewhere looks like it would improve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. it looks much simpler

@njhill
Copy link
Contributor

njhill commented Apr 14, 2019

Suggestion: Why not just have a fixed runtime dependency on netty-transport-native-epoll from grpc-netty? Without this, the benefit is kind of limited in that many users won't "automatically" use epoll. I expect many of those who would know/think to add the dependency would already know to customize their server/channel configuration. Yes it is one additional library pulled in for everyone which might not be used, but:

  • It's not very big
  • There are already other hardwired, possibly less-used libs such as the opencensus apis, and this would only be there if they already depend on grpc-netty, unlike those which have a dep from grpc-core
  • I assume the majority of production deployments run on linux x86_64 anyhow

An additional minor benefit would be to remove the need for reflection. Also, there is no harm in this being there if they are running on non linux-x84_64 platforms. In fact as questioned a while back, using the os selector for this in the build seems incorrect - the build env in general is arbitrary in relation to the runtime platform.

@ejona86
Copy link
Member

ejona86 commented Apr 15, 2019

@njhill, hmm... We were thinking most users should be using grpc-netty-shaded, which means they would get what they need.

I agree that it may make sense to depend on native-epoll for the Java classes. And I didn't realize the native transport was so small. Although we may also want kqueue support. It seems both of them depend on netty-unix-common, which is fairly small. There's no "uber" jar for it though. It looks like the linux/osx versions of netty-unix-common isn't for normal use, because it only contains a .a?

Hmm... I'm not sure how Maven/Gradle handle mixing dependencies with different classifiers.

I think I'm favoring this current more conservative approach for the moment. We've seen small performance gains from epoll; we're mainly doing this for better socket option support when available, like TCP_USER_TIMEOUT. Once we see the world doesn't fall over, we might make it a stronger dependency; and go through the effort of making sure Maven/Gradle will behave in a useful way.

@ejona86
Copy link
Member

ejona86 commented Apr 15, 2019

@creamsoup, labeling this PR as "all" seems misleading. There's literally a single line that isn't in the netty module. I'd suggest just labeling it "netty", as it also makes the rest of the description more clear.

The PR should also include some sort of description as to why it is being made.

@creamsoup creamsoup changed the title all: change default transport to Epoll if available, otherwise using Nio netty: change default transport to Epoll if available, otherwise using Nio Apr 15, 2019
@ejona86
Copy link
Member

ejona86 commented Apr 15, 2019

@creamsoup, @njhill brought up a good point about auto-detection logic. We don't want any auto-detection logic for our epoll dependency from grpc-netty-shaded. We can probably just always use the "linux-x86_64" classifier; see the main build.gradle.

@njhill
Copy link
Contributor

njhill commented Apr 15, 2019

@ejona86 netty-transport-unix-common is a common dep of epoll and kqueue but itself doesn't contain any native libs and the artifact doesn't have a classifier. What might be misleading (it confused me in the past) is the existence of the non-classifier netty-transport-native-epoll jar. This is identical to the linux-x86_64 one minus the single .so file, and as such kind of useless. I guess if there was a hard dep but you weren't using epoll you could save 24K by using it instead (i.m.o. this isn't worth the confusion it introduces where folks might think they are using epoll by including the dep, even though it won't work without the classifier specified).

So I think it wouldn't cause problems to similarly depend on kqueue in addition to epoll (no problem having diff classifiers for diff artifacts), though I would expect it's rarer for performance to be such a concern in macos contexts.

@ejona86
Copy link
Member

ejona86 commented Apr 15, 2019

@njhill, netty-transport-unix-common does have platform-specific classifier versions. I was saying though that it doesn't look like you need to use them; they just contain platform-specific goo used during compilation of other modules (like /META-INF/native/lib/libnetty-unix-common.a).

My bringing up of Kqueue is that it is another "useless dependency" for certain people. It's also small, though. Such small unnecessary dependencies could potentially add up, but they do seem minor compared to the rest of the netty artifacts (gosh it is big).

@njhill
Copy link
Contributor

njhill commented Apr 15, 2019

@ejona86 apologies, you are right. But yes doesn't seem they are used at runtime.

netty/src/main/java/io/grpc/netty/Utils.java Outdated Show resolved Hide resolved
netty/src/main/java/io/grpc/netty/NettyChannelBuilder.java Outdated Show resolved Hide resolved
netty/src/main/java/io/grpc/netty/NettyServer.java Outdated Show resolved Hide resolved
@ejona86
Copy link
Member

ejona86 commented Apr 15, 2019

@njhill, I may not have made it clear: I'm really happy you pointed out that the epoll jars were small. Although we'll use reflection for the moment and we need to do some more research into Maven/Guava behavior, I will likely try to swap us to a strong dependency in the future to avoid the reflection.

@creamsoup creamsoup merged commit a48ebb1 into grpc:master Apr 16, 2019
@creamsoup creamsoup deleted the epoll branch April 16, 2019 00:53
@Moriadry-zz Moriadry-zz mentioned this pull request May 13, 2019
6 tasks
@lock lock bot locked as resolved and limited conversation to collaborators Jul 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants