Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NumberFormatException in Native code trying to get the remote address #2867

Closed
opinali opened this issue Sep 5, 2014 · 19 comments
Closed

NumberFormatException in Native code trying to get the remote address #2867

opinali opened this issue Sep 5, 2014 · 19 comments
Assignees
Milestone

Comments

@opinali
Copy link
Contributor

opinali commented Sep 5, 2014

I'm having this problem in a test where connections come from a HAProxy server (which runs in a separate instance of GCE, connected to the app through internal IPs). No error happens if I remove HAProxy and receive the same request directly from the app, or if I configure to use the Jetty core instead; seems to be some problem with the native epoll code. The stack trace (4.0.23-Final):

    java.lang.Throwable.fillInStackTrace(Throwable.java:Unknown line)
    java.lang.Throwable.fillInStackTrace(Throwable.java:783)
    java.lang.Throwable.<init>(Throwable.java:265)
    java.lang.Exception.<init>(Exception.java:66)
    java.lang.RuntimeException.<init>(RuntimeException.java:62)
    java.lang.IllegalArgumentException.<init>(IllegalArgumentException.java:53)
    java.lang.NumberFormatException.<init>(NumberFormatException.java:55)
    java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    java.lang.Integer.parseInt(Integer.java:492)
    java.lang.Integer.parseInt(Integer.java:527)
    sun.net.util.IPAddressUtil.textToNumericFormatV4(IPAddressUtil.java:112)
    java.net.InetAddress.getAllByName(InetAddress.java:1129)
    java.net.InetAddress.getAllByName(InetAddress.java:1098)
    java.net.InetAddress.getByName(InetAddress.java:1048)
    java.net.InetSocketAddress.<init>(InetSocketAddress.java:220)
    io.netty.channel.epoll.Native.remoteAddress(Native.java:Unknown line)
    io.netty.channel.epoll.EpollSocketChannel.<init>(EpollSocketChannel.java:77)
    io.netty.channel.epoll.EpollServerSocketChannel$EpollServerSocketUnsafe.epollInReady(EpollServerSocketChannel.java:109)
    io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326)
    io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264)

This is difficult to disgnose because the exception dies at the native layer, it doesn't get captured by the EpollEventLoop so it's not processed by the pipeline. I noticed the problem by using the JVM profiler, which captures traces like the above. Then I can run the app with jdb and set a breakpoint to java.net.InetAddress:1129 and see what's going on:

epollEventLoopGroup-2-1[1] locals
Method arguments:
host = "::ffff:10.186.55.168"
reqAddr = null
Local variables:
ipv6Expected = false
addr = null
numericZone = -1
ifname = null

We have a weird, IPv6-looking hostname here, I didn't try to investigate further. The internal network between these instances is IPv4.

@normanmaurer normanmaurer added this to the 4.0.24.Final milestone Sep 5, 2014
@normanmaurer
Copy link
Member

@opinali thanks for reporting. Will look into it.

@normanmaurer normanmaurer self-assigned this Sep 5, 2014
@normanmaurer
Copy link
Member

@opinali very strange as "::ffff:10.186.55.168" is a valid ip address and if you try to pass it into InetAddress.getByName(...) you will get a valid Inet4Address. Any more infos ?

@opinali
Copy link
Contributor Author

opinali commented Sep 5, 2014

You're correct, it's a legit hostname that will resolve to /10.186.55.168... I have now diagnosed this better: there's no bug anywhere, except that there is a performance bug :-( because this exception is used internally but ignored: IPAddressUtil.textToNumericFormatV4() will do this at line 112 (and other lines)

val = Integer.parseInt(s[i]);

then at the end, we have

} catch(NumberFormatException e) {
    return null;
}

So the only reason why this exception trace appeared in my radar is because using exceptions as control structure (basically what this code is doing as it tentatively parses a string that may or may not be a valid number) is very expensive, so this exception is captured by JVM profiling. This trace appears using 0.22% of total CPU in my server, but that's relatively very high lot in a program that's running on an 8-core machine and where the Netty epoll layer consumes 95% of all CPU; so this exception actually consumes 4.4% of the "rest" of my application processing time.

I'm going to close this bug, you may investigate if Netty could avoid this call, maybe it could since apparently this doesn't happen without the native epoll, but not really your problem. I will try to raise this as a bug to the JDK.

@opinali opinali closed this as completed Sep 5, 2014
@normanmaurer
Copy link
Member

What a mess :( let me implement a workaround. Also please add tge link to the jdk bugreport here once you open it.

@normanmaurer normanmaurer reopened this Sep 6, 2014
@normanmaurer
Copy link
Member

@opinali I think I have a good workaround. Stay tuned

@Scottmitch
Copy link
Member

@normanmaurer - public static Inet6Address getByName(String ip, boolean ipv4Mapped) from PR #2863 may provide some use here. The goal is for it to take in any valid IP string and to produce an Inet6Address. It does this by manually building the bytes corresponding to the input string (not using regex or split operations). It has some limitations in that it does not support netmasks/zone indices, ports, and is a bit more lenient for some "invalid" ip addressed to be translated (i.e. allows hex digits for an ipv4 address). We could fix these limitations and extract the ipv4 logic if necessary.

@normanmaurer
Copy link
Member

@Scottmitch I will just use the raw bytes , which should be faster anyway ;)

Am 06.09.2014 um 16:29 schrieb Scottmitch notifications@github.com:

@normanmaurer - public static Inet6Address getByName(String ip, boolean ipv4Mapped) from PR #2863 may provide some use here. The goal is for it to take in any valid IP string and to produce an Inet6Address. It does this by manually building the bytes corresponding to the input string (not using regex or split operations). It has some limitations in that it does not support netmasks/zone indices, ports, and is a bit more lenient for some "invalid" ip addressed to be translated (i.e. allows hex digits for an ipv4 address). We could fix these limitations and extract the ipv4 logic if necessary.


Reply to this email directly or view it on GitHub.

@normanmaurer
Copy link
Member

Ouch :(

Let me implement a workaround here ... Also please add the jdk bug report link here .

Am 06.09.2014 um 00:08 schrieb Osvaldo Pinali Doederlein notifications@github.com:

You're correct, it's a legit hostname that will resolve to /10.186.55.168... I have now diagnosed this better: there's no bug anywhere, except that there is a performance bug :-( because this exception is used internally but ignored: IPAddressUtil.textToNumericFormatV4() will do this at line 112 (and other lines)

val = Integer.parseInt(s[i]);
then at the end, we have

} catch(NumberFormatException e) {
return null;
}
So the only reason why this exception trace appeared in my radar is because using exceptions as control structure (basically what this code is doing as it tentatively parses a string that may or may not be a valid number) is very expensive, so this exception is captured by JVM profiling. This trace appears using 0.22% of total CPU in my server, but that's relatively very high lot in a program that's running on an 8-core machine and where the Netty epoll layer consumes 95% of all CPU; so this exception actually consumes 4.4% of the "rest" of my application processing time.

I'm going to close this bug, you may investigate if Netty could avoid this call, maybe it could since apparently this doesn't happen without the native epoll, but not really your problem. I will try to raise this as a bug to the JDK.


Reply to this email directly or view it on GitHub.

normanmaurer pushed a commit that referenced this issue Sep 7, 2014
Motivation:

InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive.

Modifications:

Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor).

Result:

Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
@normanmaurer
Copy link
Member

@opinali please test proposed fix in #2871

@opinali
Copy link
Contributor Author

opinali commented Sep 8, 2014

Thanks @normanmaurer ! Tried to test this by building the netty-all and copying that tho the app, but then I have this error:

java.lang.UnsatisfiedLinkError: /tmp/libnetty-transport-native-epoll3343384921387662299.so: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/libnetty-transport-native-epoll3343384921387662299.so)

I checked my Linux and it has libc6 2.13-38+deb7u4. Is this a new requirement of the newer Netty or some missing build option? (I just ran mvn install) The machine where I've built Netty has 2.19-0ubuntu6.3, so it doesn't seem likely related.

@normanmaurer
Copy link
Member

So you have the error during build or when try to run your app with the jar ?
-- 
Norman Maurer

Am 8. September 2014 bei 06:35:33, Osvaldo Pinali Doederlein (notifications@github.com) schrieb:

java.lang.UnsatisfiedLinkError: /tmp/libnetty-transport-native-epoll3343384921387662299.so: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/libnetty-transport-native-epoll3343384921387662299.so)

@opinali
Copy link
Contributor Author

opinali commented Sep 8, 2014

(This Linux with the very old libc6 is Debian Wheezy, which unfortunately is the current stable release...)

@normanmaurer
Copy link
Member

@opinali also could you please run 'ldd --version' on both machines ?

@opinali
Copy link
Contributor Author

opinali commented Sep 8, 2014

@normanmaurer This error happens at runtime; build runs OK. Here's ldd --version for the building machine: ldd (Ubuntu EGLIBC 2.19-0ubuntu6.3) 2.19

Now for the machine running the app: ldd (Debian EGLIBC 2.13-38+deb7u4) 2.13

This may not be a new problem since it's the first time I try my own build including the native epoll; in my previous Netty hacks I had only overridden individual classes by placing them in the classpath before the netty-all jar. And I haven't updated my building machine since that time, it's [Google's] Ubuntu Trusty.

Meanwhile I will try to deploy in other GCE image types like centos or backport-debian, maybe they have newer libc6 so I can test the epoll change.

@opinali
Copy link
Contributor Author

opinali commented Sep 8, 2014

BTW, sometimes I need -DskipTests=true to make a full build, or I get error like this:

Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 14.053 sec <<< FAILURE! - in io.netty.testsuite.transport.socket.DatagramUnicastTest
testSimpleSend(io.netty.testsuite.transport.socket.DatagramUnicastTest) Time elapsed: 3.017 sec <<< ERROR!
java.net.BindException: Address already in use

I suppose some test is not smart enough to scan for unused ports. And this error follows immediately the above, so maybe it's related:

Running io.netty.testsuite.transport.socket.SocketSslEchoTest
09:59:30.371 [main] WARN i.n.t.t.socket.SocketSslEchoTest - OpenSSL is unavailable and thus will not be tested.
java.lang.UnsatisfiedLinkError: /tmp/libnetty-tcnative4691387174260672690.so: libssl.so.10: cannot open shared object file: No such file or directory

(I certainly have OpenSSL installed, "openssl version" => OpenSSL 1.0.1f 6 Jan 2014)

Oddly I just had this problem when trying to rebuild at 543c9d2, I got a full clean build for the master tip / netty 5.0. Maybe just bad luck / good luck if it's some transient problem.

@opinali
Copy link
Contributor Author

opinali commented Sep 8, 2014

Another small update, I ran the full rebuild again for 543c9d2 and it succeeded now without skipping tests, so definitely just a flaky test at least here.

@normanmaurer
Copy link
Member

@opinali I think you GLIBC error is because you build some newer GLIBC and try to use this version on a older one. I just built a jar (contains this patch) on my centos 6.5 which I usually use when release stuff. Maybe I could just upload or send you the jar via email so you can test ? If older jars worked this should work too. WDYT ?

@opinali
Copy link
Contributor Author

opinali commented Sep 9, 2014

@normanmaurer Didn't look at the glibc problem further, but I've succeeded to run this build on CentOS 7 which has a modern glibc. So good news is that everything works as expected, I don't see those traces anymore :)

normanmaurer pushed a commit that referenced this issue Sep 9, 2014
Motivation:

InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive.

Modifications:

Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor).

Result:

Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
normanmaurer pushed a commit that referenced this issue Sep 9, 2014
Motivation:

InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive.

Modifications:

Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor).

Result:

Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
normanmaurer pushed a commit that referenced this issue Sep 9, 2014
Motivation:

InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive.

Modifications:

Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor).

Result:

Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
normanmaurer pushed a commit that referenced this issue Sep 9, 2014
Motivation:

InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive.

Modifications:

Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor).

Result:

Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
@normanmaurer
Copy link
Member

@opinali fixed in 4.0, 4.1 and master branch. Thanks for reporting and testing :)

pulllock pushed a commit to pulllock/netty that referenced this issue Oct 19, 2023
…dresses

Motivation:

InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive.

Modifications:

Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor).

Result:

Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants