New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NumberFormatException in Native code trying to get the remote address #2867
Comments
@opinali thanks for reporting. Will look into it. |
@opinali very strange as "::ffff:10.186.55.168" is a valid ip address and if you try to pass it into InetAddress.getByName(...) you will get a valid Inet4Address. Any more infos ? |
You're correct, it's a legit hostname that will resolve to /10.186.55.168... I have now diagnosed this better: there's no bug anywhere, except that there is a performance bug :-( because this exception is used internally but ignored: IPAddressUtil.textToNumericFormatV4() will do this at line 112 (and other lines)
then at the end, we have
So the only reason why this exception trace appeared in my radar is because using exceptions as control structure (basically what this code is doing as it tentatively parses a string that may or may not be a valid number) is very expensive, so this exception is captured by JVM profiling. This trace appears using 0.22% of total CPU in my server, but that's relatively very high lot in a program that's running on an 8-core machine and where the Netty epoll layer consumes 95% of all CPU; so this exception actually consumes 4.4% of the "rest" of my application processing time. I'm going to close this bug, you may investigate if Netty could avoid this call, maybe it could since apparently this doesn't happen without the native epoll, but not really your problem. I will try to raise this as a bug to the JDK. |
What a mess :( let me implement a workaround. Also please add tge link to the jdk bugreport here once you open it. |
@opinali I think I have a good workaround. Stay tuned |
@normanmaurer - |
@Scottmitch I will just use the raw bytes , which should be faster anyway ;)
|
Ouch :( Let me implement a workaround here ... Also please add the jdk bug report link here .
|
Motivation: InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive. Modifications: Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor). Result: Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
Thanks @normanmaurer ! Tried to test this by building the netty-all and copying that tho the app, but then I have this error: java.lang.UnsatisfiedLinkError: /tmp/libnetty-transport-native-epoll3343384921387662299.so: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/libnetty-transport-native-epoll3343384921387662299.so) I checked my Linux and it has libc6 2.13-38+deb7u4. Is this a new requirement of the newer Netty or some missing build option? (I just ran mvn install) The machine where I've built Netty has 2.19-0ubuntu6.3, so it doesn't seem likely related. |
So you have the error during build or when try to run your app with the jar ? Am 8. September 2014 bei 06:35:33, Osvaldo Pinali Doederlein (notifications@github.com) schrieb: java.lang.UnsatisfiedLinkError: /tmp/libnetty-transport-native-epoll3343384921387662299.so: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/libnetty-transport-native-epoll3343384921387662299.so) |
(This Linux with the very old libc6 is Debian Wheezy, which unfortunately is the current stable release...) |
@opinali also could you please run 'ldd --version' on both machines ? |
@normanmaurer This error happens at runtime; build runs OK. Here's ldd --version for the building machine: ldd (Ubuntu EGLIBC 2.19-0ubuntu6.3) 2.19 Now for the machine running the app: ldd (Debian EGLIBC 2.13-38+deb7u4) 2.13 This may not be a new problem since it's the first time I try my own build including the native epoll; in my previous Netty hacks I had only overridden individual classes by placing them in the classpath before the netty-all jar. And I haven't updated my building machine since that time, it's [Google's] Ubuntu Trusty. Meanwhile I will try to deploy in other GCE image types like centos or backport-debian, maybe they have newer libc6 so I can test the epoll change. |
BTW, sometimes I need -DskipTests=true to make a full build, or I get error like this: Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 14.053 sec <<< FAILURE! - in io.netty.testsuite.transport.socket.DatagramUnicastTest I suppose some test is not smart enough to scan for unused ports. And this error follows immediately the above, so maybe it's related: Running io.netty.testsuite.transport.socket.SocketSslEchoTest (I certainly have OpenSSL installed, "openssl version" => OpenSSL 1.0.1f 6 Jan 2014) Oddly I just had this problem when trying to rebuild at 543c9d2, I got a full clean build for the master tip / netty 5.0. Maybe just bad luck / good luck if it's some transient problem. |
Another small update, I ran the full rebuild again for 543c9d2 and it succeeded now without skipping tests, so definitely just a flaky test at least here. |
@opinali I think you GLIBC error is because you build some newer GLIBC and try to use this version on a older one. I just built a jar (contains this patch) on my centos 6.5 which I usually use when release stuff. Maybe I could just upload or send you the jar via email so you can test ? If older jars worked this should work too. WDYT ? |
@normanmaurer Didn't look at the glibc problem further, but I've succeeded to run this build on CentOS 7 which has a modern glibc. So good news is that everything works as expected, I don't see those traces anymore :) |
Motivation: InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive. Modifications: Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor). Result: Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
Motivation: InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive. Modifications: Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor). Result: Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
Motivation: InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive. Modifications: Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor). Result: Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
Motivation: InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive. Modifications: Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor). Result: Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
@opinali fixed in 4.0, 4.1 and master branch. Thanks for reporting and testing :) |
…dresses Motivation: InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive. Modifications: Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor). Result: Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
I'm having this problem in a test where connections come from a HAProxy server (which runs in a separate instance of GCE, connected to the app through internal IPs). No error happens if I remove HAProxy and receive the same request directly from the app, or if I configure to use the Jetty core instead; seems to be some problem with the native epoll code. The stack trace (4.0.23-Final):
This is difficult to disgnose because the exception dies at the native layer, it doesn't get captured by the EpollEventLoop so it's not processed by the pipeline. I noticed the problem by using the JVM profiler, which captures traces like the above. Then I can run the app with jdb and set a breakpoint to java.net.InetAddress:1129 and see what's going on:
epollEventLoopGroup-2-1[1] locals
Method arguments:
host = "::ffff:10.186.55.168"
reqAddr = null
Local variables:
ipv6Expected = false
addr = null
numericZone = -1
ifname = null
We have a weird, IPv6-looking hostname here, I didn't try to investigate further. The internal network between these instances is IPv4.
The text was updated successfully, but these errors were encountered: