New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
io.netty.resolver.dns.DnsNameResolverTimeoutException when application is running Kubernetes #13705
Comments
The links you provided were very exciting for me. Do you have debug logging enabled for the netty DNSResolver? It would be very interesting if you also have 2 DNS queries at the same time 5 minutes before your timeout exception. |
@arthurzenika would you be able to test a patched netty version or explicit configure the |
Hi @mayrstefan and @normanmaurer, thanks for getting back to us. We would be really happy to help debug this problem, we are a bit focused on a production temporary fix at the moment, but we will soon be able to spend some time on setting up a test environment where we can debug this. (Unfortunately this bug is a bit random, doesn't happen often and only happens on our production clusters... not ideal in terms of reproducible outcome). |
@arthurzenika thanks a lot! Please ping me once you have time to debug this. The changes I would do are minimal |
we have the same issue, when use redisson redisson-3.24.0.jar and netty 4.1.100 avakit-test-8556c4f9dc-wqcqq:/tmp/test/BOOT-INF/lib# ls -al | grep rediss
-rw-r--r-- 1 root root 2390733 Dec 18 11:30 redisson-3.24.0.jar
javakit-test-8556c4f9dc-wqcqq:/tmp/test/BOOT-INF/lib# ls -al | grep netty
-rw-r--r-- 1 root root 280507 Dec 18 11:30 grpc-netty-1.41.0.jar
-rw-r--r-- 1 root root 7995061 Dec 18 11:30 grpc-netty-shaded-1.41.0.jar
-rw-r--r-- 1 root root 4473 Dec 18 11:30 netty-all-4.1.100.Final.jar
-rw-r--r-- 1 root root 306739 Dec 18 11:30 netty-buffer-4.1.100.Final.jar
-rw-r--r-- 1 root root 345293 Dec 18 11:30 netty-codec-4.1.100.Final.jar
-rw-r--r-- 1 root root 66908 Dec 18 11:30 netty-codec-dns-4.1.100.Final.jar
-rw-r--r-- 1 root root 37778 Dec 18 11:30 netty-codec-haproxy-4.1.100.Final.jar
-rw-r--r-- 1 root root 657672 Dec 18 11:30 netty-codec-http-4.1.100.Final.jar
-rw-r--r-- 1 root root 486355 Dec 18 11:30 netty-codec-http2-4.1.100.Final.jar
-rw-r--r-- 1 root root 44692 Dec 18 11:30 netty-codec-memcache-4.1.100.Final.jar
-rw-r--r-- 1 root root 113931 Dec 18 11:30 netty-codec-mqtt-4.1.100.Final.jar
-rw-r--r-- 1 root root 45961 Dec 18 11:30 netty-codec-redis-4.1.100.Final.jar
-rw-r--r-- 1 root root 21293 Dec 18 11:30 netty-codec-smtp-4.1.100.Final.jar
-rw-r--r-- 1 root root 120979 Dec 18 11:30 netty-codec-socks-4.1.100.Final.jar
-rw-r--r-- 1 root root 34547 Dec 18 11:30 netty-codec-stomp-4.1.100.Final.jar
-rw-r--r-- 1 root root 19774 Dec 18 11:30 netty-codec-xml-4.1.100.Final.jar
-rw-r--r-- 1 root root 660474 Dec 18 11:30 netty-common-4.1.100.Final.jar
-rw-r--r-- 1 root root 561288 Dec 18 11:30 netty-handler-4.1.100.Final.jar
-rw-r--r-- 1 root root 25492 Dec 18 11:30 netty-handler-proxy-4.1.100.Final.jar
-rw-r--r-- 1 root root 26516 Dec 18 11:30 netty-handler-ssl-ocsp-4.1.100.Final.jar
-rw-r--r-- 1 root root 37795 Dec 18 11:30 netty-resolver-4.1.100.Final.jar
-rw-r--r-- 1 root root 171593 Dec 18 11:30 netty-resolver-dns-4.1.100.Final.jar
-rw-r--r-- 1 root root 9094 Dec 18 11:30 netty-resolver-dns-classes-macos-4.1.100.Final.jar
-rw-r--r-- 1 root root 19546 Dec 18 11:30 netty-resolver-dns-native-macos-4.1.100.Final-osx-aarch_64.jar
-rw-r--r-- 1 root root 19279 Dec 18 11:30 netty-resolver-dns-native-macos-4.1.100.Final-osx-x86_64.jar
-rw-r--r-- 1 root root 3953120 Dec 18 11:30 netty-tcnative-boringssl-static-2.0.31.Final.jar
-rw-r--r-- 1 root root 489999 Dec 18 11:30 netty-transport-4.1.100.Final.jar
-rw-r--r-- 1 root root 147139 Dec 18 11:30 netty-transport-classes-epoll-4.1.100.Final.jar
-rw-r--r-- 1 root root 108428 Dec 18 11:30 netty-transport-classes-kqueue-4.1.100.Final.jar
-rw-r--r-- 1 root root 40892 Dec 18 11:30 netty-transport-native-epoll-4.1.100.Final-linux-aarch_64.jar
-rw-r--r-- 1 root root 39373 Dec 18 11:30 netty-transport-native-epoll-4.1.100.Final-linux-x86_64.jar
-rw-r--r-- 1 root root 25582 Dec 18 11:30 netty-transport-native-kqueue-4.1.100.Final-osx-aarch_64.jar
-rw-r--r-- 1 root root 25020 Dec 18 11:30 netty-transport-native-kqueue-4.1.100.Final-osx-x86_64.jar
-rw-r--r-- 1 root root 43968 Dec 18 11:30 netty-transport-native-unix-common-4.1.100.Final.jar
-rw-r--r-- 1 root root 18192 Dec 18 11:30 netty-transport-rxtx-4.1.100.Final.jar
-rw-r--r-- 1 root root 50764 Dec 18 11:30 netty-transport-sctp-4.1.100.Final.jar
-rw-r--r-- 1 root root 32137 Dec 18 11:30 netty-transport-udt-4.1.100.Final.jar
-rw-r--r-- 1 root root 402675 Dec 18 11:30 rxnetty-0.4.9.jar
-rw-r--r-- 1 root root 53987 Dec 18 11:30 rxnetty-contexts-0.4.9.jar
-rw-r--r-- 1 root root 29155 Dec 18 11:30 rxnetty-servo-0.4.9.jar |
We have the same issue ocasionally. The netty version is 4.1.74. Project is deployed in the kubernetes. The uri of the gateway route is like http://kubernetes-service-name.namespace. below is screenshot of the log. |
We think we are hitting this issue. Any update and/or workaround? |
Hello, same issue: |
I think the workaround at the moment is to enable TCP fallback when you see a UDP timeout: |
Is this issue contagious? If you have 2 pods running in the same cluster and this issue occurs, can it kill something related to the way the cluster resolves DNS with UDP and impact additional pods. |
Hello, |
Hi, same issue happened in the Azure docker. |
@normanmaurer |
Expected behavior
For DNS resolution to be reliable.
Actual behavior
Steps to reproduce
There are some code elements in reactor/reactor-netty#2978
The related issue is kubernetes is probably well described here : https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts and https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/ and kubernetes/kubernetes#56903
Minimal yet complete reproducer code (or URL to code)
There are some code elements in reactor/reactor-netty#2978
Netty version
netty 4.1.100.Final
JVM version (e.g.
java -version
)JDK 17.0.6
OS version (e.g.
uname -a
)Linux example-worker-1 4.19.0-25-amd64 #1 SMP Debian 4.19.289-2 (2023-08-08) x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: