-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNS cache not updated after unsuccessful reconnects #8574
Comments
Each time the log has "Resolved address", that is a poll of the addresses from DNS. We have code to avoid hammering DNS more than once every 30 seconds, and the times match that. It looks like it is working to me. I'd double-check your DNS TTLs and networkaddress.cache.ttl. It looks like you are using pick-first. For pick-first all the addresses are in one subchannel and we iterate over each address in turn trying to find one that works. If all the attempts fail, we just propagate a single attempt's error message and hope it is representative of the group. |
Hi @ejona86 thanks for your reply. If have checked the DNS TTL (using "dig" within the container) and it's 5s. The DNS behaviour within the JVM is as expected, the dns-resolver-thread is already printing the new pod IPs, but grpc is still trying to connect to the old ones. These are the old pod IPs
Here the new pod IPs
|
I see the problem now. You are using Since this is etcd-related, it seems etcd-io/jetcd#814 is likely where the ip resolver is coming from. It looks like that resolver does resolution in its constructor (which is broken for hostnames because that is a blocking operation, but fine for IP addresses). |
Thanks for your help, your analysis is very appreciated. |
Hi,
I use grpc-java as part of jetcd to connect to an etcd cluster within kubernetes.
When scaling down and up again all etcd endpoints, I would expect the grpc client to reconnect.
Restarting the etcd endpoints means new pod IPs, and the k8s internal DNS updates the headless service DNS pretty fast.
Based on ticket #1463 I think the grpc client should refresh the DNS names
after trying all configured endpoints.
In the provided logs I see that all three endpoints are tried in a loop, but always the old pod IPs.
Also interesting: The "No route to host" log is only seen for the first endpoint etcd-0, but the message "Started transport NettyClientTransport" is seen as round robin over all endpoints.
The JVM is already configured to networkaddress.cache.ttl=10
What version of gRPC-Java are you using?
1.39.0
What is your environment?
Linux, K8s
What did you expect to see?
After trying to connect to all endpoints, grpc should refresh DNS and get the new pod IPs
What did you see instead?
grpc is keeping the old pod-names/IPs
Steps to reproduce the bug
Shutdown all server endpoints, start them again (with new IPs) and wait for client to reconnect
grpc.log
The text was updated successfully, but these errors were encountered: