Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems: TCP connecter does not Round Robin Multiple DNS A records, TCP connecter does not check for DNS TTL expiration #2297

Open
cristicbz opened this issue Jan 7, 2017 · 4 comments

Comments

@cristicbz
Copy link

Does 0MQ connect to all IPs returned by DNS lookup, or only to the first one?

I'm trying to connect a DEALER to multiple ROUTER backends. The router.local DNS resolves to the IP addresses of all my backends 10.0.1.1, 10.0.1.2 ... It seems like if there is a connection failure, the hostname will be re-resolved and the next IP is picked up (which is great!), but I would have really liked the behaviour of doing socket.connect("tcp://router.local:8081") to be the same as as for address in lookup(hostname): socket.connect(address).

Is there some way to tell 0MQ: "stay connected to all the endpoints that this hostname resolves to, respecting TTL"? Our setup is headless services in Kubernetes: scaling up a service causes the dns to resolve to new IP-s, updating a service triggers a rolling update (with a corresponding rolling replacement of addresses in the DNS A-records).

@bluca
Copy link
Member

bluca commented Jan 7, 2017

It connects only to the first address returned by getaddrinfo, see: https://github.com/zeromq/libzmq/blob/master/src/tcp_address.cpp#L545

In theory it could be implemented by changing the address internal structure in tcp_address.c/hpp to a list and making some adjustments (well maybe a few) in tcp_connecter.cpp, but I am worried about the semantics implications.

Having multiple connects, as the documentation explains, implies for most sockets doing round-robin for sends. This works fine and is not confusing because a user has to manually call connect to each endpoint, so there's no surprise. But if this started to happen behind-the-scenes, without any intervention of the user, just depending on what the DNS returns, I can see it could get icky very quickly.

So if you would like to implement it and send a PR by all means please do so and we'll merge it, but it should be behind a socket option and disabled by default I think.

For your use case, wouldn't it be doable to just to the DNS resolution in your application, and connect multiple times using the IPs rather than the hostname?

@cristicbz
Copy link
Author

Thanks for the answer @bluca!

But if this started to happen behind-the-scenes, without any intervention of the user, just depending on what the DNS returns,

Round robin is the right semantics DNS-wise too, so it's tempting to just match it up, but the change in behaviour might be surprising, agreed that putting this behind a sock opt makes sense.

For your use case, wouldn't it be doable to just to the DNS resolution in your application, and connect multiple times using the IPs rather than the hostname?

So, to workaround this I think I'd need a thread which polls the DNS after its TTL and re-resolves. If the DNS changes from (A, B) to (A, C), I'd to do a diff and call zmq_disconnect(socket, "B") and zmq_connect(socket, "C"). Duplicating this endpoint information is inconvenient, but not terrible. The bigger issue is that the sockets are not thread-safe so I can't do this from a separate thread :(

It seems like sticking a TTL timer (with some reasonable minimum) per hostname on the I/O thread event loop to re-resolve and perform these diffs for all the sockets would be much more elegant (and we'd respect DNS semantics!). Even without the "connect multiple times" socket option set, the TTL timer would be useful to react to DNS changes. Does this sound reasonable?

I'm not very familiar with the 0MQ codebase, I could give this a try with some mentoring if that's on the table :)

@bluca
Copy link
Member

bluca commented Jan 7, 2017

So, to workaround this I think I'd need a thread which polls the DNS after its TTL and re-resolves. If the DNS changes from (A, B) to (A, C), I'd to do a diff and call zmq_disconnect(socket, "B") and zmq_connect(socket, "C"). Duplicating this endpoint information is inconvenient, but not terrible. The bigger issue is that the sockets are not thread-safe so I can't do this from a separate thread :(

If you are using CZMQ with zloop for your sockets you can add a zloop_timer () per socket with a callback, so it won't be in another thread. If you are using zpoller, you can add a zactor that just sleeps and writes back in the pipe, and read the zactor pipe from the poller, a bit more verbose code-wise but same result. Should be pretty easy to implement.

It seems like sticking a TTL timer (with some reasonable minimum) per hostname on the I/O thread event loop to re-resolve and perform these diffs for all the sockets would be much more elegant (and we'd respect DNS semantics!). Even without the "connect multiple times" socket option set, the TTL timer would be useful to react to DNS changes. Does this sound reasonable?

Sounds good. There is already support for timer events, so it should be trivial to add one for the tcp connecter class (see zmq::tcp_connecter_t::timer_event).
Then each connecter, if the endpoint was an hostname, can have its own TTL event.

The only thing is we want to try and keep behaviour changes at a minimum, to avoid tripping users who are upgrading from one version to another. So the best thing would be to have a socket option to turn on the TTL expiry check, and another to do the multiple connects, and have both disabled by default at the beginning.
Then on the day we do a major release that breaks API, we can consider enabling by default

So a rough TODO list:

  • add ZMQ_DNS_TTL_CHECK and ZMQ_DNS_CONNECT_ALL (suggestions for better names very welcome) socket options (as draft - see src/zmq_draft.hpp)
  • change the tcp_address class to store all addresses returned by getaddrinfo, and tcp_connecter to connect to them all if the option is set to true
  • change the tcp_address class to query the DNS server for TTLs and store them - IIRC getaddrinfo does not provide this info so the res* APIs will have to be used, which will be fun for cross-platform compatibility, so start with the platform you work on and stub the others
  • change tcp_connecter to add the TTL expiry event if the option is set to true

I'm not very familiar with the 0MQ codebase, I could give this a try with some mentoring if that's on the table :)

Sure, I'm happy to help, thanks for tackling this. I'm also online on IRC during weekdays working hours, GMT+00.

@bluca bluca changed the title DNS Round Robin Multiple A records Problems: TCP connecter does not Round Robin Multiple NDS A records, TCP connecter does not check for DNS TTL expiration Jan 7, 2017
@bluca bluca changed the title Problems: TCP connecter does not Round Robin Multiple NDS A records, TCP connecter does not check for DNS TTL expiration Problems: TCP connecter does not Round Robin Multiple DNS A records, TCP connecter does not check for DNS TTL expiration Jan 7, 2017
@cristicbz
Copy link
Author

Ah, this is turning out to be quite the rabbit hole. If we're going to do periodic DNS lookups, it feels like they should be async too. I may just work around this in client code for now---I still think this would be an absolute killer feature for something like a Kubernetes cluster: it could completely remove the need for all the protocol-unaware local LB-s that k8s sticks in my cluster and it'd make it super convenient to right 0MQ micro-services.

Thanks for all the info though! I may circle back to this if I ever get the time, I'm surprised this hasn't been an issue for other people (I googled for a long time before filing this issue).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants