Option to shuffle addresses from DNS #1694

Closed
wants to merge 16 commits into
from

Conversation

Projects
None yet
6 participants
@dreckard
Contributor

dreckard commented Jul 21, 2017

This patch adds an option CURLOPT_DNS_SHUFFLE_ADDRESSES to explicitly request shuffling of addresses returned for a hostname when there is more than one. This can be useful in situations where the application knows that a round robin approach is appropriate and is willing to accept the consequences of potentially discarding some preference order returned by the system's implementation.

I came across this post from a few years ago which discusses the address ordering issue and mentions shuffling the addresses after they are returned but rejects it as too problematic. I agree that it is not the right approach for general use but I think enough use cases exist for it to be available as an option.

In my case I am working on retrofitting an existing application to use libcurl instead of its own old socket based networking implementation. Its only network task is to connect to a specific hostname which is known to be configured for use with a round robin approach.

I'd be happy to write the docs/tests if this seems merge worthy but I thought I should ask for opinions first.

@mention-bot

This comment has been minimized.

Show comment Hide comment
@mention-bot

mention-bot Jul 21, 2017

@dreckard, thanks for your PR! By analyzing the history of the files in this pull request, we identified @bagder, @dfandrich and @captain-caveman2k to be potential reviewers.

@dreckard, thanks for your PR! By analyzing the history of the files in this pull request, we identified @bagder, @dfandrich and @captain-caveman2k to be potential reviewers.

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Jul 21, 2017

Member

Thanks for your contribution. I think it is an interesting addition that others might be interested in as well.

Have you thought about how this code should behave on re-use of the cached DNS data? My first cursory look on your code hints that you shuffle the order only once so that repeated uses of the cache will use the same address?

Member

bagder commented Jul 21, 2017

Thanks for your contribution. I think it is an interesting addition that others might be interested in as well.

Have you thought about how this code should behave on re-use of the cached DNS data? My first cursory look on your code hints that you shuffle the order only once so that repeated uses of the cache will use the same address?

@coveralls

This comment has been minimized.

Show comment Hide comment
@coveralls

coveralls Jul 21, 2017

Coverage Status

Coverage decreased (-0.09%) to 75.189% when pulling 6ed9930 on dreckard:dns_shuffle_master into 42a4cd4 on curl:master.

Coverage Status

Coverage decreased (-0.09%) to 75.189% when pulling 6ed9930 on dreckard:dns_shuffle_master into 42a4cd4 on curl:master.

@dreckard

This comment has been minimized.

Show comment Hide comment
@dreckard

dreckard Jul 21, 2017

Contributor

Right about the code not currently reshuffling on reuse of cached names. I did think about that and I didn't see a really decisive argument either way so I went for the simpler option of leaving it out.

The most compelling case I could think of where reshuffling would be preferred is when retrying requests after a failure but the existing fallback logic should usually allow at least a few other addresses to be attempted (timeoutms_per_addr) so I thought the benefit there would be marginal.

Contributor

dreckard commented Jul 21, 2017

Right about the code not currently reshuffling on reuse of cached names. I did think about that and I didn't see a really decisive argument either way so I went for the simpler option of leaving it out.

The most compelling case I could think of where reshuffling would be preferred is when retrying requests after a failure but the existing fallback logic should usually allow at least a few other addresses to be attempted (timeoutms_per_addr) so I thought the benefit there would be marginal.

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Jul 26, 2017

Member

Well, if you ask for a page every second from a round-robin-DNS served site using libcurl with a cached DNS entry, you'll now use the same IP (first) for 60 consecutive requests, then do a new DNS resolve and the following 60 requests again to the (new) first IP. That could be a little surprising or just exactly what you'd expect.

It mostly just needs to be clearly documented that it only shuffles them in the actual resolve moment and not later, even if it gets used several times.

Member

bagder commented Jul 26, 2017

Well, if you ask for a page every second from a round-robin-DNS served site using libcurl with a cached DNS entry, you'll now use the same IP (first) for 60 consecutive requests, then do a new DNS resolve and the following 60 requests again to the (new) first IP. That could be a little surprising or just exactly what you'd expect.

It mostly just needs to be clearly documented that it only shuffles them in the actual resolve moment and not later, even if it gets used several times.

@dreckard

This comment has been minimized.

Show comment Hide comment
@dreckard

dreckard Jul 29, 2017

Contributor

Ok agreed, sounds good. I'll try to draft up the docs additions for it in the next few days.

Contributor

dreckard commented Jul 29, 2017

Ok agreed, sounds good. I'll try to draft up the docs additions for it in the next few days.

dreckard added some commits Jul 19, 2017

lib: add option CURLOPT_DNS_SHUFFLE_ADDRESSES
Add new option CURLOPT_DNS_SHUFFLE_ADDRESSES to allow shuffling
address order if multiple A records are returned for a name.
lib: made addr_shuffle function non-static
unit1607: fixed failures due to hacky .c include
@dreckard

This comment has been minimized.

Show comment Hide comment
@dreckard

dreckard Aug 4, 2017

Contributor

Added man page changes and a basic unit test.

Contributor

dreckard commented Aug 4, 2017

Added man page changes and a basic unit test.

Merge branch 'master' of https://github.com/curl/curl into dns_shuffl…
…e_master

# Conflicts:
#	include/curl/curl.h
#	lib/url.c

@bagder bagder deleted a comment from coveralls Sep 21, 2017

@bagder bagder deleted a comment from coveralls Sep 21, 2017

@bagder bagder deleted a comment from coveralls Sep 21, 2017

@bagder bagder deleted a comment from coveralls Sep 21, 2017

@bagder bagder deleted a comment from coveralls Sep 21, 2017

@bagder bagder deleted a comment from coveralls Sep 21, 2017

@bagder bagder deleted a comment from coveralls Sep 21, 2017

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Sep 21, 2017

Member

I missed merging this PR before the merge window closed, but it will open again on the release date (October 4) and it would be nice if you could fix the merge conflict until then.

Member

bagder commented Sep 21, 2017

I missed merging this PR before the merge window closed, but it will open again on the release date (October 4) and it would be nice if you could fix the merge conflict until then.

@dreckard

This comment has been minimized.

Show comment Hide comment
@dreckard

dreckard Sep 25, 2017

Contributor

Yup will do.

Contributor

dreckard commented Sep 25, 2017

Yup will do.

dreckard added some commits Sep 30, 2017

@dreckard

This comment has been minimized.

Show comment Hide comment
@dreckard

dreckard Oct 1, 2017

Contributor

Fixed.

Contributor

dreckard commented Oct 1, 2017

Fixed.

@jay

This comment has been minimized.

Show comment Hide comment
@jay

jay Feb 21, 2018

Member

I notice this is labeled next-feature-window and has been open for a while. I don't like the name CURLOPT_DNS_SHUFFLE_ADDRESSES because I think it sounds ambiguous. I think CURLOPT_SHUFFLE_ADDRESSES and --shuffle-addresses would be better. Otherwise if a user is reading a script I think it will sound to them like it's shuffling the dns- server addresses, which can also be specified.

edit:
or CURLOPT_SHUFFLE_HOST_ADDRESSES. longer but easier to figure out

Member

jay commented Feb 21, 2018

I notice this is labeled next-feature-window and has been open for a while. I don't like the name CURLOPT_DNS_SHUFFLE_ADDRESSES because I think it sounds ambiguous. I think CURLOPT_SHUFFLE_ADDRESSES and --shuffle-addresses would be better. Otherwise if a user is reading a script I think it will sound to them like it's shuffling the dns- server addresses, which can also be specified.

edit:
or CURLOPT_SHUFFLE_HOST_ADDRESSES. longer but easier to figure out

dreckard added some commits Mar 3, 2018

Merge branch 'master' of https://github.com/dreckard/curl into dns_sh…
…uffle_master

# Conflicts:
#	include/curl/curl.h
#	lib/url.c
#	tests/data/Makefile.inc
#	tests/data/test1607
#	tests/unit/unit1607.c
Merge branch 'dns_shuffle_master' of https://github.com/dreckard/curl
…into dns_shuffle_master

# Conflicts:
#	tests/unit/unit1607.c
@dreckard

This comment has been minimized.

Show comment Hide comment
@dreckard

dreckard Mar 6, 2018

Contributor

Fixed the conflicts again. I'm not hugely attached to the name so I can go ahead and change it to CURLOPT_SHUFFLE_HOST_ADDRESSES in the near future.

Several of the resolution related options are currently using the CURLOPT_DNS_ prefix so I was hoping to remain consistent with those but it's a fair point that there's ambiguity introduced.

Contributor

dreckard commented Mar 6, 2018

Fixed the conflicts again. I'm not hugely attached to the name so I can go ahead and change it to CURLOPT_SHUFFLE_HOST_ADDRESSES in the near future.

Several of the resolution related options are currently using the CURLOPT_DNS_ prefix so I was hoping to remain consistent with those but it's a fair point that there's ambiguity introduced.

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Mar 16, 2018

Member

ok @dreckard and @jay, let's finally get this feature merged this window!

@dreckard, can you please rebase this, fix the conflicts (again) and squash the commits that should be squashed? I'm sorry for having dragged this out so that you've had to do this multiple times.

Member

bagder commented Mar 16, 2018

ok @dreckard and @jay, let's finally get this feature merged this window!

@dreckard, can you please rebase this, fix the conflicts (again) and squash the commits that should be squashed? I'm sorry for having dragged this out so that you've had to do this multiple times.

+CURLcode curl_easy_setopt(CURL *handle, CURLOPT_DNS_SHUFFLE_ADDRESSES, long onoff);
+.fi
+.SH DESCRIPTION
+When a name is resolved and more than one A record is returned, shuffle all

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Mar 17, 2018

Member

This means "more than one IP address", right? Because A record implies IPv4 address, but surely this also shuffles IPv6 addresses fine?

@bagder

bagder Mar 17, 2018

Member

This means "more than one IP address", right? Because A record implies IPv4 address, but surely this also shuffles IPv6 addresses fine?

This comment has been minimized.

Show comment Hide comment
@dreckard

dreckard Mar 17, 2018

Contributor

Yes it does also shuffle IPv6 addresses so good point.

@dreckard

dreckard Mar 17, 2018

Contributor

Yes it does also shuffle IPv6 addresses so good point.

@bagder bagder closed this in d95f3dc Mar 17, 2018

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Mar 17, 2018

Member

Thank you. I squashed and rebased your work, and you will also see that I edited it slightly. Amongst others so that torture tests rune fine on it.

Member

bagder commented Mar 17, 2018

Thank you. I squashed and rebased your work, and you will also see that I edited it slightly. Amongst others so that torture tests rune fine on it.

@dreckard

This comment has been minimized.

Show comment Hide comment
@dreckard

dreckard Mar 17, 2018

Contributor

Ah great, thanks! I just committed the option name change suggested by @jay before seeing the notifications for these messages, if it's still desired. I think the tool command line option is still needed as well, sorry for missing that earlier.

Contributor

dreckard commented Mar 17, 2018

Ah great, thanks! I just committed the option name change suggested by @jay before seeing the notifications for these messages, if it's still desired. I think the tool command line option is still needed as well, sorry for missing that earlier.

@dfandrich

This comment has been minimized.

Show comment Hide comment
@dfandrich

dfandrich Mar 18, 2018

Collaborator

Test 1608 now fails with a memory leak under Polar SSL: https://curl.haxx.se/dev/log.cgi?id=20180318182706-28132#prob75 Was this introduced with this commit or did it just tickle a pre-existing problem?

Collaborator

dfandrich commented Mar 18, 2018

Test 1608 now fails with a memory leak under Polar SSL: https://curl.haxx.se/dev/log.cgi?id=20180318182706-28132#prob75 Was this introduced with this commit or did it just tickle a pre-existing problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment