Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connect: happy eyeballs cleanup #5089

Closed
wants to merge 2 commits into from
Closed

Conversation

@bagder
Copy link
Member

bagder commented Mar 12, 2020

Make sure each separate index in connn->tempaddr[] is used for a fixed
family (and only that family) during the connection process.

Fixes #5083
Fixes #4954

@jay

This comment has been minimized.

Copy link
Member

jay commented Mar 12, 2020

This appears to solve the issue reported in #5083 but causes a different issue. If I put the immediate fail family first then the other family is never tried. For example:

curld --resolve localhost:9999:[::1],[::2],127.0.0.1, -v localhost:9999
* Added localhost:9999:[::1],[::2],127.0.0.1, to DNS cache
* STATE: INIT => CONNECT handle 0x8a9b88; line 1619 (connection #-5000)
* Added connection 0. The cache now contains 1 members
* Hostname localhost was found in DNS cache
* family0 == v6, family1 == v4
*   Trying ::1:9999...
* Immediate connect fail for ::1: Unknown error 99 (0x63)
*   Trying ::2:9999...
* Immediate connect fail for ::2: Unknown error 99 (0x63)
* Closing connection 0
* The cache now contains 0 members
* Expire cleared (transfer 0x8a9b88)
curl: (7) Couldn't connect to server
Make sure each separate index in connn->tempaddr[] is used for a fixed
family (and only that family) during the connection process.

Fixes #5083
Fixes #4954
@bagder bagder force-pushed the bagder/cleanup-happy-eyeballs branch from 6a16489 to 2430ef4 Mar 12, 2020
@bagder

This comment has been minimized.

Copy link
Member Author

bagder commented Mar 12, 2020

Thanks, it broke when I totally separated the traversing of the two families so only one family was iterated over in case of only immediate failures in that loop. Now it will go over both families.

@jay

This comment has been minimized.

Copy link
Member

jay commented Mar 13, 2020

That seems to work but now it goes IPv6 (immediate fail), IPv6 (immediate fail), IPv4 (connrefused). Is that what was intended, I would have expected trying one IPv6 followed by one IPv6 IPv4 and then for results IPv6 (immediate fail), IPv4 (connrefused), IPv6 (immediate fail)

* Added localhost:9999:[::1],[::2],127.0.0.1, to DNS cache
* STATE: INIT => CONNECT handle 0x959b88; line 1619 (connection #-5000)
* Added connection 0. The cache now contains 1 members
* Hostname localhost was found in DNS cache
* family0 == v6, family1 == v4
*   Trying ::1:9999...
* Immediate connect fail for ::1: Unknown error 99 (0x63)
*   Trying ::2:9999...
* Immediate connect fail for ::2: Unknown error 99 (0x63)
*   Trying 127.0.0.1:9999...
* STATE: CONNECT => WAITCONNECT handle 0x959b88; line 1675 (connection #0)
* connect to 127.0.0.1 port 9999 failed: Connection refused
* Failed to connect to localhost port 9999: Connection refused
* multi_done
* Closing connection 0
* The cache now contains 0 members
* Expire cleared (transfer 0x959b88)
curl: (7) Failed to connect to localhost port 9999: Connection refused

If I move IPv6 to the second family then #5083 fixed as I noted already, however also it is interesting that now connection refused (from the first family) is still the error code, probably since it happens after the immediate connect fail. It seems to me that is more correct and may inadvertently solve mitigate #5080.

* Added localhost:9999:127.0.0.1,[::1],[::2], to DNS cache
* STATE: INIT => CONNECT handle 0x1e59b88; line 1619 (connection #-5000)
* Added connection 0. The cache now contains 1 members
* Hostname localhost was found in DNS cache
* family0 == v4, family1 == v6
*   Trying 127.0.0.1:9999...
* STATE: CONNECT => WAITCONNECT handle 0x1e59b88; line 1675 (connection #0)
*   Trying ::1:9999...
* Immediate connect fail for ::1: Unknown error 99 (0x63)
*   Trying ::2:9999...
* Immediate connect fail for ::2: Unknown error 99 (0x63)
* connect to 127.0.0.1 port 9999 failed: Connection refused
* Failed to connect to localhost port 9999: Connection refused
* multi_done
* Closing connection 0
* The cache now contains 0 members
* Expire cleared (transfer 0x1e59b88)
curl: (7) Failed to connect to localhost port 9999: Connection refused
@bagder

This comment has been minimized.

Copy link
Member Author

bagder commented Mar 13, 2020

Is that what was intended

Yes, on immediate fail it'll continue to the next within the same family. Not entirely sure it is the best approach but seems consistent. At least the "immediate fail" ones are fast, and fairly rare.

it is interesting that now connection refused (from the first family) is still the error code, probably since it happens after the immediate connect fail

Since curl potentially does two connects in parallel, it is certainly not clear how to decide which single error code to store when both fail. Is it simply the last one or can we do a better heuristic?

@jay

This comment has been minimized.

Copy link
Member

jay commented Mar 14, 2020

Since curl potentially does two connects in parallel, it is certainly not clear how to decide which single error code to store when both fail. Is it simply the last one or can we do a better heuristic?

It's documented as the last connect error though I agree this may be somewhat of a special case.

@bagder bagder closed this in dbd16c3 Mar 15, 2020
@bagder bagder deleted the bagder/cleanup-happy-eyeballs branch Mar 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

2 participants
You can’t perform that action at this time.