-
-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP-01 IPv6 to IPv4 fallback not working properly #2770
Comments
One false-positive for this issue I've seen so far is a host with an A and AAAA record failing an HTTP-01 challenge because the webserver on the AAAA IP returned a 404 while the A webserver had the correct webroot configured. This doesn't meet the conditions for the retry because the failure is at the HTTP challenge validation level and not the IP connectivity level. |
That seems like a situation where it's consistent with our other behavior to treat the validation as failed due to a misconfigured server. |
@jsha I agree, that's why I called it a false positive. |
This also happened to me, although with
|
Regarding this ipv6 preference https://community.letsencrypt.org/t/certbot-ipv6-address-on-domain-misconfigured-and-challenges-fail-prefer-ipv6/34626 In this case the user have a domain with both records, A and AAAA, but the web server is only configured for ipv4, the ipv6 reachs the web server but not the right virtualhost, in this case, obviously, the challenge fails. I don't know whether it is worth to fallback to ipv4 in this case. |
@sahsanu - Thanks for commenting. That therad is the same one I mentioned earlier in this thread as a false positive (I should have linked to it, apologies). In this case I don't expect a fallback and everything appears to be working as intended. |
Got it! I didn't understand that part. :-) |
Just another "false positive" https://community.letsencrypt.org/t/404-on-well-known-acme-challenge-but-accessable-from-browser/34730 I can understand the decision to prefer AAAA if both records are available for a domain, but sadly we are still living in an ipv4 world. The use case for ipv6 is very limited as majority of domestic ISPs doesn't provide an ipv6 to their customers. Also, there are a lot of people getting a dedicated, vps and shared hosting and theirs hosters auto conf the DNS providing both ips (ipv4 and ipv6) but people doesn't care about ipv6 (yet) and don't configure their services tu use it properly so I'm afraid we will see a lot of cases with this "false positive" issue ;). |
I will be posting an announcement in the community forum about the IPv6 preference today. Hopefully that will help clear up the confusion. Ultimately if you run a website that publishes an AAAA address that doesn't work you're going to run into problems sooner or later! |
From the entry in https://community.letsencrypt.org/t/unable-to-update-challenge-the-challenge-is-not-pending/35118/3, looking in the logs, it appears that the problem was a timeout. So one possibility is that the fallback doesn't happen correctly on timeouts, perhaps because the first try uses up all of the available time? |
@jsha That's indeed a possibility. We didn't increase the timeout to accommodate making two back-to-back requests. |
Same problem here. The IPv6 address times out since the HTTP server isn't listening on that interface, and the IPv4 is never checked ( Why not run both in parallel and let the faster one win? If you want to give IPv6 a preference, you can start that check a second before the IPv4. |
Any news on this issue? In my case, the IPv6 address is unreachable (out of my direct control). For example curl -v6 returns "Immediate connect fail", "Network is unreachable". Thanks |
No news - if the issue isn't assigned & placed into a milestone for a sprint then it's safe to assume it isn't being actively worked on yet. |
I’ve just stumbled upon this. My server apparently drops its IPv6 connectivity from time to time (no idea why at this point), and then the challenge verification fails by timing out. I would have expected a fallback to IPv4, but apparently no. |
I've updated this issue description to reflect my understanding after some debugging & working on a fix. The fallback problem is isolated to HTTP-01 and I have opened #2852 with a proposed fix. |
The implementation of the dialer used by the HTTP01 challenge, constructed with `resolveAndConstructDialer`, used the same wrapped `net.Dialer` for both the initial IPv6 connection, and any subsequent IPv4 fallback connections. This caused the IPv4 fallback to never succeed for cases where the initial IPv6 connection expended the `validationTimeout`. This commit updates the http01Dialer (newly renamed from `dialer` since it is in fact specific to HTTP01 challenges) to use a fresh dialer for each connection. To facilitate testing the http01Dialer maintains a count of how many dialer instances it has constructed. We use this in a unit test to ensure the correct behaviour without a great deal of new mocking/interfaces. Resolves #2770
Hi, |
Hi @derekatkins - the fallback behaviour is a server-side change, and has been deployed to production already. The catch is that it's not a complete solution for 100% of all broken IPv6 configurations. In practice there are a handful of cases where IPv6 will not validate for ACME and is broken, but in which the actual IPv6 connectivity works enough to prevent a fallback from occurring. At this point we've decided that we can't invest any more resources in improving the fallback and are not pursuing additional improvements to the server-side code.
@derekatkins I recommend that you resolve the IPv6 connectivity or remove the AAAA record entirely. Unfortunately these are the only two options that will be able to fix your problem. If you need further help diagnosing the problem I recommend starting a new forum topic in the Let's Encrypt Community Forum. Thanks! |
Interesting. I would think that a lack-of-connect would trigger the fallback after the connect() times out -- which is my case. I'll work on getting the AAAA records removed (I don't control the DNS) until I get IPv6 working again. |
I just ran into this issue as well. The error messages were not descriptive enough in the main client to even clue me in to why I was receiving timeouts, but only on one of my domains (I have several with shared IP addresses). After an entire day of investigating, it became apparent that it was because that was the only domain with dual-stack listed in DNS, and there was some routing issues upstream with IPv6 between Lets Encrypt and my servers. In this particular case, because no TCP connection could even be made, and just "timeout" instead, shouldn't that quality as a downgrade to IPv4 condition? This is EXACTLY how web browsers handle this exact situation. Sadly, even today in 2018, there are still routing issues with the IPv6 global network at the backbone/BGP level, and because of this, it literally took my production web site offline due to the fact I could not renew certs through LetsEncrypt, and simply got the rate limit (only 5?) when it really seems like a IPv4 fallback should have been preferential. |
Any updates on this? I rely on IPv4 connection too. I don't have AAAA record defined, only A, and still does not work. |
It sounds like you have a different problem. I recommend posting on https://community.letsencrypt.org/. Thanks! |
Another thumbs up for this problem. We do have two domains with IPv6 on port 443 enabled and those update crtificates correctly. Remaining domains are for our use however, not published to clients, so without IPv4 (no need for it, only universities use IPv6 there). I can symlink all challenge dirs into one, but option -ipv4only for certbot would be cooler... |
Hi @navara! It sounds like you've got a configuration problem. I recommend posting on https://community.letsencrypt.org/. I'm going to lock this conversation for now - I think most followups are best sent to the forum. Thanks all! |
A user in IRC noticed that they were suffering HTTP-01 validation failures for a domain that previously worked. Investigating it appears the domain had an AAAA record and an A record but the AAAA address wasn't working. I expected the IPv6 to IPv4 fallback code would have masked this issue but looking at the validation records it did not, there is no
addressTried
, and theaddressUsed
is the v6 address:The VA logged:
The root cause is the VA's HTTP-01 dialer wrapper is re-using the same underlying net.Dialer with an expended timeout between the initial and subsequent fallback connection.
The text was updated successfully, but these errors were encountered: