-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seeing exception 'Couldn't resolve the following domains to an IPv4 record' when creating 7 domain certificate #46
Comments
I have just released |
Thanks for the quick reply. Sorry it took so long on this end.
Running nslookup redbrunch.org immediately produces the correct results. Ran it a second time after doing the nslookup and the SSL cert was created. (Love computers) Reordered the list of domains within the .yml file, deleted the new cert and ran it again. Failed with
Some sort of timing or timeout issue when doing the DNS lookups? |
Could you provide a Wireshark compatible trace of a failure / success? |
Tracked it down to using 8.8.8.8 (Google DNS) to resolve IP addresses. Doing DNS lookups one-by-one is no problem. Sending 32 requests at once appears to trip some sort of spam/DDOS filter. Only 16 requests receive responses. My server lives in the middle of a very large Softlayer server farm, so Google probably already has the address block with my IP addresses on a watch list. Attached are 3 Wireshark trace files (CSV format) showing only 16 responses to 32 queries. The console for the 3rd try is as follows...
This would account for why changing the order of domains in the .yml file caused different domains to fail and why, on occasion, the request would actually complete. Is there / can there be an option to use another DNS server other than 8.8.8.8? |
Thanks for taking a deeper look. I have finally figured out how to read the Windows Registry without any extension. amphp/dns#40 will use the local system config. We'll have a release soon, probably today. |
I have just released |
OK, so getting there. :-) Good news: Acme-client is now using the Windows DNS settings rather than 8.8.8.8. Bad news: Still seeing random DNS timeout/failure issues. Acme-client is now correctly accessing an internal Softlayer DNS server at 10.0.80.11. Have no idea what software the DNS server is running. Many times, was seeing this...
Here is a Wireshark file Once was able to get the multi-domain cert issued, but not a single domain one.
Reran just to pick up the single domain and got this...
Here is the Wireshark file for that Let everything rest for a while (15+ minutes) and tired again. Was able to pick up the final single domain cert.
Here is the Wireshark file for this final run Couple of thoughts. No idea if they are valid or not. :-) . Are the DNAME queries needed? The appears to be some question as to if the DNAME record is now obsolete. . Is it possible to throttle the DNS queries? Yes, requests would take a longer to run, but it might not trigger what appears to be DDOS protections in the DNS server. . "Could not obtain directory" error message seems to be tied to failed DNS queries. Is the message correct? |
Do you know whether 10.0.80.11 is really the right one? It's now searching all interfaces for nameservers, not sure if that's the right thing to do, but it was required to make our tests work on AppVeyor.
I'm not sure about that one. Will defer that question to @DaveRandom and @bwoebi. But we have plans to send the current 4 packets / requests for one resolution in a single packet in the figure. Could you ask your service provider about the failures? I think every server environment should be able to handle 10 concurrent name resolutions without running into DDoS protections. |
OK, so ran some more tests tonight. 10.0.80.11 is the correct internal primary DNS server for Softlayer. nslookup confirms the IP address is correct. Launched Acme-client and was seeing the same errors to above. Switched primary DNS for my server to my internal BIND instance (127.0.0.1). Acme-client ran multiple times with no errors and I was able to create SSL certs. Switched back to 10.0.80.11 and saw the errors errors again. Checked with Softlayer and got the following response: For security reasons we can not provide you with the version of Bind that our resolvers run. I'm running BIND 9.9.... I'm guessing they are on some version of BIND 9.10 or 9.11 ... or some customized software. So the issue, IMHO, appears to be a version(s) of BIND(?) DNS server software not responding fast enough or as expected. Or some Softlayer virtual network issue that's causing problems with both 10.0.80.11 and 8.8.8.8. :-) At this point, I would close or put this issue on hold. Once your new DNS "send the ... requests ... in a single packet" is in place, notify me and I will try the various DNS servers again. In the mean time, I'll use 127.0.0.1 and BIND 9.9... as primary DNS, which seems to be working just fine. Thanks for your help and support! |
Yes, 10.0.80.11 is definitely right according to http://knowledgelayer.softlayer.com/faqs/13#26, too. I'm not sure whether it's a BIND version issue or a configuration issue. |
BIND 9.10 implemented a Response Rate Limiting Feature to prevent DNS amplification attacks that may be causing this problem. See https://kb.isc.org/article/AA-00994/0/Using-the-Response-Rate-Limiting-Feature-in-BIND-9.10.html Have been using BIND 9.9 for DNS and seeing no problems. |
Do you have a timeout specified using |
I'm running on Windows, not Linux. In Windows, DNS servers are configured under the network adapter settings. AFAIK there are no timeout setting options ... or at least none that I've set. |
There is one, but we don't support that one yet. Ok, fine then. |
Seeing same issue as #33 but with just 7 domains. Reduce list to 6 and all is well. Running version 0.2.11 under Windows 2008 Server and PHP 7.1 in a virtual environment, using a .yml file.
Error comes thru as something like...
Sometimes it has problems resolving just redbrunch.org domain.
The corresponding part of the .yml file is
Eliminate one of the domains, and all is well. Using the letsencrypt staging server produces the same results.
Seems to be some sort of timing or time-out issue, but have not been able to narrow it down.
The text was updated successfully, but these errors were encountered: