Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time limit exceeded / did not return the expected TXT record #241

Closed
steadicat opened this issue Jul 5, 2016 · 6 comments
Closed

Time limit exceeded / did not return the expected TXT record #241

steadicat opened this issue Jul 5, 2016 · 6 comments

Comments

@steadicat
Copy link

steadicat commented Jul 5, 2016

This is similar to #212, but slightly different. I'm using the manual DNS challenge with Google Domains. I get:

2016/07/05 16:24:29 [INFO] acme: Please create the following TXT record in your bevelapp.com. zone:
2016/07/05 16:24:29 [INFO] acme: _acme-challenge.bevelapp.com. 120 IN TXT "..."
2016/07/05 16:24:29 [INFO] acme: Press 'Enter' when you are done

2016/07/05 16:25:31 [INFO][bevelapp.com] Checking DNS record propagation...
2016/07/05 16:27:34 [INFO] acme: You can now remove this TXT record from your bevelapp.com. zone:
2016/07/05 16:27:34 [INFO] acme: _acme-challenge.bevelapp.com. 120 IN TXT "..."
2016/07/05 16:27:34 [bevelapp.com] Could not obtain certificates
    Time limit exceeded. Last error: NS ns-cloud-a2.googledomains.com. did not return the expected TXT record

Yet if I query the DNS server directly, I get the right response:

$ dig _acme-challenge.bevelapp.com TXT @ns-cloud-a2.googledomains.com
_acme-challenge.bevelapp.com. 60 IN TXT "..."

I tried this a dozen times, even bumping up the DNS timeout to 5 minutes and the TXT record TTL down to 1 minute, and I get the same error every time. The only thing that changes is the name server in the error message. Sometimes it's one of the domain’s nameservers (e.g. ns-cloud-a1.googledomains.com), sometimes it’s ns*.google.com (?!?).

Note that the error message seems really confused, because it says “you can now remove this TXT record”, right before “did not return the expected TXT record”.

The workaround I found is to NOT “press enter when you are done” as Lego instructs me, but wait until l can confirm that the DNS records have fully propagated, then press enter.

To sum up, I believe there are a few different issues here:

  1. There’s a bug somewhere that breaks the DNS record propagation checking. It seems Lego fails whenever it gets an unexpected response, instead of retrying until the DNS has propagated or the timeout has expired.
  2. There’s a bug with looking up the authoritative nameserver; it gives a wrong result half the time.
  3. The error message is confusing, it says that the TXT record succeeded and failed at the same time.
@steadicat
Copy link
Author

Ok I figured out what might be triggering these issues. I had a CNAME wildcard record pointing to ghs.googlehosted.com. This might explain why Lego was using ns*.google.com as the domain server to check. Seems like there's a bug in there somewhere. Why not use the SOA or NS record of the naked domain?

@xenolf
Copy link
Member

xenolf commented Jul 10, 2016

Hey @steadicat !
We are not using the SOA or NS records as boulder doesn't do that as well. We are trying to emulate a recursive lookup to check propagation as boulder would see it as well.
I will have a word with the manual dns solver though, this output really is confusing.

@lenovouser
Copy link
Contributor

@xenolf I am getting this on the latest binary release and also when I compile it myself from the master branch too. I am using CloudFlare as my DNS provider + not using a CNAME entry for any of my domains that I am requesting certificates for.

@janeczku
Copy link
Contributor

@lenovouser Can you open a separate ticket for the CloudFlare provider and paste the error output and any other observations.

@jipperinbham
Copy link
Contributor

fwiw, cloudflare considers a 5 minute propagation to be "within reason" so it may be a good idea to adjust the timeout in the CloudFlare provider accordingly.

@ldez
Copy link
Member

ldez commented Nov 2, 2018

Since 1.1.0 you can change propagation timeout and polling interval.

@ldez ldez closed this as completed Nov 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants