New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor DNS check #116
Refactor DNS check #116
Conversation
* Gets a list of all authoritative nameservers by looking up the NS RRs for the root domain (zone apex) * Verifies that the expected TXT record exists on all nameservers before sending off the challenge to ACME server
// checkDnsPropagation checks if the expected TXT record has been propagated to | ||
// all authoritative nameservers. If not it waits and retries for some time. | ||
func checkDnsPropagation(domain, fqdn, value string) error { | ||
authoritativeNss, err := lookupNameservers(toFqdn(domain)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be
lookupNameservers(toFqdn(fqdn))
instead.
The discussion in a previous thread still applies to the current commit. I'm commenting again so it appears here. [ cc: @janeczku ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation domain (fqdn) and domain are owned by the same DNS authority, hence they share the same set of nameservers. The reason we are using the domain to lookup the authoritative NS is that we expect the domain to exist, while the fqdn record may not yet have propagated in the DNS system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is legal to delegate the validation domain to a different set of nameservers, in which case lookupNameservers(toFqdn(domain))
and lookupNameservers(toFqdn(fqdn))
will have different results.
It is also legal to complete a DNS-01 challenge by publishing a TXT record for fqdn
even if the public authoritative nameservers have no records at all for domain
(e.g. split horizon DNS), as I mentioned in a second thread on the previous PR. It's not correct to require domain
to exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. The ACME spec could not be more clearer:
The client constructs the validation domain name by prepending the label
"_acme-challenge" to the domain name being validated, then provisions a TXT
record with the digest value under that name."
Where does it tell you to delegate the validation domain to another zone?
Also, doing so would undermine the whole objective of the DNS challenge, which is to establish proof of ownership of the certificate domain by the client.
"The client must demonstrate to the server both (1) that it holds the private key of the account key pair, and (2) that it has authority over the identifier being claimed."
If the validation domain is by itself a root domain (delegated to a different zone), then by provisioning a TXT record under it, you do not prove authority over the (parent) certificate domain, which may be owned by a third party. (e.g. _acme-challenge.co.uk.
-> you, but co.uk.
-> ns1.nic.uk
)
Lastly, why are we even having this academic discussion? Lego solves DNS challenge by provisioning a TXT record under the certificate domain. No zone delegation is happening, which is why we don't need to handle such case in the DNS check (irrespective of whether that would be technically legal).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is also legal to complete a DNS-01 challenge by publishing a TXT record for fqdn even if the public authoritative nameservers have no records at all for domain (e.g. split horizon DNS), as I mentioned in a second thread on the previous PR. It's not correct to require domain to exist.
We do not require the certificate domain to exist. E.g, if the certificate domain www.example.com has no records (NXDOMAIN) the DNS check will still pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ACME spec could not be more clearer:
The client constructs the validation domain name by prepending the label "_acme-challenge" to the domain name being validated, then provisions a TXT record with the digest value under that name."
Exactly. You don't provision records on the domain name being validated, you provision records for the validation domain name. The DNS-01 section doesn't concern the domain name at all – only the validation domain name, fqdn
in this context. DNS-01 doesn't make any requirements of the domain name, only of the validation domain name.
Where does it tell you to delegate the validation domain to another zone?
Where does it tell you I can't delegate it?
Also, doing so would undermine the whole objective of the DNS challenge, which is to establish proof of ownership of the certificate domain by the client.
The DNS-01 challenge does not require asserting control over the certificate domain. If that were the goal, it could require the TXT record to be published on the certificate domain instead. (It's already specified to ignore non-matching responses, so there's no conflict even if you've already got TXT records on that name.) Why use a separate validation domain name at all?
There are legitimate use cases for delegating _acme-challenge
– as I mentioned in the previous thread, the situation where it is impractical to modify bar.com
from a cron job because of grumpy change control officers (and the compliance reasons behind their grumpiness). It is desirable in such situations to delegate _acme-challenge.foo.bar.com
out of the bar.com
zone.
If the validation domain is by itself a root domain (delegated to a different zone), then by provisioning a TXT record under it, you do not prove authority over the (parent) certificate domain, which may be owned by a third party. (e.g. _acme-challenge.co.uk. -> you, but co.uk. -> ns1.nic.uk)
DNS-01 does not concern itself with the certificate domain name, it concerns itself with the validation domain name. If you can control the validation domain name, regardless of which zone or nameservers you use to do it, that is 100% sufficient to satisfy a DNS-01 challenge. Delegation is a part of DNS, and I maintain that it is useful to delegate ACME challenges.
As for _acme-challenge.co.uk
, please see the complex history of underscores in DNS. (Or just try registering that domain name.) It is not an accident that _acme-challenge
contains an underscore.
Lastly, why are we even having this academic discussion? Lego solves DNS challenge by provisioning a TXT record under the certificate domain. No zone delegation is happening, which is why we don't need to handle such case in the DNS check (irrespective of whether that would be technically legal).
It provisions a TXT record under the validation domain. RFC2136_ZONE=_acme-challenge.foo.bar.com lego --dns=rfc2136
can easily create records in a delegated zone, and of course there's --dns=manual
.
We're having this discussion because delegation is a situation where this check would fail while boulder
would succeed, and I consider that a bug in this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does it tell you to delegate the validation domain to another zone?
Where does it tell you I can't delegate it?
A TXT record can not co-exist with a NS record for the same name if the name is a subdomain and not the root domain. Hence the spec describing you to create a TXT record under the validation domain by implication means that you should not create a NS record there which would be necessary for your scenario of delegation.
But again, this discussion is out of scope for this PR as Lego's DNS providers don't delegate the validation domain anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A TXT record can not co-exist with a NS record for the same name if the name is a subdomain and not the root domain.
It can and does. (Edit: Apologies, I read your statement wrong. You're correct that it can't exist as a sibling to the IN NS
delegation in the parent zone, but as I illustrate, that's not at issue.)
Scenario: issuing a certificate for is-delegated.willglynn.com
.
willglynn.com IN NS ns1.willglynn.com
, which has willglynn.com IN SOA ns1.willglynn.com
. Within the willglynn.com
zone, is-delegated.willglynn.com
does not exist, but _acme-challenge.is-delegated.willglynn.com IN NS ns1.lerfjhax.com
.
ns1.lerfjhax.com
has a zone _acme-challenge.is-delegated.willglynn.com IN SOA ns1.lerfjhax.com
. Within that zone, _acme-challenge.is-delegated.willglynn.com IN NS ns1.lerfjhax.com
, agreeing with the delegation, and there exists an _acme-challenge.is-delegated.willglynn.com IN TXT
record as well.
Asking lookupNameservers(toFqdn(domain))
about the validation domain returns a referral, since ns1.willglynn.com
knows it is not an authority for the validation domain:
$ dig txt _acme-challenge.is-delegated.willglynn.com @ns1.willglynn.com
; <<>> DiG 9.8.3-P1 <<>> txt _acme-challenge.is-delegated.willglynn.com @ns1.willglynn.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 988
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;_acme-challenge.is-delegated.willglynn.com. IN TXT
;; AUTHORITY SECTION:
_acme-challenge.is-delegated.willglynn.com. 300 IN NS ns1.lerfjhax.com.
;; Query time: 62 msec
;; SERVER: 205.251.198.58#53(205.251.198.58)
;; WHEN: Tue Feb 9 13:25:06 2016
;; MSG SIZE rcvd: 87
Asking the nameserver which is actually an authority for the validation domain name – i.e. lookupNameservers(toFqdn(fqdn))
– returns the TXT record:
$ dig txt _acme-challenge.is-delegated.willglynn.com @ns1.lerfjhax.com
; <<>> DiG 9.8.3-P1 <<>> txt _acme-challenge.is-delegated.willglynn.com @ns1.lerfjhax.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24113
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 2
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;_acme-challenge.is-delegated.willglynn.com. IN TXT
;; ANSWER SECTION:
_acme-challenge.is-delegated.willglynn.com. 60 IN TXT "1Mvyj8Fkgch0YMz0PyTk83hrPIHrYgR-pbvN3KuevTw"
;; AUTHORITY SECTION:
_acme-challenge.is-delegated.willglynn.com. 259200 IN NS ns1.lerfjhax.com.
;; ADDITIONAL SECTION:
ns1.lerfjhax.com. 86400 IN A 208.79.90.50
ns1.lerfjhax.com. 86400 IN AAAA 2607:f2f8:a428::2
;; Query time: 76 msec
;; SERVER: 2607:f2f8:a428::2#53(2607:f2f8:a428::2)
;; WHEN: Tue Feb 9 13:25:08 2016
;; MSG SIZE rcvd: 187
And yes, this works:
2016/02/09 13:25:14 [INFO][is-delegated.willglynn.com] The server validated our request
2016/02/09 13:25:14 [INFO] acme: You can now remove this TXT record from your DNS zone:
2016/02/09 13:25:14 [INFO] acme: _acme-challenge.is-delegated.willglynn.com. 120 IN TXT "..."
2016/02/09 13:25:14 [INFO][is-delegated.willglynn.com] acme: Validations succeeded; requesting certificates
2016/02/09 13:25:14 [INFO] acme: Requesting issuer cert from https://acme-staging.api.letsencrypt.org/acme/issuer-cert
2016/02/09 13:25:14 [INFO][is-delegated.willglynn.com] Server responded with a certificate.
But again, this discussion is out of scope for this PR as Lego's DNS providers don't delegate the validation domain anywhere.
…unless you delegate the validation domain ahead of time and only need lego
to create the TXT record in the delegated zone. As I said, RFC2136_ZONE=_acme-challenge.foo.bar.com lego --dns=rfc2136
can do that right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…unless you delegate the validation domain ahead of time and only need lego to create the TXT record in the delegated zone. As I said, RFC2136_ZONE=_acme-challenge.foo.bar.com lego --dns=rfc2136 can do that right now.
You are right, thats a valid (if edgy) scenario.
I like this strategy. Unfortunately, it is insufficient to mimic the behavior of typical recursive resolvers:
This particular example is rather contrived, but still, this approach isn't 100% equivalent to how |
The example is not only contrived. You're breaking the ACME spec when you create a CNAME record instead of a TXT record at the validation domain.
|
The spec describes validation as:
I read that as "query for
If the spec intends to require the TXT record to be provisioned directly on the validation name, we should clarify the language in the spec and fix |
I agree, there is something to be wished for in terms of clarity of the spec. E.g. the description as it relates to the client behavior is very specific:
I don't see how one could read that as "... the client may alternatively create a CNAME record under the validation domain and a TXT record in the target zone." But apparently the consensus indeed seems to be that CNAME'ing the validation domain is legal: https://www.ietf.org/mail-archive/web/acme/current/msg00827.html Anyhow, seeing that currently none of the Lego DNS providers supports this edge case, i think there is a point to be made that we can always factor in support for this in the DNS check later when necessary. |
I read the text as normative – suggestions, really – and then worked backwards from how the validation section is written.
Well, the We could accommodate this case with a flag that says "instead of checking and proceeding automatically, wait for manual confirmation", or by making the automatic check recursive (i.e. following CNAMEs up to a limited depth in the pursuit of TXT records). |
Will, I have pushed a commit to the PR that adds support for the scenarios discussed yesterday:
I tested
Please let me know if you are 👍 with the PR as it is so we can move forward and ask @xenolf to land this. |
👍 I think this is good to merge. It works better than the current code and it addresses the two cases I had in mind as clear sources of trouble. This check still makes some assumptions that might be problematic in conceivable situations, like that the set of nameservers doesn't change throughout the operation, but as far as I can tell so far, such issues will shake themselves out just by retrying the command. That's a big improvement over a check that will never succeed and which requires recompiling I've been hacking on a way to test code like this (DNS fixtures modeling a constellation of authoritative servers which you can update on command), but it's not ready yet, and in any case seems like the subject of a separate PR. |
😋 🎉 Yipeeeeh
Looking forward to check that out once it's ready. |
Replaces #96