Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor DNS check #116

Merged
merged 2 commits into from
Feb 11, 2016
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 113 additions & 38 deletions acme/dns_challenge.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,18 @@ import (
"errors"
"fmt"
"log"
"net"
"strings"
"time"

"github.com/miekg/dns"
)

type preCheckDNSFunc func(domain, fqdn string) bool
type preCheckDNSFunc func(domain, fqdn, value string) error

var preCheckDNS preCheckDNSFunc = checkDNS
var preCheckDNS preCheckDNSFunc = checkDnsPropagation

var preCheckDNSFallbackCount = 5
var recursionMaxDepth = 10

// DNS01Record returns a DNS record which will fulfill the `dns-01` challenge
func DNS01Record(domain, keyAuth string) (fqdn string, value string, ttl int) {
Expand Down Expand Up @@ -60,50 +61,121 @@ func (s *dnsChallenge) Solve(chlng challenge, domain string) error {
}
}()

fqdn, _, _ := DNS01Record(domain, keyAuth)
fqdn, value, _ := DNS01Record(domain, keyAuth)

preCheckDNS(domain, fqdn)
logf("[INFO][%s] Checking DNS record propagation...", domain)

if err = preCheckDNS(domain, fqdn, value); err != nil {
return err
}

return s.validate(s.jws, domain, chlng.URI, challenge{Resource: "challenge", Type: chlng.Type, Token: chlng.Token, KeyAuthorization: keyAuth})
}

func checkDNS(domain, fqdn string) bool {
// check if the expected DNS entry was created. If not wait for some time and try again.
m := new(dns.Msg)
m.SetQuestion(domain+".", dns.TypeSOA)
c := new(dns.Client)
in, _, err := c.Exchange(m, "google-public-dns-a.google.com:53")
// checkDnsPropagation checks if the expected TXT record has been propagated to
// all authoritative nameservers. If not it waits and retries for some time.
func checkDnsPropagation(domain, fqdn, value string) error {
authoritativeNss, err := lookupNameservers(toFqdn(domain))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be lookupNameservers(toFqdn(fqdn)) instead.

The discussion in a previous thread still applies to the current commit. I'm commenting again so it appears here. [ cc: @janeczku ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation domain (fqdn) and domain are owned by the same DNS authority, hence they share the same set of nameservers. The reason we are using the domain to lookup the authoritative NS is that we expect the domain to exist, while the fqdn record may not yet have propagated in the DNS system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is legal to delegate the validation domain to a different set of nameservers, in which case lookupNameservers(toFqdn(domain)) and lookupNameservers(toFqdn(fqdn)) will have different results.

It is also legal to complete a DNS-01 challenge by publishing a TXT record for fqdn even if the public authoritative nameservers have no records at all for domain (e.g. split horizon DNS), as I mentioned in a second thread on the previous PR. It's not correct to require domain to exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. The ACME spec could not be more clearer:

The client constructs the validation domain name by prepending the label
"_acme-challenge" to the domain name being validated, then provisions a TXT
record with the digest value under that name."

Where does it tell you to delegate the validation domain to another zone?

Also, doing so would undermine the whole objective of the DNS challenge, which is to establish proof of ownership of the certificate domain by the client.

"The client must demonstrate to the server both (1) that it holds the private key of the account key pair, and (2) that it has authority over the identifier being claimed."

If the validation domain is by itself a root domain (delegated to a different zone), then by provisioning a TXT record under it, you do not prove authority over the (parent) certificate domain, which may be owned by a third party. (e.g. _acme-challenge.co.uk. -> you, but co.uk. -> ns1.nic.uk)

Lastly, why are we even having this academic discussion? Lego solves DNS challenge by provisioning a TXT record under the certificate domain. No zone delegation is happening, which is why we don't need to handle such case in the DNS check (irrespective of whether that would be technically legal).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also legal to complete a DNS-01 challenge by publishing a TXT record for fqdn even if the public authoritative nameservers have no records at all for domain (e.g. split horizon DNS), as I mentioned in a second thread on the previous PR. It's not correct to require domain to exist.

We do not require the certificate domain to exist. E.g, if the certificate domain www.example.com has no records (NXDOMAIN) the DNS check will still pass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ACME spec could not be more clearer:

The client constructs the validation domain name by prepending the label "_acme-challenge" to the domain name being validated, then provisions a TXT record with the digest value under that name."

Exactly. You don't provision records on the domain name being validated, you provision records for the validation domain name. The DNS-01 section doesn't concern the domain name at all – only the validation domain name, fqdn in this context. DNS-01 doesn't make any requirements of the domain name, only of the validation domain name.

Where does it tell you to delegate the validation domain to another zone?

Where does it tell you I can't delegate it?

Also, doing so would undermine the whole objective of the DNS challenge, which is to establish proof of ownership of the certificate domain by the client.

The DNS-01 challenge does not require asserting control over the certificate domain. If that were the goal, it could require the TXT record to be published on the certificate domain instead. (It's already specified to ignore non-matching responses, so there's no conflict even if you've already got TXT records on that name.) Why use a separate validation domain name at all?

There are legitimate use cases for delegating _acme-challenge – as I mentioned in the previous thread, the situation where it is impractical to modify bar.com from a cron job because of grumpy change control officers (and the compliance reasons behind their grumpiness). It is desirable in such situations to delegate _acme-challenge.foo.bar.com out of the bar.com zone.

If the validation domain is by itself a root domain (delegated to a different zone), then by provisioning a TXT record under it, you do not prove authority over the (parent) certificate domain, which may be owned by a third party. (e.g. _acme-challenge.co.uk. -> you, but co.uk. -> ns1.nic.uk)

DNS-01 does not concern itself with the certificate domain name, it concerns itself with the validation domain name. If you can control the validation domain name, regardless of which zone or nameservers you use to do it, that is 100% sufficient to satisfy a DNS-01 challenge. Delegation is a part of DNS, and I maintain that it is useful to delegate ACME challenges.

As for _acme-challenge.co.uk, please see the complex history of underscores in DNS. (Or just try registering that domain name.) It is not an accident that _acme-challenge contains an underscore.

Lastly, why are we even having this academic discussion? Lego solves DNS challenge by provisioning a TXT record under the certificate domain. No zone delegation is happening, which is why we don't need to handle such case in the DNS check (irrespective of whether that would be technically legal).

It provisions a TXT record under the validation domain. RFC2136_ZONE=_acme-challenge.foo.bar.com lego --dns=rfc2136 can easily create records in a delegated zone, and of course there's --dns=manual.

We're having this discussion because delegation is a situation where this check would fail while boulder would succeed, and I consider that a bug in this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does it tell you to delegate the validation domain to another zone?

Where does it tell you I can't delegate it?

A TXT record can not co-exist with a NS record for the same name if the name is a subdomain and not the root domain. Hence the spec describing you to create a TXT record under the validation domain by implication means that you should not create a NS record there which would be necessary for your scenario of delegation.

But again, this discussion is out of scope for this PR as Lego's DNS providers don't delegate the validation domain anywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A TXT record can not co-exist with a NS record for the same name if the name is a subdomain and not the root domain.

It can and does. (Edit: Apologies, I read your statement wrong. You're correct that it can't exist as a sibling to the IN NS delegation in the parent zone, but as I illustrate, that's not at issue.)

Scenario: issuing a certificate for is-delegated.willglynn.com.

willglynn.com IN NS ns1.willglynn.com, which has willglynn.com IN SOA ns1.willglynn.com. Within the willglynn.com zone, is-delegated.willglynn.com does not exist, but _acme-challenge.is-delegated.willglynn.com IN NS ns1.lerfjhax.com.

ns1.lerfjhax.com has a zone _acme-challenge.is-delegated.willglynn.com IN SOA ns1.lerfjhax.com. Within that zone, _acme-challenge.is-delegated.willglynn.com IN NS ns1.lerfjhax.com, agreeing with the delegation, and there exists an _acme-challenge.is-delegated.willglynn.com IN TXT record as well.

Asking lookupNameservers(toFqdn(domain)) about the validation domain returns a referral, since ns1.willglynn.com knows it is not an authority for the validation domain:

$ dig txt _acme-challenge.is-delegated.willglynn.com @ns1.willglynn.com

; <<>> DiG 9.8.3-P1 <<>> txt _acme-challenge.is-delegated.willglynn.com @ns1.willglynn.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 988
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;_acme-challenge.is-delegated.willglynn.com. IN TXT

;; AUTHORITY SECTION:
_acme-challenge.is-delegated.willglynn.com. 300 IN NS ns1.lerfjhax.com.

;; Query time: 62 msec
;; SERVER: 205.251.198.58#53(205.251.198.58)
;; WHEN: Tue Feb  9 13:25:06 2016
;; MSG SIZE  rcvd: 87

Asking the nameserver which is actually an authority for the validation domain name – i.e. lookupNameservers(toFqdn(fqdn)) – returns the TXT record:

$ dig txt _acme-challenge.is-delegated.willglynn.com @ns1.lerfjhax.com

; <<>> DiG 9.8.3-P1 <<>> txt _acme-challenge.is-delegated.willglynn.com @ns1.lerfjhax.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24113
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;_acme-challenge.is-delegated.willglynn.com. IN TXT

;; ANSWER SECTION:
_acme-challenge.is-delegated.willglynn.com. 60 IN TXT "1Mvyj8Fkgch0YMz0PyTk83hrPIHrYgR-pbvN3KuevTw"

;; AUTHORITY SECTION:
_acme-challenge.is-delegated.willglynn.com. 259200 IN NS ns1.lerfjhax.com.

;; ADDITIONAL SECTION:
ns1.lerfjhax.com.   86400   IN  A   208.79.90.50
ns1.lerfjhax.com.   86400   IN  AAAA    2607:f2f8:a428::2

;; Query time: 76 msec
;; SERVER: 2607:f2f8:a428::2#53(2607:f2f8:a428::2)
;; WHEN: Tue Feb  9 13:25:08 2016
;; MSG SIZE  rcvd: 187

And yes, this works:

2016/02/09 13:25:14 [INFO][is-delegated.willglynn.com] The server validated our request
2016/02/09 13:25:14 [INFO] acme: You can now remove this TXT record from your DNS zone:
2016/02/09 13:25:14 [INFO] acme: _acme-challenge.is-delegated.willglynn.com. 120 IN TXT "..."
2016/02/09 13:25:14 [INFO][is-delegated.willglynn.com] acme: Validations succeeded; requesting certificates
2016/02/09 13:25:14 [INFO] acme: Requesting issuer cert from https://acme-staging.api.letsencrypt.org/acme/issuer-cert
2016/02/09 13:25:14 [INFO][is-delegated.willglynn.com] Server responded with a certificate.

But again, this discussion is out of scope for this PR as Lego's DNS providers don't delegate the validation domain anywhere.

…unless you delegate the validation domain ahead of time and only need lego to create the TXT record in the delegated zone. As I said, RFC2136_ZONE=_acme-challenge.foo.bar.com lego --dns=rfc2136 can do that right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…unless you delegate the validation domain ahead of time and only need lego to create the TXT record in the delegated zone. As I said, RFC2136_ZONE=_acme-challenge.foo.bar.com lego --dns=rfc2136 can do that right now.

You are right, thats a valid (if edgy) scenario.

if err != nil {
return false
return err
}

var authorativeNS string
for _, answ := range in.Answer {
soa := answ.(*dns.SOA)
authorativeNS = soa.Ns
if err = waitFor(30, 2, func() (bool, error) {
return checkAuthoritativeNss(fqdn, value, authoritativeNss)
}); err != nil {
return err
}

fallbackCnt := 0
for fallbackCnt < preCheckDNSFallbackCount {
m.SetQuestion(fqdn, dns.TypeTXT)
in, _, err = c.Exchange(m, authorativeNS+":53")
return nil
}

// checkAuthoritativeNss queries each of the given nameservers for the expected TXT record.
func checkAuthoritativeNss(fqdn, value string, nameservers []string) (bool, error) {
for _, ns := range nameservers {
r, err := dnsQuery(fqdn, dns.TypeTXT, ns, false)
if err != nil {
return false
return false, err
}

if len(in.Answer) > 0 {
return true
if r.Rcode != dns.RcodeSuccess {
return false, fmt.Errorf("%s returned RCode %s", ns, dns.RcodeToString[r.Rcode])
}

fallbackCnt++
if fallbackCnt >= preCheckDNSFallbackCount {
return false
var found bool
for _, rr := range r.Answer {
if txt, ok := rr.(*dns.TXT); ok {
if strings.Join(txt.Txt, "") == value {
found = true
break
}
}
}

time.Sleep(time.Second * time.Duration(fallbackCnt))
if !found {
return false, fmt.Errorf("%s did not return the expected TXT record", ns)
}
}

return false
return true, nil
}

// dnsQuery sends a DNS query to the given nameserver.
func dnsQuery(fqdn string, rtype uint16, nameserver string, recursive bool) (in *dns.Msg, err error) {
m := new(dns.Msg)
m.SetQuestion(fqdn, rtype)
m.SetEdns0(4096, false)
if !recursive {
m.RecursionDesired = false
}

in, err = dns.Exchange(m, net.JoinHostPort(nameserver, "53"))
if err == dns.ErrTruncated {
tcp := &dns.Client{Net: "tcp"}
in, _, err = tcp.Exchange(m, nameserver)
}

return
}

// lookupNameservers returns the authoritative nameservers for the given domain name.
func lookupNameservers(fqdn string) ([]string, error) {
var err error
var r *dns.Msg
var authoritativeNss []string
resolver := "google-public-dns-a.google.com"

r, err = dnsQuery(fqdn, dns.TypeSOA, resolver, true)
if err != nil {
return nil, err
}

// If there is a SOA RR in the Answer section then fqdn is the root domain.
for _, rr := range r.Answer {
if soa, ok := rr.(*dns.SOA); ok {
r, err = dnsQuery(soa.Hdr.Name, dns.TypeNS, resolver, true)
if err != nil {
return nil, err
}

for _, rr := range r.Answer {
if ns, ok := rr.(*dns.NS); ok {
authoritativeNss = append(authoritativeNss, strings.ToLower(ns.Ns))
}
}

return authoritativeNss, nil
}
}

// Strip of the left most label to get the parent domain.
offset, _ := dns.NextLabel(fqdn, 0)
next := fqdn[offset:]
// Only the TLD label left. This should not happen if the domain DNS is healthy.
if dns.CountLabel(next) < 2 {
return nil, fmt.Errorf("Could not determine root domain")
}

return lookupNameservers(fqdn[offset:])
}

// toFqdn converts the name into a fqdn appending a trailing dot.
Expand All @@ -124,22 +196,25 @@ func unFqdn(name string) string {
return name
}

// waitFor polls the given function 'f', once per second, up to 'timeout' seconds.
func waitFor(timeout int, f func() (bool, error)) error {
start := time.Now().Second()
// waitFor polls the given function 'f', once every 'interval' seconds, up to 'timeout' seconds.
func waitFor(timeout, interval int, f func() (bool, error)) error {
var lastErr string
timeup := time.After(time.Duration(timeout) * time.Second)
for {
time.Sleep(1 * time.Second)

if delta := time.Now().Second() - start; delta >= timeout {
return fmt.Errorf("Time limit exceeded (%d seconds)", delta)
select {
case <-timeup:
return fmt.Errorf("Time limit exceeded. Last error: %s", lastErr)
default:
}

stop, err := f()
if err != nil {
return err
}
if stop {
return nil
}
if err != nil {
lastErr = err.Error()
}

time.Sleep(time.Duration(interval) * time.Second)
}
}
2 changes: 1 addition & 1 deletion acme/dns_challenge_route53.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ func (r *DNSProviderRoute53) changeRecord(action, fqdn, value string, ttl int) e
return err
}

return waitFor(90, func() (bool, error) {
return waitFor(90, 5, func() (bool, error) {
status, err := r.client.GetChange(resp.ChangeInfo.ID)
if err != nil {
return false, err
Expand Down
141 changes: 138 additions & 3 deletions acme/dns_challenge_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,75 @@ import (
"net/http"
"net/http/httptest"
"os"
"reflect"
"sort"
"strings"
"testing"
"time"
)

var lookupNameserversTestsOK = []struct {
fqdn string
nss []string
}{
{"books.google.com.ng.",
[]string{"ns1.google.com.", "ns2.google.com.", "ns3.google.com.", "ns4.google.com."},
},
{"www.google.com.",
[]string{"ns1.google.com.", "ns2.google.com.", "ns3.google.com.", "ns4.google.com."},
},
{"physics.georgetown.edu.",
[]string{"ns1.georgetown.edu.", "ns2.georgetown.edu.", "ns3.georgetown.edu."},
},
}

var lookupNameserversTestsErr = []struct {
fqdn string
error string
}{
// invalid tld
{"_null.n0n0.",
"Could not determine root domain",
},
// invalid domain
{"_null.com.",
"Could not determine root domain",
},
}

var checkAuthoritativeNssTests = []struct {
fqdn, value string
ns []string
ok bool
}{
// TXT RR w/ expected value
{"8.8.8.8.asn.routeviews.org.", "151698.8.8.024", []string{"asnums.routeviews.org."},
true,
},
// No TXT RR
{"ns1.google.com.", "", []string{"ns2.google.com."},
false,
},
}

var checkAuthoritativeNssTestsErr = []struct {
fqdn, value string
ns []string
error string
}{
// TXT RR /w unexpected value
{"8.8.8.8.asn.routeviews.org.", "fe01=", []string{"asnums.routeviews.org."},
"did not return the expected TXT record",
},
// No TXT RR
{"ns1.google.com.", "fe01=", []string{"ns2.google.com."},
"did not return the expected TXT record",
},
}

func TestDNSValidServerResponse(t *testing.T) {
preCheckDNS = func(domain, fqdn string) bool {
return true
preCheckDNS = func(domain, fqdn, value string) error {
return nil
}
privKey, _ := generatePrivateKey(rsakey, 512)

Expand All @@ -39,7 +101,80 @@ func TestDNSValidServerResponse(t *testing.T) {
}

func TestPreCheckDNS(t *testing.T) {
if !preCheckDNS("api.letsencrypt.org", "acme-staging.api.letsencrypt.org") {
err := preCheckDNS("api.letsencrypt.org", "acme-staging.api.letsencrypt.org", "fe01=")
if err != nil {
t.Errorf("preCheckDNS failed for acme-staging.api.letsencrypt.org")
}
}

func TestLookupNameserversOK(t *testing.T) {
for _, tt := range lookupNameserversTestsOK {
nss, err := lookupNameservers(tt.fqdn)
if err != nil {
t.Fatalf("#%s: got %q; want nil", tt.fqdn, err)
}

sort.Strings(nss)
sort.Strings(tt.nss)

if !reflect.DeepEqual(nss, tt.nss) {
t.Errorf("#%s: got %v; want %v", tt.fqdn, nss, tt.nss)
}
}
}

func TestLookupNameserversErr(t *testing.T) {
for _, tt := range lookupNameserversTestsErr {
_, err := lookupNameservers(tt.fqdn)
if err == nil {
t.Fatalf("#%s: expected %q (error); got <nil>", tt.fqdn, tt.error)
}

if !strings.Contains(err.Error(), tt.error) {
t.Errorf("#%s: expected %q (error); got %q", tt.fqdn, tt.error, err)
continue
}
}
}

func TestCheckAuthoritativeNss(t *testing.T) {
for _, tt := range checkAuthoritativeNssTests {
ok, _ := checkAuthoritativeNss(tt.fqdn, tt.value, tt.ns)
if ok != tt.ok {
t.Errorf("#%s: got %t; want %t", tt.fqdn, tt.ok)
}
}
}

func TestCheckAuthoritativeNssErr(t *testing.T) {
for _, tt := range checkAuthoritativeNssTestsErr {
_, err := checkAuthoritativeNss(tt.fqdn, tt.value, tt.ns)
if err == nil {
t.Fatalf("#%s: expected %q (error); got <nil>", tt.fqdn, tt.error)
}
if !strings.Contains(err.Error(), tt.error) {
t.Errorf("#%s: expected %q (error); got %q", tt.fqdn, tt.error, err)
continue
}
}
}

func TestWaitForTimeout(t *testing.T) {
c := make(chan error)
go func() {
err := waitFor(3, 1, func() (bool, error) {
return false, nil
})
c <- err
}()

timeout := time.After(4 * time.Second)
select {
case <-timeout:
t.Fatal("timeout exceeded")
case err := <-c:
if err == nil {
t.Errorf("expected timeout error; got <nil>", err)
}
}
}