Skip to content

Query DNS to Determine Apex Domains #134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 24, 2021
Merged

Query DNS to Determine Apex Domains #134

merged 5 commits into from
Jun 24, 2021

Conversation

jriggins
Copy link
Contributor

@jriggins jriggins commented Jun 16, 2021

From https://datatracker.ietf.org/doc/html/rfc7719#section-6 (with emphasis on the last sentence).

Origin:

      (a) "The domain name that appears at the top of a zone (just below
      the cut that separates the zone from its parent).  The name of the
      zone is the same as the name of the domain at the zone's origin."
      (Quoted from [RFC2181], Section 6.)  These days, this sense of
      "origin" and "apex" (defined below) are often used
      interchangeably.

      (b) The domain name within which a given relative domain name
      appears in zone files.  Generally seen in the context of
      "$ORIGIN", which is a control entry defined in [RFC1035],
      Section 5.1, as part of the master file format.  For example, if
      the $ORIGIN is set to "example.org.", then a master file line for
      "www" is in fact an entry for "www.example.org.".

   Apex:  The point in the tree at an owner of an SOA and corresponding
      authoritative NS RRset.  This is also called the "zone apex".
      [RFC4033] defines it as "the name at the child's side of a zone
      cut".  The "apex" can usefully be thought of as a data-theoretic
      description of a tree structure, and "origin" is the name of the
      same concept when it is implemented in zone files.  The
      distinction is not always maintained in use, however, and one can
      find uses that conflict subtly with this definition.  [RFC1034]
      uses the term "top node of the zone" as a synonym of "apex", but
      that term is not widely used.  These days, the first sense of
      "origin" (above) and "apex" are often used interchangeably.

Based on the above, I believe that we should consider treating the DNS as the source of truth with regards to determining apex domains. Currently we use the Public Suffix gem's parser to try to determine this, however, it fails in cases.

Taken from

# PublicSuffix.domain pulls out the apex-level domain name.
# E.g. PublicSuffix.domain("techblog.netflix.com") # => "netflix.com"
# It's aware of multi-step top-level domain names:
# E.g. PublicSuffix.domain("blog.digital.gov.uk") # => "digital.gov.uk"
# For apex-level domain names, DNS providers do not support CNAME records.

However techblog.netflix.com is actually an apex domain!

dig techblog.netflix.com ns +qr techblog.netflix.com soa +noqr +nostats                                                                                                                                                                                                                                                                              

; <<>> DiG 9.10.6 <<>> techblog.netflix.com ns +qr techblog.netflix.com soa +noqr +nostats
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63963
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;techblog.netflix.com.          IN      NS

;; ANSWER SECTION:
techblog.netflix.com.   32148   IN      NS      pdns154.ultradns.biz.
techblog.netflix.com.   32148   IN      NS      pdns154.ultradns.com.
techblog.netflix.com.   32148   IN      NS      pdns154.ultradns.net.
techblog.netflix.com.   32148   IN      NS      pdns154.ultradns.org.

;; Query time: 9 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Jun 16 10:57:27 CDT 2021
;; MSG SIZE  rcvd: 254

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18480
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;techblog.netflix.com.          IN      SOA

;; ANSWER SECTION:
techblog.netflix.com.   44872   IN      SOA     pdns154.ultradns.com. nicaadmin.netflix.com. 2017042349 86400 86400 86400 86400

But the current logic doesn't think so:

[4] pry(main)> host = "techblog.netflix.com"
=> "techblog.netflix.com"
[5] pry(main)> unicode_host = Addressable::IDNA.to_unicode(host)
PublicSuffix.domain(unicode_host, :default_rule => nil, :ignore_private => true) == unicode_host

=> false

This PR is an attempt to make Domain#apex_domain? true by looking at DNS records to confirm.

I have a temporary child zone set up at child.jr4legacy.com which is a child of jr4legacy.com zone but is also a zone apex. I have set up a temporary Pages site at https://child.jr4legacy.com.

When running check:

Before:

bundle exec ./script/check child.jr4legacy.com                                                                
host: child.jr4legacy.com
uri: http://child.jr4legacy.com/
nameservers: :default
dns_resolves?: true
proxied?: false
cloudflare_ip?: false
fastly_ip?: false
old_ip_address?: false
a_record?: true
cname_record?: false
mx_records_present?: false
valid_domain?: true
apex_domain?: false
should_be_a_record?: false
cname_to_github_user_domain?: false
cname_to_pages_dot_github_dot_com?: false
cname_to_fastly?: false
pointed_to_github_pages_ip?: true
non_github_pages_ip_present?: false
pages_domain?: false
served_by_pages?: true
valid?: false
reason: Your site's DNS settings are using a custom subdomain, child.jr4legacy.com,
  that's not set up with a correct CNAME record. We recommend you set this CNAME record
  to point at [YOUR USERNAME].github.io. For more information, see https://help.github.com/articles/setting-up-a-custom-domain-with-github-pages/.
  (InvalidCNAMEError)
https?: true
enforces_https?: false
https_error:
https_eligible?: true
caa_error:
dns_zone_soa?: true

After:

bundle exec ./script/check child.jr4legacy.com                                                                             
host: child.jr4legacy.com
uri: http://child.jr4legacy.com/
nameservers: :default
dns_resolves?: true
proxied?: false
cloudflare_ip?: false
fastly_ip?: false
old_ip_address?: false
a_record?: true
cname_record?: false
mx_records_present?: false
valid_domain?: true
apex_domain?: true
should_be_a_record?: true
cname_to_github_user_domain?: false
cname_to_pages_dot_github_dot_com?: false
cname_to_fastly?: false
pointed_to_github_pages_ip?: true
non_github_pages_ip_present?: false
pages_domain?: false
served_by_pages?: true
valid?: true
reason:
https?: true
enforces_https?: false
https_error:
https_eligible?: true
caa_error:
dns_zone_soa?: true
dns_zone_ns?: true

@jriggins jriggins changed the title Jriggins/zone apex Query DNS to Determine Apex Domains Jun 16, 2021
@jriggins jriggins force-pushed the jriggins/zone-apex branch from be55a21 to 59fd3c1 Compare June 16, 2021 16:47
@jriggins jriggins force-pushed the jriggins/zone-apex branch from 59fd3c1 to 93499f6 Compare June 16, 2021 17:03

# PublicSuffix.domain pulls out the apex-level domain name.
# E.g. PublicSuffix.domain("techblog.netflix.com") # => "netflix.com"
# It's aware of multi-step top-level domain names:
# E.g. PublicSuffix.domain("blog.digital.gov.uk") # => "digital.gov.uk"
# For apex-level domain names, DNS providers do not support CNAME records.
#
# TODO: Should we even use this here vs allowing DNS to be source of truth?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep it as a fallback mechanism.

Kind of inline with https://stackoverflow.com/a/16395268.

Copy link
Contributor

@yoannchaudet yoannchaudet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jriggins It's looking good to me and I think that's a good change. Semantically it is valid.

@jriggins jriggins self-assigned this Jun 16, 2021
@jriggins jriggins marked this pull request as ready for review June 16, 2021 23:33
@jriggins jriggins requested a review from yoannchaudet June 16, 2021 23:48
@jriggins jriggins merged commit 84fc565 into master Jun 24, 2021
@jriggins jriggins deleted the jriggins/zone-apex branch June 24, 2021 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants