Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non leading underscore in Name::from_utf8 parsing fails #1904

Open
nrempel opened this issue Mar 10, 2023 · 10 comments
Open

Non leading underscore in Name::from_utf8 parsing fails #1904

nrempel opened this issue Mar 10, 2023 · 10 comments
Labels
compliance Not compliant to DNS standard operations has workaround

Comments

@nrempel
Copy link

nrempel commented Mar 10, 2023

Hello. I'm wondering where I can find more information about this particular parsing logic:

https://github.com/bluejekyll/trust-dns/blob/5492bdedba3479b480fcb904844568fd82d12500/crates/proto/src/rr/domain/name.rs#L495-L512

// Error, underscore in the end
assert!(Name::from_utf8("dis_allowed.example.com.").is_err());

Underscores are allowed at the beginning of a label but not anywhere else.

I couldn't find any mention of this in https://www.rfc-editor.org/rfc/rfc1035.html. Could anyone point me to the relevant rfc?

Thanks!

@cpu
Copy link
Contributor

cpu commented Mar 10, 2023

In general my understanding is that underscores are forbidden for host names but may occur in domain names in other contexts as the DNS is general beyond host names. Notably there's a pattern of using a leading _ to distinguish from a host name (e.g. for SRV records). I suspect that's what this parsing logic is attempting to honour. I think the best reference for this scoping hack is https://www.rfc-editor.org/rfc/rfc8552#section-1.1 but would be curious if someone else can dig up better chapter and verse to cite :-)

@nrempel
Copy link
Author

nrempel commented Mar 10, 2023

Thank you!

I was under the impression that underscores are forbidden in host names but all other labels allow them. The leading underscore seems like more of convention that a hard requirement but maybe I'm wrong?

For instance, Shopify now (newly?) requires creating a TXT record at shopify_verification.domain.com with a token. Should I be parsing shopify_verification with Name::from_str_relaxed in this case or is there a better type to represent more flexible labels?

Edit: or is this an opportunity to loosen the requirements of Name::from_utf8 so that parsing this label doesn't fail?

@djc
Copy link
Collaborator

djc commented Mar 13, 2023

I was under the impression that underscores are forbidden in host names but all other labels allow them.

That makes it sound like you think "host names" is used to refer to the first label? I'm not sure that is the case.

The aforelinked RFC 8552 section 1.1 (Scoped Interpretation of DNS Resource Records through "Underscored" Naming of Attribute Leaves) says:

Because the DNS rules for a "host" (host name) do not allow use of the underscore character, the underscored name is distinguishable from all legal host names [RFC0952].

RFC 952 ("DOD INTERNET HOST TABLE SPECIFICATION") says:

  1. A "name" (Net, Host, Gateway, or Domain name) is a text string up to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus sign (-), and period (.). Note that periods are only allowed when they serve to delimit components of "domain style names". (See RFC-921, "Domain Name System Implementation Schedule", for background).

So I guess the way I understand it is that names generally probably allow underscore, but specifically names referring to an (IP?) endpoint would, I guess, not?

It's not completely obvious to me if/how we should change the trust-dns-proto API based on these observations, though.

@nrempel
Copy link
Author

nrempel commented Mar 13, 2023

That makes it sound like you think "host names" is used to refer to the first label? I'm not sure that is the case.

Just in the case of shopify_verification.domain.com

Basically, from my reading, shopify_verification.domain.com seems to be a perfectly valid domain for a TXT record. Should Name::from_utf8 be updated so that it successfully parses shopify_verification.domain.com? It seems like a strange determination that only leading underscores can be parsed.

@nrempel nrempel changed the title Non leading underscore in Name parsing fails Non leading underscore in Name::from_utf8 parsing fails Mar 13, 2023
@djc
Copy link
Collaborator

djc commented Mar 16, 2023

@bluejekyll curious about your thoughts on this! I feel like making from_utf8() more lenient is probably a decent option, although we should probably add some language to the documentation around these issues.

@bluejekyll
Copy link
Member

Sorry, for the late response. When I implemented a lot of this stuff, I did it strictly... it turns out that in the wild people do things that aren't to the standard, and other nameservers/clients out there are much more flexible. I think that the end of the day, we're probably too strict and should just allow it.

@bluejekyll
Copy link
Member

Basically the question becomes, do we want to restrict the name in some way in any record, or leave that up to the caller to determine if _ is valid and we stay out of it. Shopify is probably wrong, but who am I to challenge a multiple billion dollar company.

@bluejekyll bluejekyll added has workaround compliance Not compliant to DNS standard operations labels Mar 31, 2023
@nrempel
Copy link
Author

nrempel commented Apr 4, 2023

Thanks @bluejekyll!

I actually think that sticking to the spec is ideal here. We do have a way to work around this using from_str_relaxed.

However, I couldn't find any RFC that definitively prohibits non-leading underscores.

So I think we could improve the docs here to reference the relevant RFC sections and/or make this more lenient if there is no hard rule against non-leading underscores.

Do we know that shopify_verification.domain.com for sure violates some specification?

@bluejekyll
Copy link
Member

The problem is this notion of “hostname” vs. other names. We can’t know if something is a hostname or not, it’s a distinction that would only be in context from the API user. If that’s the case, then trust-dns shouldn’t try to determine if it’s a hostname or not.

Is Shopify’s usage wrong? By my reading of the RFCs I’d say yes, but I can see someone else saying TXT records are never “host names” so therefor it’s ok. Which is why given their real world usage, I’m inclined to say that we’re being too strict.

@darnuria
Copy link
Contributor

darnuria commented Jul 2, 2023

I think it's good material and precision that can end-up in this future RFC from NLnet labs https://github.com/NLnetLabs/draft-koekkoek-dnsop-zone-file-format, now it seems paused since they first do their own SIMD zone parser https://github.com/NLnetLabs/simdzone.

Also seen at work (gandi.net registar) lot of record using either for TXT like __domain_key or SRV.

SVCB/HTTPS records will also accept _.

https://datatracker.ietf.org/doc/draft-ietf-dnsop-svcb-https/

2.3. SVCB query names

When querying the SVCB RR, a service is translated into a QNAME by
prepending the service name with a label indicating the scheme,
prefixed with an underscore, resulting in a domain name like
"_examplescheme.api.example.com.". This follows the Attrleaf naming
pattern [Attrleaf], so the scheme MUST be registered appropriately
with IANA (see Section 11).

My reading leaded to this RFC: Scoped Interpretation of DNS Resource Records through "Underscored" Naming of Attribute Leaves
https://datatracker.ietf.org/doc/html/rfc8552

Like @cpu aleady pointer in #1904 (comment)

But also point to this one that try to fix-it-all: "DNS AttrLeaf Changes: Fixing Specifications That Use Underscored Node Names"https://www.rfc-editor.org/rfc/rfc8553.html

Found via https://www.bortzmeyer.org/8553.html

Checkout: https://www.rfc-editor.org/rfc/rfc8553.html#section-2

Sorry for the edits: last minute reading was worth editing myself. :)

fabian-z added a commit to fabian-z/mail-auth that referenced this issue Dec 19, 2023
This allows DNS labels used for lookups to contain underscores,
which may not be allowed as host names.

Prevents false TempError result, which masks underlying
"proto error: Label contains invalid characters: Err(Errors
{ invalid_mapping, disallowed_by_std3_ascii_rules })"

See also hickory-dns/hickory-dns#1904
hickory-dns/hickory-dns#2009
fabian-z added a commit to fabian-z/mail-auth that referenced this issue Dec 19, 2023
This allows DNS labels used for lookups to contain underscores,
which may not be allowed as host names.

Prevents false TempError result, which masks underlying
"proto error: Label contains invalid characters: Err(Errors
{ invalid_mapping, disallowed_by_std3_ascii_rules })"

See also hickory-dns/hickory-dns#1904
hickory-dns/hickory-dns#2009
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compliance Not compliant to DNS standard operations has workaround
Projects
None yet
Development

No branches or pull requests

5 participants