Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolved: don't cache NXDOMAIN for SUDN resolver.arpa #31646

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

rpigott
Copy link
Contributor

@rpigott rpigott commented Mar 6, 2024

The name resolver.arpa is reserved for RFC9462 "Discovery of Designated Resolvers" (DDR). This relies on regular dns queries for SVCB records under the special use domain name resolver.arpa. Unfortunately, older nameservers (or broken ones) won't know about this SUDN and will likely return NXDOMAIN. If this is cached, the cache entry will become an impediment for any clients trying to discover designated resolvers through the stub-resolver, or potentially even sd-resolved itself, were it to implement DDR.

Partially for this reason, the RFC recommendation is that "clients MUST NOT perform A or AAAA queries for resolver.arpa". This enforces that condition within sd-resolved, and avoids caching any such erroneous NXDOMAIN.

N.B. Although A and AAAA are prohibited, I think validating resolvers might reasonably query for dnssec records, even though it seems impossible for the special domain _dns.resolver.arpa to actually be signed.

@github-actions github-actions bot added resolve please-review PR is ready for (re-)review by a maintainer labels Mar 6, 2024
@rpigott
Copy link
Contributor Author

rpigott commented Mar 6, 2024

Cloudflare returns such an erroneous NXDOMAIN for me:

$ dig @1.1.1.1 A _dns.resolver.arpa
[...]
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 30608
[,,,]

google dns and quad9 give the correct NOERROR response.

@poettering poettering added reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks and removed please-review PR is ready for (re-)review by a maintainer labels Mar 6, 2024
@github-actions github-actions bot added please-review PR is ready for (re-)review by a maintainer and removed reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Mar 6, 2024
@pemensik
Copy link
Contributor

pemensik commented Mar 7, 2024

I think this cache record, either positive or negative, should be flushed on dns servers configuration change. It might specially reduce max TTL for those records to something like 10 or 5 minutes max. But there would not be usually reason to change the response, unless used servers changed too.

@pemensik
Copy link
Contributor

pemensik commented Mar 7, 2024

It seems to me especially negative cache might be flushed every time servers have changed. their responses might be different, not necessary special to this special domain name.

@pemensik
Copy link
Contributor

pemensik commented Mar 7, 2024

It would make change to also include resolver.arpa NTA to validating resolvers. That is what would be required on delv _dns.resolver.arpa @dns.google. Unlike home.arpa domain resolver.arpa does not exist officially, so it needs either positive or negative trust anchor.

@rpigott
Copy link
Contributor Author

rpigott commented Mar 7, 2024

It would make change to also include resolver.arpa NTA to validating resolvers.

This is already included in the patch

I think this cache record, either positive or negative, should be flushed on dns servers configuration change.

Pretty sure we already flush all the dns scoped caches on server config change.

@pemensik
Copy link
Contributor

pemensik commented Mar 7, 2024

Pretty sure we already flush all the dns scoped caches on server config change.

Then in what cases not caching resolver.arpa should help? Do you expect it would change often on the same server?

@rpigott
Copy link
Contributor Author

rpigott commented Mar 7, 2024

The cases where DDR is implemented poorly by the upstream resolver (Cloudflare). Basically, we don't want to cache NXDOMAIN here because it is wrong, and we know it is wrong. You could argue we shouldn't accommodate this I guess, but I don't think there is a downside.

This is a failure I encountered exploring DDR. Consider it preparatory work for implementing DDR in sd-resolved. The NTA saves us from rejecting DDR records while validating, and the caching change prevents our caches from getting poisoned for the crime of querying these DNSSEC related records.

If anyone knows how to contact cloudflare or report bugs they might care to know about this. And #31484.

@poettering poettering added good-to-merge/with-minor-suggestions and removed please-review PR is ready for (re-)review by a maintainer labels Mar 7, 2024
The name resolver.arpa is reserved for RFC9462 "Discovery of Designated
Resolvers" (DDR). This relies on regular dns queries for SVCB records at
the special use domain name _dns.resolver.arpa. Unfortunately, older
nameservers (or broken ones) won't know about this SUDN and will likely
return NXDOMAIN. If this is cached, the cache entry will become an
impediment for any clients trying to discover designated resolvers
through the stub-resolver, or potentially even sd-resolved itself, were
it to implement DDR.

The RFC recommendation is that "clients MUST NOT perform A or AAAA
queries for resolver.arpa", and "resolvers SHOULD respond to queries of
any type other than SVCB for _dns.resolver.arpa. with NODATA and queries
of any type for any domain name under resolver.arpa with NODATA." which
should help avoid potential compatibility issues. This enforces that
condition within sd-resolved, and avoids caching any such erroneous
NXDOMAIN.

The RFC also recommends requests for this domain should never be
forwarded, to prevent authentication failures. Since there isn't much
point in establishing secure communication to the local stub, we still
allow SVCB to be forwarded from the stub, in case the client cares to
implement some other authentication method and understands the
consequences of skipping the local stub. Normal clients are not
expected to implement DDR, but this change will protect sd-resolved's
own caches in case they try.

Although A and AAAA are prohibited, I think validating resolvers
might reasonably query for dnssec records, even though the resolver.arpa
zone does not exist (it is declared to be a locally served zone). For
this reason, I have also added resolver.arpa to the builtin dnssec NTA.
@bluca bluca merged commit abcc94b into systemd:main Mar 7, 2024
46 of 48 checks passed
@rpigott rpigott deleted the sudn-resolver branch March 8, 2024 00:23
@pemensik
Copy link
Contributor

it seems to me this should not be just uncached, but should have special handling and offered whatever information it had received from each server. Single query from local stub should create query to each server, which do not have already cached the result. Each result should be cached separately and put together when anyone queries.

This query is meant to be (possibly) different on each queried server, so it must not be stored in normal cache.

@rpigott
Copy link
Contributor Author

rpigott commented Mar 11, 2024

"normal" clients already aren't in a good position to implement DDR since the local stub will prevent authentication by design (the upstream resolver will not have our local address in the subject altnames). Or, to put it another way, there is no rfc compliant way for clients to make use these records from behind a stub. In fact, if we wanted to be RFC compliant, we should refuse these queries outright on the stub according to RFC9462 § 6.1. I'm also quite confident that if domain was intended to have special caching behavior the document(s) would have said so.

Currently we operate one cache per-scope and I don't think that will be a problem. The built-in expectation then is that each server address configured on the same scope (a single link, say) belongs to one logical service set and will return the same data, similar to how e.g. 8.8.8.8 and 8.8.4.4 do (they are two endpoints to the same logical service, dns.google). I'm not sure there is a sensible way to combine the responses from multiple upstream resolvers anyway, because there is no real guidance on the relative priority of SVCB records between services.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

4 participants