Skip to content

resolved: more CNAME redirect fixes#19009

Merged
poettering merged 12 commits intosystemd:mainfrom
poettering:one-more-cname-fix
Mar 16, 2021
Merged

resolved: more CNAME redirect fixes#19009
poettering merged 12 commits intosystemd:mainfrom
poettering:one-more-cname-fix

Conversation

@poettering
Copy link
Copy Markdown
Member

A fix for #18972.

This look like more than it is. The important fix is a oneliner. The other stuff are trivialities, plus some TTL clean-ups. There's one more fix that restores an optimization path that we accidentally dropped back in 7820b32 (the optimization is not important though, the cache makes it usually unnecessary. But it's nicer to do it, since very low TTL RRs otherwise might add noise).

We nowadays cache full answer RRset combinations instead of just the
exact matching rrset. This means we should not cache RRs that are not
immediate answers to our question for longer then their own RRs. Or in
other words: let's determine the shortest TTL of all RRs in the whole
answer, and use that as cache lifetime.
When responding from DNS cache, let's slightly tweak how the TTL is
lowered: as before let's round down when converting from our internal µs
to the external seconds. (This is preferable, since records should
better be cached too short instead of too long.) Let's avoid rounding
down to zero though, since that has special semantics in many cases (in
particular mDNS). Let's just use 1s in that case.
@eworm-de
Copy link
Copy Markdown
Contributor

Commit 533ec8f references the wrong issue to be fixed, no?

@poettering
Copy link
Copy Markdown
Member Author

oops, typo

Previously by mistake we'd always match every single reply we get in a
CNAME chain to the original question from the stub client. That's
broken, we need to test it against the CNAME query we are currently
looking at.

The effect of this incorrect matching was that we'd assign the RRs to
the wrong section since we'd assume they'd be auxiliary answers instead
of primary answers.

Fixes: systemd#18972
When doing a CNAME/DNAME redirect let's first check if the answer we
already have fully answers the redirected question already. If so, let's
use that. If not, let's properly restart things.

This simply removes one call to dns_answer_reset() that was placed too
early: instead of resetting when we detect a CNAME/DNAME redirect, do so
only after checking if the answer we already have doesn't match the
reply, and then decide to *actually* follow it. Or in other words: rely
on the dns_answer_reset() call in dns_query_go() which we'll call to
actually begin with the redirected question.

This fixes an optimization path which was broken back in 7820b32.

(This doesn't really matter as much as one might think, since our cache
stepped in anyway and answered the questions before going back to the
network. However, this adds noise if RRs with very short TTLs are cached
– which some CDNs do – and is of course relavant when people turn off
the local cache.)
@poettering poettering added this to the v248 milestone Mar 15, 2021
@poettering
Copy link
Copy Markdown
Member Author

@eworm-de fixed now

@yuwata
Copy link
Copy Markdown
Member

yuwata commented Mar 16, 2021

LGTM.

@poettering
Copy link
Copy Markdown
Member Author

poettering commented Mar 16, 2021

@mcatanzaro any chance you could give this a whirl, so that we can merge this? @yuwata liked it, but given this is so late in the cycle I'd love a test from someone who knows the isue well before we merge it. Thanks!

@mcatanzaro
Copy link
Copy Markdown
Contributor

Sure, I'll test soon.

@mcatanzaro
Copy link
Copy Markdown
Contributor

OK, I've reenabled my DNS cache and am running this code now. I should notice within a few hours if it doesn't work reliably.

@mcatanzaro
Copy link
Copy Markdown
Contributor

I'm reasonably confident it's fixed. Thanks Lennart!

@poettering
Copy link
Copy Markdown
Member Author

thanks for testing and reported back. Let's merge.

@poettering poettering merged commit 69bedd0 into systemd:main Mar 16, 2021
@thergbway
Copy link
Copy Markdown

FYI It seems like systemd-resolved now correctly follows https://www.rfc-editor.org/rfc/rfc2181#section-5.2:
Should an authoritative source send such a malformed RRSet, the client should treat the RRs for all purposes as if all TTLs in the RRSet had been set to the value of the lowest TTL in the RRSet.
it was fixed in b974211

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Stub resolver cannot resolve www.vox.com or bugzilla.redhat.com, missing A records in response from stub resolver

6 participants