Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAA logic gets confused if the CNAME is one label above requested #1048

Closed
rolandshoemaker opened this issue Oct 27, 2015 · 8 comments

Comments

@rolandshoemaker
Copy link
Member

commented Oct 27, 2015

letsencrypt#1138 has exposed an interesting bug in the CAA/CNAME logic.

If va.getCAASet is called with a domain that has a CNAME that is one label above then a infinite loop will be entered that looks something like

CAA magicfish1990.org [nothing] ->
CNAME magicfish1990.org [jp-2.magicfish1990.org] ->
CAA jp-2.magicfish1990.org [nothing] ->
CNAME jp-2.magicfish1990.org [nothing] ->
CAA magicfish1990.org [nothing] ...

This is both a bug and an incorrect implementation of the CAA spec. Since the CAA loop is ignorant of if it is doing a CAA query for a domain or a CNAME the chain of labels it is working down can get switched mid-way through, causing the infinite looping behaviour in the case above.

A quick clarification of what is going wrong, if we have a domain c.d.com that has a CNAME to diff.other.com the CAA query chain should look like this

CAA c.d.com [nothing]
CNAME c.d.com [diff.other.com]
CAA diff.other.com [nothing]
CNAME diff.other.com [nothing]
CAA d.com [nothing]
...

but instead what happens is

CAA c.d.com [nothing]
CNAME c.d.com [diff.other.com]
CAA diff.other.com [nothing]
CNAME diff.other.com [nothing]
CAA other.com [nothing]
...
@rolandshoemaker

This comment has been minimized.

Copy link
Member Author

commented Oct 27, 2015

cc @jsha -- this should probably be marked GA.

@jsha

This comment has been minimized.

Copy link
Contributor

commented Oct 27, 2015

Yep, agree that with the infinite loop this is definitely a GA bug. A user could craft a DNS setup like that and flood the VA.

@jsha jsha added this to the General Availability milestone Oct 27, 2015

@jsha jsha self-assigned this Oct 28, 2015

@jsha

This comment has been minimized.

Copy link
Contributor

commented Oct 29, 2015

Talked some more about this: VA actually won't go into an infinite loop and it will hit a maxCNAMES limit, throwing a 500. Also, this allows a subdomain to work around CAA on a parent domain by CNAME'ing somewhere else.

In general we want to treat any case of a user-triggered 500 as a bug that needs fixing: our 500's count should always be near zero in ideal operation. However, there are a number of similar bugs where we currently allow a user-triggered 500 (largely given error-messages label here), and we can't reasonably give them all priority for GA.

However, I think the bypass of CAA is worth fixing for launch, since we plan on supporting CAA.

@tomclegg

This comment has been minimized.

Copy link
Contributor

commented Oct 30, 2015

RFC6844 s4 says: if domain has CNAME to target, and target has CAA, then R(domain) = R(target) -- where R is the algorithm we're describing, i.e., it includes checking target's parents -- otherwise check domain's parent.

Then it goes on to give an example that doesn't specifically mention whether "search alias(x.y.z)" is meant to imply a search for parent(alias(x.y.z)). But in this context it seems much more reasonable to interpret "search ... in the following order" as "do the following sequence of DNS lookups", i.e., "alias(x.y.z)" just represents a single DNS lookup.

Not checking parent(alias(...)) seems much more in keeping with the way CNAME is handled in the rest of the world: you don't behave differently depending on whether "lookup RR type X" happened to involve following a chain of CNAMEs, so you can let your upstream resolver handle CNAME transparently. Most likely the RFC just has a mistake in the algorithm description.

The current getCAASet() not only follows parent(alias(x.y.z)) first, it also forgets to come back and check parent(x.y.z) after striking out at alias(x.y.z), so it doesn't comply with the RFC no matter what the RFC was trying to say.

tl;dr

getCAASet() should lose all CNAME logic. Just LookupCAA, try parent, repeat. LookupCAA will let the upstream resolver follow CNAME, just as LookupHost does. (Or can we not trust the upstream resolver to follow CNAME?)

CAA c.d.com [nothing] [resolver noticed CNAME diff.other.com, followed it, found nothing]
CAA d.com [nothing]
[RFC says lookup 'com' but we terminate because it's a public suffix]
@rolandshoemaker

This comment has been minimized.

Copy link
Member Author

commented Oct 30, 2015

Agreed, this seems to be the cleanest approach. Unbound will follow CNAME/DNAME records for us and will only require a little extra work to properly support it on our end.

I'm not sure if Unbound can be configured to properly terminate a CNAME/DNAME loop but if it can follow them I'd be surprised if it doesn't have a option somewhere 😕.

@jsha

This comment has been minimized.

Copy link
Contributor

commented Oct 30, 2015

Sounds good, will do.

jsha added a commit that referenced this issue Oct 31, 2015

jsha added a commit that referenced this issue Oct 31, 2015

jsha added a commit that referenced this issue Nov 2, 2015

@kelunik

This comment has been minimized.

Copy link
Contributor

commented Nov 3, 2015

Is this the same issue mentioned here or does it need another issue?

@jsha

This comment has been minimized.

Copy link
Contributor

commented Nov 3, 2015

This issue, plus another temporary issue we had with whitelisting, are
both mentioned in that thread. I'll chime in and clarify.

On 11/03/2015 03:59 AM, Niklas Keller wrote:

Is this the same issue mentioned here
https://community.letsencrypt.org/t/not-actually-whitelisted-yet/2247/6?u=kelunik
or does it need another issue?


Reply to this email directly or view it on GitHub
#1048 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.