Skip to content

NXDOMAIN instead of NOERROR rcode when asked for existing CNAME record #870

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
appliedprivacy opened this issue Apr 1, 2023 · 18 comments
Closed

Comments

@appliedprivacy
Copy link

appliedprivacy commented Apr 1, 2023

Describe the bug

Unbound answers to a CNAME query with NXDOMAIN instead of NOERROR but includes the actual existing record as well.

Actual expected rcode: NOERROR

Also: When asked for a CNAME, unbound asks the authoritative NS for an A record.

Actual expected qtype: CNAME

To reproduce
Steps to reproduce the behavior:

  1. start unbound so it has an empty cache when the query reaches unbound (config is provided at the end of this bugreport)
  2. ask unbound for this existing CNAME DNS record dig _acme-challenge.bender-doh.applied-privacy.net CNAME -> NXDOMAIN
  3. ask unbound again without flushing the cache first, you will get a NOERROR rcode

Others on the mailing list have confirmed seeing the same issue.

While looking into the PCAP files from stub -> unbound and unbound -> authoritative, I also noticed that the CNAME query send to unbound results in unbound asking the authoritative for an A record - which does
not existing. This mismatch in inbound and outbound qtype might be
related to the root cause of the bug.

Expected behavior

unbound should ask the authoritative nameserver for a CNAME record not an A record.
unbound should answer with an NOERROR rcode for existing CNAMEs - like other resolvers do (for example PowerDNS Recursor).

System:

  • Unbound version:
pkg info unbound
unbound-1.17.1_2
Name           : unbound
Version        : 1.17.1_2
Installed on   : Sat Feb 18 22:20:01 2023 CET
Origin         : dns/unbound
Architecture   : FreeBSD:13:amd64
  • OS: FreeBSD 13.1
  • unbound -V output:
Version 1.17.1

Configure line: --with-libexpat=/usr/local --with-ssl=/usr --enable-dnscrypt --disable-dnstap --with-libnghttp2 --with-dynlibmodule --enable-ecdsa --disable-event-api --enable-gost --with-libevent --disable-subnet --disable-tfo-client --disable-tfo-server --with-pthreads --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --infodir=/usr/local/share/info/ --build=amd64-portbld-freebsd13.1
Linked libs: libevent 2.1.12-stable (it uses kqueue), OpenSSL 1.1.1o-freebsd  3 May 2022
Linked modules: dns64 dynlib respip validator iterator
DNSCrypt feature available

Additional information

Mailing list discussions:

unbound.conf

server:
	verbosity: 0
	access-control: 109.70.100.0/24 allow
        access-control: ::1/128 allow
	access-control: 127.0.0.1/24 allow
	edns-tcp-keepalive: yes	
	incoming-num-tcp: 200

	# plain UDP
	interface: 127.0.0.1@53
	interface: ::1@53
	interface: 109.70.100.133@53
	
	num-threads: 2
	msg-cache-size: 100m
	rrset-cache-size: 200m
	key-cache-size: 10m
	neg-cache-size: 10m

	harden-below-nxdomain: yes
	minimal-responses: yes

	prefetch: yes
	prefetch-key: yes
	aggressive-nsec: yes

	use-caps-for-id: yes
	hide-identity: yes
	hide-version: yes
	hide-trustanchor: yes

	qname-minimisation: yes


	# The following line will configure unbound to perform cryptographic
	# DNSSEC validation using the root trust anchor.
	auto-trust-anchor-file: "/usr/local/etc/unbound/root.key"

	extended-statistics: yes
	statistics-cumulative: no
	statistics-interval: 0

remote-control:
	control-enable: yes

# root on loopback
auth-zone:
	name: "."
	master: "k.root-servers.net"
        fallback-enabled: yes
  	for-downstream: no
  	for-upstream: yes
 	zonefile: "root.zone"
@wcawijngaards
Copy link
Member

The A query is made because qname-minimisation is turned on. It first attempts to locate the data with query type A to hide the query type. If qname minimisation is turned off, it likely works and asks for the CNAME, with qname-minimisation: no.

The upstream server is problematic, it does not implement the DNS standard correctly. If the domain has answers for other query types, it is not NXDOMAIN. The query for type A is then answered with a reply that is called NOERROR/NODATA. This has rcode NOERROR and no records in the answer section. In the authority section there is a SOA record, that makes the message have a TTL, of that SOA record.

@bleve
Copy link

bleve commented Apr 3, 2023

You are wrong. server is not giving wrong answer. NXDOMAIN is correct answer when CNAME destination is missing. So unbound should not return that to client, it should return the destination of CNAME when CNAME is being queried.

So this is special case where unbound currently gives wrong answer because it expects CNAME destination to actually exist when it shouldn't.

@wcawijngaards
Copy link
Member

So, I tested it some more, thank you for the details for that.

When I test this with the config, I get the NXDOMAIN and then the NXDOMAIN again? Did I not copy a necessary part of the config, that causes the problem apparantly? So the second query also gets NXDOMAIN for me.

For a query where the qtype is CNAME. So where there is uncertainty what CNAME is referred to, the first one, or the last one. Unbound currently responds with an NXDOMAIN, it seems. That means it gives information regarding the last element in the chain, the CNAME that does not exist at the target name of the first CNAME.

The first query of type A, is because of qname minimisation still.

@he32
Copy link
Contributor

he32 commented Apr 3, 2023

I agree with @bleve, this is not an instance of a RFC 8020 violation at the publishing name server (if that was ever the suggestion).

If the original query was for _acme-challenge.bender-doh.applied-privacy.net. a, it would be correct
to return NXDOMAIN because the target of the CNAME record, bender-doh.acme-dns-challenge.applied-privacy.net. does not exist in the DNS. However, CNAME queries are somewhat "special", in that the recursor is not being asked to recurse through the CNAME record's target, but instead should just return the CNAME record at the queried-for name as is, irrespective of whether the target for the CNAME record exists or not. So unbound in the recursor role must take on the responsibility which comes along with converting the original CNAME query type to something else, due to the particular semantics of processing a CNAME query.

@wcawijngaards
Copy link
Member

wcawijngaards commented Apr 3, 2023

Is there a reference for this behaviour, i.e. NXDOMAIN or not response for qtype CNAME? Unbound currently takes NXDOMAIN as the final element for the CNAME chain, that is what RFC 8020 says too, in section 2. But it does not talk about an exception for qtype CNAME.

This seems to suggest the issue is about that rcode in the response for qtype CNAME. Not about the change in rcode for another query, or about qtype A queries. Because that is what the top post talks about.

@he32
Copy link
Contributor

he32 commented Apr 3, 2023

As for a reference for the "don't recurse through the CNAME target when the query type is CNAME", I would have to dig for it. Give me some time for that. Even though it's not the authoritative source, that's the way BIND does it. And ... to me that is really the only thing that makes sense -- when you ask for a CNAME record and it exists, it should be returned, irrespective of whether the target for the CNAME record exists. However, for other record types, the presence of a CNAME record is "transparent", and recursion through the target of the CNAME record is implied.

@wcawijngaards
Copy link
Member

Well, Unbound returns the CNAME chain, but the rcode differs in case the destination does not exist, for queries of type CNAME. I agree, it is a good idea to have similar output.

@he32
Copy link
Contributor

he32 commented Apr 3, 2023

Well, we can go back to RFC 1034 which says:

CNAME RRs cause special action in DNS software.  When a name server
fails to find a desired RR in the resource set associated with the
domain name, it checks to see if the resource set consists of a CNAME
record with a matching class.  If so, the name server includes the CNAME
record in the response and restarts the query at the domain name
specified in the data field of the CNAME record.  The one exception to
this rule is that queries which match the CNAME type are not restarted.

This indicates that qtype=CNAME queries should not recurse through the target of the CNAME record.

@he32
Copy link
Contributor

he32 commented Apr 3, 2023

So, I can perhaps suggest a conceptually simple change: with qtype=CNAME and when query minimization is turned on, do not change the qtype for the outgoing queries when doing the recursion.

@wcawijngaards
Copy link
Member

wcawijngaards commented Apr 4, 2023

The upstream server has a malformed response, it is server 157.53.224.1 ns1.desec.io. for applied-privacy.net. It returns a response to a query for bender-doh.applied-privacy.net. IN A with the A record before the CNAME. It should be the CNAME and then the item after the CNAME, so CNAME then A in the answer section. Unbound deals with this by removing the A record and chasing the CNAME target itself, so it would not really hamper resolution, but I would consider it malformed. The query is done here because the qname minimisation passes by the intermediate label. The outcome does not affect this particular issue. It was visible in the logs.

The output of the query:

;; flags: qr aa ; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0 
;; QUESTION SECTION:
;; bender-doh.applied-privacy.net.	IN	A

;; ANSWER SECTION:
bender-dpriv1.appliedprivacy.net.	86400	IN	A	146.255.56.101
bender-dpriv1.appliedprivacy.net.	86400	IN	RRSIG	A 13 3 86400 20230413000000 20230323000000 12467 appliedprivacy.net. YpiUm+ZDcpZTIotIUF7ec7AllUqtmo5qp7y9DHIzAhi5jI24tJ5U7/oRqfpZGXwAIpUWO/8eFJpDmLLqARA1Jw==
bender-doh.applied-privacy.net.	86400	IN	CNAME	bender-dpriv1.appliedprivacy.net.
bender-doh.applied-privacy.net.	86400	IN	RRSIG	CNAME 13 3 86400 20230413000000 20230323000000 38828 applied-privacy.net. Cbcs2iTPqBdZeu7/GVtcrwo9yhT99lGOauxCoxV81qvgevtQiQ41fkGlFEDuACmFuW3fyCy8Jw3FyZa5HLEkkw==

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 0 msec
;; EDNS: version 0; flags: do ; udp: 1232
;; MSG SIZE  rcvd: 347

@bleve
Copy link

bleve commented Apr 4, 2023

There is no such issue with my test record, cnametest.bleve.fi.

@wcawijngaards
Copy link
Member

The issue was that qname minimisation considered the NXDOMAIN that was returned by the upstream as the answer to the type CNAME question and stopped checking the response with qtype CNAME, instead of qtype A. The fix makes sure that the NXDOMAIN is used that is pertinent to the question. In this case, that makes the NXDOMAIN passed by, to then query with type CNAME, and this then makes the response NOERROR. For the top post, that makes the initial query of type NOERROR, and also the second query after that of type NOERROR, with the CNAME in the answer to the question.

Thank you for the information about the details for this failure! The fix is committed to the code repo.

@pspacek
Copy link

pspacek commented Apr 4, 2023

The upstream server has a malformed response, it is server 157.53.224.1 ns1.desec.io. for applied-privacy.net. It returns a response to a query for bender-doh.applied-privacy.net. IN A with the A record before the CNAME. It should be the CNAME and then the item after the CNAME, so CNAME then A in the answer section. Unbound deals with this by removing the A record and chasing the CNAME target itself, so it would not really hamper resolution, but I would consider it malformed.

Hi @wcawijngaards. Do you have reference which says that RRs MUST be ordered in this way? Maybe my memory is failing, but I thought that sections are unordered sets of records...

@wcawijngaards
Copy link
Member

No, CNAMEs and DNAMEs are in order, also the RRSIG follows the RRset that it signs. But I have no immediate reference for that. I guess the 4035 or so for RRSIGs, and 1034 or so for CNAME. Do you mean the ordering of NSECs in the authority section? That seems to be unordered. For the additional section, there is a bit of talk about ordering, and also implementation, eg. NSD would order the addresses in a delegation.

@pspacek
Copy link

pspacek commented Apr 4, 2023

No, CNAMEs and DNAMEs are in order, also the RRSIG follows the RRset that it signs. But I have no immediate reference for that. I guess the 4035 or so for RRSIGs, and 1034 or so for CNAME.

I remain doubtful about illegality.

I've checked RFC 1034 & 1035 briefly and cannot see that it imposes strict order. AFAIK DNAME algorithm is the last update to the canonical algorithm, and https://datatracker.ietf.org/doc/html/rfc6672#section-3.2 says "copy the CNAME RR into the answer section", not "append" or anything else about strict order.

As for RRSIGs, e.g. Knot DNS puts all RRSIGs at the very end of section and to my knowledge nothing complained (yet?).
https://datatracker.ietf.org/doc/html/rfc4035#section-3.1.1 has an explicit SHOULD and not MUST in this regard.

So, I think this is at best under-specified but not outright illegal.

@wcawijngaards
Copy link
Member

So RFC 1035, 4.1 talks about 'list of concatenated resource records (RRs)'. Not an unordered set, and it also is a list on the wire. Also the query processing algorithm from section 4 in RFC2672 creates the CNAME records in order and puts them in the answer section one after another. And then the answer after that. Also the example, in 1034 3.6.2, shows them in order, and the example in 6.2.7. And RFC1034 4.3.1 says 'The answer to the query, possibly preface by one or more CNAME RRs..'. 6.2.2 says that RRs are not ordered, about the ordering of RRs in an RRset. Also RFC 4035, says 'The name server MUST place the NS RRset before the NSEC RRset and its associated RRSIG RR(s)' in 3.1.4.

@pspacek
Copy link

pspacek commented Apr 4, 2023

Thank you for your time @wcawijngaards.

First, we can agree to disagree. Now I can see where you are coming from - I was just curious why Unbound is being strict here. The rest of this text is just an attempt to explain why I interpret it differently - feel free to ignore.

So RFC 1035, 4.1 talks about 'list of concatenated resource records (RRs)'. Not an unordered set, and it also is a list on the wire.

That section defines wire format, and honestly it seems like a stretch to say that it defines strict order for the data.

As an extreme example, say that the RFC text was defining wire format for "bunch of 32-bit integers" and said "store it as list of 32-bit big endian integers". Does that serialization format imply that it is list? Or set? I think the serialization format does not define that property.

Also the query processing algorithm from section 4 in RFC2672 creates the CNAME records in order and puts them in the answer section one after another. And then the answer after that.

Here we clearly disagree about interpretation of the text. RFC 2672 is obsoleted by RFC 6672, but even the original text did said copy, not append. IMHO it says the record must be present in the resulting answer section, not in what order.

Also the example, in 1034 3.6.2, shows them in order, and the example in 6.2.7.

Well, I can't see any text around the examples which would impose order - and I think we can agree that example has to provide some ordering when it is written down :-)

And RFC1034 4.3.1 says 'The answer to the query, possibly preface by one or more CNAME RRs..'.

Possibly. I'm not so sure it enforces strict ordering of CNAMEs... The text seems like high-level description and not exact spec, but I take your point.

6.2.2 says that RRs are not ordered, about the ordering of RRs in an RRset.

I don't see it in 6.2.2. Do you mean 6.2.1?

If so, 6.2.1 says:

The difference in ordering of the RRs in the answer section is not significant.

Well, it might be talking only about the example at hand, but it can be also interpreted literally.

Also RFC 4035, says 'The name server MUST place the NS RRset before the NSEC RRset and its associated RRSIG RR(s)' in 3.1.4.

I interpret that as instruction for determining truncation point (= NS has higher priority when TC=1 has to be set), but I agree that this specific case has order specified.

@edmonds
Copy link
Contributor

edmonds commented Apr 4, 2023

The question of whether there is an ordering between RRsets in the answer sections comes up from time to time. There was a large email thread in dnsop from 2015 here:

https://mailarchive.ietf.org/arch/msg/dnsop/7KoE8Dr-SxuNToskxbvAwJ3BQLQ/

and an attempt at a specification clarification document here:

https://datatracker.ietf.org/doc/html/draft-jabley-dnsop-ordered-answers-00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants