Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
dnsmasq: Add upstream patch fixing SERVFAIL issues with multiple servers
This fixes FS#391 for lede-17.01 Signed-off-by: Baptiste Jonglez <git@bitsofnetworks.org>
- Loading branch information
Showing
1 changed file
with
130 additions
and
0 deletions.
There are no files selected for viewing
130 changes: 130 additions & 0 deletions
130
package/network/services/dnsmasq/patches/000-fix-servfail-handling.patch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,130 @@ | ||
From 68f6312d4bae30b78daafcd6f51dc441b8685b1e Mon Sep 17 00:00:00 2001 | ||
From: Baptiste Jonglez <git@bitsofnetworks.org> | ||
Date: Mon, 6 Feb 2017 21:09:11 +0000 | ||
Subject: [PATCH] Stop treating SERVFAIL as a successful response from upstream | ||
servers. | ||
|
||
This effectively reverts most of 51967f9807 ("SERVFAIL is an expected | ||
error return, don't try all servers.") and 4ace25c5d6 ("Treat REFUSED (not | ||
SERVFAIL) as an unsuccessful upstream response"). | ||
|
||
With the current behaviour, as soon as dnsmasq receives a SERVFAIL from an | ||
upstream server, it stops trying to resolve the query and simply returns | ||
SERVFAIL to the client. With this commit, dnsmasq will instead try to | ||
query other upstream servers upon receiving a SERVFAIL response. | ||
|
||
According to RFC 1034 and 1035, the semantic of SERVFAIL is that of a | ||
temporary error condition. Recursive resolvers are expected to encounter | ||
network or resources issues from time to time, and will respond with | ||
SERVFAIL in this case. Similarly, if a validating DNSSEC resolver [RFC | ||
4033] encounters issues when checking signatures (unknown signing | ||
algorithm, missing signatures, expired signatures because of a wrong | ||
system clock, etc), it will respond with SERVFAIL. | ||
|
||
Note that all those behaviours are entirely different from a negative | ||
response, which would provide a definite indication that the requested | ||
name does not exist. In our case, if an upstream server responds with | ||
SERVFAIL, another upstream server may well provide a positive answer for | ||
the same query. | ||
|
||
Thus, this commit will increase robustness whenever some upstream servers | ||
encounter temporary issues or are misconfigured. | ||
|
||
Quoting RFC 1034, Section 4.3.1. "Queries and responses": | ||
|
||
If recursive service is requested and available, the recursive response | ||
to a query will be one of the following: | ||
|
||
- The answer to the query, possibly preface by one or more CNAME | ||
RRs that specify aliases encountered on the way to an answer. | ||
|
||
- A name error indicating that the name does not exist. This | ||
may include CNAME RRs that indicate that the original query | ||
name was an alias for a name which does not exist. | ||
|
||
- A temporary error indication. | ||
|
||
Here is Section 5.2.3. of RFC 1034, "Temporary failures": | ||
|
||
In a less than perfect world, all resolvers will occasionally be unable | ||
to resolve a particular request. This condition can be caused by a | ||
resolver which becomes separated from the rest of the network due to a | ||
link failure or gateway problem, or less often by coincident failure or | ||
unavailability of all servers for a particular domain. | ||
|
||
And finally, RFC 1035 specifies RRCODE 2 for this usage, which is now more | ||
widely known as SERVFAIL (RFC 1035, Section 4.1.1. "Header section format"): | ||
|
||
RCODE Response code - this 4 bit field is set as part of | ||
responses. The values have the following | ||
interpretation: | ||
(...) | ||
|
||
2 Server failure - The name server was | ||
unable to process this query due to a | ||
problem with the name server. | ||
|
||
For the DNSSEC-related usage of SERVFAIL, here is RFC 4033 | ||
Section 5. "Scope of the DNSSEC Document Set and Last Hop Issues": | ||
|
||
A validating resolver can determine the following 4 states: | ||
(...) | ||
|
||
Insecure: The validating resolver has a trust anchor, a chain of | ||
trust, and, at some delegation point, signed proof of the | ||
non-existence of a DS record. This indicates that subsequent | ||
branches in the tree are provably insecure. A validating resolver | ||
may have a local policy to mark parts of the domain space as | ||
insecure. | ||
|
||
Bogus: The validating resolver has a trust anchor and a secure | ||
delegation indicating that subsidiary data is signed, but the | ||
response fails to validate for some reason: missing signatures, | ||
expired signatures, signatures with unsupported algorithms, data | ||
missing that the relevant NSEC RR says should be present, and so | ||
forth. | ||
(...) | ||
|
||
This specification only defines how security-aware name servers can | ||
signal non-validating stub resolvers that data was found to be bogus | ||
(using RCODE=2, "Server Failure"; see [RFC4035]). | ||
|
||
Notice the difference between a definite negative answer ("Insecure" | ||
state), and an indefinite error condition ("Bogus" state). The second | ||
type of error may be specific to a recursive resolver, for instance | ||
because its system clock has been incorrectly set, or because it does not | ||
implement newer cryptographic primitives. Another recursive resolver may | ||
succeed for the same query. | ||
|
||
There are other similar situations in which the specified behaviour is | ||
similar to the one implemented by this commit. | ||
|
||
For instance, RFC 2136 specifies the behaviour of a "requestor" that wants | ||
to update a zone using the DNS UPDATE mechanism. The requestor tries to | ||
contact all authoritative name servers for the zone, with the following | ||
behaviour specified in RFC 2136, Section 4: | ||
|
||
4.6. If a response is received whose RCODE is SERVFAIL or NOTIMP, or | ||
if no response is received within an implementation dependent timeout | ||
period, or if an ICMP error is received indicating that the server's | ||
port is unreachable, then the requestor will delete the unusable | ||
server from its internal name server list and try the next one, | ||
repeating until the name server list is empty. If the requestor runs | ||
out of servers to try, an appropriate error will be returned to the | ||
requestor's caller. | ||
--- | ||
src/forward.c | 3 ++- | ||
1 file changed, 2 insertions(+), 1 deletion(-) | ||
|
||
--- a/src/forward.c | ||
+++ b/src/forward.c | ||
@@ -853,7 +853,8 @@ void reply_query(int fd, int family, tim | ||
we get a good reply from another server. Kill it when we've | ||
had replies from all to avoid filling the forwarding table when | ||
everything is broken */ | ||
- if (forward->forwardall == 0 || --forward->forwardall == 1 || RCODE(header) != REFUSED) | ||
+ if (forward->forwardall == 0 || --forward->forwardall == 1 || | ||
+ (RCODE(header) != REFUSED && RCODE(header) != SERVFAIL)) | ||
{ | ||
int check_rebind = 0, no_cache_dnssec = 0, cache_secure = 0, bogusanswer = 0; | ||
|