Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolved: add an option to disable the stub resolver #4061

Merged
merged 3 commits into from Oct 7, 2016

Conversation

7 participants
@dm0-
Copy link
Contributor

commented Aug 30, 2016

We've had multiple reports of the DNS stub resolver breaking existing DNS server installations that attempt to listen on 0.0.0.0:53. (See coreos/bugs#1545 for one report.)

This patch makes starting the stub resolver optional so that other aspects of systemd-resolved can still be used, e.g. generating resolv.conf from a DHCP response. It defaults to preserving the current behavior and enabling the stub resolver.

@poettering

This comment has been minimized.

Copy link
Member

commented Aug 30, 2016

This request appears very strange to me. The reason we listen on 127.0.0.53:53 is precisely because it doesn't conflict with other daemons listening on port 127.0.0.1:53 or 0.0.0.0:53.

AFAICS there are precisely two modes of operation for DNS servers to listen on port 53:

  1. Listen generically on 0.0.0.0:53 without caring for specific interfaces. resolved listening on 127.0.0.53:53 does not conflict with that at all (try it out: just run nc -4 -l -u 53 on a system where resolved is running).
  2. Enumerate through all interfaces, and listen on all configured IP addresses. This should also not conflict with resolved listening on 127.0.0.53:53, as that IP address is not in fact listed on any local interface as local IP address. We listen on it using the fact that anything routed to the "lo" loopback device is considered local, without the address being actively configured locally.

Mode 2 is what bind does. Other DNS servers do mode 1 above. But what does your DNS server do? What does it do so that in causes a conflict with resolved?

@dm0-

This comment has been minimized.

Copy link
Contributor Author

commented Aug 30, 2016

The following command will fail consistently with EADDRINUSE when systemd-resolved is running.

sudo ncat -4 -l 0.0.0.0 53

It looks like it is just doing:

socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)

The linked issue has steps to reproduce with SkyDNS.

Another reported conflict was from running "BIND on all interfaces", but I don't have more information than that at the moment.

@poettering

This comment has been minimized.

Copy link
Member

commented Aug 30, 2016

hmm, i see, so this is not a conflict with the UDP socket, but with the TCP socket... I figure we can resolve this, as we actually don't need the TCP socket IRL...

@poettering

This comment has been minimized.

Copy link
Member

commented Aug 31, 2016

Hmm, so here's what I'd propose: We make this indeed an option as your patch suggests, but beef it up a bit. Instead of being a plain boolean, we turn it into an enum, taking four values: "no", "yes", "tcp-only", "udp-only". Of these "no" would turn off listening on both TCP and UDP. "yes" would turn it off on both, and the other two only on one of the two. We'd then make "udp-only" the default. With that we should get implicit compatibility with any other local DNS servers, as the conflict only arises for the TCP socket, not for the UDP socket, apparently. But at the same time we don't lose the stub functionality by default. For the stub TCP is kinda unnecessary anyway, as most clients have to deal with with servers not supporting TCP anyway, and for local clients the benefits TCP provides are not interesting: we support and advertise huge packets via EDNS0 anyway, as the UDP transport on the local host supports MTUs of 64K just fine.

The enum should be parsed using the "extended bool" semantics we use in systemd at various places. That means, we first try to parse the values as boolean. If that fails we check for the two special additional values. There's already infrastructure to make this easy to make use of via the DEFINE_STRING_TABLE_LOOKUP_WITH_BOOLEAN macro.

@dm0-

This comment has been minimized.

Copy link
Contributor Author

commented Aug 31, 2016

I've updated this PR to implement an option with the four settings you suggested. It's based off how the DNSSEC setting implemented the extended boolean option in the same configuration file, except I didn't add it in the shared directory since no other services use this setting.

I've also added two additional commits. The first just drops an apparently unused prototype, and the second prepares some headers so the new change doesn't break their inherited #include usage. (The latter could possibly be dropped if the new option enum is placed into its own header.)

As a side note: In testing this, SkyDNS does indeed still fail with EADDRINUSE when the DNS stub is only listening on UDP.

@poettering

This comment has been minimized.

Copy link
Member

commented Sep 1, 2016

hmm, any idea why skydns fails then too if "nc -l -u 53" does not?

@dm0-

This comment has been minimized.

Copy link
Contributor Author

commented Sep 1, 2016

The difference seems to be that SkyDNS does not set SO_REUSEADDR.

Digging around shows that this apparently goes back to Go's bundled net package. After a quick look, it seems that they call the socket and bind syscalls in the same socket function without a way to set other socket options in between. They'll actually set SO_REUSEADDR for UDP listeners when it's a multicast address, but 0.0.0.0 does not match the test.

@marineam

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2016

I dug into the UDP+SO_REUSEADDR to figure out what that actually meant. For TCP SO_REUSEADDR appears to still conflict with sockets in the TCP_LISTEN state. For UDP the SO_REUSEADDR flag on the two sockets will ignore the conflict entirely. But which socket actually receives packets appears to not be well defined, the commit that added support for SO_REUSEPORT back in 3.9 claims:

Note that SO_REUSEADDR already allows multiple UDP sockets to bind to
the same port, however there is no provision to prevent hijacking and
nothing to distribute packets across all the sockets sharing the same
bound port.  This patch does not change the semantics of SO_REUSEADDR,
but provides usable functionality of it for unicast.

torvalds/linux@ba418fa

I'm not sure what "usable functionality" means, possibly a partial chance at random selection between multiple equally weighted listeners? Though in my testing I didn't see that. One always won, perhaps optimizations of SO_REUSEPORT in recent kernels further broke SO_REUSEADDR but I haven't checked. When not balancing and merely routing between 0.0.0.0:53 and 127.0.0.53:53 things work as expected. Additionally I found it interesting in the case of sockets deliberately conflicting with resolved like ncat -u -l 127.0.0.53 53 resolved always won because SO_BINDTODEVICE gives it a slightly higher priority than ncat.

Anyway, so unlike TCP, UDP with SO_REUSEADDR is just messy and confusing. The kernel's prioritization of sockets will appropriately route traffic in this particular scenario but in general I would expect UDP apps to never ever touch SO_REUSEADDR and it surprises me that ncat does. My hunch that the only reason is ncat, and classic nc before it, shares the same code path for both TCP and UDP. Whatever the reason the result is ncat and nc are really bad tools for testing UDP.

So in short, yes 0.0.0.0:53 does and should conflict with 127.0.0.53:53. Using SO_REUSEADDR for UDP requires specific knowledge of what else is on the host to determine if the resulting behavior is safe or not.

@marineam

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2016

One last thought: 127.0.0.53 not being specifically listed on lo doesn't make it any less of a local address and thus conflicts with other things listening locally on 0.0.0.0. It does mean it won't conflict with things that listen by enumerating addresses and listening on each individually. dnsmasq's --bind-interfaces and --bind-dynamic options will do that but by default it simply uses 0.0.0.0.

@dm0-

This comment has been minimized.

Copy link
Contributor Author

commented Sep 12, 2016

Is there anything else to do for this pull request?

@Habbie

This comment has been minimized.

Copy link

commented Sep 25, 2016

we actually don't need the TCP socket IRL...

what is this based on? How will you deal with big responses?

goto eaddrinuse;
if (r < 0)
return r;
if (m->dns_stub_mode == DNS_STUB_YES || m->dns_stub_mode == DNS_STUB_UDP_ONLY) {

This comment has been minimized.

Copy link
@keszybz

keszybz Oct 7, 2016

Member

if (IN_SET(m->dns_stub_mode, DNS_STUB_YES, DNS_STUB_UDP_ONLY)) {

goto eaddrinuse;
if (r < 0)
return r;
if (m->dns_stub_mode == DNS_STUB_YES || m->dns_stub_mode == DNS_STUB_TCP_ONLY) {

This comment has been minimized.

Copy link
@keszybz

keszybz Oct 7, 2016

Member

Similarly here.

@@ -213,6 +213,18 @@
(such as 127.0.0.1 or ::1), in order to avoid duplicate local caching.</para></listitem>
</varlistentry>

<varlistentry>
<term><varname>DNSStub=</varname></term>
<listitem><para>Takes a boolean argument or one of <literal>udp-only</literal> and <literal>tcp-only</literal>.

This comment has been minimized.

Copy link
@keszybz

keszybz Oct 7, 2016

Member

Please name the options "udp" and "tcp". For example, we have DHCP=ipv4 and DHCP=ipv6, w/o "-only", so it's better to follow convention, and also simpler is better.

I'd also rename the option to DNSStubListener=, after all resolved as a whole is a stub resolver, so "DSNStub" is a bit confusing. @poettering?

This comment has been minimized.

Copy link
@poettering

poettering Oct 7, 2016

Member

yeah, I guess both changes suggested by @keszybz make sense!

@poettering

This comment has been minimized.

Copy link
Member

commented Oct 7, 2016

patch looks great to me. With the minor changes suggested by @keszybz this is good to get merged!

@dm0-

This comment has been minimized.

Copy link
Contributor Author

commented Oct 7, 2016

Okay, I've amended the commit to rename the option to DNSStubListener with tcp and udp settings. I've also renamed the internal symbols so the code matches the user-visible configuration.

@poettering

This comment has been minimized.

Copy link
Member

commented Oct 7, 2016

@martinpitt something appears wrong with the ubuntu CI?

@poettering

This comment has been minimized.

Copy link
Member

commented Oct 7, 2016

the ubuntu CI failures appear unrelated. Merging.

@poettering poettering merged commit 3157b2d into systemd:master Oct 7, 2016

2 of 5 checks passed

xenial-amd64 autopkgtest finished (failure)
Details
xenial-i386 autopkgtest finished (failure)
Details
xenial-s390x autopkgtest finished (failure)
Details
default Build finished.
Details
semaphoreci The build passed on Semaphore.
Details
@evverx

This comment has been minimized.

Copy link
Member

commented Oct 7, 2016

something appears wrong with the ubuntu CI?

dh_install: usr/bin/systemd-mount exists in debian/install/deb but is not installed to anywhere
dh_install: usr/lib/x86_64-linux-gnu/libnss_mymachines.so.2 exists in debian/install/deb but is not installed to anywhere
dh_install: usr/lib/x86_64-linux-gnu/libnss_resolve.so.2 exists in debian/install/deb but is not installed to anywhere
dh_install: usr/lib/x86_64-linux-gnu/libnss_systemd.so.2 exists in debian/install/deb but is not installed to anywhere
dh_install: usr/lib/x86_64-linux-gnu/libnss_myhostname.so.2 exists in debian/install/deb but is not installed to anywhere
dh_install: usr/lib/x86_64-linux-gnu/libudev.so exists in debian/install/deb but is not installed to anywhere
dh_install: usr/lib/x86_64-linux-gnu/libsystemd.so exists in debian/install/deb but is not installed to anywhere
dh_install: missing files, aborting
debian/rules:233: recipe for target 'override_dh_install' failed
make[1]: *** [override_dh_install] Error 2

I guess this PR doesn't contain those commits.

@poettering

This comment has been minimized.

Copy link
Member

commented Oct 7, 2016

@evverx ah, interesting! So the PR was faulty in this regard, but it doesn't matter, as the merged version should have that change now, did I get this right?

@evverx

This comment has been minimized.

Copy link
Member

commented Oct 7, 2016

Yes, the merged version have that change.

but it doesn't matter

Yes. (There is one exception: we didn't run tests :-))

@dm0- dm0- deleted the dm0-:coreos-1545 branch Oct 8, 2016

@fpqc

This comment has been minimized.

Copy link

commented Oct 27, 2016

So hey, it doesn't appear in the documentation where to turn this option on, any info?

@fpqc

This comment has been minimized.

Copy link

commented Oct 28, 2016

@keszybz Yeah I just gave up and decided to use systemd-networkd instead of dnsmasq, it's much better.

Thanks though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.