New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolved: Mitigate DVE-2018-0001, by retrying NXDOMAIN without EDNS0. #8608
Conversation
7abcaa9
to
26a6c6b
Compare
So without this, one cannot resolve and auth with the captive portal at Starbucks in the US. They appear to be deploying aruba networks captive portals at the moment. Between asserting broken DNS behavior, and coffee.... coffee wins. |
Urks, so this is of course frickin ugly, but I guess if this is what we need to do this is what we need to do... |
OK, sounds conceptually OK, but please make the two indicated changes: jump down to UDP mode in one jump please.
And add a check t->scope->dnssec_mode != DNSSEC_YES
so that we don't second guess DNSSEC replies in strict DNSSEC mode.
It kinda sucks that this means in permissive DNSSEC mode we'll never be able do do DNSSEC for NXDOMAIN anymore though
I've tested this against a DNSSEC signed domain, in permissive mode, and it was asserted correctly. The captive portals in question, simply do not know how to rewrite ends0/dnssec queries and pass them through unmodified. Thus to access the captive portal, one must request/resolve something in a non-dnssec-signed domain, e.g. like Thus i'm not sure the extra check Imho we should still assert DNSSEC for NXDOMAIN in permissive mode for dnssec signed domains, and the users simply must access something non-dnssec-signed to get to the captive portal. Thoughts? |
If people enable DNSSEC strict mode they basically say "fuck captive portals". It's a no-compromise mode, enabled by folks who do not want to compromise on security, but captive portals by their nature really are a compromise on security since they generally mean rewriting DNS and/or HTTP. I don't think anyone is helped if we'd second guess NXDOMAIN in strict DNSSEC mode. It sounds like something the security nerds who care about strict DNSSEC mode would just be pissed about, and hence not worth doing. I mean, if you pick DNSSEC strict mode you are in for a hard time anyway... And regular people would never pick DNSSEC strict mode in the first place... In an ideal world, NetworkManager's captive portal detection would tell resolved to turn off DNSSEC on that interface until connectivity is verified at which point it should be turned on (if the user said so on that interface). |
@xnox: That check is definitely required. Ubuntu bionic seems to use the same patch and it produces SERVFAIL responses if DNSSEC=yes and the upstream returns a NXDOMAIN. The log is spammed with those retry messages followed by "DNSSEC validation failed for question test.asdf IN SOA: incompatible-server" for the SOA and DS of every segment which resulted in a NXDOMAIN. See also https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1796501. |
Apparently Ubuntu 18 pulled this patch into their build. As far as I can tell this patch will lead to a reported DNS violation for every request that gets answered with NXDOMAIN and it will then downgrade from EDNS to plain DNS and try it again. Is this correct? I wish there was a solution that is less obnoxious in production, i.e. something that doesn't dump a lot of messages in a log for a failed dns query. |
@ofosos yes, it will retry with reduced "feature level" until plain DNS and then fail with SERVFAIL because it can't do DNSSEC at that level anymore. |
a1a35a3
to
fa0f317
Compare
Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and DO bit set to zero. Thus retry all domain name look ups with less secure methods, upon NXDOMAIN. Unless strict DNSSEC validation is enabled. Bug-Ubuntu: https://bugs.launchpad.net/ubuntu/bionic/+source/systemd/+bug/1766969 Bug-Ubuntu: https://bugs.launchpad.net/ubuntu/bionic/+source/systemd/+bug/1727237 Bug-DNS: https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md (cherry picked from commit cc0a0eb)
@poettering requested changes are now done. Drops straight to UDP, and doesn't do anything if script DNSSEC mode is on. |
+1 to updated patch - definite improvement. I am curious if we could restrict how often it runs more - say only on a new network, or special case something else about arubanetworks. To that end, was arubanetworks.com contacted? Possible fixes on their end? Happy to reach out if not. |
Can anyone rerun the autopkgtests - I'm pretty sure the failure was unrelated to the PR. |
I've been running this updated patch in Ubuntu 19.10 Dev release with my PPA: https://launchpad.net/~bryanquigley/+archive/ubuntu/1796501/+packages Everything seems to be working fine (DNSSEC=yes for me). I did report the issue to one of Starbuck's vendors, but have not heard back yet. |
This patch has been included in Ubuntu for a while, and the log messages have widely been reported as annoying: And since its assumption (dns violation) is actually incorrect in the vast majority of cases (i.e. for everyone not using a broken captive portal), it's highly misleading. Additionally, the dns retry appears to be causing other problems with delays in dns resolution: I haven't yet had time to look at a proper way of working around broken captive portals, but I would not recommend applying this patch in its current form. |
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Replaces: systemd#8608
I included a forward-port of this (quite reworked) in #17535. Let's close this one. |
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
i am at the point of dropping this patch myself. because internet is a vile place, and neither this, nor captive portals, nor dnssec work at all =( and i am sad. Let me check your draft PR to see what you are doing there. |
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of systemd#8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: systemd#8608
This is an updated version of #8608 with more restrictive logic. To quite the original bug: Some captive portals, lie and do not respond with the captive portal IP address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry "secure" domain name look ups with less secure methods, upon NXDOMAIN. https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md Yes, this fix sucks hard, but I guess this is what we need to do to make sure resolved works IRL. Heavily based on the original patch from Dimitri John Ledkov, and I copied the commentary verbatim. Replaces: #8608
Anyone who using old 18.04 release , will face this . In background it is creating issue to kube-proxy , which internally use iptables and depend upon underlying networking of OS. fix which worked for me
|
This sounds odd, as new enough kube-proxy knows how to look and use |
@Rajpratik71 also this issue by itself shouldn't actually be causing any issues for kub-proxy... so do please explain what you are observing. Also note that excessive error messages have been fixed in 18.04 release recently, https://launchpad.net/ubuntu/+source/systemd/237-3ubuntu10.45 have you upgraded to latest systemd? |
calico (v3.18.0) as CNI which depends upon kube-proxy , unable to forward Traffic/Unable to access , services running as NodePorts and ClusterIP even from Inside the cluster , internal communication between pods are happening using dns. This issue is consistent on AWS ec2 while same configuration/combination of services are working fine for baremetal environment. |
This sounds strange. I will ping people for familiar with kubernetes & kube-proxy. kubernetes itself uses https://github.com/kubernetes/kubernetes/blob/76f2a4d5fd8364edbb31a3611178c918644f415c/cmd/kubeadm/app/componentconfigs/kubelet.go#L59 the systemd-resolved resolv.conf directly when it is active to access the underlying resolvers to pass through to kubernetes. And so should kube-proxy...... The solution of changing /etc/resolv.conf to point at /run/systemd/resolve/resolv.conf instead of the stub, is correct for k8s-only deployments. But I thought we fixed it all a year ago, see kubernetes/kubernetes@28b9a4e I will check with maintainers of microk8s & Canonical distrubtion of k8s in AWS w.r.t. to this. Do use the mitigation via changing the symlink for now. |
Some captive portals, lie and do not respond with the captive portal IP
address, if the query is with EDNS0 enabled and D0 bit set to zero. Thus retry
"secure" domain name look ups with less secure methods, upon NXDOMAIN.
Bug-Ubuntu: https://bugs.launchpad.net/ubuntu/bionic/+source/systemd/+bug/1727237
Bug-DNS: https://github.com/dns-violations/dns-violations/blob/master/2018/DVE-2018-0001.md