Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd-resolved stops resolving after some time with "DNSSEC validation failed[...]: incompatible-server " #6490

Closed
cstrotm opened this issue Jul 31, 2017 · 42 comments

Comments

@cstrotm
Copy link
Contributor

@cstrotm cstrotm commented Jul 31, 2017

Submission type

  • Bug report

systemd version the issue has been seen with

233

Used distribution

Fedora 26

In case of bug report: Expected behaviour you didn't see

systemd-resolved works for about 20 minutes without issues, after that time it stops resolving complaining in the journal about "incompatible-server". A restart of "systemd-resolved" solves the issue for another 20-30 minutes

systemd-resolved[903]: DNSSEC validation failed for question denic.de IN AAAA: incompatible-server

In case of bug report: Unexpected behaviour you saw

systemd-resolved[903]: DNSSEC validation failed for question denic.de IN AAAA: incompatible-server

In case of bug report: Steps to reproduce the problem

setup systemd-resolved as the system resolver with DNSSEC validation enabled

[Resolve]
DNS=172.22.1.1
LLMNR=yes
DNSSEC=yes
Cache=yes

upstream resolver is a Knot-DNS resolver

What is the best way to debug this?

@poettering
Copy link
Member

@poettering poettering commented Jul 31, 2017

it appears that resolved comes to the conclusion that your DNS server doesn't properly do DNSSEC. Please run systemd-resolved with the "SYSTEMD_LOG_LEVEL=debug" env var (by doing systemctl edit systemd-resolved, and then adding the two lines [Service] and Environment=SYSTEMD_LOG_LEVEL followed by systemctl restart systemd-resolved), to see the trace in the logs why it comes to that conclusion.

@poettering poettering added needs-reporter-feedback resolve labels Jul 31, 2017
@cstrotm
Copy link
Contributor Author

@cstrotm cstrotm commented Jul 31, 2017

I'll add the debug log level and will report my findings. Thanks

@cstrotm
Copy link
Contributor Author

@cstrotm cstrotm commented Aug 3, 2017

with debug logging enabled, the issue appears much less frequently, but I had two issues this morning

the journal shows

Using degraded feature set (UDP+EDNS0+DO) for DNS server 172.22.1.1.
[...]
DNSSEC validation failed for question detectportal.firefox.com IN A: incompatible-server

the systemd-resolved Unit contains

[Service]
Environment=SYSTEMD_LOG_LEVEL=debug

and the environment variable is active

$ sudo systemctl show systemd-resolved | grep debug                                                                                                                                                                                     
Environment=SYSTEMD_LOG_LEVEL=debug

but the journal does not contain a trace of the DNS resolution.

Anything I might be doing wrong?

@poettering
Copy link
Member

@poettering poettering commented Aug 7, 2017

any chance you can provide a longer log excerpt around the lookup?

@cstrotm
Copy link
Contributor Author

@cstrotm cstrotm commented Aug 7, 2017

Hello Lennart,

sure, here is a full session from a few days ago (with debug logging environment variable present as discussed above, but the journal log does not show any additional information):

Aug 04 14:32:52 thinkcentreM900.home.strotmann.de systemd[1]: Started Network Name Resolution.
Aug 04 14:32:54 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: Switching to system DNS server 172.22.1.1.
Aug 04 14:32:57 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: Using degraded feature set (UDP+EDNS0+DO) for DNS server 172.22.1.1.
Aug 04 14:33:04 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question www.dragonflybsd.org.home.strotmann.de IN AAAA: no-signature
Aug 04 14:33:04 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question www.dragonflybsd.org.home.strotmann.de IN A: no-signature
Aug 04 14:33:04 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question www.dragonflybsd.org.home.strotmann.de IN A: no-signature
Aug 04 14:33:04 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question www.dragonflybsd.org.home.strotmann.de IN AAAA: no-signature
Aug 04 14:33:08 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question www.dragonflybsd.org IN AAAA: incompatible-server
Aug 04 14:33:08 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question www.dragonflybsd.org IN A: incompatible-server
Aug 04 14:34:31 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 123.1.42.172.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:34:40 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 1.122.168.192.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:34:40 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: Got LLMNR TCP packet on unknown scope. Ignoring.
Aug 04 14:34:40 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 226.128.15.51.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:34:41 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: Got mDNS UDP packet on unknown scope. Ignoring.
Aug 04 14:34:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 65.157.161.35.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:34:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 212.109.45.5.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:34:44 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question pbs.twimg.com IN A: incompatible-server
Aug 04 14:34:44 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question pbs.twimg.com IN AAAA: incompatible-server
Aug 04 14:34:44 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question pbs.twimg.com IN AAAA: incompatible-server
Aug 04 14:34:47 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question pbs.twimg.com IN A: incompatible-server
Aug 04 14:34:47 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question pbs.twimg.com.home.strotmann.de IN A: incompatible-server
Aug 04 14:34:47 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question pbs.twimg.com.home.strotmann.de IN AAAA: incompatible-server
Aug 04 14:34:47 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question pbs.twimg.com.home.strotmann.de IN A: incompatible-server
Aug 04 14:34:47 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question pbs.twimg.com.home.strotmann.de IN AAAA: incompatible-server
Aug 04 14:34:53 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 1.42.244.104.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:34:53 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 192.174.198.91.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:34:53 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 208.174.198.91.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:34:53 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 125.253.30.192.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:34:54 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question 123.1.42.172.in-addr.arpa IN PTR: incompatible-server
Aug 04 14:35:30 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: Got mDNS UDP packet on unknown scope. Ignoring.
Aug 04 14:36:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question fedoraproject.org IN A: incompatible-server
Aug 04 14:36:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question fedoraproject.org IN AAAA: incompatible-server
Aug 04 14:36:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question fedoraproject.org IN A: incompatible-server
Aug 04 14:36:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question fedoraproject.org IN AAAA: incompatible-server
Aug 04 14:36:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question fedoraproject.org.home.strotmann.de IN A: incompatible-server
Aug 04 14:36:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question fedoraproject.org.home.strotmann.de IN AAAA: incompatible-server
Aug 04 14:36:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question fedoraproject.org.home.strotmann.de IN A: incompatible-server
Aug 04 14:36:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: DNSSEC validation failed for question fedoraproject.org.home.strotmann.de IN AAAA: incompatible-server
Aug 04 14:39:35 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: Got mDNS UDP packet on unknown scope. Ignoring.
Aug 04 14:39:35 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: Got mDNS UDP packet on unknown scope. Ignoring.
Aug 04 14:41:43 thinkcentreM900.home.strotmann.de systemd-resolved[3285]: Grace period over, resuming full feature set (UDP+EDNS0+DO+LARGE) for DNS server 172.22.1.1.
Aug 04 14:46:01 thinkcentreM900.home.strotmann.de systemd[1]: Stopping Network Name Resolution...
Aug 04 14:46:01 thinkcentreM900.home.strotmann.de systemd[1]: Stopped Network Name Resolution.
Aug 04 14:46:01 thinkcentreM900.home.strotmann.de systemd[1]: Starting Network Name Resolution...
Aug 04 14:46:01 thinkcentreM900.home.strotmann.de systemd-resolved[3436]: Positive Trust Anchors:
Aug 04 14:46:01 thinkcentreM900.home.strotmann.de systemd-resolved[3436]: . IN DS 19036 8 2 49aac11d7b6f6446702e54a1607371607a1a41855200fd2ce1cdde32f24e8fb5
Aug 04 14:46:01 thinkcentreM900.home.strotmann.de systemd-resolved[3436]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Aug 04 14:46:01 thinkcentreM900.home.strotmann.de systemd-resolved[3436]: Negative trust anchors: 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 2
Aug 04 14:46:01 thinkcentreM900.home.strotmann.de systemd-resolved[3436]: Using system hostname 'thinkcentreM900.home.strotmann.de'.
Aug 04 14:46:01 thinkcentreM900.home.strotmann.de systemd[1]: Started Network Name Resolution.
Aug 04 14:46:43 thinkcentreM900.home.strotmann.de systemd-resolved[3436]: Switching to system DNS server 172.22.1.1.

at 14:46:01 I've restarted systemd-resolved and the previous failing domains were starting to resolve without issues again.

@poettering
Copy link
Member

@poettering poettering commented Dec 8, 2017

Hmm, is this reproducible still with current systemd? If so, the logs shown already state that a degraded DNS feature level is used for the server. resolved initially uses the most powerful feature level, and then downgrades bit by bit, if it notices that its requests don't work. Now, given that you already are one level down something must have happened before making resolved think that the server is bad. Hence, if you can reproduce this with v235, any chance you can trigger this again and then look for messages earlier than the issue you are encountering that show when and why resolved decided to downgrade?

@ott
Copy link
Contributor

@ott ott commented Dec 12, 2017

Yes, the problem is reproducible with current systemd:

Looking up RR for 0.debian.pool.ntp.org IN A.
Looking up RR for 0.debian.pool.ntp.org IN AAAA.
Sent message type=method_call sender=n/a destination=org.freedesktop.DBus object=/org/freedesktop/DBus interface=org.freedesktop.DBus member=AddMatch cookie=4 reply_cookie=0 error-name=n/a error-message=n/a
Got message type=method_return sender=org.freedesktop.DBus destination=:1.405 object=n/a interface=n/a member=n/a cookie=6 reply_cookie=4 error-name=n/a error-message=n/a
Sent message type=method_call sender=n/a destination=org.freedesktop.DBus object=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetNameOwner cookie=5 reply_cookie=0 error-name=n/a error-message=n/a
Got message type=method_return sender=org.freedesktop.DBus destination=:1.405 object=n/a interface=n/a member=n/a cookie=7 reply_cookie=5 error-name=n/a error-message=n/a
Cache miss for 0.debian.pool.ntp.org IN A
Transaction 62314 for <0.debian.pool.ntp.org IN A> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 62314.
Using DNS server 192.0.2.1 for transaction 62314.
Sending query packet with id 62314.
Cache miss for 0.debian.pool.ntp.org IN AAAA
Transaction 46571 for <0.debian.pool.ntp.org IN AAAA> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 46571.
Using DNS server 192.0.2.1 for transaction 46571.
Sending query packet with id 46571.
Processing incoming packet on transaction 46571. (rcode=SUCCESS)
Verified we get a response at feature level UDP+EDNS0+DO from DNS server 192.0.2.1.
Requesting SOA to validate transaction 46571 (0.debian.pool.ntp.org, unsigned empty non-SOA/NS/DS response).
Cache miss for 0.debian.pool.ntp.org IN SOA
Transaction 51946 for <0.debian.pool.ntp.org IN SOA> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 51946.
Using DNS server 192.0.2.1 for transaction 51946.
Sending query packet with id 51946.
Processing incoming packet on transaction 62314. (rcode=SUCCESS)
Requesting SOA to validate transaction 62314 (0.debian.pool.ntp.org, unsigned non-SOA/NS RRset <0.debian.pool.ntp.org IN A 5.196.192.58>).
Requesting SOA to validate transaction 62314 (0.debian.pool.ntp.org, unsigned non-SOA/NS RRset <0.debian.pool.ntp.org IN A 148.251.154.36>).
Requesting SOA to validate transaction 62314 (0.debian.pool.ntp.org, unsigned non-SOA/NS RRset <0.debian.pool.ntp.org IN A 5.9.38.226>).
Requesting SOA to validate transaction 62314 (0.debian.pool.ntp.org, unsigned non-SOA/NS RRset <0.debian.pool.ntp.org IN A 212.18.3.19>).
Processing incoming packet on transaction 51946. (rcode=SUCCESS)
Requesting DS to validate transaction 51946 (pool.ntp.org, unsigned SOA/NS RRset).
Cache miss for pool.ntp.org IN DS
Transaction 51242 for <pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 51242.
Using DNS server 192.0.2.1 for transaction 51242.
Sending query packet with id 51242.
Requesting DS to validate transaction 51946 (0.debian.pool.ntp.org, unsigned empty SOA/NS response).
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 40019.
Using DNS server 192.0.2.1 for transaction 40019.
Sending query packet with id 40019.
Processing incoming packet on transaction 51242. (rcode=SUCCESS)
Requesting DS to validate transaction 51242 (ntp.org, unsigned SOA/NS RRset).
Cache miss for ntp.org IN DS
Transaction 16040 for <ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 16040.
Using DNS server 192.0.2.1 for transaction 16040.
Sending query packet with id 16040.
Requesting parent SOA to validate transaction 51242 (pool.ntp.org, unsigned empty DS response).
Cache miss for ntp.org IN SOA
Transaction 30353 for <ntp.org IN SOA> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 30353.
Using DNS server 192.0.2.1 for transaction 30353.
Sending query packet with id 30353.
Processing incoming packet on transaction 16040. (rcode=SUCCESS)
Requesting DNSKEY to validate transaction 16040 (org, RRSIG with key tag: 1862).
Cache miss for org IN DNSKEY
Transaction 24794 for scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 24794.
Using DNS server 192.0.2.1 for transaction 24794.
Sending query packet with id 24794.
Requesting DNSKEY to validate transaction 16040 (d7fdd278p5up3itk58hk4vor3le6df4s.org, RRSIG with key tag: 1862).
Requesting DNSKEY to validate transaction 16040 (h9p7u7tr2u91d0v0ljs9l1gidnp90u3h.org, RRSIG with key tag: 1862).
Processing incoming packet on transaction 30353. (rcode=SUCCESS)
Requesting DS to validate transaction 30353 (ntp.org, unsigned SOA/NS RRset).
Processing incoming packet on transaction 24794. (rcode=SUCCESS)
Requesting DS to validate transaction 24794 (org, DNSKEY with key tag: 9795).
Cache miss for org IN DS
Transaction 14213 for scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 14213.
Using DNS server 192.0.2.1 for transaction 14213.
Sending query packet with id 14213.
Requesting DS to validate transaction 24794 (org, DNSKEY with key tag: 1862).
Requesting DS to validate transaction 24794 (org, DNSKEY with key tag: 6368).
Requesting DS to validate transaction 24794 (org, DNSKEY with key tag: 17883).
Processing incoming packet on transaction 14213. (rcode=SUCCESS)
Requesting DNSKEY to validate transaction 14213 (org, RRSIG with key tag: 46809).
Cache miss for . IN DNSKEY
Transaction 17231 for <. IN DNSKEY> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 17231.
Using DNS server 192.0.2.1 for transaction 17231.
Sending query packet with id 17231.
Processing incoming packet on transaction 17231. (rcode=SUCCESS)
Requesting DS to validate transaction 17231 (., DNSKEY with key tag: 20326).
Requesting DS to validate transaction 17231 (., DNSKEY with key tag: 19036).
Requesting DS to validate transaction 17231 (., DNSKEY with key tag: 46809).
Validating response from transaction 17231 (. IN DNSKEY).
Looking at . IN DNSKEY 257 3 RSASHA256 AwEAAaz/tAm8yTn4Mfeh5eyI96WSVexTBAvkMgJzkKTOiW1vkIbzxeF3+/4RgWOq7HrxRixHlFlEx
OLAJr5emLvN7SWXgnLh4+B5xQlNVz8Og8kvArMtNROxVQuCaSnIDdD5LKyWbRd2n9WGe2R8PzgCmr
3EgVLrjyBxWezF0jLHwVN8efS3rCj/EWgvIWgb9tarpVUDK/b58Da+sqqls3eNbuv7pr+eoZG+SrD
K6nWeL3c6H5Apxz7LjVc1uTIdsIXxuOLYA4/ilBmSVIzuDWfdRUfhHdY6+cn8HFRm+2hM8AnXGXws
9555KrUB5qihylGa8subX2Nn6UwNR1AkUTV74bU=
-- Flags: SEP ZONE_KEY
-- Key tag: 20326: validated
Found verdict for lookup . IN DNSKEY: secure
Added positive authenticated cache entry for . IN DNSKEY 5835s on */INET/192.0.2.1
Added positive authenticated cache entry for . IN DNSKEY 5835s on */INET/192.0.2.1
Added positive authenticated cache entry for . IN DNSKEY 5835s on */INET/192.0.2.1
Transaction 17231 for <. IN DNSKEY> on scope dns on / now complete with from network (authenticated).
Validating response from transaction 14213 (org IN DS).
Looking at org IN DS 9795 7 1 364dfab3daf254cab477b5675b10766ddaa24982: validated
Found verdict for lookup org IN DS: secure
Added positive authenticated cache entry for org IN DS 6331s on */INET/192.0.2.1
Added positive authenticated cache entry for org IN DS 6331s on */INET/192.0.2.1
Transaction 14213 for on scope dns on / now complete with from network (authenticated).
Validating response from transaction 24794 (org IN DNSKEY).
Looking at org IN DNSKEY 257 3 RSASHA1-NSEC3-SHA1 AwEAAZTjbIO5kIpxWUtyXc8avsKyHIIZ+LjC2Dv8naO+Tz6X2fqzDC1bdq7HlZwtka
qTkMVVJ+8gE9FIreGJ4c8G1GdbjQgbP1OyYIG7OHTc4hv5T2NlyWr6k6QFz98Q4zwF
IGTFVvwBhmrMDYsOTtXakK6QwHovA1+83BsUACxlidpwB0hQacbD6x+I2RCDzYuTzj
64Jv0/9XsX6AYV3ebcgn4hL1jIR2eJYyXlrAoWxdzxcW//5yeL5RVWuhRxejmnSVnC
uxkfS4AQ485KH2tpdbWcCopLJZs6tw8q3jWcpTGzdh/v3xdYfNpQNcPImFlxAun3Bt
ORPA2r8ti6MNoJEHU=
-- Flags: SEP ZONE_KEY
-- Key tag: 9795: validated
Found verdict for lookup org IN DNSKEY: secure
Added positive authenticated cache entry for org IN DNSKEY 900s on */INET/192.0.2.1
Added positive authenticated cache entry for org IN DNSKEY 900s on */INET/192.0.2.1
Added positive authenticated cache entry for org IN DNSKEY 900s on */INET/192.0.2.1
Added positive authenticated cache entry for org IN DNSKEY 900s on */INET/192.0.2.1
Transaction 24794 for on scope dns on / now complete with from network (authenticated).
Validating response from transaction 16040 (ntp.org IN DS).
Looking at d7fdd278p5up3itk58hk4vor3le6df4s.org IN NSEC3 1 1 1 d399eaab D7G4M04B3TRTAB4IVM3SEUCUT3FOTVPB ( ): validated
Found verdict for lookup d7fdd278p5up3itk58hk4vor3le6df4s.org IN NSEC3: secure
Looking at h9p7u7tr2u91d0v0ljs9l1gidnp90u3h.org IN NSEC3 1 1 1 d399eaab H9PARR669T6U8O1GSG9E1LMITK4DEM0T ( NS SOA RRSIG DNSKEY NSEC3PARAM ): validated
Found verdict for lookup h9p7u7tr2u91d0v0ljs9l1gidnp90u3h.org IN NSEC3: secure
Looking at org IN SOA a0.org.afilias-nst.info noc.afilias-nst.info 2012778125 1800 900 604800 86400: validated
Found verdict for lookup org IN SOA: secure
Data is NSEC3 opt-out via NSEC/NSEC3 for transaction 16040 (ntp.org IN DS)
Found verdict for lookup ntp.org IN DS: insecure
Added positive authenticated cache entry for d7fdd278p5up3itk58hk4vor3le6df4s.org IN NSEC3 31s on */INET/192.0.2.1
Added positive authenticated cache entry for h9p7u7tr2u91d0v0ljs9l1gidnp90u3h.org IN NSEC3 31s on */INET/192.0.2.1
Added positive authenticated cache entry for org IN SOA 31s on */INET/192.0.2.1
Added NODATA cache entry for ntp.org IN DS 31s
Transaction 16040 for <ntp.org IN DS> on scope dns on / now complete with from network (unsigned).
Validating response from transaction 30353 (ntp.org IN SOA).
Looking at ntp.org IN SOA ns1.everett.org postmaster.ntp.org 2017121100 1860 780 1209600 82800: no-signature
Found verdict for lookup ntp.org IN SOA: insecure
Looking at ntp.org IN NS ns4.p20.dynect.net: no-signature
Found verdict for lookup ntp.org IN NS: insecure
Looking at ns1.p20.dynect.net IN A 208.78.70.20: no-signature
Looking at ns1.p20.dynect.net IN AAAA 2001:500:90:1::20: no-signature
Looking at ns2.p20.dynect.net IN A 204.13.250.20: no-signature
Looking at ns3.p20.dynect.net IN A 208.78.71.20: no-signature
Looking at ns3.p20.dynect.net IN AAAA 2001:500:94:1::20: no-signature
Looking at ns4.p20.dynect.net IN A 204.13.251.20: no-signature
Looking at dns1.udel.edu IN A 128.175.13.16: no-signature
Looking at dns2.udel.edu IN A 128.175.13.17: no-signature
Looking at anyns.pch.net IN A 204.61.216.4: missing-key
Looking at anyns.pch.net IN AAAA 2001:500:14:6004:ad::1: missing-key
Added positive unauthenticated cache entry for ntp.org IN SOA 3486s on */INET/192.0.2.1
Transaction 30353 for <ntp.org IN SOA> on scope dns on / now complete with from network (unsigned).
Validating response from transaction 51242 (pool.ntp.org IN DS).
Looking at ntp.org IN SOA ns1.everett.org postmaster.ntp.org 2017121100 1860 780 1209600 82800: no-signature
Found verdict for lookup ntp.org IN SOA: insecure
Found verdict for lookup pool.ntp.org IN DS: insecure
Added NODATA cache entry for pool.ntp.org IN DS 3486s
Transaction 51242 for <pool.ntp.org IN DS> on scope dns on / now complete with from network (unsigned).
Timeout reached on transaction 40019.
Retrying transaction 40019.
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 40019.
Sending query packet with id 40019.
Timeout reached on transaction 40019.
Retrying transaction 40019.
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0+DO+LARGE for transaction 40019.
Sending query packet with id 40019.
Timeout reached on transaction 40019.
Retrying transaction 40019.
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Lost too many UDP packets, downgrading feature level...
Using degraded feature set (UDP+EDNS0+DO) for DNS server 192.0.2.1.
Using feature level UDP+EDNS0+DO for transaction 40019.
Sending query packet with id 40019.
Processing incoming packet on transaction 40019. (rcode=SERVFAIL)
Server returned error SERVFAIL, retrying transaction with reduced feature level UDP+EDNS0.
Retrying transaction 40019.
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0 for transaction 40019.
Sending query packet with id 40019.
Timeout reached on transaction 40019.
Retrying transaction 40019.
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0 for transaction 40019.
Sending query packet with id 40019.
Timeout reached on transaction 40019.
Retrying transaction 40019.
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0 for transaction 40019.
Sending query packet with id 40019.
Timeout reached on transaction 40019.
Retrying transaction 40019.
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0 for transaction 40019.
Sending query packet with id 40019.
Timeout reached on transaction 40019.
Retrying transaction 40019.
Removing cache entry for d7fdd278p5up3itk58hk4vor3le6df4s.org IN NSEC3 (expired 0s ago)
Removing cache entry for h9p7u7tr2u91d0v0ljs9l1gidnp90u3h.org IN NSEC3 (expired 0s ago)
Removing cache entry for org IN SOA (expired 0s ago)
Removing cache entry for ntp.org IN DS (expired 0s ago)
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0 for transaction 40019.
Sending query packet with id 40019.
Timeout reached on transaction 40019.
Retrying transaction 40019.
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0 for transaction 40019.
Sending query packet with id 40019.
Timeout reached on transaction 40019.
Retrying transaction 40019.
Cache miss for 0.debian.pool.ntp.org IN DS
Transaction 40019 for <0.debian.pool.ntp.org IN DS> scope dns on /.
Using feature level UDP+EDNS0 for transaction 40019.
Sending query packet with id 40019.
Got message type=signal sender=org.freedesktop.DBus destination=n/a object=/org/freedesktop/DBus interface=org.freedesktop.DBus member=NameOwnerChanged cookie=8 reply_cookie=0 error-name=n/a error-message=n/a
Sent message type=method_call sender=n/a destination=org.freedesktop.DBus object=/org/freedesktop/DBus interface=org.freedesktop.DBus member=RemoveMatch cookie=6 reply_cookie=0 error-name=n/a error-message=n/a
Got message type=method_return sender=org.freedesktop.DBus destination=:1.405 object=n/a interface=n/a member=n/a cookie=9 reply_cookie=6 error-name=n/a error-message=n/a
Client of active query vanished, aborting query.
Freeing transaction 62314.
Freeing transaction 46571.
Freeing transaction 51946.
Freeing transaction 51242.
Freeing transaction 30353.
Freeing transaction 16040.
Freeing transaction 24794.
Freeing transaction 14213.
Freeing transaction 17231.
Freeing transaction 40019.

I tried to resolve 0.debian.pool.ntp.org manually. It seems that IPv6 does not work on a.ntpns.org. The recursive resolver that systemd-resolved forwards its queries to is dnsmasq and it seems to prefer IPv6 addresses and has a longer timeout than systemd-resolved. There are two problems here: BIND does not seem to try IPv6 and IPv4 in parallel and systemd-resolved has too short timeouts and therefore thinks that the resolver that it forwards its queries to is broken and therefore degrades the feature set. I will have a closer look at BIND but I think we should change systemd-resolved as well. In the area where I'm living 750 ms RTT is not uncommon with one of the mobile network operators and I think systemd-resolved's timeout values might be too low.

@cstrotm
Copy link
Contributor Author

@cstrotm cstrotm commented Dec 13, 2017

@poettering I'm very busy this week but I will try v235 next week or over the holidays and report back

@ott
Copy link
Contributor

@ott ott commented Dec 13, 2017

BIND seems to have default query timeout of 10 seconds, so systemd-resolved does not work well with it and often degrades the feature set. Unbound seems to work better with BIND. I will have a look at Unbound's infra cache module and will see whether we can adapt systemd-resolved's timeout and feature set algorithm to be more compatible with BIND and similar DNS resolver software.

I switched from Unbound to systemd-resolved a few days ago and DNS resolution often does not work anymore when I enable DNSSEC. For a user who doesn't know about all the details it will look like the Internet is broken.

@msoltyspl
Copy link
Contributor

@msoltyspl msoltyspl commented Jul 3, 2018

Just wanted to add that I observed the same with v238. The issues are exactly the same, restart solves the problems (and the upstream servers queried are bind9s). systemd-resolved slowly degrades (but never attempts to upgrade after some time ?) with - what looks like - 5s timeouts. I'm not sure how bind9 itself works when resolving takes time (according to documentation, controlled by resolver-query-timeout, with default and possible minimum 10s, settable up to 30s), but that seems to be confusing systemd's resolver about losing / not receiving expected replies. Whether bind9 internally extends the delay up to 30s(?) for failed queries or not - I'm not sure (I read something somewhere that ancient bind8 behaved this way).

Some way to control systemd-resolved behaviour (in context of timeouts, or policy how to retry / how hard to try) would be nice to have. Otherwise any timeout for any reason is essentially a slow but one-way ticket with resolved degrading itself to unnecessarily limited functionality or - in case of dnssec=yes - non-workable state requiring restart.

Options such as (global + per-link):

MaxRetries=
ReplyTimeout=
NeverDowngrade=

etc.

EDIT: some clarifications

@crosser
Copy link

@crosser crosser commented Aug 11, 2018

I confirm this issue on Bionic, systemd 237-3ubuntu10.3.
Suddenly all DNS resolution stops working. I have to restart systemd-resolved service to restore sanity.

@ott
Copy link
Contributor

@ott ott commented Aug 31, 2018

I'm also affected by this issue. Unfortunately, it seems some major changes in the way systemd-resolved determines the features that an upstream resolver supports. You are invited to help with #9384.

@BryanQuigley
Copy link

@BryanQuigley BryanQuigley commented Oct 19, 2018

I have not been able to reproduce since specifying FallbackDNS.

Does anyone have a reliable (with a specified time it will take) way to reproduce it?

@dud225
Copy link

@dud225 dud225 commented Jan 16, 2019

Yes :

$ cat /etc/systemd/resolved.conf
[Resolve]
LLMNR=no
MulticastDNS=no
DNSSEC=yes
$ resolvectl
Global
       LLMNR setting: no
MulticastDNS setting: no
  DNSOverTLS setting: no
      DNSSEC setting: yes
    DNSSEC supported: yes
  Current DNS Server: 8.8.8.8
Fallback DNS Servers: 8.8.8.8
                      8.8.4.4
                      2001:4860:4860::8888
                      2001:4860:4860::8844
[...]

there is no per-link setting

$ resolvectl flush-caches
$ resolvectl reset-server-features
$ resolvectl query blog.haschek.at
blog.haschek.at: resolve call failed: DNSSEC validation failed: incompatible-server
$ resolvectl query google.com
google.com: resolve call failed: DNSSEC validation failed: incompatible-server
$ resolvectl
Global
       LLMNR setting: no
MulticastDNS setting: no
  DNSOverTLS setting: no
      DNSSEC setting: yes
    DNSSEC supported: no
  Current DNS Server: 8.8.8.8
Fallback DNS Servers: 8.8.8.8
                      8.8.4.4
                      2001:4860:4860::8888
                      2001:4860:4860::8844
[...]

resolved needs to be reset to make it usable again :

$ resolvectl reset-server-features
$ resolvectl query google.com
google.com: 216.58.204.110                     -- link: enp0s31f6

-- Information acquired via protocol DNS in 55.1ms.
-- Data is authenticated: no

@LDVSOFT
Copy link

@LDVSOFT LDVSOFT commented Jan 24, 2019

Please, has anyone found a solution for this issue or a workaroud? Seems like yes, systemd-resolved by some magic reason considers some DNS server not to support DNSSEC when they actualy do (verified using dig +dnssec).

@RolandRosenfeld
Copy link

@RolandRosenfeld RolandRosenfeld commented Jan 27, 2019

I do not have a solution for this issue, but for me it happens approximately every third day on my office workstation (Debian stretch with systemd 239-12~bpo9+1).
I have DNSSEC enabled, and use recursion to a server running bind 9 (Debian package) (configured as IPv6 and IPv4 in the DNS= setting) and some another resolvers configured as FallbackDNS.
The workstation receives some mail and uses spamassassin with several external DNS blacklists, so there are DNS requests around the clock.
Sometimes in the night systemd-resolved runs into the "incompatible-server" state and from this time onwards most network load fails, since DNS resolution is completely broken.

Since this is an unacceptable situation on my office workstation, I had to disable DNSSEC or switch over to a different resolver.

@ott
Copy link
Contributor

@ott ott commented Mar 10, 2019

@RolandRosenfeld This is a known issue. As I said, it would require some changes to systemd-resolved.

@eddebc
Copy link

@eddebc eddebc commented Apr 2, 2019

Quick and dirty "workaround".

@gertvdijk
Copy link

@gertvdijk gertvdijk commented Apr 2, 2019

Also observing this on Ubuntu Bionic, 237-3ubuntu10.15 and a DNSSEC aware PowerDNS Recursor 4.1.11-1pdns.stretch (with dnssec=validate) with both the workstation and the recursor only having IPv4 connectivity if that matters.

Would love to help out getting to the bottom of this - it's currently frustrating my DNSSEC deployment (with SSHFP records I finally hope to be using...)

@eddebc wow... 😢

@BryanQuigley
Copy link

@BryanQuigley BryanQuigley commented Apr 2, 2019

Ubuntu 18.04 and up are made much worse by the partial fix for #8608 just fyi (the previous merge proposal is what in Ubuntu's systemd). Once the better fix is merged we plan to backport it to 18.04 LTS.

@gertvdijk
Copy link

@gertvdijk gertvdijk commented Apr 2, 2019

@BryanQuigley Thanks for the pointer! 👍

@LDVSOFT
Copy link

@LDVSOFT LDVSOFT commented Apr 3, 2019

Quick and dirty "workaround".

Is this a private repo, cannot access it?

@gertvdijk
Copy link

@gertvdijk gertvdijk commented Apr 3, 2019

Quick and dirty "workaround".

Is this a private repo, cannot access it?

I can access this perfectly fine, anonymously, not signed in.

@toreanderson
Copy link
Contributor

@toreanderson toreanderson commented May 15, 2019

I am also experiencing this issue quite often, even though I am using an upstream recursive resolver known to be DNSSEC capable (CloudFlare's 1.1.1.1).

What seems to be happening is that some query is answered with SERVFAIL by the upstream recursive resolver. In response, systemd-resolved downgrades the server as being DNSSEC-incompatible. When this has happened, it is no longer possible to resolve anything - all subsequent queries fail with incompatible-server (probably due to DNSSEC=yes in /etc/systemd/resolved.conf). I need to issue resolvectl reset-server-features to get a working resolver back.

SERVFAIL can occur for all sorts of transient reasons. It can also occur consistently when querying for specific FQDNs while there are no problems for others. Therefore, using a single SERVFAIL as a trigger to disable DNSSEC completely, and by extension the entire system resolver (when DNSSEC=yes), appears to be an extreme overreaction by systemd-resolved.

I am attaching debug-level logs from systemd-journald and a PCAP with the DNS protocol traffic between systemd-resolved and 1.1.1.1 during the minute in which the issue occurred below.

Note that at the time I was simply using my computer normally and not trying to do anything specific to provoke the issue to occur.

The issue appears to have triggered around these log lines:

2019-05-15T14:06:21.607863+0200 sloth.fud.no systemd-resolved[5353]: Processing incoming packet on transaction 63106 (rcode=SERVFAIL).
2019-05-15T14:06:21.607913+0200 sloth.fud.no systemd-resolved[5353]: Server returned error SERVFAIL, retrying transaction with reduced feature level UDP.
2019-05-15T14:06:21.607937+0200 sloth.fud.no systemd-resolved[5353]: Retrying transaction 63106.
2019-05-15T14:06:21.608192+0200 sloth.fud.no systemd-resolved[5353]: Cache miss for services.mozilla.com IN DS
2019-05-15T14:06:21.608224+0200 sloth.fud.no systemd-resolved[5353]: Transaction 63106 for <services.mozilla.com IN DS> scope dns on */*.
2019-05-15T14:06:21.608250+0200 sloth.fud.no systemd-resolved[5353]: Using feature level UDP for transaction 63106.
2019-05-15T14:06:21.608280+0200 sloth.fud.no systemd-resolved[5353]: Sending query packet with id 63106.
2019-05-15T14:06:22.024048+0200 sloth.fud.no systemd-resolved[5353]: Processing incoming packet on transaction 63106 (rcode=SUCCESS).
2019-05-15T14:06:22.024161+0200 sloth.fud.no systemd-resolved[5353]: Downgrading transaction feature level fixed an RCODE error, downgrading server 1.1.1.1 too.
2019-05-15T14:06:22.024201+0200 sloth.fud.no systemd-resolved[5353]: Not validating response for 63106, used server feature level does not support DNSSEC.
2019-05-15T14:06:22.024278+0200 sloth.fud.no systemd-resolved[5353]: DNSSEC validation failed for question services.mozilla.com IN DS: incompatible-server

The failing query it complains about appears to be this one from the PCAP (note how it took five seconds to complete, so the SERVFAIL probably indicates an upstream query timeout):

  117 2019-05-15 14:06:17,241011  100.66.1.86 → 1.1.1.1      DNS 93 Standard query 0x82f6 DS services.mozilla.com OPT
  182 2019-05-15 14:06:21,607603      1.1.1.1 → 100.66.1.86  DNS 93 Standard query response 0x82f6 Server failure DS services.mozilla.com OPT

This was clearly a transient issue, after having issued resolvectl reset-server-features I can again look up this hostname without problems:

$ resolvectl query services.mozilla.com -t SOA
services.mozilla.com IN SOA ns-679.awsdns-20.net awsdns-hostmaster.amazon.com 1 7200 900 1209600 86400 -- link: tun0

-- Information acquired via protocol DNS in 402.3ms.
-- Data is authenticated: no

My /etc/systemd/resolved.conf file contains the following:

$ egrep -v '^($|#)' /etc/systemd/resolved.conf
[Resolve]
DNS=1.1.1.1
DNSSEC=yes
Cache=no

I'm running systemd-241-8.git9ef65cb.fc30.x86_64 (Fedora packaging).

@toreanderson
Copy link
Contributor

@toreanderson toreanderson commented May 20, 2019

In case it is of any help to others, here is the workaround I use to alleviate this issue.

It looks for log messages from systemd-resolved that indicates the upstream server has been flagged as DNSSEC-incompatible. If such a log message appear, the server features are immediately reset (thus re-enabling DNSSEC).

It is an ugly hack, but it works. To use it, drop it into /etc/systemd/system/systemd-resolved-autofix-dnssec.service and run systemctl enable --now systemd-resolved-autofix-dnssec.service.

# /etc/systemd/system/systemd-resolved-autofix-dnssec.service
[Service]
ExecStart=sh -c 'journalctl -n0 -fu systemd-resolved | grep -m1 "DNSSEC validation failed.*incompatible-server" && resolvectl reset-server-features'
Restart=always

[Install]
WantedBy=systemd-resolved.service

@toreanderson
Copy link
Contributor

@toreanderson toreanderson commented May 21, 2019

@poettering I just realised this issue still has the needs-reporter-feedback label you added back in July 2017. I believe that others and I have since provided the feedback requested, so that the label may now be removed. If not, please advise which additional information is still needed - I will try to provide it as soon as I can.

@q2dg
Copy link

@q2dg q2dg commented Nov 20, 2019

This issue is still seen in Fedora 31 (with Systemd v243-4.gitef67743.fc31)

@anthr76
Copy link

@anthr76 anthr76 commented Dec 4, 2019

Also currently facing this issue.

systemd 243 (243.162-2-arch)

@Avamander
Copy link

@Avamander Avamander commented Dec 26, 2019

I just encountered this issue, disabling DNSSEC, restarting, enabling it and restarting fixed it very temporarily. This is incredibly annoying. Using systemd 242 on Ubuntu 19.10

@ott
Copy link
Contributor

@ott ott commented Dec 26, 2019

@Avamander It is a known issue for years. Unfortunately, it is mostly ignored by the developers. It would be helpful if someone with contacts to RedHat or otherwise to the systemd developers could highlight the issue. However, as I wrote, it is not an easy issue and requires a rewrite of the DNS server feature detection of systemd-resolved.

Please don't add to this issue if you have the same problem but use the emoticons to indicate that you are also affected.

@Avamander
Copy link

@Avamander Avamander commented Dec 26, 2019

Emoticons only work if someone really pays attention. It is clear that it isn't the case.

The very least if there aren't any immediate plans on fixing this then the documentation should be updated to warn against a critical bug that will bite anyone using it in the ass.

@ThomasCr
Copy link

@ThomasCr ThomasCr commented Apr 13, 2020

I still have the issue on Ubuntu 20.04

root@srv1:~# dpkg -l|grep systemd
ii  dbus-user-session                    1.12.16-2ubuntu2                     amd64        simple interprocess messaging system (systemd --user integration)
ii  libnss-systemd:amd64                 245.4-2ubuntu1                       amd64        nss module providing dynamic user and group name resolution
ii  libpam-systemd:amd64                 245.4-2ubuntu1                       amd64        system and service manager - PAM module
ii  libsystemd0:amd64                    245.4-2ubuntu1                       amd64        systemd utility library
ii  networkd-dispatcher                  2.0.1-1                              all          Dispatcher service for systemd-networkd connection status changes
ii  python3-systemd                      234-3build2                          amd64        Python 3 bindings for systemd
ii  systemd                              245.4-2ubuntu1                       amd64        system and service manager
ii  systemd-sysv                         245.4-2ubuntu1                       amd64        system and service manager - SysV links
ii  systemd-timesyncd                    245.4-2ubuntu1                       amd64        minimalistic service to synchronize local time with NTP servers

@mickevi
Copy link

@mickevi mickevi commented Nov 2, 2020

I got this issues when running fedora 33 under virtualbox.

@kpfleming
Copy link
Contributor

@kpfleming kpfleming commented Nov 2, 2020

That's because Fedora started using systemd-resolved in Fedora 33. Every Fedora 33 user is going to experience this.

@toreanderson
Copy link
Contributor

@toreanderson toreanderson commented Nov 2, 2020

@kpfleming That will not be the case, as F33 disables DNSSEC by default.

@Avamander
Copy link

@Avamander Avamander commented Nov 2, 2020

It doesn't change the fact that people expect this functionality to work properly. It really doesn't.

@letoams
Copy link

@letoams letoams commented Nov 2, 2020

@kpfleming That will not be the case, as F33 disables DNSSEC by default.

You cannot "disabled DNSSEC" on the Operating System. DNS libraries and DNS applications determine this by themselves. The real problem appears when these programs and libraries use the information from /etc/resolv.conf to find a real DNS server, and what they get is only a nameserver pointing to systemd-resolved via "nameserver 127.0.0.53".
While it is fine for systemd-resolved to not validate (DNSSEC=no), it contains several bugs that prevent these real DNS applications/libraries from getting the required DNSSEC records (due to bugs in systemd-resolved where it returns a non-DNSSEC cache entry in response to a DNSSEC query).

Additionally, there seems to be some negative caching issues for me on fedora 33, that prevents me from using systemd-resolved when using my iphone as hotspot. It appears some DNS queries are lost while my laptop is connecting to my iphone, resulting in my hotspot connection just not working at all due to DNS failures.

@toreanderson
Copy link
Contributor

@toreanderson toreanderson commented Nov 2, 2020

@letoams In that case you are talking about a different issue than this one. This issue is about systemd-resolved's tendency to, when configured with DNSSEC=yes, degrade into a state where no queries are answered because they all fail with incompatible-server (even though the upstream server supports DNSSEC just fine). This issue will certainly not impact all Fedora 33 users, because Fedora 33's default setting is DNSSEC=no.

I encourage you to submit a new issue about the behaviour you're describing (assuming nobody already has).

@Avamander
Copy link

@Avamander Avamander commented Nov 2, 2020

tendency to, when configured

More like an absolute certainty. It's totally broken for nearly three years now.

@toreanderson
Copy link
Contributor

@toreanderson toreanderson commented Nov 20, 2020

poettering@ca8fe05 will in all likelihood fix this issue 🎉

poettering added a commit to poettering/systemd that referenced this issue Nov 20, 2020
This adjusts our feature level handling: when DNSSEC strict mode is on,
let's never lower the feature level below the lowest DNSSEC mode.

Also, when asking whether DNSSEC is supproted, always say yes in strict
mode. This means that error reporting about transactions that fail
because of missing DNSSEC RRs will not report "incompatible-server" but
instead "missing-signature" or suchlike.

The main difference here is that DNSSEC failures become local to a
transaction, instead of propagating into the feature level we reuse for
future transactions. This is beneficial with routers that implement
"mostly a DNS proxy", i.e. that propagate most DNS requests 1:1 to their
upstream servers, but synthesize local answers for a select few domains.
For example, AVM Fritz!Boxes operate that way: they proxy most traffic
1:1 upstream in an DNSSEC-compatible fashion, but synthesize the
"fritz.box" locally, so that it can be used to configure the router.
This local domain cannot be DNSSEC verified, it comes without
signatures. Previously this would mean once that domain was resolved
feature level would be downgraded, and we'd thus fail all future DNSSEC
attempts. With this change, the immediate lookup for "fritz.box" will
fail validation, but for all other unrelated future ones that comes
without prejudice.

(While we are at it, also make a couple of other downgrade paths a bit
tighter.)

Fixes: systemd#10570 systemd#14435 systemd#6490
@keszybz keszybz removed the needs-reporter-feedback label Nov 27, 2020
poettering added a commit to poettering/systemd that referenced this issue Dec 2, 2020
This adjusts our feature level handling: when DNSSEC strict mode is on,
let's never lower the feature level below the lowest DNSSEC mode.

Also, when asking whether DNSSEC is supproted, always say yes in strict
mode. This means that error reporting about transactions that fail
because of missing DNSSEC RRs will not report "incompatible-server" but
instead "missing-signature" or suchlike.

The main difference here is that DNSSEC failures become local to a
transaction, instead of propagating into the feature level we reuse for
future transactions. This is beneficial with routers that implement
"mostly a DNS proxy", i.e. that propagate most DNS requests 1:1 to their
upstream servers, but synthesize local answers for a select few domains.
For example, AVM Fritz!Boxes operate that way: they proxy most traffic
1:1 upstream in an DNSSEC-compatible fashion, but synthesize the
"fritz.box" locally, so that it can be used to configure the router.
This local domain cannot be DNSSEC verified, it comes without
signatures. Previously this would mean once that domain was resolved
feature level would be downgraded, and we'd thus fail all future DNSSEC
attempts. With this change, the immediate lookup for "fritz.box" will
fail validation, but for all other unrelated future ones that comes
without prejudice.

(While we are at it, also make a couple of other downgrade paths a bit
tighter.)

Fixes: systemd#10570 systemd#14435 systemd#6490
poettering added a commit to poettering/systemd that referenced this issue Dec 3, 2020
This adjusts our feature level handling: when DNSSEC strict mode is on,
let's never lower the feature level below the lowest DNSSEC mode.

Also, when asking whether DNSSEC is supproted, always say yes in strict
mode. This means that error reporting about transactions that fail
because of missing DNSSEC RRs will not report "incompatible-server" but
instead "missing-signature" or suchlike.

The main difference here is that DNSSEC failures become local to a
transaction, instead of propagating into the feature level we reuse for
future transactions. This is beneficial with routers that implement
"mostly a DNS proxy", i.e. that propagate most DNS requests 1:1 to their
upstream servers, but synthesize local answers for a select few domains.
For example, AVM Fritz!Boxes operate that way: they proxy most traffic
1:1 upstream in an DNSSEC-compatible fashion, but synthesize the
"fritz.box" locally, so that it can be used to configure the router.
This local domain cannot be DNSSEC verified, it comes without
signatures. Previously this would mean once that domain was resolved
feature level would be downgraded, and we'd thus fail all future DNSSEC
attempts. With this change, the immediate lookup for "fritz.box" will
fail validation, but for all other unrelated future ones that comes
without prejudice.

(While we are at it, also make a couple of other downgrade paths a bit
tighter.)

Fixes: systemd#10570 systemd#14435 systemd#6490
poettering added a commit to poettering/systemd that referenced this issue Dec 4, 2020
This adjusts our feature level handling: when DNSSEC strict mode is on,
let's never lower the feature level below the lowest DNSSEC mode.

Also, when asking whether DNSSEC is supproted, always say yes in strict
mode. This means that error reporting about transactions that fail
because of missing DNSSEC RRs will not report "incompatible-server" but
instead "missing-signature" or suchlike.

The main difference here is that DNSSEC failures become local to a
transaction, instead of propagating into the feature level we reuse for
future transactions. This is beneficial with routers that implement
"mostly a DNS proxy", i.e. that propagate most DNS requests 1:1 to their
upstream servers, but synthesize local answers for a select few domains.
For example, AVM Fritz!Boxes operate that way: they proxy most traffic
1:1 upstream in an DNSSEC-compatible fashion, but synthesize the
"fritz.box" locally, so that it can be used to configure the router.
This local domain cannot be DNSSEC verified, it comes without
signatures. Previously this would mean once that domain was resolved
feature level would be downgraded, and we'd thus fail all future DNSSEC
attempts. With this change, the immediate lookup for "fritz.box" will
fail validation, but for all other unrelated future ones that comes
without prejudice.

(While we are at it, also make a couple of other downgrade paths a bit
tighter.)

Fixes: systemd#10570 systemd#14435 systemd#6490
poettering added a commit to poettering/systemd that referenced this issue Dec 7, 2020
This adjusts our feature level handling: when DNSSEC strict mode is on,
let's never lower the feature level below the lowest DNSSEC mode.

Also, when asking whether DNSSEC is supproted, always say yes in strict
mode. This means that error reporting about transactions that fail
because of missing DNSSEC RRs will not report "incompatible-server" but
instead "missing-signature" or suchlike.

The main difference here is that DNSSEC failures become local to a
transaction, instead of propagating into the feature level we reuse for
future transactions. This is beneficial with routers that implement
"mostly a DNS proxy", i.e. that propagate most DNS requests 1:1 to their
upstream servers, but synthesize local answers for a select few domains.
For example, AVM Fritz!Boxes operate that way: they proxy most traffic
1:1 upstream in an DNSSEC-compatible fashion, but synthesize the
"fritz.box" locally, so that it can be used to configure the router.
This local domain cannot be DNSSEC verified, it comes without
signatures. Previously this would mean once that domain was resolved
feature level would be downgraded, and we'd thus fail all future DNSSEC
attempts. With this change, the immediate lookup for "fritz.box" will
fail validation, but for all other unrelated future ones that comes
without prejudice.

(While we are at it, also make a couple of other downgrade paths a bit
tighter.)

Fixes: systemd#10570 systemd#14435 systemd#6490
@MartinX3
Copy link

@MartinX3 MartinX3 commented Jan 18, 2021

Still an issue with 247.2

poettering added a commit to poettering/systemd that referenced this issue Feb 16, 2021
This adjusts our feature level handling: when DNSSEC strict mode is on,
let's never lower the feature level below the lowest DNSSEC mode.

Also, when asking whether DNSSEC is supproted, always say yes in strict
mode. This means that error reporting about transactions that fail
because of missing DNSSEC RRs will not report "incompatible-server" but
instead "missing-signature" or suchlike.

The main difference here is that DNSSEC failures become local to a
transaction, instead of propagating into the feature level we reuse for
future transactions. This is beneficial with routers that implement
"mostly a DNS proxy", i.e. that propagate most DNS requests 1:1 to their
upstream servers, but synthesize local answers for a select few domains.
For example, AVM Fritz!Boxes operate that way: they proxy most traffic
1:1 upstream in an DNSSEC-compatible fashion, but synthesize the
"fritz.box" locally, so that it can be used to configure the router.
This local domain cannot be DNSSEC verified, it comes without
signatures. Previously this would mean once that domain was resolved
feature level would be downgraded, and we'd thus fail all future DNSSEC
attempts. With this change, the immediate lookup for "fritz.box" will
fail validation, but for all other unrelated future ones that comes
without prejudice.

(While we are at it, also make a couple of other downgrade paths a bit
tighter.)

Fixes: systemd#10570 systemd#14435 systemd#6490
poettering added a commit to poettering/systemd that referenced this issue Feb 16, 2021
This adjusts our feature level handling: when DNSSEC strict mode is on,
let's never lower the feature level below the lowest DNSSEC mode.

Also, when asking whether DNSSEC is supproted, always say yes in strict
mode. This means that error reporting about transactions that fail
because of missing DNSSEC RRs will not report "incompatible-server" but
instead "missing-signature" or suchlike.

The main difference here is that DNSSEC failures become local to a
transaction, instead of propagating into the feature level we reuse for
future transactions. This is beneficial with routers that implement
"mostly a DNS proxy", i.e. that propagate most DNS requests 1:1 to their
upstream servers, but synthesize local answers for a select few domains.
For example, AVM Fritz!Boxes operate that way: they proxy most traffic
1:1 upstream in an DNSSEC-compatible fashion, but synthesize the
"fritz.box" locally, so that it can be used to configure the router.
This local domain cannot be DNSSEC verified, it comes without
signatures. Previously this would mean once that domain was resolved
feature level would be downgraded, and we'd thus fail all future DNSSEC
attempts. With this change, the immediate lookup for "fritz.box" will
fail validation, but for all other unrelated future ones that comes
without prejudice.

(While we are at it, also make a couple of other downgrade paths a bit
tighter.)

Fixes: systemd#10570 systemd#14435 systemd#6490
poettering added a commit to poettering/systemd that referenced this issue Feb 16, 2021
This adjusts our feature level handling: when DNSSEC strict mode is on,
let's never lower the feature level below the lowest DNSSEC mode.

Also, when asking whether DNSSEC is supproted, always say yes in strict
mode. This means that error reporting about transactions that fail
because of missing DNSSEC RRs will not report "incompatible-server" but
instead "missing-signature" or suchlike.

The main difference here is that DNSSEC failures become local to a
transaction, instead of propagating into the feature level we reuse for
future transactions. This is beneficial with routers that implement
"mostly a DNS proxy", i.e. that propagate most DNS requests 1:1 to their
upstream servers, but synthesize local answers for a select few domains.
For example, AVM Fritz!Boxes operate that way: they proxy most traffic
1:1 upstream in an DNSSEC-compatible fashion, but synthesize the
"fritz.box" locally, so that it can be used to configure the router.
This local domain cannot be DNSSEC verified, it comes without
signatures. Previously this would mean once that domain was resolved
feature level would be downgraded, and we'd thus fail all future DNSSEC
attempts. With this change, the immediate lookup for "fritz.box" will
fail validation, but for all other unrelated future ones that comes
without prejudice.

(While we are at it, also make a couple of other downgrade paths a bit
tighter.)

Fixes: systemd#10570 systemd#14435 systemd#6490
@poettering
Copy link
Member

@poettering poettering commented Feb 17, 2021

Fixed by #18624

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests