Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolve: TCP fast open is broken? #9283

Closed
yuwata opened this issue Jun 13, 2018 · 6 comments
Closed

resolve: TCP fast open is broken? #9283

yuwata opened this issue Jun 13, 2018 · 6 comments
Labels
bug 🐛 Programming errors, that need preferential fixing resolve
Milestone

Comments

@yuwata
Copy link
Member

yuwata commented Jun 13, 2018

DNSSEC=yes is set.

Before #8849 is merged, fedoraproject.org can be resolved.

$ resolvectl query fedoraproject.org
fedoraproject.org: 152.19.134.198
                   8.43.85.67
                   209.132.181.15
                   152.19.134.142
                   140.211.169.206
                   209.132.190.2
                   209.132.181.16
                   67.219.144.68
                   140.211.169.196
                   185.141.165.254

-- Information acquired via protocol DNS in 156.6ms.
-- Data is authenticated: yes

However, after #8849 is merged, (e.g. 73c718a), it is failed:

$ resolvectl query fedoraproject.org
fedoraproject.org: resolve call failed: DNSSEC validation failed: failed-auxiliary
$ resolvectl query github.com
github.com: resolve call failed: DNSSEC validation failed: failed-auxiliary

With #9280, some hosts can be resolved, but not all.

$ resolvectl query fedoraproject.org
fedoraproject.org: resolve call failed: DNSSEC validation failed: failed-auxiliary
$ resolvectl query github.com
github.com: 192.30.255.112
            192.30.255.113

-- Information acquired via protocol DNS in 7.4ms.
-- Data is authenticated: no

Below is the result of resolvectl.

resolvectl
Global
       LLMNR setting: no
MulticastDNS setting: yes
  PrivateDNS setting: no
      DNSSEC setting: yes
    DNSSEC supported: yes
Fallback DNS Servers: 8.8.8.8
                      8.8.4.4
                      2001:4860:4860::8888
                      2001:4860:4860::8844
          DNSSEC NTA: 10.in-addr.arpa
                      16.172.in-addr.arpa
                      168.192.in-addr.arpa
                      17.172.in-addr.arpa
                      18.172.in-addr.arpa
                      19.172.in-addr.arpa
                      20.172.in-addr.arpa
                      21.172.in-addr.arpa
                      22.172.in-addr.arpa
                      23.172.in-addr.arpa
                      24.172.in-addr.arpa
                      25.172.in-addr.arpa
                      26.172.in-addr.arpa
                      27.172.in-addr.arpa
                      28.172.in-addr.arpa
                      29.172.in-addr.arpa
                      30.172.in-addr.arpa
                      31.172.in-addr.arpa
                      corp
                      d.f.ip6.arpa
                      home
                      internal
                      intranet
                      lan
                      local
                      private
                      test

Link 3 (wlp2s0)
      Current Scopes: none
       LLMNR setting: yes
MulticastDNS setting: no
  PrivateDNS setting: no
      DNSSEC setting: yes
    DNSSEC supported: yes

Link 2 (enp0s25)
      Current Scopes: DNS
       LLMNR setting: yes
MulticastDNS setting: no
  PrivateDNS setting: no
      DNSSEC setting: yes
    DNSSEC supported: yes
  Current DNS Server: 10.224.254.1
         DNS Servers: 10.224.254.1
                      10.224.253.11
          DNS Domain: kuins.net
@yuwata yuwata added resolve bug 🐛 Programming errors, that need preferential fixing labels Jun 13, 2018
@yuwata yuwata added this to the v239 milestone Jun 13, 2018
@yuwata
Copy link
Member Author

yuwata commented Jun 13, 2018

Below is a part of journal log (with b961fbd).

Jun 13 16:10:23 systemd-resolved[3973]: Retrying transaction 27899.
Jun 13 16:10:23 systemd-resolved[3973]: Switching to DNS server 10.224.253.11 for interface enp0s25.
Jun 13 16:10:23 systemd-resolved[3973]: Cache miss for org IN DNSKEY
Jun 13 16:10:23 systemd-resolved[3973]: Transaction 27899 for <org IN DNSKEY> scope dns on enp0s25/*.
Jun 13 16:10:23 systemd-resolved[3973]: Using feature level UDP+EDNS0+DO+LARGE for transaction 27899.
Jun 13 16:10:23 systemd-resolved[3973]: Using DNS server 10.224.253.11 for transaction 27899.
Jun 13 16:10:23 systemd-resolved[3973]: Sending query packet with id 27899.
Jun 13 16:10:28 systemd-resolved[3973]: Timeout reached on transaction 27899.
Jun 13 16:10:28 systemd-resolved[3973]: Retrying transaction 27899.
Jun 13 16:10:28 systemd-resolved[3973]: Switching to DNS server 10.224.254.1 for interface enp0s25.
Jun 13 16:10:28 systemd-resolved[3973]: Cache miss for org IN DNSKEY
Jun 13 16:10:28 systemd-resolved[3973]: Transaction 27899 for <org IN DNSKEY> scope dns on enp0s25/*.
Jun 13 16:10:28 systemd-resolved[3973]: Lost too many UDP packets, downgrading feature level...
Jun 13 16:10:28 systemd-resolved[3973]: Using degraded feature set (UDP+EDNS0+DO) for DNS server 10.224.254.1.
Jun 13 16:10:28 systemd-resolved[3973]: Using feature level UDP+EDNS0+DO for transaction 27899.
Jun 13 16:10:28 systemd-resolved[3973]: Using DNS server 10.224.254.1 for transaction 27899.
Jun 13 16:10:28 systemd-resolved[3973]: Sending query packet with id 27899.
Jun 13 16:10:28 systemd-resolved[3973]: Processing incoming packet on transaction 27899. (rcode=SUCCESS)
Jun 13 16:10:28 systemd-resolved[3973]: Reply truncated, retrying via TCP.
Jun 13 16:10:28 systemd-resolved[3973]: Using feature level UDP+EDNS0+DO for transaction 27899.
Jun 13 16:10:28 systemd-resolved[3973]: Transaction 27899 for <org IN DNSKEY> on scope dns on enp0s25/* now complete with <EINPROGRESS> from none (unsigned).
Jun 13 16:10:28 systemd-resolved[3973]: Auxiliary DNSSEC RR query failed with errno
Jun 13 16:10:28 systemd-resolved[3973]: DNSSEC validation failed for question fedoraproject.org IN DS: failed-auxiliary
Jun 13 16:10:28 systemd-resolved[3973]: Transaction 47800 for <fedoraproject.org IN DS> on scope dns on enp0s25/* now complete with <dnssec-failed> from network (unsigned).
Jun 13 16:10:28 systemd-resolved[3973]: Auxiliary DNSSEC RR query failed validation: failed-auxiliary
Jun 13 16:10:28 systemd-resolved[3973]: DNSSEC validation failed for question fedoraproject.org IN DNSKEY: failed-auxiliary
Jun 13 16:10:28 systemd-resolved[3973]: Transaction 3071 for <fedoraproject.org IN DNSKEY> on scope dns on enp0s25/* now complete with <dnssec-failed> from network (unsigned).
Jun 13 16:10:28 systemd-resolved[3973]: Auxiliary DNSSEC RR query failed validation: failed-auxiliary
Jun 13 16:10:28 systemd-resolved[3973]: DNSSEC validation failed for question fedoraproject.org IN A: failed-auxiliary
Jun 13 16:10:28 systemd-resolved[3973]: Transaction 35618 for <fedoraproject.org IN A> on scope dns on enp0s25/* now complete with <dnssec-failed> from network (unsigned).

@yuwata
Copy link
Member Author

yuwata commented Jun 13, 2018

Hmm, it seems storage that 'UDP+EDNS0+DO' is used just after 'retrying via TCP'.

Processing incoming packet on transaction 27899. (rcode=SUCCESS)
Reply truncated, retrying via TCP.
Using feature level UDP+EDNS0+DO for transaction 27899.

@yuwata
Copy link
Member Author

yuwata commented Jun 13, 2018

Before #8849, the spurious log is also found. But works fine...

Jun 13 17:40:55 systemd-resolved[11069]: Transaction 18944 for <org IN DNSKEY> scope dns on enp0s25/*.
Jun 13 17:40:55 systemd-resolved[11069]: Using feature level UDP+EDNS0+DO+LARGE for transaction 18944.
Jun 13 17:40:55 systemd-resolved[11069]: Using DNS server 10.224.253.11 for transaction 18944.
Jun 13 17:40:55 systemd-resolved[11069]: Sending query packet with id 18944.
Jun 13 17:40:58 systemd-resolved[11069]: Timeout reached on transaction 18944.
Jun 13 17:40:58 systemd-resolved[11069]: Retrying transaction 18944.
Jun 13 17:40:58 systemd-resolved[11069]: Switching to DNS server 10.224.254.1 for interface enp0s25.
Jun 13 17:40:58 systemd-resolved[11069]: Cache miss for org IN DNSKEY
Jun 13 17:40:58 systemd-resolved[11069]: Transaction 18944 for <org IN DNSKEY> scope dns on enp0s25/*.
Jun 13 17:40:58 systemd-resolved[11069]: Lost too many UDP packets, downgrading feature level...
Jun 13 17:40:58 systemd-resolved[11069]: Using degraded feature set (UDP+EDNS0+DO) for DNS server 10.224.254.1.
Jun 13 17:40:58 systemd-resolved[11069]: Using feature level UDP+EDNS0+DO for transaction 18944.
Jun 13 17:40:58 systemd-resolved[11069]: Using DNS server 10.224.254.1 for transaction 18944.
Jun 13 17:40:58 systemd-resolved[11069]: Sending query packet with id 18944.
Jun 13 17:40:58 systemd-resolved[11069]: Processing incoming packet on transaction 18944. (rcode=SUCCESS)
Jun 13 17:40:58 systemd-resolved[11069]: Reply truncated, retrying via TCP.
Jun 13 17:40:58 systemd-resolved[11069]: Using feature level UDP+EDNS0+DO for transaction 18944.
Jun 13 17:40:58 systemd-resolved[11069]: Processing incoming packet on transaction 18944. (rcode=SUCCESS)
Jun 13 17:40:58 systemd-resolved[11069]: Requesting DS to validate transaction 18944 (org, DNSKEY with key tag: 17883).

@yuwata
Copy link
Member Author

yuwata commented Jun 13, 2018

98767d7 is fine.
91ccab1 does not work.

yuwata added a commit to yuwata/systemd that referenced this issue Jun 13, 2018
@yuwata yuwata changed the title resolve: DNSSEC is broken? resolve: TCP fast open is broken? Jun 13, 2018
@poettering
Copy link
Member

/cc @irtimmer

@irtimmer
Copy link
Contributor

I think I've found the issue. I probably made some mistake when changing the read and write functions to use negative error return codes. In case TCP Fast Open can't send data it will return the error EINPROGRESS, which get replaced by EAGAIN, but wasn't handled correctly. Therefore resolved closed the connection instead of invoking the write function again.

irtimmer added a commit to irtimmer/systemd that referenced this issue Jun 14, 2018
During the transition from system functions using errno to our own read and write functions with negative return codes some errors where introduced. This patch correctly convert errno to negative return codes for read and write and fix checks still using errno instead of the return code.

Closes systemd#9283
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing resolve
Development

No branches or pull requests

3 participants