Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolved: "resolvectl query" not working with for some addresses my corporate network #19394

Closed
NTMan opened this issue Apr 22, 2021 · 23 comments
Labels

Comments

@NTMan
Copy link

NTMan commented Apr 22, 2021

$ systemd-resolve --version
systemd 248 (v248-2.fc35)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified

$ systemd-resolve git.sbis.ru
git.sbis.ru: 10.76.168.67                      -- link: vpn0

-- Information acquired via protocol DNS in 31.1ms.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: cache network

$ systemd-resolve online.sbis.ru
online.sbis.ru: resolve call failed: Lookup failed due to system error: Invalid argument

$ systemd-resolve -4 online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 31.1ms.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: network

$ resolvectl dns
Global:
Link 2 (enp5s0):
Link 3 (wlp4s0):
Link 4 (vpn0): 10.76.4.153

$ resolvectl domain
Global:
Link 2 (enp5s0):
Link 3 (wlp4s0):
Link 4 (vpn0):

Distro: Fedora Rawhide

@poettering
Copy link
Member

Please enable debug logging in resolved (systemctl service-log-level systemd-resolved debug), reproduce the issue, and provide the logs this generated (i.e. journalctl -u systemd-resolved)

@poettering poettering added needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer resolve labels Apr 27, 2021
@NTMan
Copy link
Author

NTMan commented Apr 27, 2021

Please enable debug logging in resolved (systemctl service-log-level systemd-resolved debug), reproduce the issue, and provide the logs this generated (i.e. journalctl -u systemd-resolved)

Apr 28 02:53:43 fedora systemd-resolved[1006]: Regular transaction 54184 for <online.sbis.ru IN A> scope dns on vpn0/* (validate=yes).
Apr 28 02:53:43 fedora systemd-resolved[1006]: Using feature level UDP+EDNS0 for transaction 54184.
Apr 28 02:53:43 fedora systemd-resolved[1006]: Using DNS server 10.76.4.153 for transaction 54184.
Apr 28 02:53:43 fedora systemd-resolved[1006]: Announcing packet size 512 in egress EDNS(0) packet.
Apr 28 02:53:43 fedora systemd-resolved[1006]: Emitting UDP, link MTU is 1300, socket MTU is 0, minimal MTU is 40
Apr 28 02:53:43 fedora systemd-resolved[1006]: Sending query packet with id 54184 of size 43.
Apr 28 02:53:43 fedora systemd-resolved[1006]: varlink-21: varlink: changing state processing-method → pending-method
Apr 28 02:53:43 fedora systemd-resolved[1006]: Received dns UDP packet of size 127, ifindex=4, ttl=0, fragsize=0
Apr 28 02:53:43 fedora systemd-resolved[1006]: Processing incoming packet of size 127 on transaction 30066 (rcode=SUCCESS).
Apr 28 02:53:43 fedora systemd-resolved[1006]: Regular transaction 30066 for <online.sbis.ru IN AAAA> on scope dns on vpn0/* now complete with <success> from network (unsigned; non-confidential).
Apr 28 02:53:43 fedora systemd-resolved[1006]: Received dns UDP packet of size 143, ifindex=4, ttl=0, fragsize=0
Apr 28 02:53:43 fedora systemd-resolved[1006]: Processing incoming packet of size 143 on transaction 54184 (rcode=SUCCESS).
Apr 28 02:53:43 fedora systemd-resolved[1006]: Added positive unauthenticated non-confidential cache entry for online.sbis.ru IN A 1s on vpn0/INET/10.76.4.153
Apr 28 02:53:43 fedora systemd-resolved[1006]: Regular transaction 54184 for <online.sbis.ru IN A> on scope dns on vpn0/* now complete with <success> from network (unsigned; non-confidential).
Apr 28 02:53:43 fedora systemd-resolved[1006]: Freeing transaction 30066.
Apr 28 02:53:43 fedora systemd-resolved[1006]: varlink-21: Sending message: {"error":"io.systemd.System","parameters":{"errno":22}}
Apr 28 02:53:43 fedora systemd-resolved[1006]: varlink-21: varlink: changing state pending-method → idle-server
Apr 28 02:53:43 fedora systemd-resolved[1006]: Freeing transaction 54184.

Full log is here:
systemd-resolved-log.txt

@poettering
Copy link
Member

hmm, weird. Any chance you can install debug symbols for this, and install gdb and then attach gdb to the running instance and set a breakpoint to vl_method_resolve_hostname_complete and then step through it, so that we can track down where the EINVAL comes from?

It's a bit of a learning curve if you don't know gdb though

@poettering
Copy link
Member

when i try to resolve the same hostname here, all works, so this is really strange

@NTMan
Copy link
Author

NTMan commented Apr 28, 2021

hmm, weird. Any chance you can install debug symbols for this, and install gdb and then attach gdb to the running instance and set a breakpoint to vl_method_resolve_hostname_complete and then step through it, so that we can track down where the EINVAL comes from?

(gdb) break vl_method_resolve_hostname_complete
Breakpoint 1 at 0x563a41b5b590: file ../src/resolve/resolved-varlink.c, line 146.
(gdb) c
Continuing.

Breakpoint 1, vl_method_resolve_hostname_complete (q=0x563a43f45e80) at ../src/resolve/resolved-varlink.c:146
146	static void vl_method_resolve_hostname_complete(DnsQuery *q) {
(gdb) s
147	        _cleanup_(dns_resource_record_unrefp) DnsResourceRecord *canonical = NULL;
(gdb) s
148	        _cleanup_(json_variant_unrefp) JsonVariant *array = NULL;
(gdb) s
149	        _cleanup_free_ char *normalized = NULL;
(gdb) s
154	        assert(q);
(gdb) s
156	        if (q->state != DNS_TRANSACTION_SUCCESS) {
(gdb) s
157	                r = reply_query_state(q);
(gdb) s
reply_query_state (q=0x563a43f45e80) at ../src/resolve/resolved-varlink.c:24
24	        assert(q);
(gdb) s
25	        assert(q->varlink_request);
(gdb) s
27	        switch (q->state) {
(gdb) s
74	                                      JSON_BUILD_OBJECT(JSON_BUILD_PAIR("rcode", JSON_BUILD_INTEGER(q->answer_rcode))));
(gdb) s
0x00007f298b9ff9a2 in varlink_errorb (v=0x563a439c1dd0, error_id=0x563a41b7fe62 "io.systemd.Resolve.DNSError") at ../src/shared/varlink.c:1661
1661	int varlink_errorb(Varlink *v, const char *error_id, ...) {
(gdb) s
1662	        _cleanup_(json_variant_unrefp) JsonVariant *parameters = NULL;
(gdb) s
1666	        assert_return(v, -EINVAL);
(gdb) s
1667	        assert_return(error_id, -EINVAL);
(gdb) s
1669	        va_start(ap, error_id);
(gdb) s
1670	        r = json_buildv(&parameters, ap);
(gdb) s
0x00007f298b9c15c5 in json_buildv (ret=0x7ffc8e84c228, ap=0x7ffc8e84c230) at ../src/shared/json.c:3230
3230	int json_buildv(JsonVariant **ret, va_list ap) {

@NTMan
Copy link
Author

NTMan commented Apr 28, 2021

when i try to resolve the same hostname here, all works, so this is really strange

It happens only when my VPN link is up.

The last working for me systemd-resolve is 247.3-3.fc35

[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.0ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: resolve call failed: Lookup failed due to system error: Invalid argument
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.4ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.1ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.4ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.0ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 28.9ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.4ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: resolve call failed: Lookup failed due to system error: Invalid argument
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.5ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.2ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.1ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.2ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: resolve call failed: Lookup failed due to system error: Invalid argument
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.0ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.3ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 28.9ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.4ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.2ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.0ms.
-- Data is authenticated: no
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: resolve call failed: Lookup failed due to system error: Invalid argument
[mikhail@fedora ~]$ systemd-resolve online.sbis.ru
online.sbis.ru: 91.213.144.193                 -- link: vpn0

-- Information acquired via protocol DNS in 29.4ms.
-- Data is authenticated: no

Yes as you can see, some of the requests also did not resolved, but this was more likely was a rare exception. But starting with version 248~rc1-1.fc35 all queries stopping resolving.

@codewiz
Copy link

codewiz commented Jun 18, 2021

This is happening to me with Dallas airport's WiFi. I can't resolve the captive portal's hostname with resolved 248.3:

resolvectl query vgw1-01.dal-dfw.lpv.attwifi.com
vgw1-01.dal-dfw.lpv.attwifi.com: resolve call failed: Lookup failed due to system error: Invalid argument

But oddly, resolving only v4 works just fine!

bernie@giskard:~% resolvectl query -4 vgw1-01.dal-dfw.lpv.attwifi.com
vgw1-01.dal-dfw.lpv.attwifi.com: 12.203.54.178 -- link: wlp0s20f3

-- Information acquired via protocol DNS in 7.2ms.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: network

There's no AAAA record:

bernie@giskard:~% resolvectl query -6 vgw1-01.dal-dfw.lpv.attwifi.com
vgw1-01.dal-dfw.lpv.attwifi.com: resolve call failed: 'vgw1-01.dal-dfw.lpv.attwifi.com' does not have any RR of the requested type

There's a single nameserver on the network:

bernie@giskard:~% resolvectl
Global
       Protocols: -LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 2 (enp0s31f6)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 3 (wlp0s20f3)
    Current Scopes: DNS mDNS/IPv4 mDNS/IPv6
         Protocols: +DefaultRoute +LLMNR mDNS=resolve -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 10.44.0.1
       DNS Servers: 10.44.0.1

Dig also has no issues with this server:

bernie@giskard:~% dig @10.44.0.1 vgw1-01.dal-dfw.lpv.attwifi.com

; <<>> DiG 9.16.16-RH <<>> @10.44.0.1 vgw1-01.dal-dfw.lpv.attwifi.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61631
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 9b241a4dc34f93660100000060ccd8099a172166a63d8e1e (good)
;; QUESTION SECTION:
;vgw1-01.dal-dfw.lpv.attwifi.com. IN    A

;; ANSWER SECTION:
vgw1-01.dal-dfw.lpv.attwifi.com. 5 IN   A       12.203.54.178

;; ADDITIONAL SECTION:
rpz.dns-override.       1       IN      SOA     localhost. root.localhost. 1623999719 3600 900 604800 60

;; Query time: 29 msec
;; SERVER: 10.44.0.1#53(10.44.0.1)
;; WHEN: Fri Jun 18 12:30:22 CDT 2021
;; MSG SIZE  rcvd: 170

Here's the debug log for one query:

Jun 18 12:34:27 giskard systemd-resolved[21136]: Got message type=method_call sender=:1.225 destination=org.freedesktop.resolve1 path=/org/freedesktop/resolve1 interface=org.freedesktop.resolve1.Manager member=ResolveHostname cookie=2 reply_cookie=0 signature=isit error-name=n/a error-message=n/a                                                                                 
Jun 18 12:34:27 giskard systemd-resolved[21136]: idn2_lookup_u8: vgw1-01.dal-dfw.lpv.attwifi.com → vgw1-01.dal-dfw.lpv.attwifi.com
Jun 18 12:34:27 giskard systemd-resolved[21136]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetConnectionUnixProcessID cookie=140 reply_cookie=0 signature=s error-name=n/a error-message=n/a                                                                                             
Jun 18 12:34:27 giskard systemd-resolved[21136]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.191 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=140 signature=u error-name=n/a error-message=n/a                                                                                                                                          
Jun 18 12:34:27 giskard systemd-resolved[21136]: D-Bus hostname resolution request from client PID 24123 (n/a) with UID 4294967295
Jun 18 12:34:27 giskard systemd-resolved[21136]: Looking up RR for vgw1-01.dal-dfw.lpv.attwifi.com IN A.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Looking up RR for vgw1-01.dal-dfw.lpv.attwifi.com IN AAAA.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=AddMatch cookie=141 reply_cookie=0 signature=s error-name=n/a error-message=n/a                                                                                                               
Jun 18 12:34:27 giskard systemd-resolved[21136]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetNameOwner cookie=142 reply_cookie=0 signature=s error-name=n/a error-message=n/a                                                                                                           
Jun 18 12:34:27 giskard systemd-resolved[21136]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.191 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=142 signature=s error-name=n/a error-message=n/a                                                                                                                                          
Jun 18 12:34:27 giskard systemd-resolved[21136]: Removing cache entry for vgw1-01.dal-dfw.lpv.attwifi.com IN A (expired 5s ago)
Jun 18 12:34:27 giskard systemd-resolved[21136]: Cache miss for vgw1-01.dal-dfw.lpv.attwifi.com IN A
Jun 18 12:34:27 giskard systemd-resolved[21136]: Firing regular transaction 62628 for <vgw1-01.dal-dfw.lpv.attwifi.com IN A> scope dns on wlp0s20f3/* (validate=yes).
Jun 18 12:34:27 giskard systemd-resolved[21136]: Using feature level UDP+EDNS0 for transaction 62628.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Using DNS server 10.44.0.1 for transaction 62628.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Announcing packet size 512 in egress EDNS(0) packet.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Emitting UDP, link MTU is 1500, socket MTU is 0, minimal MTU is 40
Jun 18 12:34:27 giskard systemd-resolved[21136]: Sending query packet with id 62628 of size 60.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Cache miss for vgw1-01.dal-dfw.lpv.attwifi.com IN AAAA
Jun 18 12:34:27 giskard systemd-resolved[21136]: Firing regular transaction 25145 for <vgw1-01.dal-dfw.lpv.attwifi.com IN AAAA> scope dns on wlp0s20f3/* (validate=yes).
Jun 18 12:34:27 giskard systemd-resolved[21136]: Using feature level UDP+EDNS0 for transaction 25145.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Using DNS server 10.44.0.1 for transaction 25145.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Announcing packet size 512 in egress EDNS(0) packet.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Emitting UDP, link MTU is 1500, socket MTU is 0, minimal MTU is 40
Jun 18 12:34:27 giskard systemd-resolved[21136]: Sending query packet with id 25145 of size 60.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.191 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=141 signature= error-name=n/a error-message=n/a                                                                                                                                           
Jun 18 12:34:27 giskard systemd-resolved[21136]: Match type='signal',sender='org.freedesktop.DBus',path='/org/freedesktop/DBus',interface='org.freedesktop.DBus',member='NameOwnerChanged',arg0=':1.225' successfully installed.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Received dns UDP packet of size 142, ifindex=3, ttl=0, fragsize=0
Jun 18 12:34:27 giskard systemd-resolved[21136]: Processing incoming packet of size 142 on transaction 62628 (rcode=SUCCESS).
Jun 18 12:34:27 giskard systemd-resolved[21136]: Added positive unauthenticated non-confidential cache entry for vgw1-01.dal-dfw.lpv.attwifi.com IN A 1s on wlp0s20f3/INET/10.44.0.1
Jun 18 12:34:27 giskard systemd-resolved[21136]: Regular transaction 62628 for <vgw1-01.dal-dfw.lpv.attwifi.com IN A> on scope dns on wlp0s20f3/* now complete with <success> from network (unsigned; non-confidential).
Jun 18 12:34:27 giskard systemd-resolved[21136]: Received dns UDP packet of size 126, ifindex=3, ttl=0, fragsize=0
Jun 18 12:34:27 giskard systemd-resolved[21136]: Processing incoming packet of size 126 on transaction 25145 (rcode=SUCCESS).
Jun 18 12:34:27 giskard systemd-resolved[21136]: Regular transaction 25145 for <vgw1-01.dal-dfw.lpv.attwifi.com IN AAAA> on scope dns on wlp0s20f3/* now complete with <success> from network (unsigned; non-confidential).
Jun 18 12:34:27 giskard systemd-resolved[21136]: Freeing transaction 62628.
Jun 18 12:34:27 giskard systemd-resolved[21136]: Sent message type=error sender=n/a destination=:1.225 path=n/a interface=n/a member=n/a cookie=143 reply_cookie=2 signature=s error-name=org.freedesktop.DBus.Error.InvalidArgs error-message=Lookup failed due to system error: Invalid argument
Jun 18 12:34:27 giskard systemd-resolved[21136]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=RemoveMatch cookie=144 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Jun 18 12:34:27 giskard systemd-resolved[21136]: Freeing transaction 25145.
bernie@giskard:~% resolvectl --version
systemd 248 (v248.3-1.fc34)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified

@codewiz
Copy link

codewiz commented Jul 11, 2021

I think the needs-reporter-feedback label should be dropped now (logs have been provided as requested).

@mrc0mmand mrc0mmand removed the needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer label Jul 11, 2021
@jamincollins
Copy link

I'm seeing similar behavior with a Response Policy Zone (RPZ) in BIND.

$ resolvectl query desertoasis.asgardsrealm.net
desertoasis.asgardsrealm.net: resolve call failed: Lookup failed due to system error: Invalid argument

$ resolvectl query -4 desertoasis.asgardsrealm.net
desertoasis.asgardsrealm.net: 192.168.11.2     -- link: enp6s0f3u1u1

-- Information acquired via protocol DNS in 4.0ms.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: network

$ resolvectl query -6 desertoasis.asgardsrealm.net
desertoasis.asgardsrealm.net: resolve call failed: 'desertoasis.asgardsrealm.net' does not have any RR of the requested type
$ resolvectl status
Global
           Protocols: +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
    resolv.conf mode: stub
Fallback DNS Servers: 1.1.1.1#cloudflare-dns.com 9.9.9.10#dns.quad9.net 8.8.8.8#dns.google
                      2606:4700:4700::1111#cloudflare-dns.com 2620:fe::10#dns.quad9.net 2001:4860:4860::8888#dns.google

Link 2 (enp2s0f0)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 3 (wlp3s0)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 4 (enp6s0f3u1u1)
    Current Scopes: DNS
         Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.11.10
       DNS Servers: 192.168.11.10
        DNS Domain: az

Like above dig has no issue:

$ dig desertoasis.asgardsrealm.net

; <<>> DiG 9.16.23 <<>> desertoasis.asgardsrealm.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16687
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;desertoasis.asgardsrealm.net.	IN	A

;; ANSWER SECTION:
desertoasis.asgardsrealm.net. 5	IN	A	192.168.11.2

;; ADDITIONAL SECTION:
rpz.			1	IN	SOA	localhost. root.localhost. 1594332896 10800 3600 604800 38400

;; Query time: 3 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Mon Jan 10 13:34:04 MST 2022
;; MSG SIZE  rcvd: 139

@g5pw
Copy link

g5pw commented Jan 31, 2022

We have the same problem with named RPZ usage

@jamincollins
Copy link

Looks like three different users are all experiencing this issue, any update? Is further information needed? If so, what information? I'm happy to provide any additional information.

@g5pw
Copy link

g5pw commented Feb 7, 2022

Could this be similar to #17384? I seem to have the same behavior via -4/-6 (-4 works, -6 fails)

@g5pw
Copy link

g5pw commented Feb 7, 2022

Sorry for double-posting. On a whim, I tried to add an AAAA record to the RPZ and resolvectl is working correctly now, so we have a workaround. @jamincollins could you try and replicate that?

@jamincollins
Copy link

@g5pw, from my testing the only way this resolves the above failure is if a AAAA record is created for the RPZ host(s). However, my testing also seems to show that if a AAAA record is created the IPv6 address in the response is preferred over the IPv4 address.

In my case this is a problem as the host in question does not actually have an IPv6 address.

@pyther
Copy link

pyther commented Mar 26, 2022

Adding a data point, I encountered this issue as well on Fedora 35 and can consistently reproduce it.

systemd-resolve --version
systemd 249 (v249.9-1.fc35)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified
# Normal response
[mg5@ink speedtest]$ systemd-resolve accounts.myq-cloud.com
accounts.myq-cloud.com: 40.71.237.168          -- link: enp3s0
                        (accounts-east.myq-cloud.com)

-- Information acquired via protocol DNS in 22.0ms.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: yes
-- Data from: cache network

# Successful, but took 10 seconds
[mg5@ink speedtest]$ systemd-resolve accounts.myq-cloud.com
accounts.myq-cloud.com: 40.71.237.168          -- link: enp3s0
                        (accounts-east.myq-cloud.com)

-- Information acquired via protocol DNS in 10.1101s.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: yes
-- Data from: cache network

# Failed
[mg5@ink speedtest]$ systemd-resolve accounts.myq-cloud.com
accounts.myq-cloud.com: resolve call failed: Lookup failed due to system error: Invalid argument

@jluebbe
Copy link
Contributor

jluebbe commented Aug 30, 2022

Sorry for double-posting. On a whim, I tried to add an AAAA record to the RPZ and resolvectl is working correctly now, so we have a workaround. @jamincollins could you try and replicate that?

I've had the same symptom with my RPZ (Invalid argument without -4/-6, working with explicit -4). The workaround of adding an AAAA record helped here.

@jamincollins
Copy link

@jluebbe what record would you add for a host that does not have an ipv6 address?

@jluebbe
Copy link
Contributor

jluebbe commented Aug 30, 2022

@jluebbe what record would you add for a host that does not have an ipv6 address?

My host actually has a v6 address (which I just didn't have in my RPZ before). I don't have a good suggestion for your case, sorry.

@g5pw
Copy link

g5pw commented Sep 14, 2022

Hey all, we're still seeing this bug, output from journalctl:

17:05:16 systemd-resolved[]: Got message type=method_call sender=:1.33057 destination=org.freedesktop.resolve1 path=/org/freedesktop/resolve1 interface=org.freedesktop.resolve1.Manager member=ResolveHostname cookie=2 reply_cookie=0 signature=isit error-name=n/a error-message=n/a
17:05:16 systemd-resolved[]: idn2_lookup_u8: <domain_name> → <domain_name>
17:05:16 systemd-resolved[]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetConnectionUnixProcessID cookie=42379 reply_cookie=0 signature=s error-name=n/a error-message=n/a
17:05:16 systemd-resolved[]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.1 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=42379 signature=u error-name=n/a error-message=n/a
17:05:16 systemd-resolved[]: D-Bus hostname resolution request from client PID 284192 (n/a) with UID 4294967295
17:05:16 systemd-resolved[]: Looking up RR for <domain_name> IN A.
17:05:16 systemd-resolved[]: Looking up RR for <domain_name> IN AAAA.
17:05:16 systemd-resolved[]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=AddMatch cookie=42380 reply_cookie=0 signature=s error-name=n/a error-message=n/a
17:05:16 systemd-resolved[]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetNameOwner cookie=42381 reply_cookie=0 signature=s error-name=n/a error-message=n/a
17:05:16 systemd-resolved[]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.1 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=42381 signature=s error-name=n/a error-message=n/a
17:05:16 systemd-resolved[]: Cache miss for <domain_name> IN AAAA
17:05:16 systemd-resolved[]: Firing regular transaction 16320 for <<domain_name> IN AAAA> scope dns on enp3s0/* (validate=yes).
17:05:16 systemd-resolved[]: Using feature level UDP+EDNS0 for transaction 16320.
17:05:16 systemd-resolved[]: Using DNS server <dns_ip> for transaction 16320.
17:05:16 systemd-resolved[]: Announcing packet size 1472 in egress EDNS(0) packet.
17:05:16 systemd-resolved[]: Emitting UDP, link MTU is 1500, socket MTU is 0, minimal MTU is 40
17:05:16 systemd-resolved[]: Sending query packet with id 16320 of size 50.
17:05:16 systemd-resolved[]: Cache miss for <domain_name> IN A
17:05:16 systemd-resolved[]: Firing regular transaction 51265 for <<domain_name> IN A> scope dns on enp3s0/* (validate=yes).
17:05:16 systemd-resolved[]: Using feature level UDP+EDNS0 for transaction 51265.
17:05:16 systemd-resolved[]: Using DNS server <dns_ip> for transaction 51265.
17:05:16 systemd-resolved[]: Announcing packet size 1472 in egress EDNS(0) packet.
17:05:16 systemd-resolved[]: Emitting UDP, link MTU is 1500, socket MTU is 0, minimal MTU is 40
17:05:16 systemd-resolved[]: Sending query packet with id 51265 of size 50.
17:05:16 systemd-resolved[]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.1 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=42380 signature= error-name=n/a error-message=n/a
17:05:16 systemd-resolved[]: Match type='signal',sender='org.freedesktop.DBus',path='/org/freedesktop/DBus',interface='org.freedesktop.DBus',member='NameOwnerChanged',arg0=':1.33057' successfully installed.
17:05:16 systemd-resolved[]: Received dns UDP packet of size 92, ifindex=2, ttl=0, fragsize=0, sender=<dns_ip>, destination=<host_ip>
17:05:16 systemd-resolved[]: Processing incoming packet of size 92 on transaction 16320 (rcode=SUCCESS).
17:05:16 systemd-resolved[]: Regular transaction 16320 for <<domain_name> IN AAAA> on scope dns on enp3s0/* now complete with <success> from network (unsigned; non-confidential).
17:05:16 systemd-resolved[]: Received dns UDP packet of size 108, ifindex=2, ttl=0, fragsize=0, sender=<dns_ip>, destination=<host_ip>
17:05:16 systemd-resolved[]: Processing incoming packet of size 108 on transaction 51265 (rcode=SUCCESS).
17:05:16 systemd-resolved[]: Added positive unauthenticated non-confidential cache entry for <domain_name> IN A 1s on enp3s0/INET/<dns_ip>
17:05:16 systemd-resolved[]: Regular transaction 51265 for <<domain_name> IN A> on scope dns on enp3s0/* now complete with <success> from network (unsigned; non-confidential).
17:05:16 systemd-resolved[]: Freeing transaction 16320.
17:05:16 systemd-resolved[]: Sent message type=error sender=n/a destination=:1.33057 path=n/a interface=n/a member=n/a cookie=42382 reply_cookie=2 signature=s error-name=org.freedesktop.DBus.Error.InvalidArgs error-message=Lookup failed due to system error: Invalid argument
17:05:16 systemd-resolved[]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=RemoveMatch cookie=42383 reply_cookie=0 signature=s error-name=n/a error-message=n/a
17:05:16 systemd-resolved[]: Freeing transaction 51265.

version:

systemd 250 (v250.8-1.fc36)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified

@cstrahan-blueshift
Copy link

cstrahan-blueshift commented Oct 31, 2022

Maybe I'm having the same problem? Can't resolve our AWS instance/load-balancer host names:

$ resolvectl query internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com
internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com: resolve call failed: 'internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com' does not have any RR of the requested type

Using dig and specifying the DNS ip address does work, though.

Is there a workaround for this? I've been manually touching up my /etc/hosts up to this point (for about the past month, I think). It's been a bit embarrassing being "that Linux guy" at work who has to hold up our meetings while I futz with /etc/hosts 😅.

journalctl logs (after systemctl service-log-level systemd-resolved debug):

Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Got message type=method_call sender=:1.10872 destination=org.freedesktop.resolve1 path=/org/freedesktop/resolve1 interface=org.freedesktop.resolve1.Manager member=ResolveHostname cookie=2 reply_cookie=0 signature=isit error-name=n/a error-message=n/a
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: idn2_lookup_u8: internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com → internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetConnectionUnixProcessID cookie=65 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.10859 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=65 signature=u error-name=n/a error-message=n/a
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: D-Bus hostname resolution request from client PID 3002067 (n/a) with UID 4294967295
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Looking up RR for internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN A.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Looking up RR for internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN AAAA.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=AddMatch cookie=66 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetNameOwner cookie=67 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.10859 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=67 signature=s error-name=n/a error-message=n/a
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Cache miss for internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN AAAA
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Firing regular transaction 41360 for <internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN AAAA> scope dns on wlp0s20f3/* (validate=yes).
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Using feature level UDP+EDNS0 for transaction 41360.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Using DNS server 192.168.1.1 for transaction 41360.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Announcing packet size 1472 in egress EDNS(0) packet.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Emitting UDP, link MTU is 1500, socket MTU is 0, minimal MTU is 40
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Sending query packet with id 41360 of size 94.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Cache miss for internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN A
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Firing regular transaction 4003 for <internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN A> scope dns on wlp0s20f3/* (validate=yes).
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Using feature level UDP+EDNS0 for transaction 4003.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Using DNS server 192.168.1.1 for transaction 4003.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Announcing packet size 1472 in egress EDNS(0) packet.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Emitting UDP, link MTU is 1500, socket MTU is 0, minimal MTU is 40
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Sending query packet with id 4003 of size 94.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Cache miss for internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN A
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Firing regular transaction 23605 for <internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN A> scope dns on tun0/* (validate=yes).
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Using feature level UDP+EDNS0 for transaction 23605.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Using DNS server 10.20.0.2 for transaction 23605.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Announcing packet size 1472 in egress EDNS(0) packet.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Emitting UDP, link MTU is 1500, socket MTU is 0, minimal MTU is 40
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Sending query packet with id 23605 of size 94.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Cache miss for internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN AAAA
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Firing regular transaction 30648 for <internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN AAAA> scope dns on tun0/* (validate=yes).
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Using feature level UDP+EDNS0 for transaction 30648.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Using DNS server 10.20.0.2 for transaction 30648.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Announcing packet size 1472 in egress EDNS(0) packet.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Emitting UDP, link MTU is 1500, socket MTU is 0, minimal MTU is 40
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Sending query packet with id 30648 of size 94.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.10859 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=66 signature= error-name=n/a error-message=n/a
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Match type='signal',sender='org.freedesktop.DBus',path='/org/freedesktop/DBus',interface='org.freedesktop.DBus',member='NameOwnerChanged',arg0=':1.10872' successfully installed.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Received dns UDP packet of size 100, ifindex=3, ttl=0, fragsize=0
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Processing incoming packet of size 100 on transaction 4003 (rcode=SUCCESS).
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Not caching negative entry without a SOA record: internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN A
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Regular transaction 4003 for <internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN A> on scope dns on wlp0s20f3/* now complete with <success> from network (unsigned; non-confidential).
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Received dns UDP packet of size 172, ifindex=3, ttl=0, fragsize=0
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Processing incoming packet of size 172 on transaction 41360 (rcode=SUCCESS).
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Not caching negative entry for: internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN AAAA, cache mode set to no-negative
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Regular transaction 41360 for <internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com IN AAAA> on scope dns on wlp0s20f3/* now complete with <success> from network (unsigned; non-confidential).
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Freeing transaction 4003.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Freeing transaction 23605.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Added socket 19 to graveyard
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Freeing transaction 30648.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Added socket 20 to graveyard
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Sent message type=error sender=n/a destination=:1.10872 path=n/a interface=n/a member=n/a cookie=68 reply_cookie=2 signature=s error-name=org.freedesktop.resolve1.NoSuchRR error-message='internal-firefly-release-lb-492455747.us-west-2.elb.amazonaws.com' does not have any RR of the requested type
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=RemoveMatch cookie=69 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Freeing transaction 41360.
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Closing graveyard socket fd 20
Oct 31 17:37:11 8l05sc182305 systemd-resolved[2995963]: Closing graveyard socket fd 19

System details:

$ apt list --installed | grep systemd

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnss-systemd/jammy,now 249.11-0ubuntu3pop0~1663192488~22.04~d6043af amd64 [installed,automatic]
libpam-systemd/jammy,now 249.11-0ubuntu3pop0~1663192488~22.04~d6043af amd64 [installed,automatic]
libsystemd0/jammy,now 249.11-0ubuntu3pop0~1663192488~22.04~d6043af amd64 [installed,automatic]
python3-systemd/jammy,now 234-3ubuntu2 amd64 [installed,automatic]
systemd-sysv/jammy,now 249.11-0ubuntu3pop0~1663192488~22.04~d6043af amd64 [installed,automatic]
systemd-timesyncd/jammy,now 249.11-0ubuntu3pop0~1663192488~22.04~d6043af amd64 [installed,automatic]
systemd/jammy,now 249.11-0ubuntu3pop0~1663192488~22.04~d6043af amd64 [installed,automatic]

$ cat /etc/os-release
NAME="Pop!_OS"
VERSION="22.04 LTS"
ID=pop
ID_LIKE="ubuntu debian"
PRETTY_NAME="Pop!_OS 22.04 LTS"
VERSION_ID="22.04"
HOME_URL="https://pop.system76.com"
SUPPORT_URL="https://support.system76.com"
BUG_REPORT_URL="https://github.com/pop-os/pop/issues"
PRIVACY_POLICY_URL="https://system76.com/privacy"
VERSION_CODENAME=jammy
UBUNTU_CODENAME=jammy
LOGO=distributor-logo-pop-os

$ uname -a
Linux 8l05sc182305 5.19.16-76051916-generic #202210150742~1666053244~22.04~cf07008 SMP PREEMPT_DYNAMIC Tue O x86_64 x86_64 x86_64 GNU/Linux

@cstrahan-blueshift
Copy link

I would be more than happy to pair up with anyone interested in resolving this -- I can make myself available whenever.

@wdoekes
Copy link

wdoekes commented Jan 19, 2023

For bind with RPZ, this appears to work:
https://lists.isc.org/mailman/htdig/bind-users/2012-February/086764.html

Instead of this in your rpz.local zone:

override.example.com  IN A  1.2.3.4

You do this:

override.example.com        IN CNAME  override.rpz.my-domain.net
override.rpz.my-domain.net  IN A      1.2.3.4

Now A/AAAA lookups both return the CNAME. And the CNAME target (for a legit domain) can safely return only an A record.


Also, as mentioned above, regular queries work fine. But calls through the bus (resolvectl or libnss-resolve) only exhibit the issue.


As for the cause of the issue, I think we're moving through these parts:

void dns_transaction_process_reply(DnsTransaction *t, DnsPacket *p, bool encrypted) {
...
        /* After the superficial checks, actually parse the message. */
        r = dns_packet_extract(p);
        if (r < 0) {
...
                dns_transaction_complete(t, DNS_TRANSACTION_INVALID_REPLY);
                return;
        }

        dns_transaction_process_dnssec(t);
        return;
...
static void dns_transaction_process_dnssec(DnsTransaction *t) {
...
        dns_transaction_cache_answer(t);

        if (t->answer_rcode == DNS_RCODE_SUCCESS)
                dns_transaction_complete(t, DNS_TRANSACTION_SUCCESS);
        else
                dns_transaction_complete(t, DNS_TRANSACTION_RCODE_FAILURE);

        return;
void dns_transaction_complete(DnsTransaction *t, DnsTransactionState state) {
...
        log_debug("%s transaction %" PRIu16 " for <%s> on scope %s on %s/%s now complete with <%s> from %s (%s; %s).",
...

We get this log message:

Processing incoming packet of size 109 on transaction 39064 (rcode=SUCCESS)

And this:

Regular transaction 39064 for <override.example.com IN AAAA> on scope dns on vpn-xxx/* now complete with <success> from network (unsigned; non-confidential).

But not this, which should be between those two:

Not caching negative entry for: override.example.com IN AAAA, cache mode set to no-negative

Because of the last message with <success>, I assume we're in that last dns_transaction_complete(t, DNS_TRANSACTION_SUCCESS);. In which case something is different in dns_transaction_cache_answer where we're not hitting the Not caching... log.

static void dns_transaction_cache_answer(DnsTransaction *t) {
...
        dns_cache_put(&t->scope->cache,
                      t->scope->manager->enable_cache,
                      dns_transaction_key(t),
                      t->answer_rcode,
                      t->answer,
                      DNS_PACKET_CD(t->received) ? t->received : NULL, /* only cache full packets with CD on,
                                                                        * since our usecase for caching them
                                                                        * is "bypass" mode which is only
                                                                        * enabled for CD packets. */
                      t->answer_query_flags,
                      t->answer_dnssec_result,
                      t->answer_nsec_ttl,
                      t->received->family,
                      &t->received->sender);
int dns_cache_put(
                DnsCache *c,
                DnsCacheMode cache_mode,
                DnsResourceKey *key,
                int rcode,
                DnsAnswer *answer,
                DnsPacket *full_packet,
                uint64_t query_flags,
                DnssecResult dnssec_result,
                uint32_t nsec_ttl,
                int owner_family,
                const union in_addr_union *owner_address) {
...
        /* See https://tools.ietf.org/html/rfc2308, which say that a matching SOA record in the packet is used to
         * enable negative caching. We apply one exception though: if we are about to cache a weird rcode we do so
         * regardless of a SOA. */
        r = dns_answer_find_soa(answer, key, &soa, &flags);
        if (r < 0)
                goto fail;
        if (r == 0 && !weird_rcode)
                return 0;
...

Probably dns_answer_find_soa returns 0, because we get no example.com. SOA, but an rpz.local. one:

;; ADDITIONAL SECTION:
rpz.local.		1	IN	SOA	localhost. root.localhost. 1 604800 86400 2419200 86400

(And not in the authority section, but in the additional one. See other people's dig output.)


But, that was not the problem either. The problem is here:

int dns_answer_add(
                DnsAnswer *a,
                DnsResourceRecord *rr,
                int ifindex,
                DnsAnswerFlags flags,
                DnsResourceRecord *rrsig) {

        DnsAnswerItem tmp, *exist;

        assert(rr);

        if (!a)
                return -ENOSPC;
        if (a->n_ref > 1)
                return -EBUSY;

        tmp = (DnsAnswerItem) {
                .rr = rr,
                .ifindex = ifindex,
        };

        exist = set_get(a->set_items, &tmp);
        if (exist) {
                /* There's already an RR of the same RRset in place! Let's see if the TTLs more or less
                 * match. We don't really care if they match precisely, but we do care whether one is 0 and
                 * the other is not. See RFC 2181, Section 5.2. */
                if ((rr->ttl == 0) != (exist->rr->ttl == 0))
                        return -EINVAL;

This code is invoked from dns_transaction_complete()

With these values:

--- rr->ttl	2023-01-19 13:37:54.704345610 +0100
+++ expect->rr->ttl	2023-01-19 13:37:53.992350201 +0100
@@ -1,8 +1,8 @@
 {
-  n_ref = 1,
-  ttl = 0,
+  n_ref = 2,
+  ttl = 1,
   expiry = 18446744073709551615,
-  key = 0x55f8d52c35e0,
+  key = 0x55f8d52cbdb0,
   to_string = 0x0,
   n_skip_labels_signer = 255 '\377',
   n_skip_labels_source = 255 '\377',
@@ -12,63 +12,63 @@
   wire_format_size = 0,
   wire_format_rdata_offset = 0,
   {
-    generic = {data = 0x55f8d52eb810,
-    data_size = 94527216755152},
-    opt = {data = 0x55f8d52eb810,
-    data_size = 94527216755152},
-    srv = {name = 0x55f8d52eb810 "localhost",
-    priority = 8656,
+    generic = {data = 0x55f8d52d7220,
+    data_size = 94527216776608},
+    opt = {data = 0x55f8d52d7220,
+    data_size = 94527216776608},
+    srv = {name = 0x55f8d52d7220 "localhost",
+    priority = 30112,
#0  dns_answer_add (a=a@entry=0x55f8d52e0010, rr=0x55f8d52dfaf0, ifindex=18, flags=DNS_ANSWER_SECTION_ADDITIONAL, rrsig=0x0) at ../src/resolve/resolved-dns-answer.c:167
#1  0x000055f8d438925e in dns_answer_add_all (b=0x55f8d52d77d0, a=0x55f8d52e0010) at ../src/resolve/resolved-dns-answer.c:195
#2  dns_answer_merge (ret=<synthetic pointer>, b=0x55f8d52d77d0, a=<optimized out>) at ../src/resolve/resolved-dns-answer.c:447
#3  dns_answer_extend (a=0x55f8d52d7f30, b=0x55f8d52d77d0) at ../src/resolve/resolved-dns-answer.c:462
#4  0x000055f8d4352175 in dns_query_accept (q=0x55f8d52d7ee0, c=0x55f8d52cb190) at ../src/resolve/resolved-dns-query.c:849
#5  0x000055f8d4366c18 in dns_transaction_complete (t=0x55f8d52cf2c0, state=<optimized out>) at ../src/resolve/resolved-dns-transaction.c:437
#6  0x000055f8d436a55d in on_dns_packet (s=<optimized out>, fd=<optimized out>, revents=<optimized out>, userdata=0x55f8d52cf2c0) at ../src/resolve/resolved-dns-transaction.c:1457
#7  0x00007fce5059d8e0 in source_dispatch (s=s@entry=0x55f8d52cf530) at ../src/libsystemd/sd-event/sd-event.c:3526

So, because expect->rr->ttl holds 1 and rr->ttl holds 0, we get EINVAL.

This TTL of 1 is hardcoded in bind9. If we change it to 111, the problem persists. If we change it to 0, the problem goes away. But then bind is serving a SOA with a TTL of 0:

bind9-9.18.1$ diff -pu lib/ns/query.c{.orig,}
--- lib/ns/query.c.orig	2023-01-19 14:18:47.246309269 +0100
+++ lib/ns/query.c	2023-01-19 14:17:25.660158590 +0100
@@ -7197,7 +7197,7 @@ query_checkrpz(query_ctx_t *qctx, isc_re
 		if (qctx->rpz_st->m.rpz->addsoa) {
 			bool override_ttl =
 				dns_rdataset_isassociated(qctx->rdataset);
-			rresult = query_addsoa(qctx, override_ttl,
+			rresult = query_addsoa(qctx, override_ttl * 0,
 					       DNS_SECTION_ADDITIONAL);
 			if (rresult != ISC_R_SUCCESS) {
 				QUERY_ERROR(qctx, result);

Not sure where the problem is. But this should be sufficient debug info to get to the bottom of this.

Cheers,
Walter Doekes
OSSO B.V.

@poettering
Copy link
Member

Fixed by 8ec951e, i.e. merging multiple RRs with different TTLs into one rrset is no longer refused with EINVAL. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

10 participants