Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SERVFAIL] Unbound with DoT enabled fails to resolve certain websites #1060

Closed
yuukiAme opened this issue May 2, 2024 · 4 comments
Closed

Comments

@yuukiAme
Copy link

yuukiAme commented May 2, 2024

Describe the bug
Sometimes and randomly, a few domains fails to load to my browser - Firefox and Vivaldi.
The main example in this issue will be fitgirl-repacks.site. Another domain is medium.com.

To reproduce
Steps to reproduce the behavior:

  1. Install dietpi, using dietpi-software to install pi-hole, unbound, which comes with its custom scripts.
  2. Activating DoT (DNS over TLS) in unbound from Dietpi Docs.
  3. use dig fitgirl-repacks.site @127.0.0.1 -p 5335 in Terminal of the unbound server.
  4. SERVFAIL will be in the output which it could not find the domain

Expected behavior
the website fitgirl-repacks.site resolves to a public IP and load the contents on the browser normally.

System:

  • Unbound version: Version 1.17.1
  • OS:
root@DietPi:~# uname -a
Linux DietPi 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr  3 17:24:16 BST 2023 aarch64 GNU/Linux

  • unbound -V output:
root@DietPi:~# unbound -V
Version 1.17.1

Configure line: --build=aarch64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-option-checking --disable-silent-rules --libdir=${prefix}/lib/aarch64-linux-gnu --runstatedir=/run --disable-maintainer-mode --disable-dependency-tracking --with-pythonmodule --with-pyunbound --enable-subnet --enable-dnstap --enable-systemd --with-libnghttp2 --with-chroot-dir= --with-dnstap-socket-path=/run/dnstap.sock --disable-rpath --with-pidfile=/run/unbound.pid --with-libevent --enable-tfo-client --with-rootkey-file=/usr/share/dns/root.key --disable-flto --enable-tfo-server
Linked libs: libevent 2.1.12-stable (it uses epoll), OpenSSL 3.0.11 19 Sep 2023
Linked modules: dns64 python subnetcache respip validator iterator
TCP Fastopen feature available

BSD licensed, see LICENSE in source package for details.
Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues

Additional information
I understand that I'm using a custom script with Pi-hole and Unbound
I have tried to read the journalctl log but I don't understand it well enough.
I understand that in the log, public IP address of fitgirl-repacks.site is returned to Unbound but the dig still fails.
I require some help to understand why the query was SERVFAIL.

I use Pi-hole, Unbound with DoT forwarded to Cloudflare DNS and Quad9 DNS.
My ISP blocks fitgirl-repacks.site among other domains that my country required them to ban from their own DNS. So my nslookup and dig will return a public IP, but it will still reject the query from returning if the query was unencrypted. Which is why I'm using Unbound with DoT.

Unbound with DoT activated and forwarded to Cloudflare and Quad9

root@DietPi:~# cat /etc/unbound/unbound.conf.d/dietpi-dot.conf
# Adding DNS-over-TLS support
server:
tls-cert-bundle: /etc/ssl/certs/ca-certificates.crt
## Cloudflare
forward-zone:
name: "."
forward-tls-upstream: yes
forward-addr: 1.1.1.1@853#cloudflare-dns.com
forward-addr: 1.0.0.1@853#cloudflare-dns.com
## Quad9
forward-addr: 9.9.9.9@853#dns.quad9.net
forward-addr: 149.112.112.112@853#dns.quad9.net

I have the output of journalctl -u unbound on my dietpi with verbosity : 4 in /etc/unbound/unbound.conf.d/dietpi.conf which is the unbound/unbound.conf

Journalctl log here.

unbound_log_fitgirl.txt

the output of dig with unbound on @127.0.0.1 -p 5335 here

root@DietPi:~# dig fitgirl-repacks.site @127.0.0.1 -p 5335

; <<>> DiG 9.18.24-1-Debian <<>> fitgirl-repacks.site @127.0.0.1 -p 5335
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 22893
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;fitgirl-repacks.site.          IN      A

;; Query time: 0 msec
;; SERVER: 127.0.0.1#5335(127.0.0.1) (UDP)
;; WHEN: Thu May 02 15:42:35 +07 2024
;; MSG SIZE  rcvd: 49

the output of dig with cloudflare unencrypted @1.1.1.1 -p 53

root@DietPi:~# dig fitgirl-repacks.site @1.1.1.1 -p 53

; <<>> DiG 9.18.24-1-Debian <<>> fitgirl-repacks.site @1.1.1.1 -p 53
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4016
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;fitgirl-repacks.site.          IN      A

;; ANSWER SECTION:
fitgirl-repacks.site.   8       IN      A       190.115.31.179

;; Query time: 32 msec
;; SERVER: 1.1.1.1#53(1.1.1.1) (UDP)
;; WHEN: Thu May 02 15:43:07 +07 2024
;; MSG SIZE  rcvd: 65

the nslookup using Cloudflare DNS

root@DietPi:~# nslookup fitgirl-repacks.site
Server:         1.1.1.1
Address:        1.1.1.1#53

Non-authoritative answer:
Name:   fitgirl-repacks.site
Address: 190.115.31.179
@wcawijngaards
Copy link
Member

From the logs that you helpfully included, it shows that the domain is merely failing DNSSEC validation. The domain has uploaded a DS record to .site that wants a particular KSK, but that KSK is not present for the domain's DNSKEY RRset. In the logs this shows as the messages Failed to match any usable DS to a DNSKEY and Did not match a DS to a DNSKEY, thus bogus as it fails to resolve the domain.

One solution could be to turn off DNSSEC validation. It is possible to disable dnssec validation specifically for certain domains by listing these domains with domain-insecure: "example.com", then these domains have no DNSSEC validation and do not fail. If there are several such domains, list them with a domain-insecure line for each.

The 0x20 feature is enabled to use capitals for id, but since this is forwarded with DoT, that is unnecessary. It looks harmless, but I would turn it off, I guess, with use-caps-for-id: no. It turns out to not produce the failure in this case, so this change is not necessary.

@yuukiAme
Copy link
Author

yuukiAme commented May 2, 2024

Thank you so much for the response, @wcawijngaards .

Steps I have taken as suggested from the comment above:

  1. I edited unbound.conf which is dietpi.conf and added/changed the following lines in the config file

domain-insecure: fitgirl-repacks.site

use-caps-for-id: no

  1. save the changes and restart unbound with systemctl restart unbound

  2. I checked with dig fitgirl-repacks.site @127.0.0.1 -p 5335

output

root@DietPi:~# dig fitgirl-repacks.site @127.0.0.1 -p 5335

; <<>> DiG 9.18.24-1-Debian <<>> fitgirl-repacks.site @127.0.0.1 -p 5335
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64045
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;fitgirl-repacks.site.          IN      A

;; ANSWER SECTION:
fitgirl-repacks.site.   242     IN      A       190.115.31.179

;; Query time: 0 msec
;; SERVER: 127.0.0.1#5335(127.0.0.1) (UDP)
;; WHEN: Thu May 02 16:47:56 +07 2024
;; MSG SIZE  rcvd: 65
  1. dig returned a public IP address.

When I used my browser to access the website, it fails to load at all.

I used another device to check and same thing happened. I used a VPN from a provider and checked website, it is online and active.

What am I missing here?

@wcawijngaards
Copy link
Member

I guess this is something related to the web browser's settings to fetch DNS, like the port number that is used by dig, may not be set for the web browser. That I guess is part of the other system configuration.

@yuukiAme
Copy link
Author

yuukiAme commented May 2, 2024

I guess this is something related to the web browser's settings to fetch DNS, like the port number that is used by dig, may not be set for the web browser. That I guess is part of the other system configuration.

Thanks for the help again. I guess the issue is resolved because Unbound is no longer SERVFAIL with this specific website. I will check through my other services see if there is a conflict somewhere within the network.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants