Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't inherit DNS from DHCP on Tentacool #364

Closed
samhh opened this issue Oct 30, 2022 · 25 comments
Closed

Don't inherit DNS from DHCP on Tentacool #364

samhh opened this issue Oct 30, 2022 · 25 comments

Comments

@samhh
Copy link
Owner

samhh commented Oct 30, 2022

Tentacool hosts Onix, my LAN DNS server. When updating its container it goes down, and it then fails to fetch an image. I can workaround this by supplying a second DNS server like 8.8.8.8 via DHCP, but this can let ads through on other devices. Ideally Tentacool would have hardcoded DNS not from DHCP.

@samhh
Copy link
Owner Author

samhh commented Nov 7, 2022

For checking DNS incl/ if set by DHCP:

$ resolvconf -l

@samhh
Copy link
Owner Author

samhh commented Nov 7, 2022

The above output updates immediately, but no luck with c0878e6 or 199597e for 3cbfa1d.

Nov 07 23:31:00 tentacool systemd[1]: Starting podman-pihole.service...
Nov 07 23:31:00 tentacool podman-pihole-start[3386227]: Resolving "pihole/pihole" using unqualified-search registries (/etc/containers/registries.conf)
Nov 07 23:31:00 tentacool podman-pihole-start[3386227]: Trying to pull docker.io/pihole/pihole:2022.10...
Nov 07 23:31:00 tentacool podman-pihole-start[3386227]: Trying to pull quay.io/pihole/pihole:2022.10...
Nov 07 23:31:00 tentacool podman-pihole-start[3386227]: Error: 2 errors occurred while pulling:
Nov 07 23:31:00 tentacool podman-pihole-start[3386227]:  * initializing source docker://pihole/pihole:2022.10: pinging container registry registry-1.docker.io: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io: no such host
Nov 07 23:31:00 tentacool podman-pihole-start[3386227]:  * initializing source docker://quay.io/pihole/pihole:2022.10: pinging container registry quay.io: Get "https://quay.io/v2/": dial tcp: lookup quay.io: no such host
Nov 07 23:31:00 tentacool systemd[1]: podman-pihole.service: Main process exited, code=exited, status=125/n/a
Nov 07 23:31:00 tentacool podman-pihole-post-stop[3386242]: Error: reading CIDFile: open /run/podman-pihole.ctr-id: no such file or directory
Nov 07 23:31:00 tentacool systemd[1]: podman-pihole.service: Control process exited, code=exited, status=125/n/a
Nov 07 23:31:00 tentacool systemd[1]: podman-pihole.service: Failed with result 'exit-code'.
Nov 07 23:31:00 tentacool systemd[1]: Failed to start podman-pihole.service.
Nov 07 23:31:00 tentacool systemd[1]: podman-pihole.service: Scheduled restart job, restart counter is at 5.
Nov 07 23:31:00 tentacool systemd[1]: Stopped podman-pihole.service.

Maybe let's try reloading resolvconf first somehow, or rebooting? Though I'd expect Nix to take care of that stuff.

@samhh samhh pinned this issue Nov 7, 2022
@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

$ systemctl status resolvconf

Suggests nothing has happened with the service for four days. Perhaps it indeed needs reloading.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

No luck with a unit restart after updating the nameservers but before upgrading Pi-hole, regardless of the order of the nameservers.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

Simpler repro:

$ resolvconf -l
nameserver 127.0.0.1
nameserver 8.8.8.8

<dhcp nameservers>

$ nix-shell -p dogdns

$ dog samhh.com
A samhh.com. <time> <ip>

# systemctl stop podman-pihole

$ dog samhh.com
Error [network]: Connection refused (os error 111)

# systemctl start podman-pihole

$ dog samhh.com
A samhh.com. <time> <ip>

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

https://serverfault.com/a/513273:

The resolver will query the second name server only if the attempt to reach the first name server times out. In your case, it is not a time out issue, it is a resolution failure, so there is no need to query the remaining name servers.

I'm actually not sure now how I ever got it working before via DHCP. Presumably on startup something is checking for a working nameserver among those listed and with a Tentacool restart that's how it got beyond Onix.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

Perhaps using Unbound could workaround this. It wouldn't have the same issue as the new version would be fetched from nixpkgs before it's restarted. (Relevant issue for a nixpkgs derivation of Pi-hole: NixOS/nixpkgs#61617)

  • Pi-hole container: up/build -> down/fetch -> fail
  • Unbound derivation: up/fetch -> down/build -> up -> success

Then again, it's another moving part which I don't really need as Onix maintains its own cache:

$ dog samhh.com --time -n tentacool
A samhh.com. <etc>
Ran in 1ms
$ dog samhh.com --time -n 8.8.8.8
A samhh.com. <etc>
Ran in 7ms

Blocky also has a nixpkgs derivation. Really come to think of it the best solution is anything that gets this out of its container.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

Downside of leaving Pi-hole... figuring out how to plug anything new into UniFi.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

AdGuard Home is another option.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

More broadly, it's really irritating that NixOS brings down containers before it fetches new images. It causes needless downtime on upgrades of services like Starmie. In other words, I want zero-downtime deployments, which'd solve this issue by proxy. Then again I guess that's what you get if you just don't use containers...

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

Don't particularly want to lose niceties like Pi-hole integration on my phone.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

Back to the resolver. Putting an external nameserver first and restarting the resolvconf unit makes no difference, seemingly contradicting the idea that it runs them in order. Something else stateful at play?

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

Unbound will only help if the relevant container entries have been cached before Onix goes down. Or if it will do failover unlike resolvconf.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

I suppose a hacky workaround would be to use /etc/hosts to bypass resolvconf for docker.io or wherever else containers might come from.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

dmsmasq lets you specify nameservers for specific hosts which'd be a little less brittle, but that's bundled into the Pi-hole image.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

Another test of resolvconf changes:

$ # After a restart
$ dog samhh.com --time
<etc>
Ran in 20ms
$ dog samhh.com --time
<etc>
Ran in 0ms
# # rebuild with 8.8.8.8 placed at the top, validated with `resolvconf -l`
$ dog samhh.com --time
<etc>
Ran in 0ms
# systemctl restart resolvconf
$ dog samhh.com --time
<etc>
Ran in 0ms

It caches in Onix and then clearly keeps using Onix regardless of what else changes.

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

No luck with /etc/hosts:

44e1704

$ cat /etc/hosts
127.0.0.1 localhost
::1 localhost
127.0.0.2 tentacool
::1 tentacool
3.228.146.75 docker.io

Exact same service error. Maybe Podman doesn't pick up on host network changes properly?

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

No luck with:

commit 26218d34028095c6cfd9fc5961d807874ce14f3b
Author: Sam A. Horvath-Hunt <hello@samhh.com>
Date:   Tue Nov 8 19:28:20 2022 +0000

    Live resolvconf updates

diff --git a/hosts/tentacool/network.nix b/hosts/tentacool/network.nix
index d2bdff0..11828c3 100644
--- a/hosts/tentacool/network.nix
+++ b/hosts/tentacool/network.nix
@@ -12,4 +12,8 @@
     # nameservers in order to configure this external nameserver.
     nameservers = [ "8.8.8.8" "127.0.0.1" ];
   };
+
+  # This needs to be set so that a full system restart isn't needed:
+  #   https://unix.stackexchange.com/a/487615
+  system.nssDatabases.hosts = [ "resolve" ];
 }

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

Another idea as a workaround for now - pull the image before rebuilding whenever there's an Onix upgrade:

$ podman pull pihole/pihole:2022.10

Something's different about images which have already been successfully brought up as containers though:

$ podman pull pihole/pihole:2022.10
Trying to pull pihole/pihole:2022.10... etc
$ podman pull docker.io/pihole/pihole:2022.09.4
Trying to pull docker.io/pihole/pihole:2022.09.4... etc
$ podman pull pihole/pihole:2022.09.4
Resolved "pihole/pihole" as an alias (/home/sam/.cache/containers/short-name-aliases.conf)
Trying to pull pihole/pihole:2022.09.4... etc

I don't understand why this only applies to 2022.09.4 given the contents of the referenced file:

$ cat ~/.cache/containers/short-name-aliases.conf
[aliases]
  "pihole/pihole" = "docker.io/pihole/pihole"

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

wut

$ resolvconf -l
nameserver 127.0.0.1
nameserver 8.8.8.8
<etc>
$ cat /etc/resolv.conf
nameserver 127.0.0.1
options edns0

@samhh
Copy link
Owner Author

samhh commented Nov 8, 2022

No luck with:

commit e8e0124be0aeb3c4fc9595bd2c9d96e48788141a
Author: Sam A. Horvath-Hunt <hello@samhh.com>
Date:   Tue Nov 8 20:24:07 2022 +0000

    Rotate Tentacool nameservers

diff --git a/hosts/tentacool/network.nix b/hosts/tentacool/network.nix
index ee636db..7f2bc32 100644
--- a/hosts/tentacool/network.nix
+++ b/hosts/tentacool/network.nix
@@ -11,5 +11,10 @@
     # points only at Onix and this machine specifically will override DHCP's
     # nameservers in order to configure this additional, external nameserver.
     nameservers = [ "127.0.0.1" "8.8.8.8" ];
+
+    # resolvconf doesn't failover to other nameservers if the first it tries
+    # completely fails. This makes it rotate between all available options,
+    # which changes that unwanted behaviour as a side effect.
+    resolvconf.extraOptions = [ "rotate" ];
   };
 }

samhh added a commit that referenced this issue Nov 9, 2022
This reverts commit 49850e4.

Currently blocked by an inability to deploy to Tentacool due to
nameserver issues, see #364.
@samhh
Copy link
Owner Author

samhh commented Nov 10, 2022

Temporary workaround:

  1. networking.nameservers = [ "8.8.8.8" ];
  2. Rebuild
  3. Upgrade Onix
  4. Rebuild
  5. Reset/remove networking.nameservers
  6. Rebuild

@samhh samhh unpinned this issue Nov 10, 2022
@samhh
Copy link
Owner Author

samhh commented Jan 30, 2023

I can workaround this by supplying a second DNS server like 8.8.8.8 via DHCP, but this can let ads through on other devices.

Let's validate this. They could always use their own DNS anyway.

Edit: Yeah, this seems to let ads into the network. Block % went down too.

Edit 2: Makes sense, see also: https://www.reddit.com/r/pihole/comments/12dzrfq/small_question_while_updating_raspberrypis_os/jfaq80u/?context=3

@samhh
Copy link
Owner Author

samhh commented Aug 4, 2023

Trying UniFi's built-in adblocker as one fewer moving part, particularly for DNS, would be nice. If it's trash then probably Blocky as it's declarative and would solve this issue (+ #386).

@samhh
Copy link
Owner Author

samhh commented Aug 6, 2023

UniFi's adblocker was noticeably less effective. Went with Blocky: e22e0e3

@samhh samhh closed this as completed Aug 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant