Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with running Pi-hole 4.1.1 in swarm (--cap-add) #392

Closed
3 tasks done
shanekorbel opened this issue Jan 14, 2019 · 26 comments
Closed
3 tasks done

Issue with running Pi-hole 4.1.1 in swarm (--cap-add) #392

shanekorbel opened this issue Jan 14, 2019 · 26 comments
Labels

Comments

@shanekorbel
Copy link

shanekorbel commented Jan 14, 2019

In raising this issue, I confirm the following: {please fill the checkboxes, e.g: [X]}

How familiar are you with the the source code relevant to this issue?:

{1}


Expected behaviour:

`{Up until 4.1.1 I have been running pihole as a docker swarm service on a x64 linux cluster running ubuntu 16.04 (five node swarm). This required a little modifying of the normal install, but only in how to start up the image, no real functionality changes. This has allowed me to scale pihole to 2 replicas for redundancy and be able to allow pihole to run on any of the five hosts in the swarm should one fail.

}`

Actual behaviour:

`{Once i pulled the latest image (4.1.1) my service would no longer boot at all. I even removed my service and reset it up thinking it was something I accidently changed since i was trying to also configure placement preferences. Eventually it led me to the image repository and i noticed changes were made that required flags be added that are not allowed in swarm.

Starting with the v4.1.1 release your Pi-hole container may encounter issues starting the DNS service unless ran with the following settings:

--cap-add=NET_ADMIN This previously optional argument is now required or strongly encouraged
Starting in a future version FTL DNS is going to check this setting automatically
--dns=127.0.0.1 --dns=1.1.1.1 The second server can be any DNS IP of your choosing, but the first dns must be 127.0.0.1
A WARNING stating "Misconfigured DNS in /etc/resolv.conf" may show in docker logs without this.
These are the raw docker run cli versions of the commands. We provide no official support for docker GUIs but the community forums may be able to help if you do not see a place for these settings. Remember, always consult your manual too!

}`

Steps to reproduce:

`{I realize this is somewhat of a special config since running pihole as docker service is not the way it is typically ran, however swarm and compose are very similar in most regards other then the issue i found...

After some testing and having my service run a prior image, i have concluded that --cap-add=NET_ADMIN is the culprit that breaks pihole running as a service. I can see that --cap-add is not supported in swarm for security reasons and docker/moby says "it's being worked on" but it has been two+ years of "working on it".

Since this is not standard i'm willing to work with how to better articulate the issue if it is not clear or setup any sort of demo of the issue.
}`

Debug token provided by uploading pihole -d log:

{Service will not start, debug token unavailable.}

Troubleshooting undertaken, and/or other relevant information:

`{setting my image to 4.1 solves the issue and lets the service startup, i fear though, that --cap-add won't be supported in the near or even medium future and therefore i will be stuck on a old image that may or may not have issues in the future with compatibility.

docker service inspect pihole:

{
"AppArmorProfile": "docker-default",
"Args": [],
"Config": {
"ArgsEscaped": true,
"AttachStderr": false,
"AttachStdin": false,
"AttachStdout": false,
"Cmd": null,
"Domainname": "",
"Entrypoint": [
"/s6-init"
],
"Env": [
"WEBPASSWORD=",
"TZ=Eastern",
"IPv6=False",
"ServerIP=127.0.0.1",
"DNS1=1.1.1.1",
"DNS2=1.0.0.1",
"PATH=/opt/pihole:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"S6OVERLAY_RELEASE=https://github.com/just-containers/s6-overlay/releases/download/v1.21.7.0/s6-overlay-amd64.tar.gz",
"PIHOLE_INSTALL=/root/ph_install.sh",
"PHP_ENV_CONFIG=/etc/lighttpd/conf-enabled/15-fastcgi-php.conf",
"PHP_ERROR_LOG=/var/log/lighttpd/error.log",
"S6_LOGGING=0",
"S6_KEEP_ENV=1",
"S6_BEHAVIOUR_IF_STAGE2_FAILS=2",
"FTL_CMD=debug",
"VERSION=v4.1",
"ARCH=amd64"
],
"ExposedPorts": {
"443/tcp": {},
"53/tcp": {},
"53/udp": {},
"67/udp": {},
"80/tcp": {}
},
"Healthcheck": {
"Test": [
"CMD-SHELL",
"dig @127.0.0.1 pi.hole || exit 1"
]
},
"Hostname": "HPC2",
"Image": "pihole/pihole:4.1@sha256:3c165a8656d22b75ad237d86ba3bdf0d121088c144c0b2d34a0775a9db2048d7",
"Labels": {
"com.docker.swarm.node.id": "qkldc61gsxbqniii2rpkqjwlb",
"com.docker.swarm.service.id": "4ym908jc96qxvqnxuczn48mlh",
"com.docker.swarm.service.name": "pihole",
"com.docker.swarm.task": "",
"com.docker.swarm.task.id": "nb9cs1ietpry76pxdqf6acolj",
"com.docker.swarm.task.name": "pihole.1.nb9cs1ietpry76pxdqf6acolj",
"image": "pihole/pihole:v4.1_amd64",
"maintainer": "adam@diginc.us",
"url": "https://www.github.com/pi-hole/docker-pi-hole"
},
"OnBuild": null,
"OpenStdin": false,
"StdinOnce": false,
"Tty": false,
"User": "",
"Volumes": null,
"WorkingDir": ""
},
"Created": "2019-01-14T14:14:46.320931851Z",
"Driver": "overlay2",
"ExecIDs": [
"8b81d48090e6d2cb05f7b79c8b1c70377680ca98fc20e7b5049e093252affa4f",
"ab6630a1ccff10ae554acac07fd365d031e5621c9bf8088ed362ecdf6605f0e7"
],
"GraphDriver": {
"Data": {
"LowerDir": "/var/lib/docker/overlay2/7a7e3c7c9dd580732dbb4e248812fe156f454846d4adfcec1590a1148d56f5a7-init/diff:/var/lib/docker/overlay2/e4411ae9935a55823ed2e088892172b3ca26eccb18c12ad4f1d5ca797562f6c9/diff:/var/lib/docker/overlay2/f8acc507dc6e36cea4a286b4d3342aec43a53e38c3b0432f269abd4f1941bc0c/diff:/var/lib/docker/overlay2/bd4ce6bcd15554f56994abae241c770df8ac35fa02e17d5535e35fb13d3cdac3/diff:/var/lib/docker/overlay2/f9eb95e392611ec5b7fe11292e60be4f7a752c23455977f74a1a7965305e1d85/diff:/var/lib/docker/overlay2/b9787676c7ff67d298173c596bf712c81ec9b76b9c76ab5b404e8cae0b4f802d/diff:/var/lib/docker/overlay2/a686ee271505c41b38fd3b6a14a33d24dbc573166cf43f14c31753d997b78230/diff:/var/lib/docker/overlay2/73b4f186c76919a4ab386630bdd895eb12f522fb0899af5ff34ff958ec51bd88/diff:/var/lib/docker/overlay2/e6962d8e85bacab08f8912263a1b36152245f249de77577ebb633befd14aa9e9/diff:/var/lib/docker/overlay2/db004dda0cee0d98f4e691e1bcfc340d88e8e8a9946771c0c531381cec4d6954/diff:/var/lib/docker/overlay2/bbc8131790dadd3afe9f405fe55ebe92e38a6a619f337961e6c6b7412281282f/diff",
"MergedDir": "/var/lib/docker/overlay2/7a7e3c7c9dd580732dbb4e248812fe156f454846d4adfcec1590a1148d56f5a7/merged",
"UpperDir": "/var/lib/docker/overlay2/7a7e3c7c9dd580732dbb4e248812fe156f454846d4adfcec1590a1148d56f5a7/diff",
"WorkDir": "/var/lib/docker/overlay2/7a7e3c7c9dd580732dbb4e248812fe156f454846d4adfcec1590a1148d56f5a7/work"
},
"Name": "overlay2"
},
"HostConfig": {
"AutoRemove": false,
"Binds": null,
"BlkioDeviceReadBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceWriteIOps": null,
"BlkioWeight": 0,
"BlkioWeightDevice": null,
"CapAdd": null,
"CapDrop": null,
"Cgroup": "",
"CgroupParent": "",
"ConsoleSize": [
0,
0
],
"ContainerIDFile": "",
"CpuCount": 0,
"CpuPercent": 0,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpuShares": 0,
"CpusetCpus": "",
"CpusetMems": "",
"DeviceCgroupRules": null,
"Devices": null,
"DiskQuota": 0,
"Dns": null,
"DnsOptions": null,
"DnsSearch": null,
"ExtraHosts": [
"HPC1:172.16.0.191",
"HPC2:172.16.0.192",
"HPC3:172.16.0.193",
"HPC4:172.16.0.194",
"HPC5:172.16.0.195"
],
"GroupAdd": null,
"IOMaximumBandwidth": 0,
"IOMaximumIOps": 0,
"IpcMode": "shareable",
"Isolation": "default",
"KernelMemory": 0,
"Links": null,
"LogConfig": {
"Config": {},
"Type": "json-file"
},
"MaskedPaths": [
"/proc/acpi",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/proc/scsi",
"/sys/firmware"
],
"Memory": 0,
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": null,
"Mounts": [
{
"Source": "/etc/pihole",
"Target": "/etc/pihole",
"Type": "bind"
},
{
"Source": "/etc/dnsmasq.d",
"Target": "/etc/dnsmasq.d",
"Type": "bind"
},
{
"Source": "/etc/hosts",
"Target": "/etc/hosts",
"Type": "bind"
}
],
"NanoCpus": 0,
"NetworkMode": "default",
"OomKillDisable": false,
"OomScoreAdj": 0,
"PidMode": "",
"PidsLimit": 0,
"PortBindings": {},
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyPaths": [
"/proc/asound",
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
],
"ReadonlyRootfs": false,
"RestartPolicy": {
"MaximumRetryCount": 0,
"Name": ""
},
"Runtime": "runc",
"SecurityOpt": null,
"ShmSize": 67108864,
"UTSMode": "",
"Ulimits": null,
"UsernsMode": "",
"VolumeDriver": "",
"VolumesFrom": null
},
"HostnamePath": "/var/lib/docker/containers/724d84a1d0424866984fd3d5e1063a49372b98ac617e6da776d7d3d15597b8ad/hostname",
"HostsPath": "/etc/hosts",
"Id": "724d84a1d0424866984fd3d5e1063a49372b98ac617e6da776d7d3d15597b8ad",
"Image": "sha256:d2cae28ed1651910f7a2317594bd6d566cda90613eb3911cde92860630f81d95",
"LogPath": "/var/lib/docker/containers/724d84a1d0424866984fd3d5e1063a49372b98ac617e6da776d7d3d15597b8ad/724d84a1d0424866984fd3d5e1063a49372b98ac617e6da776d7d3d15597b8ad-json.log",
"MountLabel": "",
"Mounts": [
{
"Destination": "/etc/pihole",
"Mode": "",
"Propagation": "rprivate",
"RW": true,
"Source": "/etc/pihole",
"Type": "bind"
},
{
"Destination": "/etc/dnsmasq.d",
"Mode": "",
"Propagation": "rprivate",
"RW": true,
"Source": "/etc/dnsmasq.d",
"Type": "bind"
},
{
"Destination": "/etc/hosts",
"Mode": "",
"Propagation": "rprivate",
"RW": true,
"Source": "/etc/hosts",
"Type": "bind"
}
],
"Name": "/pihole.1.nb9cs1ietpry76pxdqf6acolj",
"NetworkSettings": {
"Bridge": "",
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"HairpinMode": false,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"MacAddress": "",
"Networks": {
"ingress": {
"Aliases": [
"724d84a1d042"
],
"DriverOpts": null,
"EndpointID": "f8ee7b5f03473fd0b62c8e2ca6fbd142b02e7c3a0d86f50e86e4c5df39c6b2f1",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAMConfig": {
"IPv4Address": "10.255.1.179"
},
"IPAddress": "10.255.1.179",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"Links": null,
"MacAddress": "02:42:0a:ff:01:b3",
"NetworkID": "szxnv7y1azs975wkgra7mc18s"
}
},
"Ports": {
"443/tcp": null,
"53/tcp": null,
"53/udp": null,
"67/udp": null,
"80/tcp": null
},
"SandboxID": "d55da5cd2e11a3d5c62a689e24d17deaed2aac729c4db74633488609fe3fe8ad",
"SandboxKey": "/var/run/docker/netns/d55da5cd2e11",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null
},
"Path": "/s6-init",
"Platform": "linux",
"ProcessLabel": "",
"ResolvConfPath": "/var/lib/docker/containers/724d84a1d0424866984fd3d5e1063a49372b98ac617e6da776d7d3d15597b8ad/resolv.conf",
"RestartCount": 0,
"State": {
"Dead": false,
"Error": "",
"ExitCode": 0,
"FinishedAt": "0001-01-01T00:00:00Z",
"Health": {
"FailingStreak": 0,
"Log": [
{
"End": "2019-01-14T09:57:37.641382259-05:00",
"ExitCode": 0,
"Output": "\n; <<>> DiG 9.10.3-P4-Debian <<>> @127.0.0.1 pi.hole\n; (1 server found)\n;; global options: +cmd\n;; Got answer:\n;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14480\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version: 0, flags:; udp: 4096\n;; QUESTION SECTION:\n;pi.hole.\t\t\tIN\tA\n\n;; ANSWER SECTION:\npi.hole.\t\t2\tIN\tA\t127.0.0.1\n\n;; Query time: 0 msec\n;; SERVER: 127.0.0.1#53(127.0.0.1)\n;; WHEN: Mon Jan 14 14:57:37 Eastern 2019\n;; MSG SIZE rcvd: 52\n\n",
"Start": "2019-01-14T09:57:37.39541478-05:00"
},
{
"End": "2019-01-14T09:58:07.820750563-05:00",
"ExitCode": 0,
"Output": "\n; <<>> DiG 9.10.3-P4-Debian <<>> @127.0.0.1 pi.hole\n; (1 server found)\n;; global options: +cmd\n;; Got answer:\n;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8257\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version: 0, flags:; udp: 4096\n;; QUESTION SECTION:\n;pi.hole.\t\t\tIN\tA\n\n;; ANSWER SECTION:\npi.hole.\t\t2\tIN\tA\t127.0.0.1\n\n;; Query time: 0 msec\n;; SERVER: 127.0.0.1#53(127.0.0.1)\n;; WHEN: Mon Jan 14 14:58:07 Eastern 2019\n;; MSG SIZE rcvd: 52\n\n",
"Start": "2019-01-14T09:58:07.651049203-05:00"
},
{
"End": "2019-01-14T09:58:38.046312332-05:00",
"ExitCode": 0,
"Output": "\n; <<>> DiG 9.10.3-P4-Debian <<>> @127.0.0.1 pi.hole\n; (1 server found)\n;; global options: +cmd\n;; Got answer:\n;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23608\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version: 0, flags:; udp: 4096\n;; QUESTION SECTION:\n;pi.hole.\t\t\tIN\tA\n\n;; ANSWER SECTION:\npi.hole.\t\t2\tIN\tA\t127.0.0.1\n\n;; Query time: 0 msec\n;; SERVER: 127.0.0.1#53(127.0.0.1)\n;; WHEN: Mon Jan 14 14:58:37 Eastern 2019\n;; MSG SIZE rcvd: 52\n\n",
"Start": "2019-01-14T09:58:37.830246755-05:00"
},
{
"End": "2019-01-14T09:59:08.194864943-05:00",
"ExitCode": 0,
"Output": "\n; <<>> DiG 9.10.3-P4-Debian <<>> @127.0.0.1 pi.hole\n; (1 server found)\n;; global options: +cmd\n;; Got answer:\n;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22643\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version: 0, flags:; udp: 4096\n;; QUESTION SECTION:\n;pi.hole.\t\t\tIN\tA\n\n;; ANSWER SECTION:\npi.hole.\t\t2\tIN\tA\t127.0.0.1\n\n;; Query time: 0 msec\n;; SERVER: 127.0.0.1#53(127.0.0.1)\n;; WHEN: Mon Jan 14 14:59:08 Eastern 2019\n;; MSG SIZE rcvd: 52\n\n",
"Start": "2019-01-14T09:59:08.055375217-05:00"
},
{
"End": "2019-01-14T09:59:38.337787788-05:00",
"ExitCode": 0,
"Output": "\n; <<>> DiG 9.10.3-P4-Debian <<>> @127.0.0.1 pi.hole\n; (1 server found)\n;; global options: +cmd\n;; Got answer:\n;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17238\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version: 0, flags:; udp: 4096\n;; QUESTION SECTION:\n;pi.hole.\t\t\tIN\tA\n\n;; ANSWER SECTION:\npi.hole.\t\t2\tIN\tA\t127.0.0.1\n\n;; Query time: 0 msec\n;; SERVER: 127.0.0.1#53(127.0.0.1)\n;; WHEN: Mon Jan 14 14:59:38 Eastern 2019\n;; MSG SIZE rcvd: 52\n\n",
"Start": "2019-01-14T09:59:38.203771777-05:00"
}
],
"Status": "healthy"
},
"OOMKilled": false,
"Paused": false,
"Pid": 377,
"Restarting": false,
"Running": true,
"StartedAt": "2019-01-14T14:14:51.821691216Z",
"Status": "running"
}
}

}`

@dschaper dschaper transferred this issue from pi-hole/pi-hole Jan 14, 2019
@diginc
Copy link
Member

diginc commented Jan 14, 2019

You can add -e FTL_CMD=debug to revert to the previous release's NET_ADMIN work around behavior.

@shanekorbel
Copy link
Author

shanekorbel commented Jan 14, 2019

You can add -e FTL_CMD=debug to revert to the previous release's NET_ADMIN work around behavior.

I can confirm this does work around the issue.

Thanks Adam!

@Joniator
Copy link

Joniator commented Jan 16, 2019

What is your Setup for swarm?
I can't get it running (relevant part of the compose-file), at least now I can enter the web interface with FTL_CMD.

I think the issue is related to the gateway, since it can't reach the default gateway, but I don't know how to tackle this

ip -4 route | grep default | cut -d ' ' -f 3 on the container returns a different IP than the default gateway from the stack network

TIA

Edit: So I fixed it by changing the DNS-Settings to listen on all interfaces, not only on eth0, but I'm still irritated by the [✗] DNS service is NOT running, and the container randomly stops responding wihtout log entries, which is fixed with a restart but still annoying

@pabloromeo
Copy link

pabloromeo commented Jan 20, 2019

Same issue here, running pi-hole on Swarm successfully for months, until this problem came up.
I was able to get it back up using the info mentioned above, adding two environment variables:

- FTL_CMD=debug
- DNSMASQ_LISTENING=all

However, just as above, even-though it's blocking ads, the log still shows the following warning:

WARNING Misconfigured DNS in /etc/resolv.conf: Primary DNS should be 127.0.0.1 (found 127.0.0.11)

And ultimately logs errors as well:

  [✗] DNS service is NOT running,
Starting crond,
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec],
[cont-init.d] 20-start.sh: exited 0.,
[cont-init.d] done.,
Starting lighttpd,
Starting pihole-FTL (debug),
[services.d] done.,
[services.d] starting services

My compose file does override DNS entries:

    dns:
      - 127.0.0.1
      - 1.1.1.1

Now, I'm not sure why pi-hole is expecting /etc/resolv.conf to always have the value 127.0.0.1, given that for user-defined networks docker always defines 127.0.0.11 instead, and internally manages resolution (according to their documentation: https://docs.docker.com/v17.09/engine/userguide/networking/configure-dns/).

Hopefully those errors in the logs won't affect pi-hole's stability.

@scottjl
Copy link

scottjl commented Jan 21, 2019

i have the same issue and not in swarm mode. single docker container running on a pi-b+. a problem with the image?

Attaching to pihole                                                                                            [429/984]
pihole    | [s6-init] making user provided files available at /var/run/s6/etc...exited 0.
pihole    | [s6-init] ensuring user provided files have correct perms...exited 0.
pihole    | [fix-attrs.d] applying ownership & permissions fixes...
pihole    | [fix-attrs.d] 01-resolver-resolv: applying...
pihole    | [fix-attrs.d] 01-resolver-resolv: exited 0.
pihole    | [fix-attrs.d] done.
pihole    | [cont-init.d] executing container initialization scripts...
pihole    | [cont-init.d] 20-start.sh: executing...
pihole    | stty: 'standard input': Inappropriate ioctl for device
pihole    |  ::: Starting docker specific setup for docker pihole/pihole
pihole    | WARNING Misconfigured DNS in /etc/resolv.conf: Two DNS servers are recommended, 127.0.0.1 and any backup ser
ver
pihole    | WARNING Misconfigured DNS in /etc/resolv.conf: Primary DNS should be 127.0.0.1 (found 127.0.0.11)
pihole    |
pihole    | nameserver 127.0.0.11
pihole    | options ndots:0
pihole    | stty: 'standard input': Inappropriate ioctl for device
pihole    |   [i] Existing PHP installation detected : PHP version 7.0.33-0+deb9u1

@diginc
Copy link
Member

diginc commented Jan 22, 2019

127.0.0.11 is the default address for docker's resolv.conf I believe so it seems swarm isn't letting you overwrite it by the standard --dns method.

This sounds a lot like what synology has happen and they've worked around the problem by force overwriting /etc/resolv.conf with a read only volume containing the setting you want:

http://tonylawrence.com/posts/unix/synology/running-pihole-inside-docker/

@pabloromeo
Copy link

pabloromeo commented Jan 22, 2019

I did try mounting /etc/resolve.conf however that caused other issues (permission related, since chown fails on that file).
Now, the odd thing, is that even on 127.0.0.11 (along with --dns overwrite for 127.0.0.1) the previous release was working just fine.

Not sure if it might have something to do with the way I personally have it set up:
I publish port 53 on the host network, and then use KeepAlived to actually resolve a virtual IP to the physical node pi-hole is running on at any moment.

    ports:
      - published: 53
        target: 53
        protocol: tcp
        mode: host
      - published: 53
        target: 53
        protocol: udp
        mode: host
      - "80:80/tcp"
      - "443:443/tcp"

But that setup no longer worked in the latest release, and the only way to get it to work again was by adding the FTL_CMD=debug and DNSMASQ_LISTENING=all environment variables.

@4s3ti
Copy link

4s3ti commented Feb 3, 2019

+1 Here...
got the same issue ... managed to resolve it by adding the same env vars

- FTL_CMD=debug
- DNSMASQ_LISTENING=all

follow some logs ....

version: '3.3'

services:

 <...More Services keeping only the core of it...>
  pihole:
    image: pihole/pihole:latest
    networks:
      - rp
    ports:
#     - 80:80
      - 53:53/tcp
      - 53:53/udp
    dns:
      - 127.0.0.1
      - 1.1.1.1
    cap_add:     < ----- **This One is ignored**
      - NET_ADMIN
    volumes:
      - /srv/docker-data/pihole/pihole:/etc/pihole/
      - /srv/docker-data/pihole/dnsmasq.d:/etc/dnsmasq.d/
      - /srv/docker-data/pihole/pihole.log:/var/log/pihole.log
#     - /srv/docker-data/pihole/resolv.conf:/etc/resolv.conf
      - /srv/docker-data/pihole/hosts.home:/etc/hosts.home
    environment:
      - ServerIP=10.20.60.12
      - TZ=Europe/Stockholm
      - WEBPASSWORD=***REDACTED****
      - DNS1=1.1.1.1
      - DNS2=1.0.0.1
      - FTL_CMD=debug
      - DNSMASQ_LISTENING=all
      - PROXY_LOCATION=pihole
      - VIRTUAL_HOST=pihole.home
    deploy:
      replicas: 1
      restart_policy:
        condition: on-failure
      placement:
        constraints:
          - node.role == manager
      labels:
        - "traefik.enable=true"
        - "traefik.backend=hole"
        - "traefik.pihole.port=80"
        - "traefik.frontend.rule=Host:pihole.home"
        - "traefik.docker.network=homeswarm_rp"
        - "traefik.frontend.priority=1"
        - "traefik.frontend.headers.browserXSSFilter=true"
        - "traefik.frontend.headers.customFrameOptionsValue=SAMEORIGIN"
        - "traefik.frontend.headers.contentTypeNosniff=true"
        - "traefik.frontend.headers.ContentTypeNoSniff=true" 
        - "traefik.frontend.headers.forceSTSHeader=true"
        - "traefik.frontend.headers.STSSeconds=15552000"
        - "traefik.frontend.headers.STSIncludeSubdomains=true"
        - "traefik.frontend.headers.STSPreload=true"


networks: 
  rp:
    attachable: true
    driver: overlay

However, still complains about DNS Service not running.

root@docker01:/srv/docker-data/homeswarm # docker container logs homeswarm_pihole.1.bvcbej4nysqpqcu90k8p98aue 
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] 01-resolver-resolv: applying... 
[fix-attrs.d] 01-resolver-resolv: exited 0.
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 20-start.sh: executing... 
stty: 'standard input': Inappropriate ioctl for device
 ::: Starting docker specific setup for docker pihole/pihole
WARNING Misconfigured DNS in /etc/resolv.conf: Two DNS servers are recommended, 127.0.0.1 and any backup server
WARNING Misconfigured DNS in /etc/resolv.conf: Primary DNS should be 127.0.0.1 (found 127.0.0.11)

nameserver 127.0.0.11
options ndots:0
stty: 'standard input': Inappropriate ioctl for device
  [i] Existing PHP installation detected : PHP version 7.0.33-0+deb9u1

  [i] Installing configs from /etc/.pihole...
  [i] Existing dnsmasq.conf found... it is not a Pi-hole file, leaving alone!
  [✓] Copying 01-pihole.conf to /etc/dnsmasq.d/01-pihole.conf
chown: cannot access '/etc/pihole/dhcp.leases': No such file or directory
Setting password: **REDACTED**
+ pihole -a -p '**REDACTED**'
  [✓] New password set
Using custom DNS servers: 1.1.1.1 & 1.0.0.1
DNSMasq binding to default interface: eth0
Added ENV to php:
			"PHP_ERROR_LOG" => "/var/log/lighttpd/error.log",
			"ServerIP" => "10.20.60.12",
			"VIRTUAL_HOST" => "pihole.home",
Using IPv4 and IPv6
::: Preexisting ad list /etc/pihole/adlists.list detected ((exiting setup_blocklists early))
https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
https://mirror1.malwaredomains.com/files/justdomains
http://sysctl.org/cameleon/hosts
https://zeustracker.abuse.ch/blocklist.php?download=domainblocklist
https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt
https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt
https://hosts-file.net/ad_servers.txt
::: Testing pihole-FTL DNS: FTL started!
::: Testing lighttpd config: Syntax OK
::: All config checks passed, cleared for startup ...
 ::: Docker start setup complete
  [i] Pi-hole blocking is enabled
  [i] Neutrino emissions detected...
  [✓] Pulling blocklist source list into range

  [i] Target: raw.githubusercontent.com (hosts)
  [✓] Status: Retrieval successful

  [i] Target: mirror1.malwaredomains.com (justdomains)
  [✓] Status: No changes detected

  [i] Target: sysctl.org (hosts)
  [✓] Status: No changes detected

  [i] Target: zeustracker.abuse.ch (blocklist.php?download=domainblocklist)
  [✓] Status: No changes detected

  [i] Target: s3.amazonaws.com (simple_tracking.txt)
  [✓] Status: No changes detected

  [i] Target: s3.amazonaws.com (simple_ad.txt)
  [✓] Status: No changes detected

  [i] Target: hosts-file.net (ad_servers.txt)
  [✓] Status: No changes detected

  [✓] Consolidating blocklists
  [✓] Extracting domains from blocklists
  [i] Number of domains being pulled in by gravity: 135414
  [✓] Removing duplicate domains
  [i] Number of unique domains trapped in the Event Horizon: 112784
  [i] Number of whitelisted domains: 1
  [i] Number of blacklisted domains: 0
  [i] Number of regex filters: 0
  [✓] Parsing domains into hosts format
  [✓] Cleaning up stray matter

  [✗] DNS service is NOT running
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
[cont-init.d] 20-start.sh: exited 0.
[cont-init.d] done.
[services.d] starting services
Starting lighttpd
Starting crond
Starting pihole-FTL (debug)
[services.d] done.

right now it is working even tough it complains dns is not working.
Let me know if you need any more datils from my setup.

@4s3ti
Copy link

4s3ti commented Feb 3, 2019

Related to:

moby/moby#25885

@4s3ti
Copy link

4s3ti commented Feb 3, 2019

even though it gets working with @pabloromeo workaround, all the pihole reporting gets messed up as pi hole sees the query as if they are originating from a "10.255.0.3" IP address instead of the correct hosts IP address.

@diginc
Copy link
Member

diginc commented Feb 3, 2019

Is 10.255.0.3 one of swarm's docker networks? I'm unfamilar with that range.

@musicsnob
Copy link

musicsnob commented Feb 5, 2019

What I've noticed, and I hope I'm not overstating the obvious, is that this issue only seems to happen on my (user-generated) bridge network.

When --network=dns-net, the --dns parameters are ignored, and I get the 127.0.0.11 error.

When --network=host, the --dns parameters are used, and everything starts except that the blocklists are not downloaded:

  [✗] Pulling blocklist source list into range
  [i] No source list found, or it is empty

The relevant part of my run command is:

docker run --name=pihole --hostname=pihole \
--network=dns-net \
--publish=53:53/tcp \
--publish=53:53/udp \
--publish=80:80 \
--publish=443:443 \
--cap-add=NET_ADMIN \
--dns=127.0.0.1 \
--dns=1.1.1.1 \

@Korhm
Copy link

Korhm commented Feb 27, 2019

Hello,

I can confirm that withtout network_mode: "host" or (--network=host), the --dns parameter is ignored and /etc/resolv.conf is not updated (it keeps default value 127.0.0.11
With this configuration, pihole does not answer do DNS queries.

Here is a working version of compose file for me (only DNS filtering, not DHCP server)

version: '3'
services:
  pihole:
    container_name: pihole
    hostname: pihole.lebeau.ovh
    image: pihole/pihole:latest
    #ports:
    #  - "53:53/tcp"
    #  - "53:53/udp"
    #  - "8080:80/tcp"
    network_mode: "host"
    environment:
      TZ: 'Europe/Paris'
      IPv6: 'False'
      ServerIP: '192.168.1.10'
      INTERFACE: 'eth1'
      DNSMASQ_USER: 'pihole'
      WEB_PORT: '8080'
    volumes:
      - '/home/pihole/pihole/:/etc/pihole/'
      - '/home/pihole/dnsmasq.d/:/etc/dnsmasq.d/'
    dns:
      - 127.0.0.1
      - 192.168.1.1
    restart: unless-stopped
    cap_add:
      - NET_ADMIN

@MadsBen
Copy link

MadsBen commented Jan 7, 2020

A workaround on swarm for the lack of capabilities, is to use macvlan.
Here's how a macvlan is created with Portainer.
https://www.portainer.io/2018/09/using-macvlan-portainer-io/

And my compose file is as follows. The admin interface is behind a traefik proxy.

version: '3'

services:
  pihole:
    deploy:
      placement:
        constraints:
          - node.role == manager
      labels:
        - "traefik.enable=true"
        - "traefik.frontend.rule=Host:pihole.domain.com"
        - "traefik.frontend.priority=1"
        - "traefik.protocol=http"
        - "traefik.port=80"
        - "traefik.docker.network=proxy_default"
    networks:
      - "default"
      - "proxy_default"
    image: pihole/pihole:latest
    hostname: PiHole
    dns:
      - 127.0.0.1
      - 91.239.100.100
    #ports:
      #- "53:53/tcp"
      #- "53:53/udp"
      #- "81:80/tcp"
      #- "444:443/tcp"
      #- "67:67/udp"
    volumes:
      - pihole:/etc/pihole/
      - pihole_dnsmasq:/etc/dnsmasq.d/
      - /media/ssd/docker/volumes/ssd_pihole/pihole.log:/var/log/pihole.log
    environment:
      - ServerIP=X.X.X.X
      - PROXY_LOCATION=pihole
      - VIRTUAL_HOST=pihole.domain.com
      - VIRTUAL_PORT=80
      - TZ=Europe/Copenhagen
      - WEBPASSWORD=
      - FTL_CMD=debug
      - DNSMASQ_LISTENING=all
      - DNS1=91.239.100.100
      - DNS2=89.233.43.71

volumes:
    pihole:
      driver: local
      driver_opts:
        type: "nfs"
        o: "addr=X.X.X.X,rw,noatime,rsize=8192,wsize=8192,tcp,timeo=14,nolock"
        device: ":/path/to/volumes/pihole"           
    pihole_dnsmasq:
      driver: local
      driver_opts:
        type: "nfs"
        o: "addr=X.X.X.X,rw,noatime,rsize=8192,wsize=8192,tcp,timeo=14,nolock"
        device: ":/path/to/volumes/pihole_dnsmasq" 
networks:
    proxy_default:
      external:
        name: proxy_default
    default:
      external:
        name: pihole_network

@patcsy88
Copy link

patcsy88 commented Mar 4, 2020

even though it gets working with @pabloromeo workaround, all the pihole reporting gets messed up as pi hole sees the query as if they are originating from a "10.255.0.3" IP address instead of the correct hosts IP address.

Did a solution present itself as I have a couple of piholes running in Swarm and the src IPs logged are from the docker ingress network and NOT the actual client IP querying the piholes?

@pabloromeo
Copy link

pabloromeo commented Mar 4, 2020

In my case I do get the proper IPs of the clients, and even host name resolution from the router (by configuring the router IP in the pihole UI).
What I had to do was publish the TCP/UDP 53 port on the host network directly, tell pihole to listen on all interfaces, and then have keepalived running in the cluster with a virtual IP pointing only to the current host listening on port 53 (the one actively running pihole). This single VIP is the one I configure in my router as the DNS server.
That way pihole can jump from one node to another in the swarm during a service update.

Because of that, I don't actually need to run more than one replica of the service, if it ever goes down a new one will come up automatically so I haven't needed additional redundancy.

Macvlan may be an option, but the problem with that, is that each swarm node would need a different IP-range to avoid conflicts, which would mean that there wouldn't be a single fixed IP to specify in the router DNS config. That's why i went with keepalived and one single VIP.

@MadsBen
Copy link

MadsBen commented May 5, 2020

@pabloromeo Can you list your compose-file for pihole?
I can't seem to get it to work, by publishing the ports to the host (using the long format as above), when doing that, the container refuses to start.

@pabloromeo
Copy link

pabloromeo commented May 9, 2020

@MadsBen sure. Now, take into account that I'm doing this in tandem with Keepalived running on the whole cluster with a fixed virtual IP pointing at whichever node is running pihole at the moment (through a keepalived check script that tests if the current node is using port 53, if so then this should be the master node for that VIP in keepalived). The I just configure my network router to use that VIP as Primary DNS.

The check_dns script for Keepalived is very simple:

#!/bin/bash

nc -z 127.0.0.1 53

Regarding your container not starting up i'd recommend tailing the logs for it during startup. There should be error information there.

I run pihole using the following:

version: '3.4'

services:

  pihole:
    image: pihole/pihole:latest
    hostname: "{{.Node.Hostname}}"
    deploy:
      mode: replicated
      replicas: 1
      update_config:
        order: stop-first
    dns:
      - 127.0.0.1
      - 1.1.1.1
    volumes:
      - /mnt/nfs/state-storage/pihole/pihole:/etc/pihole
      - /mnt/nfs/state-storage/pihole/dnsmasq.d:/etc/dnsmasq.d
      - /etc/localtime:/etc/localtime:ro
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/Argentina/Buenos_Aires
      - FTL_CMD=debug
      - DNSMASQ_LISTENING=all
      - ServerIP=<IP of virtual IP assigned in Keepalived>
      - WEBPASSWORD=<password>
      - DNS1=1.1.1.1
      - DNS2=1.0.0.1
    ports:
      - published: 53
        target: 53
        protocol: tcp
        mode: host
      - published: 53
        target: 53
        protocol: udp
        mode: host
      - "80:80/tcp"
      - "443:443/tcp"

@MadsBen
Copy link

MadsBen commented Jun 15, 2020

@pabloromeo Thx for your help. I already had keepalived running, but struggled with publishing the ports to the host for the docker container.

I'm running Ubuntu and turns out, that was the problem, as it already runs a dnsmasq, listening on port 53...
Solution listed here:
https://aarongodfrey.dev/software/running_pihole_in_docker_on_ubuntu_server/
After fixing this, the container is running and I get the proper IPs of the clients.

@scyto
Copy link

scyto commented Jul 31, 2020

@pabloromeo this is slightly off topic. why the need for keepalived? If one has nodea and nodeb and pihole only runs on those AND one specifies dns1=nodea and dns2=nodeb on every client wouldn't DNS continue to resolve? Also do you replicate that nfs share in anyway so that you cope with loss of any one machine running the share?

Also to add value to this thread. If using portainer you have to use the aternate format for env vars or they don't seem to work.
aka

environment:
      foo1: bar1
      foo2: bar2
 

@pabloromeo
Copy link

pabloromeo commented Jul 31, 2020

@scyto sure, that would also work if you run it on two nodes. My scenario was a bit different.
I run 6 nodes at the moment and pihole could run on any of those, so I wanted a single fixed virtual IP to configure on my router directly as primary DNS and not have to worry, while keeping the secondary one against Cloudflare's 1.1.1.1 in case pihole is totally down or the cluster down for maintenance.
The second reason is, as I mentioned above, I run pihole with port 53 published on the host network instead of the overlay.
That's so that proper network IPs for devices are be reported and correct hostnames show up. If not you just get a single docker IP which is pretty useless for stats.
I guess you could do the same and have clients attempt to connect to node1, fail, and then try node2. I just assumed it was more efficient to just avoid that fallback sequence and have a single target IP.
And finally, the third reason is, I was already using Keepalived for another purpose anyway so I had it all set up. I run another VIP which points to basically any live node on the cluster. That's just so that my router port-forwarding rules can target that single place to send traffic to the cluster. If not, i'd have to tie that to one single node's IP and that would become a SPoF in case I restarted that node in particular. I wanted a failsafe way to always be able to get to the cluster as long as at least one node was up.

If I remember correctly that was the thinking behind Keepalived :)

@scyto
Copy link

scyto commented Aug 1, 2020

@pabloromeo thank! That really helps my mental model. I discovered the single client ip issue when I set up my first swarm node last night. Next up move to host or macvlan. Do you recommend running keepalived in docker or on the hosts directly?

@pabloromeo
Copy link

pabloromeo commented Aug 1, 2020

@scyto in my case I run keepalived through docker too.
Not as a service in the swarm though since you need capadmin, just as a standard container.
Macvlan could work too to get the same effect as host I think.
The reason I didn't use it is because with macvlan you'll need a separate subnet range per host, meaning, your pihole will have different real network ips depending on which host its running in, which would have complicated things for me. So I just went with host network and a VIP pointing to the node actively listening on port 53.

@petersjostrom
Copy link

petersjostrom commented Jan 10, 2021

"...have keepalived running in the cluster with a virtual IP pointing only to the current host listening on port 53 (the one actively running pihole)."

I've searched for how to setup this, but can't find any solution. Can you explain how you did this? I understand how I can setup a VIP on the hosts or using a docker based keepalived image. But I can't understand how to make sure the active VIP points to the node running the pihole docker container?

@christianerbsmehl
Copy link

christianerbsmehl commented Jan 19, 2021

Docker Swarm now supports cap_add with release 20.10 (see docker/cli#2663)
I was able to get it to work with --cap-add NET_ADMIN but I had to enable DNSMASQ_LISTENING=all. Setting FTL_CMD=debug was not necessary. Maybe this is related to the overlay network.

@pabloromeo
Copy link

pabloromeo commented Jan 22, 2021

"...have keepalived running in the cluster with a virtual IP pointing only to the current host listening on port 53 (the one actively running pihole)."

I've searched for how to setup this, but can't find any solution. Can you explain how you did this? I understand how I can setup a VIP on the hosts or using a docker based keepalived image. But I can't understand how to make sure the active VIP points to the node running the pihole docker container?

Ah, I see. So, the way I get it to run the VIP where pihole is, is using a basic check script, that all it does, is check if port 53 is in use.

check_dns.sh:

#!/bin/bash

nc -z 127.0.0.1 53

And I make that script available to the docker container in runtime in /etc/keepalived/check_dns.sh
Then in the keepalived config i set it up:

vrrp_script chk_dns {
  script "/etc/keepalived/check_dns.sh"
  interval 10
  weight 50
}

And within the VRRP instance block you add the track_script section to use that check:

vrrp_instance VI_2 {
  interface eth0
  state MASTER
  virtual_router_id 52

  .......

  track_script {
    chk_dns
  }
}

So nodes that aren't running anything on port 53 will fail the chk_dns, and only the one with pihole will have a successful status code and become MASTER of that VIP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests