Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsmasq fails to restart during nixos-rebuild switch #20863

Open
rycee opened this issue Dec 2, 2016 · 4 comments
Open

dnsmasq fails to restart during nixos-rebuild switch #20863

rycee opened this issue Dec 2, 2016 · 4 comments

Comments

@rycee
Copy link
Member

rycee commented Dec 2, 2016

Issue description

When nixos-rebuild switch causes the network interfaces to be reconfigured it makes dnsmasq fail to restart. From the nixos-rebuild output:

...
warning: the following units failed: dnsmasq.service

● dnsmasq.service - Dnsmasq Daemon
   Loaded: loaded (/nix/store/26a85qwadw7hn4nk76n1a6rbhln4rdqx-unit-dnsmasq.service/dnsmasq.service; bad; vendor preset: enabled)
   Active: failed (Result: exit-code) since Fri 2016-12-02 18:56:03 CET; 2s ago
  Process: 25930 ExecStart=/nix/store/bgw0qjbj7p641lvq9gzlzgvg4w9aqj6m-dnsmasq-2.76/bin/dnsmasq -k --enable-dbus --user=dnsmasq -C /nix/store/12xj3ldj64jgcvsyl4ai0j22bchiar1g-dnsmasq.conf (code=exited, status=2)
  Process: 25822 ExecStartPre=/nix/store/2w3y5x590hj1ngf92phjl467fkbn4fj0-unit-script/bin/dnsmasq-pre-start (code=exited, status=0/SUCCESS)
 Main PID: 25930 (code=exited, status=2)

Dec 02 18:56:03 lambda systemd[1]: Starting Dnsmasq Daemon...
Dec 02 18:56:03 lambda dnsmasq-pre-start[25822]: dnsmasq: syntax check OK.
Dec 02 18:56:03 lambda dnsmasq[25930]: dnsmasq: unknown interface wlp1s0
Dec 02 18:56:03 lambda systemd[1]: dnsmasq.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 02 18:56:03 lambda systemd[1]: Failed to start Dnsmasq Daemon.
Dec 02 18:56:03 lambda systemd[1]: dnsmasq.service: Unit entered failed state.
Dec 02 18:56:03 lambda systemd[1]: dnsmasq.service: Failed with result 'exit-code'.
warning: error(s) occurred while switching to the new configuration

Some journal entries from around the time:

[rycee@lambda:~]$ journalctl -S 18:55:26 -u systemd -u dnsmasq -u 'network-*'
-- Logs begin at Sat 2016-08-13 00:11:44 CEST, end at Fri 2016-12-02 19:15:03 CET. --
Dec 02 18:55:26 lambda systemd[1]: Stopping Dnsmasq Daemon...
Dec 02 18:55:26 lambda dnsmasq[22608]: exiting on receipt of SIGTERM
Dec 02 18:55:26 lambda systemd[1]: Stopped Dnsmasq Daemon.
Dec 02 18:55:26 lambda systemd[1]: Stopping Address configuration of enp3s0...
Dec 02 18:55:26 lambda systemd[1]: Stopping Address configuration of wlp1s0...
Dec 02 18:55:26 lambda systemd[1]: Stopped target All Network Interfaces.
Dec 02 18:55:26 lambda systemd[1]: Stopped Link configuration of enp3s0.
Dec 02 18:55:26 lambda systemd[1]: Stopped Link configuration of wlp1s0.
Dec 02 18:55:26 lambda systemd[1]: Stopped Extra networking commands..
Dec 02 18:55:26 lambda systemd[1]: Stopped Networking Setup.
Dec 02 18:55:26 lambda network-addresses-wlp1s0-pre-stop[25288]: deleting 10.1.20.1/24...
Dec 02 18:55:26 lambda systemd[1]: Stopped Address configuration of wlp1s0.
Dec 02 18:55:26 lambda network-addresses-enp3s0-pre-stop[25284]: deleting 10.1.10.1/24...
Dec 02 18:55:26 lambda systemd[1]: Stopped Address configuration of enp3s0.
Dec 02 18:56:03 lambda systemd[1]: Starting Dnsmasq Daemon...
Dec 02 18:56:03 lambda systemd[1]: Starting Address configuration of enp3s0...
Dec 02 18:56:03 lambda systemd[1]: Starting Address configuration of wlp1s0...
Dec 02 18:56:03 lambda systemd[1]: Starting Link configuration of enp3s0...
Dec 02 18:56:03 lambda systemd[1]: Starting Link configuration of wlp1s0...
Dec 02 18:56:03 lambda network-addresses-enp3s0-start[25870]: bringing up interface...
Dec 02 18:56:03 lambda network-addresses-wlp1s0-start[25878]: bringing up interface...
Dec 02 18:56:03 lambda network-link-wlp1s0-start[25883]: Configuring link...
Dec 02 18:56:03 lambda network-link-enp3s0-start[25882]: Configuring link...
Dec 02 18:56:03 lambda systemd[1]: Started Link configuration of enp3s0.
Dec 02 18:56:03 lambda systemd[1]: Started Link configuration of wlp1s0.
Dec 02 18:56:03 lambda dnsmasq-pre-start[25822]: dnsmasq: syntax check OK.
Dec 02 18:56:03 lambda network-addresses-enp3s0-start[25870]: checking ip 10.1.10.1/24...
Dec 02 18:56:03 lambda network-addresses-wlp1s0-start[25878]: checking ip 10.1.20.1/24...
Dec 02 18:56:03 lambda network-addresses-enp3s0-start[25870]: added ip 10.1.10.1/24...
Dec 02 18:56:03 lambda dnsmasq[25930]: dnsmasq: unknown interface wlp1s0
Dec 02 18:56:03 lambda systemd[1]: dnsmasq.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 02 18:56:03 lambda systemd[1]: Failed to start Dnsmasq Daemon.
Dec 02 18:56:03 lambda systemd[1]: dnsmasq.service: Unit entered failed state.
Dec 02 18:56:03 lambda systemd[1]: dnsmasq.service: Failed with result 'exit-code'.
Dec 02 18:56:03 lambda network-addresses-wlp1s0-start[25878]: added ip 10.1.20.1/24...
Dec 02 18:56:03 lambda systemd[1]: Started Address configuration of enp3s0.
Dec 02 18:56:03 lambda systemd[1]: Started Address configuration of wlp1s0.
Dec 02 18:56:03 lambda systemd[1]: Reached target All Network Interfaces.
Dec 02 18:56:03 lambda systemd[1]: Starting Networking Setup...
Dec 02 18:56:03 lambda systemd[1]: Started Networking Setup.
Dec 02 18:56:03 lambda systemd[1]: Starting Extra networking commands....
Dec 02 18:56:03 lambda systemd[1]: Started Extra networking commands..

My dnsmasq configuration is

  services.dnsmasq = {
    enable = true;
    extraConfig = ''
      interface=enp3s0
      interface=wlp1s0

      bind-interfaces

      dhcp-range=enp3s0,10.1.10.10,10.1.10.250,24h
      dhcp-range=wlp1s0,10.1.20.10,10.1.20.250,24h

      dhcp-host=00:09:6b:60:94:b5,10.1.10.100
    '';
  };

I think the issue stems from my use of bind-interfaces and the from the apparent fact that wlp1s0 is given its IP address after dnsmasq is started. From the dnsmasq FAQ:

In "bind-interfaces" mode, dnsmasq runs through all the network interfaces available when it starts, finds the set of IP addresses on those interfaces, filters that set using the access control configuration, and then binds the set of IP addresses. Only packets sent to the allowed addresses are delivered by the kernel to dnsmasq.

So presumably it fails hard when no IP addresses are assigned to an interface.

I think a solution (beside me not using bind-interfaces) is to get dnsmasq to start slightly later, when all IP addresses have been assigned.

Steps to reproduce

I imagine this would be experienced by anybody having dnsmasq configured using bind-interfaces.

Technical details

  • System: 16.09.1186.046229b (Flounder)
  • Nix version: nix-env (Nix) 1.11.4
@nh2
Copy link
Contributor

nh2 commented Oct 22, 2017

This just took down my site.

After installing dnsmasq (I also did a couple other changes but none of them should be related to networking), my server stopped pinging and I had to reboot it via my hoster's (Hetzner) admin panel.

After reboot, I saw in journalctl that this had happened:

Oct 22 21:24:16 benaco-node-1 network-addresses-eth0-pre-stop[28194]: deleting 1.2.3.4/26...
Oct 22 21:24:16 benaco-node-1 systemd[1]: Stopped Address configuration of eth0.

I confirmed via the server's VGA port that the rest of NixOS was running fine, it apparently just had no IP address any more until the reboot.

Possibly related: #25455

@rycee
Copy link
Member Author

rycee commented Jan 10, 2018

For anybody finding this, I've been using the following workaround for quite some time:

systemd.services.dnsmasq = {
  requires = [ "network.target" ];
  wants = [ "systemd-networkd-wait-online.service" "network-online.target" ];
  after = [ "systemd-networkd-wait-online.service" "network-online.target" ];
};

which will delay the dnsmasq start enough to succeed. This is if you are using systemd-networkd for managing the network configuration.

@stale
Copy link

stale bot commented Jun 5, 2020

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 5, 2020
@MordragT
Copy link

MordragT commented Apr 14, 2022

I am not sure if this is related but every time I try to switch my configuration it get the following warning/error.

warning: the following units failed: systemd-networkd-wait-online.service

× systemd-networkd-wait-online.service - Wait for Network to be Configured
     Loaded: loaded (/etc/systemd/system/systemd-networkd-wait-online.service; enabled; vendor preset: disabled)
    Drop-In: /nix/store/qbp6s2w0hr6vpgd542swzc28ilh8sd3n-system-units/systemd-networkd-wait-online.service.d
             └─overrides.conf
     Active: failed (Result: exit-code) since Thu 2022-04-14 14:25:07 CEST; 27ms ago
       Docs: man:systemd-networkd-wait-online.service(8)
    Process: 8511 ExecStart=/nix/store/4sjmk6209x5c6ns3b7193qpq03r7m8wv-systemd-250.4/lib/systemd/systemd-networkd-wait-online --timeout=120 (code=exited, status=1/FAILURE)
   Main PID: 8511 (code=exited, status=1/FAILURE)
         IP: 0B in, 0B out
        CPU: 4ms

Apr 14 14:23:07 tom-pc systemd[1]: Starting Wait for Network to be Configured...
Apr 14 14:25:07 tom-pc systemd-networkd-wait-online[8511]: Timeout occurred while waiting for network connectivity.
Apr 14 14:25:07 tom-pc systemd[1]: systemd-networkd-wait-online.service: Main process exited, code=exited, status=1/FAILURE
Apr 14 14:25:07 tom-pc systemd[1]: systemd-networkd-wait-online.service: Failed with result 'exit-code'.
Apr 14 14:25:07 tom-pc systemd[1]: Failed to start Wait for Network to be Configured.
warning: error(s) occurred while switching to the new configuratio

Edit: I am also using systemd-networkd and I am on nixos-unstable fully updated.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants