Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hostapd service, degrated systemd dependencies #16090

Closed
qknight opened this issue Jun 9, 2016 · 23 comments
Closed

hostapd service, degrated systemd dependencies #16090

qknight opened this issue Jun 9, 2016 · 23 comments

Comments

@qknight
Copy link
Member

qknight commented Jun 9, 2016

Issue description

nixos-version: 16.03.714.69420c5 (Emu)

in nixos/modules/services/networking/hostapd.nix we have this code:

  systemd.services.hostapd =
      { description = "hostapd wireless AP";

        path = [ pkgs.hostapd ];
        wantedBy = [ "network.target" ];

        after = [ "${cfg.interface}-cfg.service" "nat.service" "bind.service" "dhcpd.service"];

        serviceConfig =
          { ExecStart = "${pkgs.hostapd}/bin/hostapd ${configFile}";
            Restart = "always";
          };
      };

when using this with the configuration below:


  services.hostapd = {
    enable = true;
    wpaPassphrase = pw.wpaPassphrase;
    interface = "wlp4s0";
    ssid="flux";
  };

we will see this in the boot log:

boot log

[  OK  ] Started NTP Daemon.
[  OK  ] Started CPU Frequency Governor Setup.
[  OK  ] Started Kernel Auditing.
[  OK  ] Found device I210 Gigabit Network Connection.
[  OK  ] Started Name Service Cache Daemon.
[  OK  ] Started SSH Daemon.
[  OK  ] Reached target User and Group Name Lookups.
         Starting Permit User Sessions...
         Starting Login Service...
[  OK  ] Reached target Host and Network Name Lookups.
[  OK  ] Reached target Local File Systems.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Create Volatile Files and Directories.
[  OK  ] Stopped hostapd wireless AP.
[  OK  ] Started hostapd wireless AP.
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Started Serial Getty on ttyS0.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started Login Service.
[  OK  ] Started Firewall.
[  OK  ] Stopped hostapd wireless AP.
[  OK  ] Started hostapd wireless AP.
[  OK  ] Reached target Network (Pre).
[  OK  ] Stopped hostapd wireless AP.
[  OK  ] Started hostapd wireless AP.
[  OK  ] Found device QCA986x/988x 802.11ac Wireless Network Adapter.
         Starting Bridge Interface br0...
[  OK  ] Stopped hostapd wireless AP.
[  OK  ] Started hostapd wireless AP.
[FAILED] Failed to start Bridge Interface br0.
See 'systemctl status br0-netdev.service' for details.
[  OK  ] Stopped hostapd wireless AP.
[FAILED] Failed to start hostapd wireless AP.
See 'systemctl status hostapd.service' for details.
[  OK  ] Reached target All Network Interfaces.
         Starting Networking Setup...
         Starting DHCP Client...
[  OK  ] Started Networking Setup.
         Starting Extra networking commands....
[  OK  ] Started Extra networking commands..
[  OK  ] Started DHCP Client.
[  OK  ] Reached target Network.
[  OK  ] Reached target Multi-User System.

fix

@Clever suggested to use:
systemd.services.hostapd.after = [ "sys-subsystem-net-devices-wlp4s0.device" "nat.service" "bind.service" "dhcpd.service" ]; in the configuration.nix

this makes it work! we have to replace:
"${cfg.interface}-cfg.service"
by
sys-subsystem-net-devices-${cfg.interface}.device

note: there is no ${cfg.interface}-cfg.service on my system by looking at systemctl's output. instead i see sys-subsystem-net-devices-${cfg.interface}.device which should be the dependency we want instead. guess: this was probably renamed at some point or something.

shall i crate a PR for this fix?

@edolstra

@cleverca22
Copy link
Contributor

the cfg is an alias, cfg = config.services.hostapd;

@groxxda
Copy link
Contributor

groxxda commented Jun 9, 2016

Looking at your log there is something wrong with the service in my opinion. It seems as if hostapd is started and stopped at least 4 times.
I don't think wantedBy = network.target is correct. Most certainly wantedBy = multi-user.target and after = network.target is more accurate. I believe this should be enough so the sys-subsystem.. ordering can be removed completely.
Can you please check if this works and gets rid of the multiple start/stop messages?

@qknight
Copy link
Member Author

qknight commented Jun 9, 2016

@groxxda

the line below won't work:

systemd.services.hostapd.after = [ "network.target" ]; 

i even tried:

systemd.services.hostapd.after = [ "sys-subsystem-net-devices-wlp4s0.device" "network.target" ];

what you proposed

using this config:

  systemd.services.hostapd.after = [ "network.target" ];
  systemd.services.hostapd.wantedBy= [ "multi-user.target" ];

it fails with;

[  OK  ] Started Journal Service.
         Starting Flush Journal to Persistent Storage...
[  OK  ] Started Flush Journal to Persistent Storage.
[  OK  ] Started Load Kernel Modules.
[  OK  ] Found device /dev/ttyS0.
         Starting Apply Kernel Variables...
[  OK  ] Started Apply Kernel Variables.
[  OK  ] Found device KINGSTON_SMS200S360G 1.
         Mounting /boot...
[  OK  ] Reached target System Initialization.
[  OK  ] Listening on Nix Daemon Socket.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
         Starting SSH Daemon...
         Starting NTP Daemon...
         Starting Cron Daemon...
         Starting Name Service Cache Daemon...
         Starting CPU Frequency Governor Setup...
         Starting Store Sound Card State...
         Starting Kernel Auditing...
[  OK  ] Started Kernel Log Daemon.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
         Starting Firewall...
[  OK  ] Mounted /boot.
[  OK  ] Started Cron Daemon.
[  OK  ] Started CPU Frequency Governor Setup.
[  OK  ] Started Store Sound Card State.
[  OK  ] Started NTP Daemon.
[  OK  ] Started Kernel Auditing.
[  OK  ] Started SSH Daemon.
[  OK  ] Started Name Service Cache Daemon.
[  OK  ] Found device I210 Gigabit Network Connection.
[  OK  ] Reached target User and Group Name Lookups.
         Starting Login Service...
         Starting Permit User Sessions...
[  OK  ] Reached target Host and Network Name Lookups.
[  OK  ] Reached target Local File Systems.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Create Volatile Files and Directories.
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Started Serial Getty on ttyS0.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started Login Service.
[  OK  ] Started Firewall.
[  OK  ] Reached target Network (Pre).
[  OK  ] Found device QCA986x/988x 802.11ac Wireless Network Adapter.
         Starting Bridge Interface br0...
[FAILED] Failed to start Bridge Interface br0.
See 'systemctl status br0-netdev.service' for details.
[  OK  ] Reached target All Network Interfaces.
         Starting DHCP Client...
         Starting Networking Setup...
[  OK  ] Started Networking Setup.
         Starting Extra networking commands....
[  OK  ] Started Extra networking commands..
[  OK  ] Started DHCP Client.
[  OK  ] Reached target Network.
[  OK  ] Started hostapd wireless AP.
[  OK  ] Reached target Multi-User System.

what i proposed

systemd.services.hostapd.after = [ "sys-subsystem-net-devices-wlp4s0.device" "nat.service" "bind.service" "dhcpd.service" ];

results in

         Starting Update UTMP about System Boot/Shutdown...
         Starting Load/Save Random Seed...
[  OK  ] Started Create Static Device Nodes in /dev.
[  OK  ] Started Load/Save Random Seed.
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Started udev Coldplug all Devices.
[  OK  ] Reached target Local File Systems (Pre).
         Starting udev Kernel Device Manager...
[  OK  ] Started Setup Virtual Console.
[  OK  ] Started udev Kernel Device Manager.
[  OK  ] Started Journal Service.
         Starting Flush Journal to Persistent Storage...
[  OK  ] Started Flush Journal to Persistent Storage.
[  OK  ] Started Load Kernel Modules.
[  OK  ] Found device /dev/ttyS0.
[  OK  ] Found device KINGSTON_SMS200S360G 1.
         Mounting /boot...
         Starting Apply Kernel Variables...
[  OK  ] Started Apply Kernel Variables.
[  OK  ] Mounted /boot.
[  OK  ] Reached target Local File Systems.
         Starting Create Volatile Files and Directories...
[  OK  ] Reached target System Initialization.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Listening on Nix Daemon Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
         Starting CPU Frequency Governor Setup...
         Starting SSH Daemon...
         Starting NTP Daemon...
         Starting Store Sound Card State...
         Starting Kernel Auditing...
         Starting Name Service Cache Daemon...
[  OK  ] Started Kernel Log Daemon.
         Starting Cron Daemon...
         Starting Firewall...
[  OK  ] Started Create Volatile Files and Directories.
[  OK  ] Started CPU Frequency Governor Setup.
[  OK  ] Started Store Sound Card State.
[  OK  ] Started Kernel Auditing.
[  OK  ] Started Cron Daemon.
[  OK  ] Started NTP Daemon.
[  OK  ] Found device I210 Gigabit Network Connection.
[  OK  ] Started SSH Daemon.
[  OK  ] Started Name Service Cache Daemon.
[  OK  ] Reached target Host and Network Name Lookups.
[  OK  ] Reached target User and Group Name Lookups.
         Starting Permit User Sessions...
         Starting Login Service...
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Getty on tty1.
[  OK  ] Started Serial Getty on ttyS0.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Started Login Service.
[  OK  ] Started Firewall.
[  OK  ] Reached target Network (Pre).
[  OK  ] Found device QCA986x/988x 802.11ac Wireless Network Adapter.
[  OK  ] Started hostapd wireless AP.
         Starting Bridge Interface br0...
[  OK  ] Started Bridge Interface br0.
[  OK  ] Reached target All Network Interfaces.
         Starting DHCP Client...
         Starting Networking Setup...
[  OK  ] Started Networking Setup.
         Starting Extra networking commands....
[  OK  ] Started Extra networking commands..

works pretty well.

combination

this seems to work also:

systemd.services.hostapd.wantedBy= [ "multi-user.target" ];
systemd.services.hostapd.after = [ "sys-subsystem-net-devices-wlp4s0.device" "nat.service" "bind.service" "dhcpd.service" ];

so what should we do?

@groxxda
Copy link
Contributor

groxxda commented Jun 10, 2016

My guess is that the problem is not directly related to hostapd but rather by the bridge br0.
Would you mind posting how you setup br0 and the output of systemctl status br0-netdev.service after the failure?
I'm not sure why hostapd has to be started before br0 and more importantly why the ordering with sys-subsystem-net-devices-wlp4s0.device works. This looks random to me 😕

@qknight
Copy link
Member Author

qknight commented Jun 11, 2016

@groxxda

br0 setup

 networking = {
    hostName = "apu-nixi"; # Define your hostname.
    bridges.br0.interfaces = [ "enp1s0" "wlp4s0" ];
    firewall = {
      enable = true;
      allowPing = true;
      allowedTCPPorts = [ 22 ];
      #allowedUDPPorts = [ 5353 ];
    };

  };

@qknight
Copy link
Member Author

qknight commented Jun 17, 2016

@groxxda ideas?

@groxxda
Copy link
Contributor

groxxda commented Jun 17, 2016

@qknight
As you can see from your logs hostapd starts without errors in both cases and the problem is really with the bridge.
I'm not using scripted network interfaces myself, and have no detailed knowledge how they should work.
To investigate further it would be handy to see the outputs of

  • systemctl status br0-netdev.service or better journalctl -u br0-netdev.service
  • systemctl cat br0-netdev.service
  • systemctl list-dependencies --after br0-netdev.service (maybe in both cases: with your fixed hostapd.service and my simplified version)

and optionally

  • systemctl list-dependencies network.target

@qknight
Copy link
Member Author

qknight commented Jul 1, 2016

@groxxda

configuration.nix

  systemd.services.hostapd.wantedBy= [ "multi-user.target" ];
  systemd.services.hostapd.after = [ "sys-subsystem-net-devices-wlp4s0.device" "nat.service" "bind.service" "dhcpd.service" ];

  services.hostapd = {
    enable = true;
    wpaPassphrase = pw.wpaPassphrase;
    interface = "wlp4s0";
    ssid="flux";
  };

systemctl status br0-netdev.service

systemctl status br0-netdev.service                                                                       ~
● br0-netdev.service - Bridge Interface br0
   Loaded: loaded (/nix/store/qg3vglcr6pmq15hsjjrj6w6gvglh296j-unit-br0-netdev.service/br0-netdev.service; bad; vendor pres
   Active: active (exited) since Fri 2016-07-01 11:15:45 CEST; 2min 42s ago
  Process: 710 ExecStart=/nix/store/1n622x4m2q7gc0mijd0v5h50ky633q3d-unit-script/bin/br0-netdev-start (code=exited, status=
 Main PID: 710 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/br0-netdev.service

Jul 01 11:15:44 apu-nixi systemd[1]: Starting Bridge Interface br0...
Jul 01 11:15:44 apu-nixi br0-netdev-start[710]: Removing old bridge br0...
Jul 01 11:15:44 apu-nixi br0-netdev-start[710]: Adding bridge br0...
Jul 01 11:15:45 apu-nixi systemd[1]: Started Bridge Interface br0.

journalctl -u br0-netdev.service

-- Reboot --
Jul 01 11:15:44 apu-nixi systemd[1]: Starting Bridge Interface br0...
Jul 01 11:15:44 apu-nixi br0-netdev-start[710]: Removing old bridge br0...
Jul 01 11:15:44 apu-nixi br0-netdev-start[710]: Adding bridge br0...
Jul 01 11:15:45 apu-nixi systemd[1]: Started Bridge Interface br0.

systemctl cat br0-netdev.service

# /nix/store/qg3vglcr6pmq15hsjjrj6w6gvglh296j-unit-br0-netdev.service/br0-netdev.service
[Unit]
After=network-pre.target mstpd.service sys-subsystem-net-devices-enp1s0.device sys-subsystem-net-devices-wlp4s0.device netw
Before=network-interfaces.target sys-subsystem-net-devices-br0.device
BindsTo=sys-subsystem-net-devices-enp1s0.device sys-subsystem-net-devices-wlp4s0.device
Description=Bridge Interface br0

[Service]
Environment="LOCALE_ARCHIVE=/nix/store/0hjq5fy6ghbkzwdwszvy5k8hm31i5fcz-glibc-locales-2.23/lib/locale/locale-archive"
Environment="PATH=/nix/store/azal5a1w3l4kwrkwjrxva1mflw8cga27-iproute2-4.3.0/bin:/nix/store/lphk3qrh53kv2j69b3sry5jmhfbrz1n
Environment="TZDIR=/nix/store/majymj6knc0j74rrp7m6mlrrmr5h4wqa-tzdata-2015g/share/zoneinfo"



ExecStart=/nix/store/1n622x4m2q7gc0mijd0v5h50ky633q3d-unit-script/bin/br0-netdev-start 
ExecStopPost=/nix/store/74g07gm4zdi3d2jak0q2xc863gpxjhg7-unit-script/bin/br0-netdev-post-stop
RemainAfterExit=true
Type=oneshot

systemctl list-dependencies --after br0-netdev.service (my fixed hostapd.service)

br0-netdev.service
x ├─mstpd.service
x ├─network-addresses-enp1s0.service
x ├─network-addresses-wlp4s0.service
x ├─network-link-enp1s0.service
x ├─network-link-wlp4s0.service
● ├─sys-subsystem-net-devices-enp1s0.device
● ├─sys-subsystem-net-devices-wlp4s0.device
● ├─system.slice
● ├─systemd-journald.socket
● ├─basic.target
● │ ├─-.mount
x │ ├─tmp.mount
● │ ├─paths.target
● │ │ ├─systemd-ask-password-console.path
● │ │ └─systemd-ask-password-wall.path
● │ ├─slices.target
● │ │ ├─-.slice
● │ │ ├─system.slice
● │ │ └─user.slice
● │ ├─sockets.target
● │ │ ├─dbus.socket
● │ │ ├─nix-daemon.socket
x │ │ ├─syslog.socket
● │ │ ├─systemd-initctl.socket
● │ │ ├─systemd-journald-audit.socket
● │ │ ├─systemd-journald-dev-log.socket
● │ │ ├─systemd-journald.socket
● │ │ ├─systemd-udevd-control.socket
● │ │ └─systemd-udevd-kernel.socket
● │ └─sysinit.target
● │   ├─dev-hugepages.mount
● │   ├─dev-mqueue.mount
x │   ├─emergency.service
● │   ├─kmod-static-nodes.service
x │   ├─sys-fs-fuse-connections.mount
x │   ├─sys-kernel-config.mount
● │   ├─sys-kernel-debug.mount
● │   ├─systemd-journald.service
● │   ├─systemd-modules-load.service
● │   ├─systemd-random-seed.service
● │   ├─systemd-sysctl.service
● │   ├─systemd-tmpfiles-setup-dev.service
x │   ├─systemd-udev-settle.service
● │   ├─systemd-udev-trigger.service
● │   ├─systemd-udevd.service
● │   ├─systemd-update-utmp.service
● │   ├─systemd-vconsole-setup.service
x │   └─emergency.target
x │     └─emergency.service
● ├─network-pre.target
● │ └─firewall.service
● └─sysinit.target
●   ├─dev-hugepages.mount
●   ├─dev-mqueue.mount
x   ├─emergency.service
●   ├─kmod-static-nodes.service
x   ├─sys-fs-fuse-connections.mount
x   ├─sys-kernel-config.mount
●   ├─sys-kernel-debug.mount
●   ├─systemd-journald.service
●   ├─systemd-modules-load.service
●   ├─systemd-random-seed.service
●   ├─systemd-sysctl.service
●   ├─systemd-tmpfiles-setup-dev.service
x   ├─systemd-udev-settle.service
●   ├─systemd-udev-trigger.service
●   ├─systemd-udevd.service
●   ├─systemd-update-utmp.service
●   ├─systemd-vconsole-setup.service
x   └─emergency.target
x     └─emergency.service

x marks a red dot

systemctl list-dependencies network.target

● ├─br0-netdev.service
● ├─dhcpcd.service
● ├─hostapd.service
● ├─network-local-commands.service
● ├─network-setup.service
● ├─network-interfaces.target
● └─network-pre.target
●   └─firewall.service

@groxxda
Copy link
Contributor

groxxda commented Jul 6, 2016

Well thanks for the outputs, but they are not really helpful to me because they don't show the error 😞
From the default config:

# For nl80211, this parameter can be used to request the AP interface to be
# added to the bridge automatically (brctl may refuse to do this before hostapd
# has been started to change the interface mode). If needed, the bridge
# interface is also created.
#bridge=br0

Maybe this is enough to get everything working.

@luminoso
Copy link

I have exactly same problem. A race condition between systemd bridge and hostapd bridge. Have you managed to solve the issue?

@qknight
Copy link
Member Author

qknight commented Jul 14, 2016

@luminoso
i have a workaround, see this thread for it. i'm still working on a proper fix and you are invited to help ;-)

@qknight
Copy link
Member Author

qknight commented Aug 3, 2016

@groxxda

now the output but without the following two lines:

systemd.services.hostapd.wantedBy= [ "multi-user.target" ];
systemd.services.hostapd.after = [ "sys-subsystem-net-devices-wlp4s0.device" "nat.service" "bind.service" "dhcpd.service" ];

configuration.nix

  services.hostapd = {
    enable = true;
    wpaPassphrase = pw.wpaPassphrase;
    interface = "wlp4s0";
    ssid="flux";
  };

systemctl status br0-netdev.service

systemctl status br0-netdev.service                                                                       ~
● br0-netdev.service - Bridge Interface br0
   Loaded: loaded (/nix/store/ac1i3ywzxm809sjvmqikvacbgf33zi5s-unit-br0-netdev.service/br0-netdev.service; bad; vendor preset: enabled)
   Active: failed (Result: exit-code) since Wed 2016-08-03 16:47:44 CEST; 2min 0s ago
  Process: 891 ExecStart=/nix/store/w9qj52sl052fnm3g35hcp8k8khhkv99a-unit-script/bin/br0-netdev-start (code=exited, status=2)
 Main PID: 891 (code=exited, status=2)

Aug 03 16:47:43 apu-nixi systemd[1]: Starting Bridge Interface br0...
Aug 03 16:47:43 apu-nixi br0-netdev-start[891]: Removing old bridge br0...
Aug 03 16:47:43 apu-nixi br0-netdev-start[891]: Adding bridge br0...
Aug 03 16:47:44 apu-nixi br0-netdev-start[891]: RTNETLINK answers: Operation not supported
Aug 03 16:47:44 apu-nixi systemd[1]: br0-netdev.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 03 16:47:44 apu-nixi systemd[1]: Failed to start Bridge Interface br0.
Aug 03 16:47:44 apu-nixi systemd[1]: br0-netdev.service: Unit entered failed state.
Aug 03 16:47:44 apu-nixi systemd[1]: br0-netdev.service: Failed with result 'exit-code'.

journalctl -u br0-netdev.service

-- Reboot --
Aug 03 16:47:43 apu-nixi systemd[1]: Starting Bridge Interface br0...
Aug 03 16:47:43 apu-nixi br0-netdev-start[891]: Removing old bridge br0...
Aug 03 16:47:43 apu-nixi br0-netdev-start[891]: Adding bridge br0...
Aug 03 16:47:44 apu-nixi br0-netdev-start[891]: RTNETLINK answers: Operation not supported
Aug 03 16:47:44 apu-nixi systemd[1]: br0-netdev.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 03 16:47:44 apu-nixi systemd[1]: Failed to start Bridge Interface br0.
Aug 03 16:47:44 apu-nixi systemd[1]: br0-netdev.service: Unit entered failed state.
Aug 03 16:47:44 apu-nixi systemd[1]: br0-netdev.service: Failed with result 'exit-code'.

systemctl cat br0-netdev.service

# /nix/store/ac1i3ywzxm809sjvmqikvacbgf33zi5s-unit-br0-netdev.service/br0-netdev.service
[Unit]
After=network-pre.target mstpd.service sys-subsystem-net-devices-enp1s0.device sys-subsystem-net-devices-wlp4s0.device network-addresses-enp1s0.service network-link-enp1s0.service network-addresses-wlp4s0.se
Before=network-interfaces.target sys-subsystem-net-devices-br0.device
BindsTo=sys-subsystem-net-devices-enp1s0.device sys-subsystem-net-devices-wlp4s0.device
Description=Bridge Interface br0

[Service]
Environment="LOCALE_ARCHIVE=/nix/store/6c9w544nky66z8ax5rzh941cl6rv0j9c-glibc-locales-2.23/lib/locale/locale-archive"
Environment="PATH=/nix/store/vcd3l2xdfffsbkk5i234zbdq2552w832-iproute2-4.3.0/bin:/nix/store/w8vzn0lsahbd9sfh0v30x65qwq6xrpa8-coreutils-8.25/bin:/nix/store/l65knk24c08q0lwdcf0yyh7x6l5shhqj-findutils-4.4.2/bin
Environment="TZDIR=/nix/store/8qmj1pz36ky95r25c74w6dx48vck3i3b-tzdata-2016e/share/zoneinfo"



ExecStart=/nix/store/w9qj52sl052fnm3g35hcp8k8khhkv99a-unit-script/bin/br0-netdev-start 
ExecStopPost=/nix/store/9v82lg0lhjqq7xaix6063bq3wdvv480f-unit-script/bin/br0-netdev-post-stop
RemainAfterExit=true
Type=oneshot




ExecStart=/nix/store/1n622x4m2q7gc0mijd0v5h50ky633q3d-unit-script/bin/br0-netdev-start 
ExecStopPost=/nix/store/74g07gm4zdi3d2jak0q2xc863gpxjhg7-unit-script/bin/br0-netdev-post-stop
RemainAfterExit=true
Type=oneshot

systemctl list-dependencies --after br0-netdev.service (my fixed hostapd.service)

br0-netdev.service
x ├─mstpd.service
x ├─network-addresses-enp1s0.service
x ├─network-addresses-wlp4s0.service
x ├─network-link-enp1s0.service
x ├─network-link-wlp4s0.service
● ├─sys-subsystem-net-devices-enp1s0.device
● ├─sys-subsystem-net-devices-wlp4s0.device
● ├─system.slice
● ├─systemd-journald.socket
● ├─basic.target
● │ ├─-.mount
x │ ├─tmp.mount
● │ ├─paths.target
● │ │ ├─systemd-ask-password-console.path
● │ │ └─systemd-ask-password-wall.path
● │ ├─slices.target
● │ │ ├─-.slice
● │ │ ├─system.slice
● │ │ └─user.slice
● │ ├─sockets.target
● │ │ ├─dbus.socket
● │ │ ├─nix-daemon.socket
x │ │ ├─syslog.socket
● │ │ ├─systemd-initctl.socket
● │ │ ├─systemd-journald-audit.socket
● │ │ ├─systemd-journald-dev-log.socket
● │ │ ├─systemd-journald.socket
● │ │ ├─systemd-udevd-control.socket
● │ │ └─systemd-udevd-kernel.socket
● │ └─sysinit.target
● │   ├─dev-hugepages.mount
● │   ├─dev-mqueue.mount
x │   ├─emergency.service
● │   ├─kmod-static-nodes.service
x │   ├─sys-fs-fuse-connections.mount
x │   ├─sys-kernel-config.mount
● │   ├─sys-kernel-debug.mount
● │   ├─systemd-journald.service
● │   ├─systemd-modules-load.service
● │   ├─systemd-random-seed.service
● │   ├─systemd-sysctl.service
● │   ├─systemd-tmpfiles-setup-dev.service
x │   ├─systemd-udev-settle.service
● │   ├─systemd-udev-trigger.service
● │   ├─systemd-udevd.service
● │   ├─systemd-update-utmp.service
● │   ├─systemd-vconsole-setup.service
x │   └─emergency.target
x │     └─emergency.service
● ├─network-pre.target
● │ └─firewall.service
● └─sysinit.target
●   ├─dev-hugepages.mount
●   ├─dev-mqueue.mount
x   ├─emergency.service
●   ├─kmod-static-nodes.service
x   ├─sys-fs-fuse-connections.mount
x   ├─sys-kernel-config.mount
●   ├─sys-kernel-debug.mount
●   ├─systemd-journald.service
●   ├─systemd-modules-load.service
●   ├─systemd-random-seed.service
●   ├─systemd-sysctl.service
●   ├─systemd-tmpfiles-setup-dev.service
x   ├─systemd-udev-settle.service
●   ├─systemd-udev-trigger.service
●   ├─systemd-udevd.service
●   ├─systemd-update-utmp.service
●   ├─systemd-vconsole-setup.service
x   └─emergency.target
x     └─emergency.service

x is actually a red dot

systemctl list-dependencies network.target

x ├─br0-netdev.service
● ├─dhcpcd.service
x ├─hostapd.service
● ├─network-local-commands.service
● ├─network-setup.service
● ├─network-interfaces.target
● └─network-pre.target
●   └─firewall.service

x is actually a red dot

@qknight
Copy link
Member Author

qknight commented Aug 3, 2016

@groxxda

if i just add the code below it'll work:

systemd.services.hostapd.after = [ "sys-subsystem-net-devices-wlp4s0.device" ];

i think the device needs to be up in ordert to add it to a bridge.

@qknight
Copy link
Member Author

qknight commented Aug 12, 2016

@luminoso could you please check if that also fixes your issue?

@luminoso
Copy link

@qknight I'm sorry for the delay replying.
I was confused when I posted here. I thought I was posting in another topic. I am actually using arch when I noticed that here at NixOS a related problem was being discussed. I shouldn't have posted here. However, the problem is the same, a race condition and I didn't found a way to fix it. I just used a timmer to avoid the race condition (the ugly way)

@qknight
Copy link
Member Author

qknight commented Oct 10, 2016

since nobody has a better idea, i close this as the fix:
systemd.services.hostapd.after = [ "sys-subsystem-net-devices-wlp4s0.device" ];
works for me!

@qknight qknight closed this as completed Oct 10, 2016
@mrobbetts
Copy link
Contributor

Ah, I've been meaning to comment on this for a while but not gotten around to it.

I've been having trouble with a race condition related to this. I think hostapd was (sometimes) trying to come up before my wifi device was available at boot. A nixos-rebuild switch, depending on what needed restarting, would also (sometimes) fail for the same reason. Things would fail approaching half the time.

My data is limited to just me, but I applied this fix and haven't seen the problem since.

Shouldn't this change be more directly integrated into the hostapd package? It really looks like, logically, it should have been there all along. Or is there other logic which is supposed to regulate the ordering, that is just buggy for me?

@qknight
Copy link
Member Author

qknight commented Oct 10, 2016

@mrobbetts yes, i could fix this in nixpkgs but so far it seemd that i'm the only one affected.

the question is, should i?

@mrobbetts
Copy link
Contributor

I hope so! From my point of view, it looks as though this is a real bug in the module. IMO, bugs ought to be fixed if they are low risk, even if nobody you know of is suffering from them, because they may suffer later and in unpredictable ways. Others may not agree with that, I know.

While I know the hostapd module is not fully developed (currently, the overwhelming majority of my hostapd configuration is poured into extraConfig, because not many of the configuration keys seem to be supported by the module directly), the module does at least know what interface you're using it with, so the needed change for this would presumably be very small.

Of course, I'm assuming this is the real bug. I'm not familiar enough with systemd to know if there is another/better way of handling this, or if it should actually be working but there's a bug elsewhere. If this all sounds logical to you, I'd definitely advocate towards fixing it.

@qknight
Copy link
Member Author

qknight commented Jun 14, 2017

fixed in #26573

@clefru
Copy link
Contributor

clefru commented Aug 22, 2018

This issue is not fixed, and the race is still present in the master branch. I have submitted a PR though, and I would welcome testing.

The root cause of this is that if hostapd.service loses the race against br0-netdev.service the latter will call ip link wlp4s0 set master br0 which fails if hostapd hasn't put the device into AP mode yet. The failure bubbles up and the bridge setup all together fails.

The fix is moving hostapd between br0-netdev.service by hanging it below network-link-wlp4s0.service which br0-netdev.service depends on. This is mostly achieved by cleaning up the "after" clause and introducing a requiredBy clause. See PR for a PDF of the cleaned-up systemd dependency tree.

@qknight
Copy link
Member Author

qknight commented Sep 28, 2018

i've tried to reproduce my original problem using (18.03) 45f52f7 but it doesn't hit the race at all. that said, i tested:

  • without any patch -> works
  • tested with my patch -> works
  • tested with your patch -> works

i'll do some more investitation but don't understand why it works now.

update:

nixos-rebuild -I nixpkgs=/root/nixpkgs switch
warning: Nix search path entry '/root/nixpkgs' does not exist, ignoring

missed the warning in the output and virtually updated the system without any patch of mine... who creates these defaults? why not 'abort' instead of warning?

@qknight
Copy link
Member Author

qknight commented Sep 28, 2018

see #45464 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants