Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#2637 - hostapd ubus reload makes OpenWrt as WiFi client very unreliable #7899

Closed
openwrt-bot opened this issue Nov 28, 2019 · 9 comments
Closed
Labels

Comments

@openwrt-bot
Copy link

@openwrt-bot openwrt-bot commented Nov 28, 2019

stintel:

Commit 60fb4c9 (hostapd: add ubus reload) breaks WiFi client connectivity when for some reason the association does not succeed immediately.

I started seeing this problem on all my Raspberry Pi Zero W devices, where WiFi takes a long time to come up due to the lack of entropy:

[ 32.392170] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 104.672670] random: crng init done [ 104.685751] random: 3 urandom warning(s) missed due to ratelimiting [ 105.436537] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready

At this point, the Raspberry Pi is associated to the AP, but the DHCP client is not running. This is the output of //ifstatus lan//:

{ "up": false, "pending": false, "available": false, "autostart": true, "dynamic": false, "proto": "dhcp", "data": {
    },
    "errors": [
            {
                    "subsystem": "interface",
                    "code": "NO_DEVICE"
            }
    ]

}

Running //ifup lan// fixes the problem, but this is of course not easy to do on headless devices.

Later I also experienced this problem on a Raspberry Pi 3 A, which does not have the low entropy problem. In this case the delayed association was caused by a power outage, and the Raspberry Pi booted faster than the APs.

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Dec 3, 2019

stintel:

I just noticed this on one of my devices, and am pretty sure it is caused by the same commit:

root@litebeam01:~# cat /etc/config/wireless

config wifi-device 'radio0'
option type 'mac80211'
option channel '36'
option hwmode '11a'
option path 'pci0000:00/0000:00:00.0'
option htmode 'VHT80'
option disabled '1'

config wifi-iface 'default_radio0'
option device 'radio0'
option network 'lan'
option mode 'ap'
option ssid 'OpenWrt'
option encryption 'none'

config wifi-device 'radio1'
option type 'mac80211'
option channel '11'
option hwmode '11g'
option path 'platform/ahb/18100000.wmac'
option htmode 'HT20'
option disabled '1'

config wifi-iface 'default_radio1'
option device 'radio1'
option network 'lan'
option mode 'ap'
option ssid 'OpenWrt'
option encryption 'none'

root@litebeam01:~# ps | grep phy 745 root 3444 S /usr/sbin/hostapd -s -n phy0 -P /var/run/hostapd-phy0.pid -g /var/run/hostapd-phy0/global -B 820 root 3444 S /usr/sbin/hostapd -s -n phy1 -P /var/run/hostapd-phy1.pid -g /var/run/hostapd-phy1/global -B 871 root 3444 S /usr/sbin/wpa_supplicant -s -n phy0 -P /var/run/wpa_supplicant-phy0.pid -g /var/run/wpa_supplicant-phy0/global -B 872 root 3444 S /usr/sbin/wpa_supplicant -s -n phy1 -P /var/run/wpa_supplicant-phy1.pid -g /var/run/wpa_supplicant-phy1/global -B 1806 root 1208 S grep phy

We should not unconditionally start these instances I'd say.

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Dec 6, 2019

dangowrt:

Running the services unconditionally (to be precise: only in the presence of wifi hardware) was the intention of those commits and believe me, it's a feature, not a bug :)

The cause of the problem described by you above is probably that previously, we have been running hostapd/wpa_supplicant from those scripts and probably those services only fork to background once enough entropy had been collected.
I was hoping that introducing ubus wait_for ... would solve that problem, but apparently it didn't -- the service is running and the configuration comes up (according to your logs), just netifd had already given up. I will try to find a way to reproduce this, but I don't have any hardware as bad as RasbPi Zero at hand, so it may take a couple of days to find it...

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Dec 6, 2019

stintel:

If you have a spare AP you could configure a client to connect to it. Turn of that AP, boot the client, and boot the AP 2 minutes afterwards.

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Apr 6, 2020

dangowrt:

I've now tried exactly that and cannot reproduce the issue. Can it be that the wifi driver (maybe brcmfmac? cfg80211?) doesn't signal link status properly and our 'ip link set wlan0 up' call makes it come up prematurely?

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Apr 16, 2020

stintel:

Any suggestions on how to check this would be appreciated.

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Apr 25, 2020

dangowrt:

To test this, please try this:
Have an open/unencrypted AP next to the device.
On the device, boot fresh/unconfigured OpenWrt.
Enter commands:

ip link show dev wlan0
ip link set dev wlan0 up
ip link show dev wlan0
iw dev wlan0 connect "wrong_non_existing_ssid"
ip link show dev wlan0
sleep 1
ip link show dev wlan0
sleep 1
ip link show dev wlan0
iw dev wlan0 connect "ssid_of_open_AP"
ip link show dev wlan0
sleep 1
ip link show dev wlan0
sleep 1
ip link show dev wlan0

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Jun 6, 2020

stintel:

https://drive.google.com/file/d/1SJQx9ukPT6ZeOmz1GpYwYlYgeWRBTYSO/view?usp=sharing

Since the device was not connected I could not do this over SSH, so here's a photo.

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Jun 6, 2020

stintel:

As I noticed some changes were made to wifi reconfig recently, I tried with master r13499-7b4877c204 and at first sight I can no longer reproduce.

The crng is initialized a bit faster in this new version, I will try booting the RPi0W with the AP offline for 2 minutes.

[ 72.436719] random: crng init done [ 72.453561] random: 2 urandom warning(s) missed due to ratelimiting [ 73.233633] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready

FYI, on this version, I am not seeing this line:

[ 32.392170] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Jun 6, 2020

stintel:

Brought up the AP after ~25m uptime on the RPi0W and it instantly connected and requested an IP via DHCP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant