-
Notifications
You must be signed in to change notification settings - Fork 110
Description
First Boot Wizard crashes when radio without hardware exists in UCI
Problem Description
The First Boot Wizard (FBW) crashes during network scanning when there's a radio configured in UCI (/etc/config/wireless) but the corresponding physical hardware (phy) doesn't exist in the system.
Symptoms
- FBW starts scanning and detects mesh networks successfully
- Process crashes silently during config download phase
- Frontend shows "Connection attempt not yet started" indefinitely
/tmp/scanningfile remainstrue(no cleanup)- No config files downloaded to
/tmp/fbw/
Steps to Reproduce
- Have a router with a stale UCI radio configuration (e.g.,
radio2) pointing to non-existent PCI hardware - Start First Boot Wizard scan via lime-app
- FBW detects networks but crashes when trying to process them
Environment
Hardware: Router with 2 physical radios (phy0, phy1)
UCI Config: 3 radios configured (radio0, radio1, radio2)
# Physical radios
root@LiMe-1d2ae2:~# ls /sys/class/ieee80211/
phy0 phy1
# UCI radios
root@LiMe-1d2ae2:~# uci show wireless | grep "^wireless.radio"
wireless.radio0=wifi-device
wireless.radio0.path='platform/ahb/18100000.wmac'
wireless.radio1=wifi-device
wireless.radio1.path='pci0000:00/0000:00:00.0'
wireless.radio2=wifi-device
wireless.radio2.path='pci0000:01/0000:01:00.0' # <-- Hardware doesn't existError Log
root@LiMe-1d2ae2:~# /bin/firstbootwizard
[FBW] Scanning...
/usr/bin/lua: /usr/lib/lua/lime/wireless.lua:19: wireless.get_phy_mac(..) failed reading: /sys/class/ieee80211/phy2/macaddress
stack traceback:
[C]: in function 'assert'
/usr/lib/lua/lime/wireless.lua:19: in function 'get_phy_mac'
/usr/lib/lua/firstbootwizard.lua:110: in function 'func'
/usr/lib/lua/firstbootwizard/functools.lua:63: in function </usr/lib/lua/firstbootwizard/functools.lua:60>
(tail call): ?
/usr/lib/lua/firstbootwizard.lua:127: in function 'cb'
/usr/lib/lua/firstbootwizard/functools.lua:127: in function 'reduce'
/usr/lib/lua/firstbootwizard.lua:430: in function 'get_all_networks'
/bin/firstbootwizard:7: in main chunk
[C]: ?Root Cause Analysis
The bug occurs in this call chain:
- firstbootwizard.lua:110 -
fbw.get_own_macs()iterates over all 5GHz radios - firstbootwizard/utils.lua:78 -
extract_phys_from_radios("radio2")returns"phy2"function utils.extract_phys_from_radios(radio) return "phy"..radio.sub(radio, -1) -- Assumes radioN = phyN end
- wireless.lua:110 calls
wireless.get_phy_mac("phy2") - wireless.lua:19 -
assert()crashes when file doesn't exist:function wireless.get_phy_mac(phy) local path = "/sys/class/ieee80211/"..phy.."/macaddress" local mac = assert(fs.readfile(path), "wireless.get_phy_mac(..) failed reading: "..path):gsub("\n","") return utils.split(mac, ":") end
Why the incorrect mapping happens
The code incorrectly assumes that radioN always corresponds to phyN:
radio0→phy0✅radio1→phy1✅radio2→phy2❌ (phy2 doesn't exist)
This is fragile because:
- Radio names are UCI configuration names (can be arbitrary)
- Phy names are kernel-assigned based on hardware detection order
- A radio can be removed/disabled in hardware but remain in UCI config
Proposed Solutions
Solution 1: Graceful error handling (Quick fix)
Modify wireless.get_phy_mac() to return nil instead of crashing:
function wireless.get_phy_mac(phy)
local path = "/sys/class/ieee80211/"..phy.."/macaddress"
-- Check if phy exists before trying to read MAC
if not fs.stat(path) then
utils.log("wireless.get_phy_mac: phy "..phy.." does not exist, skipping")
return nil
end
local mac = assert(fs.readfile(path), "wireless.get_phy_mac(..) failed reading: "..path):gsub("\n","")
return utils.split(mac, ":")
endThen update fbw.get_own_macs() to filter out nil results:
function fbw.get_own_macs()
local radios = ft.map(utils.extract_prop(".name"), wireless.scandevices())
local radios_5ghz = ft.filter(wireless.is5Ghz, radios)
local phys = ft.map(utils.extract_phys_from_radios, radios_5ghz)
local macs = ft.map(function(phy)
local mac = wireless.get_phy_mac(phy)
if mac then
return table.concat(mac, ":")
end
return nil
end, phys)
-- Filter out nils
return ft.filter(function(mac) return mac ~= nil end, macs)
endSolution 2: Correct radio→phy mapping (Proper fix)
Don't assume radioN = phyN. Instead, derive the phy from the radio's device path:
function wireless.get_phy_from_radio(radio_name)
local uci = config.get_uci_cursor()
local path = uci:get("wireless", radio_name, "path")
if not path then
utils.log("wireless.get_phy_from_radio: no path for radio "..radio_name)
return nil
end
-- Find phy by matching device path
for phy_dir in fs.dir("/sys/class/ieee80211/") do
if phy_dir ~= "." and phy_dir ~= ".." then
local device_link = fs.readlink("/sys/class/ieee80211/"..phy_dir.."/device")
if device_link and device_link:find(path, 1, true) then
return phy_dir
end
end
end
utils.log("wireless.get_phy_from_radio: phy not found for radio "..radio_name.." with path "..path)
return nil
endSolution 3: Filter radios during scandevices (Most robust)
Modify wireless.scandevices() to only return radios that have corresponding hardware:
function wireless.scandevices()
local devices = {}
local uci = config.get_uci_cursor()
uci:foreach("wireless", "wifi-device", function(dev)
-- Check if hardware exists for this radio
local path = dev.path
if path then
local phy_exists = false
for phy_dir in fs.dir("/sys/class/ieee80211/") do
if phy_dir ~= "." and phy_dir ~= ".." then
local device_link = fs.readlink("/sys/class/ieee80211/"..phy_dir.."/device")
if device_link and device_link:find(path, 1, true) then
phy_exists = true
break
end
end
end
if phy_exists then
devices[dev[".name"]] = dev
else
utils.log("wireless.scandevices: skipping radio "..dev[".name"].." - hardware not found")
end
else
utils.log("wireless.scandevices: skipping radio "..dev[".name"].." - no path defined")
end
end)
-- ... rest of the function
endWorkaround
Users can work around this by cleaning up stale radio configurations:
# Identify stale radios
for radio in $(uci show wireless | grep "=wifi-device" | cut -d. -f2 | cut -d= -f1); do
path=$(uci get wireless.$radio.path 2>/dev/null)
if [ -n "$path" ]; then
# Check if hardware exists
if ! ls -d /sys/devices/$path/ieee80211/phy* >/dev/null 2>&1; then
echo "Stale radio: $radio (path: $path)"
uci delete wireless.$radio
fi
fi
done
uci commit wirelessRelated Issues
- lime-config: generate configs from scratch #75 - lime-config: generate configs from scratch (stale configs problem)
- wifi interface do not always get deconfigured correctly #1222 - wifi interface do not always get deconfigured correctly
Impact
- Severity: High - FBW completely breaks on affected systems
- Frequency: Medium - affects routers with hardware changes or stale configs
- User Experience: Critical - prevents initial network setup
Additional Context
This bug was discovered while debugging why FBW was stuck showing "Connection attempt not yet started" in lime-app. The silent crash leaves the system in an inconsistent state with /tmp/scanning locked to true, preventing subsequent scan attempts until manual cleanup.