-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ubiquiti AC devices dont return any devices during a scan #888
Comments
Can we flag this as a bug? That would be helpful when looking at the list of issues. |
For my notes (and any one interested) here's a comparison of the WMI events (that basically the wifi chipset+firmware talking to the kernel) for an Mikrotik AC device compared to a UBNT AC device: Mikrotik AC:
UBNT AC:
The distinction in the scan data is that while both show an ever changing frequency for the "foreign" events, the UBNT always shows the same frequency for the bss while the Mikrotik show the same as the previous foreign event. Which is why we get sensible scans from the Mikrotik devices but not the UBNT ones. |
I have a LiteBeam 5AC running nightly build 20231212-ee0dd54 which seems to be showing the same thing. The symptom is that the web UI's "WiFi Scan" almost never lists anything other than nodes on the channel I've already set. Since scan doesn't work at all, and you need a working Internet/meshmap to compensate, that is a significant limit on usefulness in an emergency situation... My node is currently using channel 160, frequency 5800. It's pointed at a nest of other AREDN nodes, on channel/freqs 137/5685, 142/5710, 149/5745, 152/5760, 160/5800, 162/5810, 182/5910, all with 10 MHz bandwidth. I know I can connect to the 149 and 160 nodes, but the "foreign" one doesn't appear in the scans. I got a list of WMI events by logging in as root with ssh (port 2222) and running litebeam5ac-149.txt Since WMI is a firmware/driver interface, does that mean that something is wrong with the firmware (supplied by Ubiquity)? Or could there be a driver problem, in which case the Linux ath10k developers should be made aware of it? |
That's correct. It's busted for Ubiquiti -ac devices. When the firmware
(binary blob in the driver) was adopted it led to very bad performance. So
it was backed out of the AREDN code for the time being.
Orv W6BI
…On Wed, Feb 14, 2024, 11:33 AM Hollis Blanchard ***@***.***> wrote:
I have a LiteBeam 5AC running nightly build 20231212-ee0dd54 which seems
to be showing the same thing. The symptom is that the web UI's "WiFi Scan"
almost never lists anything other than nodes on the channel I've already
set.
Since scan doesn't work at all, and you need a working Internet/meshmap to
compensate, that is a significant limit on usefulness in an emergency
situation...
My node is currently using channel 160, frequency 5800. It's pointed at a
nest of other AREDN nodes, on channel/freqs 137/5685, 142/5710, 149/5745,
152/5760, 160/5800, 162/5810, 182/5910, all with 10 MHz bandwidth. I know I
can connect to the 149 and 160 nodes, but the "foreign" one doesn't appear
in the scans.
I got a list of WMI events by logging in as root with ssh (port 2222) and
running
echo 0x10002002 > /sys/module/ath10k_core/parameters/debug_mask, iw wlan0
scan passive, and dumped them with dmesg. I'm attaching two logs here:
one when I was configured to use channel 160, and one when I was using
channel 149.
litebeam5ac-149.txt
<https://github.com/aredn/aredn/files/14285237/litebeam5ac-149.txt>
litebeam5ac-160.txt
<https://github.com/aredn/aredn/files/14285238/litebeam5ac-160.txt>
Since WMI is a firmware/driver interface, does that mean that something is
wrong with the firmware (supplied by Ubiquity)? Or could there be a driver
problem, in which case the Linux ath10k developers should be made aware of
it?
—
Reply to this email directly, view it on GitHub
<#888 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAISQNKQZUBA65DKC7ZJ7TLYTUGRLAVCNFSM6AAAAAAZ3UR5K2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBUGQ3DIMRVGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
What is the next step with this?
|
Try testing with current openwrt image. Possibly, would also need to configure an adhoc connection in openwrt. Does the scan also not return anything with an iw passive scan? If so, this could be submitted as a defect back to openwrt or the package owner of iw. (Maybe already a defect upstream?) |
Yes, the problem occurs with passive scans (I included logs above). The latest OpenWRT build for LiteBeam AC Gen2 is 23.05.02, using ath10k-firmware-qca988x-ct. How can I figure out which firmware vendor and version the OpenWRT and AREDN builds are using, in order to compare them? @Orv is #1055 the "bad performance" issue that forced the switch away from firmware with working scan? I don't see mention of switching firmware there (from what to what?) so I'm not sure... |
@slightlyunconventional I see logs above from AREDN images on the devices. After installing the openwrt image on the same device(s) is the problem reproducible? If the issue occurs in openwrt, then it is a problem upstream and a defect can be submitted. If the issue is not reproducible in openwrt, then the issue is introduced locally by AREDN -- some incompatibility introduced in using 3 different 'things': a) chip firmware (loaded into the chipset); b) ath10k-firmware* (linux driver); c) iw (user app). These 3 things are different in AREDN then openwrt -- we've modified to extend channels, and reusing DD-WRT firmware loaded into the chip (the source code is proprietary and we don't have access) -- it is a black box. |
I flashed the OpenWRT image using the web interface without any trouble. A passive scan with the 802.11AC interface lists 41 networks, all in the 5GHz range, most with channel width 1 (80 MHz), some with width 0 ("20 or 40 MHz"). (The AREDN SSID isn't listed, but the SSID is blank for about half the networks. AREDN was using 10MHz channels...)
There's some stuff in debugfs, but I don't know what would help:
I'm pretty familiar with Linux (kernel and userspace), but not familiar with these devices or the wireless network stack. @ae6xe can you suggest ways to get more debug data that would be helpful? Thanks! |
See this ref to debug ath10k: https://wireless.wiki.kernel.org/en/users/drivers/ath10k/debug [update] |
Turns out OpenWRT has CONFIG_PACKAGE_ATH_DEBUG disabled, which explains why the command I showed you from that page didn't produce anything. I am now building OpenWRT for LiteBeam 5AC Gen2, with CONFIG_PACKAGE_ATH_DEBUG enabled, and am hoping that will boot and give us what we need. If we see the same broken WMI log, that suggests OpenWRT has a workaround higher up the stack that AREDN should grab. If the WMI log looks better, that suggests the firmware OpenWRT is using is better, and AREDN should adopt that. (I think that already matches @Orv 's comment though: maybe it's better at scanning, but worse in performance.) |
openwrt's implementation doesn't have the code/changes for extended channel support and 5/10 MHz channels. All 3 levels (chip firmware, linux driver, user commands) have changes to support these add-on features. A test on openwrt will confirm which side of the fence the issue exists -- pre-existing openwrt issue, or AREDN specific features introduced issue. While I'd speculate it is AREDN features introducing this issue, it is better to know and not speculate. We are adding features with a blindfold on, given we have no visibility to the proprietary source code of the Qualcomm chip firmware. |
I think I just did that test, right? On AREDN, It sounded like you wanted more detailed information from the WMI log from OpenWRT's firmware, so that's what I'm trying to obtain now. If you don't need that, I'll stop. |
I dont know if you've tried this, but have you run the AREDN scanning code while the device is set to 20MHz bandwidth? |
Got it. In the openwrt test, configured for ad-hoc mode and not AP or station/client mode? Ad-hoc mode has been known to be lower quality, given it is not widely used. Although, I don't know 100% if this scan failure has a dependency to the 802.11 mode in use, maybe it doesn't. Similiar to @aanon4's comment, if we compare the same apple to apple in openwrt verses AREDN (20 Mhz channel, ad-hoc mode, stations connected/non-connected, same part 15 channel). A failure in AREDN and success in openwrt helps to rule out places to look for the root cause. |
Each time I changed the two settings in the web UI, rebooted, and ran this command: (The WMI logs I attached earlier were from the 10MHz cases, #2 and #3.) Please let me know if an OpenWRT comparison is still useful. If I can do it from LuCI that's fine, but if from the cmdline I'll need further assistance in getting the right set of |
Attaching WMI log for channel 149 bandwidth 20MHz: channel149-width20.log |
FYI. If I use the DD-WRT firmware blob for the Ubiquiti AC devices (we use the CT firmware) then you do get a full scan @ 10 MHz but at the cost of approximately half the bandwidth during normal operation ... so not an option. |
Here's an interesting things to try. Put the AREDN device in 20 MHz mode. Then switch it back to 10 MHz mode. Do a scan. The first fails. Do another scan .. and you get back a list of devices. Well .. I tried this a couple of times and it worked both times. Not quite sure what this is telling me, but something ... |
Okay ... the trick appears to be that the Ubiquiti driver cannot scan when it has an active IBSS. If you disassociate the wifi using "iw wlan0 ibss leave" and then scan you get results. There may be a moment during re-association when scanning also works, but it's brief (and may not be really there .. difficult to tell in my testing). |
This suggests the error may be upstream. If a work around drops the
adhoc connection, doesn’t this risk breaking any active sockets and daisy
chain application failures to end users?
|
There are various work arounds in the AREDN code to handle firmware blobs which dont behave as one might like ... but not much we can really do about it. |
Comment from another user:
|
Back in the day most wifi's would disconnect while they did the scan. These days most chipset dont do that in a visible way, although it still happens down in the low level firmware because there is only one radio still. Unfortunately with the Ubiquiti AC firmware we're more back to the old ways of doing things, so there is a momentary disconnect which must happen before the wifi scan can be started. |
If you run a scan on a Ubiquiti AC device, you dont see any devices. In fact, the underlying command for this "iw wlan0 scan passive" doesnt return any devices except yourself (an active scan does the same). This works fine on non AC devices and AC devices from Mikrotik, so something low level is amiss here.
The text was updated successfully, but these errors were encountered: