Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ath79: use Qualcomm ath10k for QCA9887 devices #4385

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

CodeFetch
Copy link
Contributor

@CodeFetch CodeFetch commented Jul 20, 2021

Issues were observed with the Candelatech version of the QCA9887 driver.
It was said that this will not be fixed and it may be better to use the
official driver instead (which works stable):
greearb/ath10k-ct#180 (comment)

Note: A remaining user of the qca9887-ct driver is the Meraki MR33 which is
not being handled by this commit as it is an (probably unaffected) IPQ40xx-based device.

Issues were observed with the Candelatech version of the QCA9887 driver.
It was said that this will not be fixed and it may be better to use the
official driver instead (which works stable):
greearb/ath10k-ct#180 (comment)

Note: A remaining user of the qca9887-ct driver is the Meraki MR33 which is
not being handled by this commit as it is an IPQ40xx-based device.

Signed-off-by: Vincent Wiemann <vincent.wiemann@ironai.com>
@neheb
Copy link
Contributor

neheb commented Jul 20, 2021

Funny.

Locally, I have a device that just crashes with the stock driver/firmware. At least with ath10k-ct it can restart and keep the wifi on.

IIRC, these crashes have something to do with 802.11w.

In any case, these Wave 1 devices are a mess.

@CodeFetch
Copy link
Contributor Author

CodeFetch commented Jul 21, 2021

@neheb Which device is it? From what I know there are two revisions of the QCA9887. Version 1 seems to be wave 1, but version 2 is wave 2 even if it has a single stream only.
https://www.qualcomm.com/products/qca9887
Maybe that's the reason... Mine is wave 2 and works rock-solid with Qualcomm's driver.

Edit: Looking at the ath10k code it seems there is only revision 1.0 defined... Very strange...
Edit2: Hm... The Qualcomm page and datasheet say that it supports MU-MIMO, but that doesn't make sense with single stream, does it? What else is different in wave 2?
Edit3: As it only supports 80 MHz channels from what I now, this indeed looks like a wave 1 chip... Very confusing.

@neheb
Copy link
Contributor

neheb commented Jul 21, 2021

It's a QCA9880 device.

My point is, all wave 1 ath10k devices have these issues. It might be something mitigated in silicon on some devices, no idea.

@CodeFetch
Copy link
Contributor Author

@neheb From the QCA9880-AR1A it is known that there is a bug in silicon. https://patchwork.kernel.org/project/ath10k/patch/20190906215423.23589-1-chunkeey@gmail.com/
The QCA9887 on the other hand are still being produced.

@s-2
Copy link
Contributor

s-2 commented Jul 21, 2021

The Qualcomm page and datasheet say that it supports MU-MIMO, but that doesn't make sense with single stream

Maybe that is how they can declare Wave 2 compatibility, when it boils down to trivial software support with no hardware support for actual MU-MIMO required for a single stream... 😂 Reminds me of Chinese "USB 2.0" hubs that only support 12Mbps Full Speed Mode, but the vendor would claim they are yet conformant with the USB 2.0 Specification, which does not actually require the implementation of High Speed mode...

Maybe MU-MIMO on the website ist just a boilerplate they copied from other devices.

After all, this device somewhat reminds me of the MT7610E: first .11ac device, single stream, bad driver support, even by vendor (however they seem to run quite smoothly on recent OpenWrt meanwhile).

@adschm
Copy link
Member

adschm commented Jul 22, 2021

Funny.

Locally, I have a device that just crashes with the stock driver/firmware. At least with ath10k-ct it can restart and keep the wifi on.

AFAIK that's the kind of the reason why we chose to set -ct as default for all devices. Experiences are different for each device, and not rarely different experiences are made even for the same device. So, having all-ct prevents us from the switching-forwards-and-backwards problem.

Apart from that, build setup is much more predictable if you know it's just one default package to deal with. But that has been somewhat watered down by the -small-buffers variant lately.

Finally, the low-memory-patches have been removed for ath10k for 21.02 and newer, so out-of-the-box behavior will be degraded on many systems: 1e27bef

@CodeFetch
Copy link
Contributor Author

@adschm

Experiences are different for each device, and not rarely different experiences are made even for the same device.

As far as I know all QCA9887 devices are affected/run more stable with Qualcomm's firmware. Neheb has a different, older generation chip (QCA9880-AR1A) which is known to have a silicon bug.

The QCA9887 is still in stock and e.g. the GL.iNet GL-AR750 is a very popular device which is running GL-iNet's OpenWrt version with Qualcomm's ath10k driver without known issues or workarounds.
We have tested the JT-OR750i for more than a year now with Qualcomm's firmware and it is stable.

While multiple devices with QCA9887 chips have issues with CT firmware. I've talked about a year ago with Ben Greear about these issues and he didn't want to work on "Wave 1" chips anymore. QCA9887 has one chain only, but the chip is not so old. So maybe it is not "Wave 1 silicon" or something. It does not work properly with CT firmware and likely never did.

So I guess OpenWrt actually introduced a regression for the QCA9887 chips when switching over to CT firmware. And this PR tries to fix it.

@an-kaly
Copy link

an-kaly commented Jul 22, 2021

I also had frequent crashes due to ath10k on my tp-link archer C7 v4 since i upgraded to 21.02 Then i revert the firmware file to 19.07 ct firmware firmware-2-ct-full-community-22.bin.lede.019 and i have a much more stable system .What i have noticed is the driver uses suddenly more memory and cpu resources and is reporting high signal to noise ratio and then reboots

@adschm adschm added the target/ath79 pull request/issue for ath79 target label Jul 25, 2021
@rsalvaterra
Copy link
Member

rsalvaterra commented Jul 26, 2021

@CodeFetch, all ath10k wave 1 devices have issues with 802.11w (and consequently WPA3) which are only mitigated by the ath10k-ct driver and respective firmware. I'm sure @neheb's devices are QCA9880-BR4A (I have two of these cards myself), since the only instance of AR1A hardware I know of is an oversized and unsupported card in the Archer C7 v1.
Defaulting to ath10k for this family of devices would imply a regression. NACK. Since QCA9887 does seem to be a wave 2 device, it should be fine.

@CodeFetch CodeFetch marked this pull request as draft July 26, 2021 20:30
@CodeFetch
Copy link
Contributor Author

I've converted it to a draft as I haven't experienced the bug myself with OpenWrt master and the information about it aren't clear, yet. So I'll do some tests and report back.

@adschm adschm added the work in progress pull request the author is still working on label Aug 17, 2021
@adschm
Copy link
Member

adschm commented Aug 17, 2021

I'll put the "concerns" tag here, mostly to prevent somebody merging it by accident. It should be a conscious decision.

@adschm adschm added the concerns substantial concerns have been raised, merge/review with extra care label Aug 17, 2021
@chunkeey
Copy link
Member

chunkeey commented Aug 30, 2021

Uhhh, more data here.

For the MR33: The QCA9887v1 there is only the "Air Marshal". From what I know, this is a marketing ploy on the post-911 "sky marshal". Because Cisco sells it as following

"Air Marshal is Cisco Meraki's wireless intrusion prevention (WIPS) solution. Integrated into every Cisco Meraki access point and centrally managed from the cloud, Air Marshal detects and neutralizes wireless threats, delivering state of the art protection to the most security conscious distributed networks. " (see: Air Marshal WIPS)

The sad part here is, that this radio isn't that useful. I guess it would be "ok" for sniffing (if ath10k would support that properly),
but it's not much of any use for data transfer. So my best guess is that: nobody noticed any instabilities since nobody really needs it.

As for whenever 9887 is wave-1 and seemingly also wave-2.
From what I know the original 9887/hw1.0 was wave-1. But qualcomm needed that 1x1:1 device when the wave-1 chips were replaced by wave-2 technology. So they re-bagged the 9886/9888 wave-2 and these are ?still? selling as 9887v2. I stumbled on this as well when developing the MR33 OpenWrt port. We needed an actual MR33 to clarify that it is the old v1

here's the ath10k id from a MR33:
ath10k_pci 0000:01:00.0: qca9887 hw1.0 target 0x4100016d chip_id 0x004000ff sub 0000:0000

EDIT:
In my opinion, the MR33 would either suffer nor really benefit much from a switch. Though, it would be great if the low-memory-footprint option for the QCA9887 would be separately available "just for this wifi". That said, I don't think at this point that Qualcomm would port that feature from their SDK to the ath10k driver.

Cheers

@s-2
Copy link
Contributor

s-2 commented Aug 31, 2021

So they re-bagged the 9886/9888 wave-2 and these are ?still? selling as 9887v2.

So technically, @greearb might be able to attempt compiling wave2 firmware with the number of RF chains reduced to 1 (ignoring the presumably defective ones, that did not pass the final silicon tests), assuming this were even possible from the codebase?

I wonder what would happen if we forced loading of QCA9888 firmware to these devices, it will probably crash during initialisation, before the driver even has a chance to disable the second RF chain (again, if this is feasible from the API).

@CodeFetch could you compare the exact ath10k chip ID for JT-OR750i with the one from Meraki?

@chunkeey
Copy link
Member

Hm, not quite sure what you meant there?

The chain/antennae settings for all ath10k supported devices are part of the individual boarddata(+pre-cal) file combination.

But looking at the patch/commit got me thinking... Because from what I have tested (in the past), the ath10k-ct driver will
work with qualcomm firmwares too. Here is an really old ath10k-ct driver loading the latest (which is already 2 years old) 9887v1 (MR33) with Qualcomm's 10.2.4-1.0-00047 .

[    8.732648] ath10k 4.19 driver, optimized for CT firmware, probing pci device: 0x50.
[    8.733265] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
[    8.739919] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[    8.974785] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/fwcfg-pci-0000:01:00.0.txt failed with error -2
[...]
[    9.627654] ath10k_pci 0000:01:00.0: qca9887 hw1.0 target 0x4100016d chip_id 0x004000ff sub 0000:0000
[    9.627709] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 0
[    9.638010] ath10k_pci 0000:01:00.0: firmware ver 10.2.4-1.0-00047 api 2 features no-p2p,ignore-otp,skip-clock-init,mfp,allows-mesh-bcast crc32 62f7565f
[    9.677405] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/QCA9887/hw1.0/board-2.bin failed with error -2
[    9.677445] ath10k_pci 0000:01:00.0: Falling back to user helper
[    9.723276] firmware ath10k!QCA9887!hw1.0!board-2.bin: firmware_loading_store: map pages failed
[    9.825840] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A crc32 546cca0d
[   10.831047] ath10k_pci 0000:01:00.0: wmi print 'P 135 V 16 T 433'
[   10.842573] ath10k_pci 0000:01:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 cal file max-sta 128 raw 0 hwcrypto 1

(Note: It seems to me that everybody here has the older 9887v1, correct? The "new" 9887v2 will use 9888 pci-id values and chip-id)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
concerns substantial concerns have been raised, merge/review with extra care target/ath79 pull request/issue for ath79 target work in progress pull request the author is still working on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants