Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ath79: convert all users of ar832/37 to qca8k #4622

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

neheb
Copy link
Contributor

@neheb neheb commented Sep 30, 2021

qca8k is a DSA driver with potential speedups. On an Archer C7v2, I
observed a ~70mbps iperf speedup from device to router.

Signed-off-by: Rosen Penev rosenp@gmail.com

While this has successfully been tested on some devices, more testing is needed.

Devices that rely on builtin-switch will not work currently and are beyond the scope of this PR.

Review is definitely needed. All these changes were done manually. I'm sure bugs/mistakes crept in.

The best way to test would be to setup wifi access, delete /etc/config/network, and sysupgrade to avoid locking yourself out of the router.

depends on #4619

LED definitions were skipped as that can go in a separate PR. It's not clear if #4562 will get merged. If it does, they will be handled in a separate PR.

@aparcar aparcar added RFT pull request ready for testing target/ath79 pull request/issue for ath79 target labels Sep 30, 2021
@castillofrancodamian
Copy link

So the Archer C60 v1 with AR8229 wouldn't benefit from this?

@neheb
Copy link
Contributor Author

neheb commented Oct 1, 2021

nope. that uses the builtin switch.

@neheb
Copy link
Contributor Author

neheb commented Oct 1, 2021

ETH chip1: Qualcomm Atheros QCA9561
Switch: Qualcomm Atheros QCA9561
LAN speed: 100M
LAN ports: 4
WAN speed: 100M
WAN ports: 1 

sad.

@adschm adschm added the work in progress pull request the author is still working on label Oct 1, 2021
@adschm
Copy link
Member

adschm commented Oct 1, 2021

I'm putting "blocked" here, since as I understand it, this is only for testing and would break the devices still needing swconfig in its current state?

@adschm adschm added the blocked pull request blocked by/waiting for another change label Oct 1, 2021
@neheb
Copy link
Contributor Author

neheb commented Oct 1, 2021

@adschm I just tested. swconfig is not the issue. AR8216 is. Removed the swconfig removal. Other devices should work again.

@adschm adschm removed the blocked pull request blocked by/waiting for another change label Oct 1, 2021
@neheb
Copy link
Contributor Author

neheb commented Oct 1, 2021

@eisaev do you have any idea why this xiaomi switch setup is so strange?

Conf for cpu port0, Mode sgmii
Conf for cpu port5, Mode rgmii-rxid
Absent config (regs set to 0x0) for pad6. Not attached port6?
Assume qca8327. Set qca,package48 in switch node

for all other devices, port 6 is used for the second CPU eth, not port 5.

Also, how come this is ar8327 on a device with qca9563?

@neheb
Copy link
Contributor Author

neheb commented Oct 1, 2021

Patches went in the other PR. Removed from here.

@rbpp
Copy link

rbpp commented Oct 2, 2021

Testing on my C7 shows lan1 and lan3 not working properly. I have cameras connected to these ports and sometimes they connect after a while and sometimes they simply don't (log shows all ports are up though).

This C7 is a dumb router connected to main router via lan2. If I switch the cable from lan2 to lan1, I also lose connection to the router.

Also, is ip link set lan1 link eth1 supposed to be working? It does not work for me.

Others having similar issues?

@neheb
Copy link
Contributor Author

neheb commented Oct 2, 2021

I've seen that issue with the multi CPU DSA patch. Its current incarnation is broken for some reason.

Should not be an issue otherwise.

@neheb
Copy link
Contributor Author

neheb commented Oct 3, 2021

@Ansuel so it turns out CONFIG_AR8216 is needed for users of the builtin switch. However when I remove the removal, the router's switch takes extra long to come up. I see this in dmesg:

[   45.329946] ar8216: Unknown Atheros device [ver=255, rev=111]
[   89.354543] ar8216: Unknown Atheros device [ver=255, rev=111]
[  133.375392] ar8216: Unknown Atheros device [ver=255, rev=111]

Any idea why this driver is being loaded? I see no compatible strings that should cause this.

@Ansuel
Copy link
Member

Ansuel commented Oct 3, 2021

@Ansuel so it turns out CONFIG_AR8216 is needed for users of the builtin switch. However when I remove the removal, the router's switch takes extra long to come up. I see this in dmesg:

[   45.329946] ar8216: Unknown Atheros device [ver=255, rev=111]
[   89.354543] ar8216: Unknown Atheros device [ver=255, rev=111]
[  133.375392] ar8216: Unknown Atheros device [ver=255, rev=111]

Any idea why this driver is being loaded? I see no compatible strings that should cause this.

Bad driver... It does check for phy and delay boot. You can notice there is a while and a massive timeout.... We can consider adding a special binding to skip specific phy... If you want I can prepare a patch... Consider it won't be ever upstreamed, we can consider adding this hack... Or reduce the timeout/while times

@neheb
Copy link
Contributor Author

neheb commented Oct 3, 2021

reducing timeout sounds better.

@adschm
Copy link
Member

adschm commented Oct 7, 2021

ar832/37

I'm a little confused by that identifier, do you mean ar8327 and ar8337, i.e. ar83{2,3}7?

@rbpp
Copy link

rbpp commented Oct 7, 2021

Testing this PR on my C7 with its original version, I have problems with some devices on certain ports as reported before. I'm now testing this PR + patch "096-34-drivers-net-dsa-qca8k-add-support-for-multi-cpu" from Ansuel's PR #4036 and everything now works perfectly. Also, "ip link set lan1 link eth1" works too.

@neheb
Copy link
Contributor Author

neheb commented Oct 7, 2021

ar832/37

I'm a little confused by that identifier, do you mean ar8327 and ar8337, i.e. ar83{2,3}7?

Yes

@neheb
Copy link
Contributor Author

neheb commented Oct 11, 2021

Updated PR. This now requires the qca8k backport commit to be updated. Some DTS changes were made.

@neheb
Copy link
Contributor Author

neheb commented Sep 12, 2022

@mans0n that delay is because this PR no longer removes CONFIG_AR8216. Remove it and it'll work fine.

This PR in its current state cannot be merged. ath79 as a target is huge and has devices for which no DSA drivers exist(realtek switches).

@systemcrash
Copy link

@systemcrash we can still keep the swconfig.

Thing is that the regression DSA cause it too big for the benefits so if swconfig is still usable lets stick with it.

This gonna make the wikis a minefield.

Maybe we should just ask in the forums. Is it possible to run just an anonymous vote?

ath79 backs (one of) the largest amount of downloads for existing platforms:

https://downloads.openwrt.org/stats/

@Ansuel
Copy link
Member

Ansuel commented Oct 3, 2022

For anyone that wants to still test this....

Can you guys test this patch?

https://gist.github.com/Ansuel/70eb89a00443180fd2ab04c7ea0f2b3c

Should fix the increased port setup time.

@Leo-PL
Copy link
Contributor

Leo-PL commented Oct 3, 2022

@Ansuel I'll pick it up later today. Just did a big rebase of all my patches yesterday, before ipq40xx DSA was merged, but haven't flashed anything yet. I'll do one more rebase to get that one as well.

@Ansuel
Copy link
Member

Ansuel commented Oct 3, 2022

@Leo-PL as said it was reported increased time for the first port setup. This should fix the problem

@Ansuel
Copy link
Member

Ansuel commented Oct 6, 2022

@Leo-PL This should be the final patch

https://gist.github.com/Ansuel/d1c45184fd00b5e07fa1e1757d0fa996

The main 2 test are check the time and the correctness of the 2 command

  • ethtool -S lan1 (or whatever port)
  • bridge fdb

If all works correctly i would love some Tested-by tag as I need to push the fix upstream.

@Ansuel
Copy link
Member

Ansuel commented Oct 8, 2022

@Leo-PL did you find time to test the patch? the powerpc target confirmed that all works good so i'm waiting for ath79 so i can push the fix upstream

@Leo-PL
Copy link
Contributor

Leo-PL commented Oct 8, 2022

@Ansuel I wrote on IRC that I had no traffic after applying it, I need to rebase. Where can I find the latest backports for qca8k to base this on?

@Ansuel
Copy link
Member

Ansuel commented Oct 8, 2022 via email

@Leo-PL
Copy link
Contributor

Leo-PL commented Oct 8, 2022

Maybe then I shall drop multi-CPU patch and LED patch temporarily, and just use 5.15?

@Ansuel
Copy link
Member

Ansuel commented Oct 8, 2022 via email

@Leo-PL
Copy link
Contributor

Leo-PL commented Oct 9, 2022

Okay, I'll build just master with only this patch applied.

@Leo-PL
Copy link
Contributor

Leo-PL commented Oct 10, 2022

@Ansuel when flashed without clearing config, I had the same connection problems, but factory reset sorted them out. Then I found out, that your patch wasn't applied, because I put it only in hack-5.10, LOL.
Before:

root@OpenWrt:~# time ethtool -S lan1
NIC statistics:
     tx_packets: 1019
     tx_bytes: 108802
     rx_packets: 36472
     rx_bytes: 4487520
     RxBroad: 554369024
     RxPause: 0
     RxMulti: 4270456832
     RxFcsErr: 0
     RxAlignErr: 0
     RxRunt: 0
     RxFragment: 0
     Rx64Byte: 1803128700808462336
     Rx128Byte: 1474084453033705472
     Rx256Byte: 1879126944520339456
     Rx512Byte: 9370583449674448896
     Rx1024Byte: 3530822107858468864
     Rx1518Byte: 13402993966031306752
     RxMaxByte: 0
     RxTooLong: 0
     RxGoodByte: 4269505905235591168
     RxBadByte: 0
     RxOverFlow: 0
     Filtered: 14847523546510393344
     TxBroad: 216172782113783808
     TxPause: 0
     TxMulti: 1801439850948198400
     TxUnderRun: 0
     Tx64Byte: 936748722493063168
     Tx128Byte: 5837509542002294784
     Tx256Byte: 10448351135499550720
     Tx512Byte: 1513209474796486656
     Tx1024Byte: 72057594037927936
     Tx1518Byte: 0
     TxMaxByte: 0
     TxOverSize: 0
     TxByte: 3511964383930482688
     TxCollision: 0
     TxAbortCol: 0
     TxMultiCol: 0
     TxSingleCol: 0
     TxExcDefer: 0
     TxDefer: 0
     TxLateCol: 0
real    0m 2.87s
user    0m 0.00s
sys     0m 0.00s
root@OpenWrt:~# time bridge fdb
33:33:00:00:00:01 dev eth0 self permanent
33:33:00:00:00:02 dev eth0 self permanent
01:00:5e:00:00:01 dev eth1 self permanent
33:33:00:00:00:01 dev eth1 self permanent
33:33:00:00:00:02 dev eth1 self permanent
33:33:ff:4f:fb:d2 dev eth1 self permanent
33:33:ff:00:00:00 dev eth1 self permanent
<snip>
33:33:00:00:00:01 dev wlan0 self permanent
33:33:00:00:00:02 dev wlan0 self permanent
33:33:00:00:00:01 dev wlan1 self permanent
33:33:00:00:00:02 dev wlan1 self permanent
33:33:00:00:00:01 dev br-lan self permanent
33:33:00:00:00:02 dev br-lan self permanent
01:00:5e:00:00:01 dev br-lan self permanent
33:33:ff:00:00:01 dev br-lan self permanent
33:33:ff:4f:fb:d2 dev br-lan self permanent
33:33:ff:00:00:00 dev br-lan self permanent
33:33:00:01:00:02 dev br-lan self permanent
33:33:00:01:00:03 dev br-lan self permanent
real    9m 24.47s
user    0m 0.04s
sys     0m 0.00s

After:

root@OpenWrt:~# time ethtool -S lan1
NIC statistics:
     tx_packets: 79
     tx_bytes: 12270
     rx_packets: 1180
     rx_bytes: 128138
     RxBroad: 152
     RxPause: 0
     RxMulti: 1699
     RxFcsErr: 0
     RxAlignErr: 0
     RxRunt: 0
     RxFragment: 0
     Rx64Byte: 75
     Rx128Byte: 1458
     Rx256Byte: 276
     Rx512Byte: 140
     Rx1024Byte: 1
     Rx1518Byte: 2
     RxMaxByte: 0
     RxTooLong: 0
     RxGoodByte: 246190
     RxBadByte: 0
     RxOverFlow: 0
     Filtered: 772
     TxBroad: 4
     TxPause: 0
     TxMulti: 29
     TxUnderRun: 0
     Tx64Byte: 1
     Tx128Byte: 66
     Tx256Byte: 12
     Tx512Byte: 6
     Tx1024Byte: 1
     Tx1518Byte: 2
     TxMaxByte: 0
     TxOverSize: 0
     TxByte: 13348
     TxCollision: 0
     TxAbortCol: 0
     TxMultiCol: 0
     TxSingleCol: 0
     TxExcDefer: 0
     TxDefer: 0
     TxLateCol: 0
real    0m 0.13s
user    0m 0.00s
sys     0m 0.06s
root@OpenWrt:~# time bridge fdb
33:33:00:00:00:01 dev eth0 self permanent
33:33:00:00:00:02 dev eth0 self permanent
01:00:5e:00:00:01 dev eth1 self permanent
33:33:00:00:00:02 dev eth1 self permanent
33:33:00:00:00:01 dev eth1 self permanent
33:33:ff:4f:fb:d2 dev eth1 self permanent
33:33:ff:00:00:00 dev eth1 self permanent
<snip>
33:33:00:00:00:01 dev wlan0 self permanent
33:33:00:00:00:02 dev wlan0 self permanent
33:33:00:00:00:01 dev wlan1 self permanent
33:33:00:00:00:02 dev wlan1 self permanent
33:33:00:00:00:01 dev br-lan self permanent
33:33:00:00:00:02 dev br-lan self permanent
01:00:5e:00:00:01 dev br-lan self permanent
33:33:ff:00:00:01 dev br-lan self permanent
33:33:00:01:00:02 dev br-lan self permanent
33:33:00:01:00:03 dev br-lan self permanent
33:33:ff:4f:fb:d2 dev br-lan self permanent
33:33:ff:00:00:00 dev br-lan self permanent
real    0m 0.11s
user    0m 0.01s
sys     0m 0.04s

The results speak for themselves.

So,
Tested-by: Lech Perczak <lech.perczak@gmail.com>

Regarding the patch itself, I would optimize the loop condition in both loops replacing memcpy:

i <= QCA_HDR_MGMT_DATA2_LEN &&
i <= (len - QCA_HDR_MGMT_DATA1_LEN);

with single comparison like this, with data_len precalculated outside of the loop:

int data_len = min_t(int, QCA_HDR_MGMT_DATA2_LEN, len - QCA_HDR_MGMT_DATA1_LEN)
[...]
i <= data_len

With that in mind,
Reviewed-by: Lech Perczak <lech.perczak@gmail.com>

Edit: forgot to mention. This was done on Archer C7v2.

@Ansuel
Copy link
Member

Ansuel commented Oct 10, 2022

@Leo-PL thanks a lot for testing this! (the 9 minutes is really funny ahahhaha)
Also thanks for the suggestion of the macro, that makes the code more readable and it was indeed a concern i had. With the suggested change it's much better!

@Leo-PL
Copy link
Contributor

Leo-PL commented Oct 10, 2022

@Leo-PL thanks a lot for testing this! (the 9 minutes is really funny ahahhaha) Also thanks for the suggestion of the macro, that makes the code more readable and it was indeed a concern i had. With the suggested change it's much better!

Great minds think alike ;->

@Ansuel
Copy link
Member

Ansuel commented Oct 14, 2022

@Leo-PL patch got accepted upstream and i backported it on master. Now everything should work correctly.

@tiagogaspar8
Copy link
Contributor

@Ansuel I've been sort of daily driving this configuration and noticed today that eee is disabled by default, is there an easy patch that can be applied to change this? And shouldn't this be enabled upstream? From my linux knowledge eee is always enabled by default
Sorry to bother you with this but I didn't know who to ask for help

@Ansuel
Copy link
Member

Ansuel commented Dec 22, 2022

@Ansuel I've been sort of daily driving this configuration and noticed today that eee is disabled by default, is there an easy patch that can be applied to change this? And shouldn't this be enabled upstream? From my linux knowledge eee is always enabled by default
Sorry to bother you with this but I didn't know who to ask for help

Hi, EEE is disabled per QCOM qsdk configuration. I applied the same values of the original driver. AFAIK it was problematic with some devices. Feel free to enable it with ethtool

@tiagogaspar8
Copy link
Contributor

tiagogaspar8 commented Dec 25, 2022

Hi, I see, this is a great power-saving feature, it's sad that it has issues in this chip, especially since it's used with many routers...
Thanks for the help!
As for this pull request? any news?🙁🙁🙁

@neheb
Copy link
Contributor Author

neheb commented Dec 25, 2022

Nope. ath79 is too big of a target. There's also no support for some of the builtin switches. Conversion has to start out small.

@tiagogaspar8
Copy link
Contributor

Ok, I would be able to migrate for example the archer c7 v4 which is my code, but this requires one to turn off the swconfig driver as it interferes with the DSA one.
Since that's not possible and it's not possible to migrate the whole ath79 tree, would this be possible by subtargets, although a per device would be the best
May I suggest the rename of the current subtargets to old_generic/tiny/etc. And have the plain name ones, for example generic, for the converted devices? This would allow separate config and progressive but fast conversion.

@neheb
Copy link
Contributor Author

neheb commented Dec 25, 2022

subtarget is the best way forward I think.

@tiagogaspar8
Copy link
Contributor

Is there anything I can do/help with? 😄

@Leo-PL
Copy link
Contributor

Leo-PL commented Dec 27, 2022

subtarget is the best way forward I think.

@Ansuel do I recall correctly, that swconfig driver for ar8237 series was refactored into a package? If so, we could leverage DEVICE_PACKAGES to migrate individual devices. We could achieve the same by enforcing matching DT compatible strings in it, to avoid collisions with DSA driver, and then proceed with individual devices, but at cost of rootfs size. We'd need to include both in DEFAULT_PACKAGES.

@Ansuel
Copy link
Member

Ansuel commented Dec 27, 2022

@Leo-PL yes... also with no compatible the qca8k driver is not loaded at all... So we can just add the swconfig kmod for legacy device and add the related compatible (and NOT install the swconfig kmod)

@tew42
Copy link

tew42 commented May 24, 2023

Wanted to check in if this is becoming any more realistic now that 23.05 branch/5.15 kernel is here? From my earlier understanding the 5.10 kernel was part of the reason why this couldn't move forward (another being the potential need for subtargets, which I guess a new branch is also a good opportunity for).

Once I see some indications that 23.05 won't immediately brick my Archer A7, I'm also happy to help test things out.

@neheb
Copy link
Contributor Author

neheb commented May 24, 2023

Yes and no. The swconfig driver issue needs to be fixed first. It's complicated as some ath79 devices use swconfig for realtel switches with no DSA driver available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFT pull request ready for testing target/ath79 pull request/issue for ath79 target work in progress pull request the author is still working on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet