Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KSZ DSA driver transmission speed issue #86

Open
darshankiran opened this issue Jul 8, 2022 · 5 comments
Open

KSZ DSA driver transmission speed issue #86

darshankiran opened this issue Jul 8, 2022 · 5 comments

Comments

@darshankiran
Copy link

darshankiran commented Jul 8, 2022

We are using KSZ9477 DSA driver , facing issue while transmitting data. I.e Bitrate not going above 2Mbits/sec. This issue is not seen while receiving/downloading .

We tried loading the spi_ksz9877 driver and we are getting proper bandwidth/bitrate . issue is seen only in dsa driver.

Here is the iperf logs:

iperf3 -c 10.0.0.187
Connecting to host 10.0.0.187, port 5201
[ 5] local 10.0.0.25 port 51030 connected to 10.0.0.187 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 38.5 MBytes 323 Mbits/sec 0 1.53 MBytes
[ 5] 1.00-2.00 sec 33.8 MBytes 283 Mbits/sec 0 1.86 MBytes
[ 5] 2.00-3.00 sec 37.5 MBytes 315 Mbits/sec 0 2.32 MBytes
[ 5] 3.00-4.00 sec 33.8 MBytes 283 Mbits/sec 0 2.45 MBytes
[ 5] 4.00-5.00 sec 32.5 MBytes 273 Mbits/sec 0 2.57 MBytes
[ 5] 5.00-6.00 sec 35.0 MBytes 294 Mbits/sec 0 2.72 MBytes
[ 5] 6.00-7.00 sec 53.8 MBytes 451 Mbits/sec 0 3.02 MBytes
[ 5] 7.00-8.00 sec 35.0 MBytes 294 Mbits/sec 0 3.02 MBytes
[ 5] 8.00-9.00 sec 33.8 MBytes 283 Mbits/sec 0 3.02 MBytes
[ 5] 9.00-10.00 sec 36.2 MBytes 304 Mbits/sec 0 3.02 MBytes


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 370 MBytes 310 Mbits/sec 0 sender
[ 5] 0.00-10.00 sec 369 MBytes 309 Mbits/sec receiver

iperf Done.

iperf3 -c 10.0.0.187 -R
Connecting to host 10.0.0.187, port 5201
Reverse mode, remote host 10.0.0.187 is sending
[ 5] local 10.0.0.25 port 51034 connected to 10.0.0.187 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 277 KBytes 2.27 Mbits/sec
[ 5] 1.00-2.00 sec 264 KBytes 2.17 Mbits/sec
[ 5] 2.00-3.00 sec 243 KBytes 1.99 Mbits/sec
[ 5] 3.00-4.00 sec 212 KBytes 1.74 Mbits/sec
[ 5] 4.00-5.00 sec 249 KBytes 2.04 Mbits/sec
[ 5] 5.00-6.00 sec 246 KBytes 2.02 Mbits/sec
[ 5] 6.00-7.00 sec 320 KBytes 2.62 Mbits/sec
[ 5] 7.00-8.00 sec 291 KBytes 2.39 Mbits/sec
[ 5] 8.00-9.00 sec 273 KBytes 2.24 Mbits/sec
[ 5] 9.00-10.00 sec 160 KBytes 1.31 Mbits/sec


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.57 MBytes 2.16 Mbits/sec 348 sender
[ 5] 0.00-10.00 sec 2.48 MBytes 2.08 Mbits/sec receiver

iperf Done.

Is this issue related to driver?

Thanks
Darshan

@ewhac
Copy link

ewhac commented Jul 29, 2022

You are not alone; we've noticed this, too.

@triha2work
Copy link
Collaborator

DSA driver requires no MAC driver change. But there is a compromise that MAC acceleration features cannot be used as a tag needs to be added at the end of the frame. That means no hardware checksum generation and scatter/gather transmit operation. By itself this slow downs transmit operation but not much. However older kernels have this weird problem such that basic TCP transmission drastically impacts transmit performance. Newer kernels like 5.10 seems to correct this problem.
This is a software or kernel problem as the MAC driver can advertise it can support hardware acceleration but does everything in software before sending the frame and this will increase performance to expected level.
Now SAMA5D3 used in the KSZ9477 evaluation board has another issue that a new fix in mainline kernel drastically reduces the transmit performance because software needs to calculate CRC for every frame sent. This causes the TCP throughput to drop below 90 Mbps while typical throughput is 150 Mbps. This can be workaround by not using the fix as that fix is not needed for normal operation.
Now the numbers shown here are much lower so I do not know what is going on.

@Bartel-C8
Copy link

We found out that iperf3 is very resource intensive (CPU/flash) and that this poses an additional bottleneck.
We tried nuttcp and results were better.

Flash Acces is not that fast though, maybe that is a problem as well in later kernels? (But obviously SAMA5D3 related...)

@triha2work
Copy link
Collaborator

iperf, iperf3, and nuttcp give out the same result in my setup. Is SAMA5D3 or the KSZ9477 evaluation board being used in these tests?

@sgidel
Copy link

sgidel commented Jan 21, 2023

I also ran into this problem and have spent probably 70 hours trying to figure it out.
I finally found the cause - TCP segmentation offload. I use the in-tree DSA driver and not the driver in this repo but the same should apply to it as well.
What seems to be happening is that packets that make use of TSO get dropped while small standard packets make it through. This is due to the fact that the tail tag gets added to the end of the full TSO skb so when the network hardware transmits the normal frame sized chunks its going to be missing the tail tag for all but the last chunk. This also results in tons of checksum errors on the receiving device.
It looks like this bug has been fixed in newer kernel versions by disabling scatter-gather. I am using 5.10 though.
See https://elixir.bootlin.com/linux/v6.2-rc4/source/net/dsa/slave.c#L2352
My fix for 5.10 was to add slave->features &= ~NETIF_F_GSO_SOFTWARE; to slave_dsa_setup_tagger , however the implementation in 6.2 seems cleaner and probably works better for other offloads including RX as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants