Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#2573 - BT Home Hub 5A xrx200 performance degradation caused by RPS XPS commit #7670

Closed
openwrt-bot opened this issue Oct 29, 2019 · 2 comments
Closed
Labels

Comments

@openwrt-bot
Copy link

@openwrt-bot openwrt-bot commented Oct 29, 2019

bill888:

Known to affect BT Home Hub 5A. Other similar Lantiq xrx200 devices are likely to be affected too.

When OpenWrt 18.06 was released, it was observed that WAN to LAN throughput had suffered degradation. Maximum throughput dropping from 140 mbps to 80 mbps.

Some investigative testing was conducted when 18.06.1 was released.
[[https://openwrt.ebilan.co.uk/viewtopic.php?f=7&t=1105|ebilan forum]]

It was discovered this commit was responsible for the fall in maximum throughput.
[[https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=916e33fa1e14b97daf8c9bf07a1b36f9767db679|netifd: update to the latest version, rewrite RPS/XPS handling]]

No one at the time thought it was a bug. In the UK, there is no support for VDSL vectoring, and so maximum DSL speeds is 80 Mbps which equates to 76 Mbps in real world speed tests through to the LAN ports.

Removing the above commit restored maximum possible throughput with 18.06.

mkresin recently took some time to look at this commit. Here are his comments:

First of all, it is about receive packet steering (rps) and transmit
packet steering (xps). rps/xps allows to specify which cpus/cores should
process transmitted and/or received packets. It is expressed as a
bitmask reading from right to left:

For example:

Only cpu 0 should handle something: 00000001 (decimal 1)
Only cpu 1 should handle something: 00000010 (decimal 2)
cpu0 & cpu 1 should handle something: 00000011 (decimal 3)

I'm not yet sure what's the result of setting decimal 0.

The commit you already identified, changes the logic to not steer
packets to the cpus/cores handling the interrupts (cat /proc/interrupts).

Fun fact, the check which cpu/core handles the interrupts doesn't work
for lantiq, nevertheless the correct cpu/core is returned by accident.

It seems to me, that on some targets this change increases the max.
packet throughput, while it causes a degradation on lantiq. I've no idea
which targets benefit from the change nor for which targets it
introduces a degradation. Neither do I know why targets behave that
different.

First column of numeric values are from **Maurer's custom 18.06.4 for HH5A** which does not contain above RPS/XPS commit. Second column is from 18.06.4 for HH5A. Third column is from 19.07-snapshot-r10605 for HH5A.

Receive Packet Steering
/sys/class/net/br-lan/queues/rx-0/rps_cpus: 3 0 0
/sys/class/net/eth0.1/queues/rx-0/rps_cpus: 3 0 0
/sys/class/net/eth0.2/queues/rx-0/rps_cpus: 3 0 0
/sys/class/net/eth0/queues/rx-0/rps_cpus: 3 2 2
/sys/class/net/lo/queues/rx-0/rps_cpus: 3 0 0
/sys/class/net/wlan0/queues/rx-0/rps_cpus: 0 2 2
/sys/class/net/wlan1/queues/rx-0/rps_cpus: 0 2 2

RPS flow count
/sys/class/net/br-lan/queues/rx-0/rps_flow_cnt: 0 0 0
/sys/class/net/eth0.1/queues/rx-0/rps_flow_cnt: 0 0 0
/sys/class/net/eth0.2/queues/rx-0/rps_flow_cnt: 0 0 0
/sys/class/net/eth0/queues/rx-0/rps_flow_cnt: 0 0 0
/sys/class/net/lo/queues/rx-0/rps_flow_cnt: 0 0 0
/sys/class/net/wlan0/queues/rx-0/rps_flow_cnt: 0 0 0
/sys/class/net/wlan1/queues/rx-0/rps_flow_cnt: 0 0 0

Transmit Packet Steering
/sys/class/net/br-lan/queues/tx-0/xps_cpus: 3 0 0
/sys/class/net/eth0.1/queues/tx-0/xps_cpus: 3 0 0
/sys/class/net/eth0.2/queues/tx-0/xps_cpus: 3 0 0
/sys/class/net/eth0/queues/tx-0/xps_cpus: 3 2 2
/sys/class/net/lo/queues/tx-0/xps_cpus: 3 0 0
/sys/class/net/wlan0/queues/tx-0/xps_cpus: 0 2 2
/sys/class/net/wlan0/queues/tx-1/xps_cpus: 0 1 1
/sys/class/net/wlan0/queues/tx-2/xps_cpus: 0 2 2
/sys/class/net/wlan0/queues/tx-3/xps_cpus: 0 1 1
/sys/class/net/wlan1/queues/tx-0/xps_cpus: 0 2 2
/sys/class/net/wlan1/queues/tx-1/xps_cpus: 0 1 1
/sys/class/net/wlan1/queues/tx-2/xps_cpus: 0 2 2
/sys/class/net/wlan1/queues/tx-3/xps_cpus: 0 1 1

'Best' possible throughput observed during testing from red WAN ethernet port to LAN port. (Results likely to be similar for VDSL to LAN port)

maurer's custom 18.06.4 for HH5A without above RPS/XPS commit for reference

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10 -R
Connecting to host 192.168.0.10, port 5201
Reverse mode, remote host 192.168.0.10 is sending
[ 4] local 192.168.1.162 port 51716 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 17.3 MBytes 145 Mbits/sec
[ 4] 1.00-2.00 sec 17.3 MBytes 145 Mbits/sec
[ 4] 2.00-3.00 sec 17.4 MBytes 146 Mbits/sec

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10
Connecting to host 192.168.0.10, port 5201
[ 4] local 192.168.1.162 port 51720 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 16.7 MBytes 140 Mbits/sec
[ 4] 1.00-2.00 sec 16.5 MBytes 139 Mbits/sec
[ 4] 2.00-3.00 sec 16.4 MBytes 137 Mbits/sec

=============

18.06.4 for HH5A for reference

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10 -R
Connecting to host 192.168.0.10, port 5201
Reverse mode, remote host 192.168.0.10 is sending
[ 4] local 192.168.1.162 port 51757 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 9.41 MBytes 78.8 Mbits/sec
[ 4] 1.00-2.00 sec 8.35 MBytes 70.0 Mbits/sec
[ 4] 2.00-3.00 sec 9.34 MBytes 78.5 Mbits/sec

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10
Connecting to host 192.168.0.10, port 5201
[ 4] local 192.168.1.162 port 51759 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 9.29 MBytes 77.9 Mbits/sec
[ 4] 1.00-2.00 sec 9.23 MBytes 77.4 Mbits/sec
[ 4] 2.00-3.00 sec 9.23 MBytes 77.4 Mbits/sec

19.07-snapshot r10605 for reference.

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10 -R
Connecting to host 192.168.0.10, port 5201
Reverse mode, remote host 192.168.0.10 is sending
[ 4] local 192.168.1.162 port 52587 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 8.94 MBytes 75.0 Mbits/sec
[ 4] 1.00-2.00 sec 8.84 MBytes 74.1 Mbits/sec
[ 4] 2.00-3.00 sec 8.89 MBytes 74.5 Mbits/sec

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10
Connecting to host 192.168.0.10, port 5201
[ 4] local 192.168.1.162 port 52578 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 8.92 MBytes 74.8 Mbits/sec
[ 4] 1.00-2.00 sec 8.80 MBytes 73.7 Mbits/sec
[ 4] 2.00-3.00 sec 8.86 MBytes 74.4 Mbits/sec

As far as I remember, all that need to be done to get back the former
lan speed, is to force packet steering to both cores:

Temporary fix:

echo 3 > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo 3 > /sys/class/net/eth0/queues/tx-0/xps_cpus

The same might be true for the wireless interfaces. Changing the values
for vlan or logic interfaces (eth0.x, br-lan) shouldn't have any impact.

The best would be if someone does some more tests to check which values
result in the greatest overall throughput.

Following mkresin's suggestion, here are some test results:

18.06.4 for HH5A with above **temporary fix** applied via SSH:

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10 -R
Connecting to host 192.168.0.10, port 5201
Reverse mode, remote host 192.168.0.10 is sending
[ 4] local 192.168.1.162 port 51787 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 16.9 MBytes 142 Mbits/sec
[ 4] 1.00-2.00 sec 16.9 MBytes 141 Mbits/sec
[ 4] 2.00-3.00 sec 16.9 MBytes 141 Mbits/sec

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10
Connecting to host 192.168.0.10, port 5201
[ 4] local 192.168.1.162 port 51791 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.01 sec 16.5 MBytes 138 Mbits/sec
[ 4] 1.01-2.00 sec 16.2 MBytes 136 Mbits/sec
[ 4] 2.00-3.00 sec 16.2 MBytes 136 Mbits/sec

19.07-snapshot r10605 with above temporary fix

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10 -R
Connecting to host 192.168.0.10, port 5201
Reverse mode, remote host 192.168.0.10 is sending
[ 4] local 192.168.1.162 port 52603 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 16.2 MBytes 136 Mbits/sec
[ 4] 1.00-2.00 sec 16.2 MBytes 136 Mbits/sec
[ 4] 2.00-3.00 sec 16.2 MBytes 136 Mbits/sec

C:\install\iperf>iperf3 -c 192.168.0.10 -t 10
Connecting to host 192.168.0.10, port 5201
[ 4] local 192.168.1.162 port 52607 connected to 192.168.0.10 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 12.1 MBytes 101 Mbits/sec
[ 4] 1.00-2.00 sec 12.1 MBytes 101 Mbits/sec
[ 4] 2.00-3.00 sec 12.0 MBytes 101 Mbits/sec

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Nov 17, 2019

bill888:

fyi, a discussion on above problem appears to have started here

[[https://forum.openwrt.org/t/18-06-4-speed-fix-for-bt-homehub-5a/23643/13|OpenWrt forum]]

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Mar 7, 2020

bill888:

Fixed in master from 3rd March 2020
[[https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=d3868f15f876507db54afacdef22a7059011a54e|netifd: change RPS/XPS handling to all CPUs and disable by default]]

It will Not be backported to 19.07 releases.
[[https://github.com//pull/2553#issuecomment-594771114|Discussion thread]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant