Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throughput monitoring issue #33

Closed
mschirrmeister opened this issue Apr 9, 2023 · 14 comments
Closed

Throughput monitoring issue #33

mschirrmeister opened this issue Apr 9, 2023 · 14 comments

Comments

@mschirrmeister
Copy link

Hello,

I am running the latest version 9.011.00-NAPI with kernel 6.1 and it shows wrong values for the throughput. The nic is connected to a 1GBit switch.
With the kernels default driver r8169 the throughput monitoring tools show typically around 115MB/s. With the r8125 driver, it shows multiple hundred Gigabyte/s. It changes between 300-700 GB/s.

Driver

root@nightowl ~# ethtool -i ens4
driver: r8125
version: 9.011.00-NAPI
firmware-version:
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

pci device

root@nightowl ~# lspci -s 01:00.0 -k
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
	Subsystem: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
	Kernel driver in use: r8125
	Kernel modules: r8169, r8125

Example wrong value.

  bwm-ng v0.6.3 (probing every 0.500s), press 'h' for help
  input: /proc/net/dev type: rate
  -         iface                   Rx                   Tx                Total
  ==============================================================================
               lo:           0.00  B/s            0.00  B/s            0.00  B/s
             ens4:         635.39 GB/s          546.15 KB/s          635.39 GB/s
             ens5:           0.00  B/s            0.00  B/s            0.00  B/s
  ------------------------------------------------------------------------------
            total:         635.39 GB/s          546.15 KB/s          635.39 GB/s

Any idea if I am doing something wrong, or is this a known issue?

@awesometic
Copy link
Owner

Hello,

Can you try this Debian package?
I reverted some changes that came from this 9.011.00 version.

Please remove the .zip extension from the attached file, Github restricts uploading files.

realtek-r8125-dkms_9.011.00-2_amd64.deb.zip

@mschirrmeister
Copy link
Author

mschirrmeister commented Apr 12, 2023

The package does not install. Error is below.

DKMS make.log for realtek-r8125-9.011.00 for kernel 6.1.0-7-amd64 (amd64)
Wed Apr 12 08:50:18 AM CEST 2023
/bin/sh: 1: VER: not found
make -C src/ KVER=6.1.0-7-amd64 BASEDIR=/lib/modules/6.1.0-7-amd64 modules
make[1]: Entering directory '/var/lib/dkms/realtek-r8125/9.011.00/build/src'
make -C /lib/modules/6.1.0-7-amd64/build M=/var/lib/dkms/realtek-r8125/9.011.00/build/src modules
make[2]: Entering directory '/usr/src/linux-headers-6.1.0-7-amd64'
  CC [M]  /var/lib/dkms/realtek-r8125/9.011.00/build/src/r8125_n.o
  CC [M]  /var/lib/dkms/realtek-r8125/9.011.00/build/src/rtl_eeprom.o
  CC [M]  /var/lib/dkms/realtek-r8125/9.011.00/build/src/rtltool.o
/var/lib/dkms/realtek-r8125/9.011.00/build/src/r8125_n.c:13512:31: error: ‘rtl8125_get_stats’ undeclared here (not in a function); did you mean ‘rtl8125_get_stats64’?
13512 |         .ndo_get_stats      = rtl8125_get_stats,
      |                               ^~~~~~~~~~~~~~~~~
      |                               rtl8125_get_stats64
/var/lib/dkms/realtek-r8125/9.011.00/build/src/r8125_n.c:13468:1: warning: ‘rtl8125_get_stats64’ defined but not used [-Wunused-function]
13468 | rtl8125_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
      | ^~~~~~~~~~~~~~~~~~~
make[3]: *** [/usr/src/linux-headers-6.1.0-7-common/scripts/Makefile.build:255: /var/lib/dkms/realtek-r8125/9.011.00/build/src/r8125_n.o] Error 1
make[2]: *** [/usr/src/linux-headers-6.1.0-7-common/Makefile:2037: /var/lib/dkms/realtek-r8125/9.011.00/build/src] Error 2
make[2]: Leaving directory '/usr/src/linux-headers-6.1.0-7-amd64'
make[1]: *** [Makefile:188: modules] Error 2
make[1]: Leaving directory '/var/lib/dkms/realtek-r8125/9.011.00/build/src'
make: *** [Makefile:42: modules] Error 2

I will be unavailable for the next 3 weeks. Can do the next test most likely first at the beginning of May.

@awesometic
Copy link
Owner

Looks like the compiler options caused that.

Here is the new file: realtek-r8125-dkms_9.011.00-2_amd64.deb.zip

I checked it compiles normally. Sorry for the inconvenience 😅

@mschirrmeister
Copy link
Author

Thanks. That one installs fine. But the problem is still there. It shows still GB/s. The number itself might be a little better. But still goes to high and fluctuates more compared to the r8169 driver.

@awesometic
Copy link
Owner

awesometic commented Apr 12, 2023

Then we should check if it happens on the other kernel versions too.

I neutralize some conditions about kernel version 5.11.0 or above in the network stat things, which are not there in the previous version.
Maybe there is another point I should look at but anyway Realtek should know this error and release the new version if this error is also caused on another system.

@mschirrmeister
Copy link
Author

Looks like it is to some extend kernel depended. I tested the following 2 kernels. Both have the problem as well.

  • 5.19.0-0.deb11.2-amd64
  • 5.15.94-x86

On 5.15.94-x86 it looks worse. Numbers go again up to 400GB/s.

I thought about reporting it to Realtek too, but did not find any good way to report it yet. Only thing I found is a support email address for network cards. I might drop a mail there and lets hope they can fix it.

@awesometic
Copy link
Owner

Thank you for the test and for reporting it to Realtek. Let's wait for the new version.

@dream10201
Copy link

dream10201 commented Apr 13, 2023

The same problem and after running for a while, dmesg gives these errors:

[  993.729605] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G    B      OE      6.2.10-arch1-1 #1 3b64a9154b84a23b8badf9e10678249884a952c6
[  993.729609] Hardware name: Default string Default string/Default string, BIOS 1.010 09/27/2021
[  993.729611] ==================================================================
[ 1002.885181] ==================================================================
[ 1002.885190] BUG: KFENCE: use-after-free read in rtl8125_rx_interrupt+0x347/0x5c0 [r8125]

[ 1002.885207] Use-after-free read at 0x0000000024c7079d (in kfence-#222):
[ 1002.885210]  rtl8125_rx_interrupt+0x347/0x5c0 [r8125]
[ 1002.885221]  rtl8125_poll_msix_rx+0x45/0x90 [r8125]
[ 1002.885231]  __napi_poll+0x28/0x1b0
[ 1002.885238]  net_rx_action+0x2a2/0x360
[ 1002.885241]  __do_softirq+0xd1/0x2c8
[ 1002.885245]  __irq_exit_rcu+0xb7/0xe0
[ 1002.885250]  common_interrupt+0x86/0xa0
[ 1002.885252]  asm_common_interrupt+0x26/0x40
[ 1002.885257]  cpuidle_enter_state+0xe2/0x420
[ 1002.885261]  cpuidle_enter+0x2d/0x40
[ 1002.885263]  do_idle+0x1ed/0x270
[ 1002.885266]  cpu_startup_entry+0x1d/0x20
[ 1002.885269]  rest_init+0xc8/0xd0
[ 1002.885272]  arch_call_rest_init+0xe/0x30
[ 1002.885277]  start_kernel+0x734/0xb30
[ 1002.885280]  secondary_startup_64_no_verify+0xe5/0xeb

[ 1002.885286] kfence-#222: 0x000000001096ce9d-0x00000000efff5d14, size=232, cache=skbuff_head_cache

[ 1002.885289] allocated by task 412 on cpu 2 at 1002.503026s:
[ 1002.885295]  __alloc_skb+0x167/0x1d0
[ 1002.885299]  alloc_skb_with_frags+0x50/0x200
[ 1002.885301]  sock_alloc_send_pskb+0x203/0x250
[ 1002.885304]  __ip_append_data+0x998/0x1070
[ 1002.885308]  ip_make_skb+0x105/0x140
[ 1002.885310]  udp_sendmsg+0xacf/0xe90
[ 1002.885314]  udpv6_sendmsg+0x469/0x1050
[ 1002.885317]  sock_sendmsg+0x46/0x70
[ 1002.885319]  ____sys_sendmsg+0x17f/0x2f0
[ 1002.885321]  ___sys_sendmsg+0x9a/0xe0
[ 1002.885323]  __sys_sendmmsg+0xe3/0x210
[ 1002.885326]  __x64_sys_sendmmsg+0x21/0x30
[ 1002.885329]  do_syscall_64+0x5c/0x90
[ 1002.885332]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

[ 1002.885336] freed by task 0 on cpu 0 at 1002.885162s:
[ 1002.885372]  tcp_data_queue+0x5a6/0xec0
[ 1002.885375]  tcp_rcv_established+0x210/0x730
[ 1002.885378]  tcp_v6_do_rcv+0xde/0x4c0
[ 1002.885380]  tcp_v6_rcv+0xc88/0xd00
[ 1002.885383]  ip6_protocol_deliver_rcu+0x6c/0x480
[ 1002.885385]  ip6_input_finish+0x43/0x60
[ 1002.885386]  ip6_sublist_rcv_finish+0x59/0x90
[ 1002.885388]  ip6_sublist_rcv+0x22f/0x2f0
[ 1002.885390]  ipv6_list_rcv+0x13f/0x170
[ 1002.885392]  __netif_receive_skb_list_core+0x1f6/0x2c0
[ 1002.885395]  netif_receive_skb_list_internal+0x1d1/0x310
[ 1002.885398]  napi_gro_receive+0xd0/0x210
[ 1002.885400]  rtl8125_rx_interrupt+0x33d/0x5c0 [r8125]
[ 1002.885410]  rtl8125_poll_msix_rx+0x45/0x90 [r8125]
[ 1002.885420]  __napi_poll+0x28/0x1b0
[ 1002.885423]  net_rx_action+0x2a2/0x360
[ 1002.885426]  __do_softirq+0xd1/0x2c8
[ 1002.885428]  __irq_exit_rcu+0xb7/0xe0
[ 1002.885430]  common_interrupt+0x86/0xa0
[ 1002.885432]  asm_common_interrupt+0x26/0x40
[ 1002.885435]  cpuidle_enter_state+0xe2/0x420
[ 1002.885438]  cpuidle_enter+0x2d/0x40
[ 1002.885440]  do_idle+0x1ed/0x270
[ 1002.885442]  cpu_startup_entry+0x1d/0x20
[ 1002.885444]  rest_init+0xc8/0xd0
[ 1002.885447]  arch_call_rest_init+0xe/0x30
[ 1002.885450]  start_kernel+0x734/0xb30
[ 1002.885452]  secondary_startup_64_no_verify+0xe5/0xeb

@dream10201
Copy link

@mschirrmeister
The official test version given by realtek, maybe you can try it, I'm not near the device and can't test it.
图片
r8125-9.011.01_20230412_b1.zip

@dream10201
Copy link

@awesometic After a period of testing, the problem did not reproduce.

@awesometic
Copy link
Owner

awesometic commented Apr 14, 2023

@dream10201

Thank you for your effort,

Can we merge that beta version into our repository? It will be open-sourced anyway, but don't know if we can use the unpublished version 🤔

@dream10201
Copy link

@awesometic
Create a patch file and patch it before compiling. Maybe it would be better?

@mschirrmeister
Copy link
Author

What @dream10201 posted here is also what Linda sent me for my question to Realtek. She mentioned that I can share the version here, because I am right now on vacation until early May. But @dream10201 shared the driver already. :-)

Linda also mentioned to me that they will apply the change in their next driver releases as well.

@awesometic
Copy link
Owner

Fixed it by 9.011.01 version :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants