-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
FS#1170 - mt7621: kernel errors - rcu_sched detected stalls on CPUs/tasks - again #8656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
kristrev: Here is a more detailed trace. Trace was triggered by a network restart during heavy load: [ 686.452081] INFO: rcu_sched self-detected stall on CPU |
elphidium: having the same problem OpenWrt SNAPSHOT r5938-6f425a28a4 / LuCI Master (git-18.023.74248-ee409b6)
|
nbd: Please try the latest version |
camel: problem still exists .. |
nbd: Should be fixed now, please test |
kristrev:
After the work done for issue FS#804, the rcu_sched error seemed to be gone. However, I am now starting to see it again. Usually, at least for me, the error happens when there is large amounts of traffic and I do something with the network. My most reliable way for reproducing the error is as follows:
[ 2251.870000] INFO: rcu_bh detected stalls on CPUs/tasks:
[ 2251.870000] 2-...: (1 GPs behind) idle=ae1/140000000000001/0 softirq=212487/217796 fqs=4380
[ 2251.870000] (detected by 1, t=6002 jiffies, g=-146, c=-147, q=4)
[ 2251.870000] Task dump for CPU 2:
[ 2251.870000] openvpn-mover.s R running 0 2598 1 0x08100004
[ 2251.870000] Stack : 8fa69998 800ebe38 00000000 8fa69998 57512e2b 000001fd 00000000 80035454
[ 2251.870000] 00000000 800edbd4 8fa69998 804b0000 00000000 00000000 00000004 00000000
[ 2251.870000] 00000000 8ea17850 8efc7ec0 800376d4 00000000 00000000 778b8930 00000012
[ 2251.870000] 00000000 004077cd 778d4000 00000000 778d55e8 778d6f7c 00000000 8002b280
[ 2251.870000] ffbffeff ffffffff 00617772 706d742f 00000000 00000000 00000001 800379dc
[ 2251.870000] ...
[ 2251.870000] Call Trace:
[ 2251.870000] [<8000bc88>] __schedule+0x574/0x758
I am also able to sometimes trigger the issue by simply issuing the reboot-command (while the CPU is stressed). I have not applied any traffic shaping to my interface, and I see the error both with kernel 4.4 and 4.9 (i.e., LEDE 17.01 and master). I don't quite know how to progress in debugging this.
The text was updated successfully, but these errors were encountered: