Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent Kernal Panics - Arch Linux #33

Open
jaredraby opened this issue Mar 18, 2014 · 7 comments
Open

Frequent Kernal Panics - Arch Linux #33

jaredraby opened this issue Mar 18, 2014 · 7 comments

Comments

@jaredraby
Copy link

Been having frequent kernel panics with the latest commit of the driver (d30225b) and the latest arch kernel 3.13.6. I'm on a lenovo yoga 11s.

It can occur as soon as I log on or I get as long as 15 minutes , but it happens every time. I still have testing to do but I updated the arch kernel and with that got the latest commit of the driver. I still need to revert the driver to the last commit and also the precious arch kernel, for testing, but I wanted to bring it up as an issue with arch kernel 3.13.6 and commit d30225b.

I will try to get a kernel dump message next time it happens.

@lwfinger
Copy link
Owner

I cannot help you without the traceback in the kernel dump.

@jaredraby
Copy link
Author

This is the best I can do for now. I can't scroll so if you need something better I will do some more tinkering with it.

img_20140318_153118

@lwfinger
Copy link
Owner

Unfortunately, none of the routines in the traceback are part of r8723au.

If these crashes have recently started, then you might consider using the 'git bisect' commands to see what commit contributed to the crashes.

@ckuethe
Copy link

ckuethe commented Mar 20, 2014

maybe this helps?

Lenovo Yoga13
Ubuntu 13.10
Kernel "3.11.0-19-generic #33-Ubuntu SMP Tue Mar 11 18:48:34 UTC 2014
x86_64 x86_64 x86_64 GNU/Linux"
git commit id bb718cb

just rebooted, was loading a ton of tabs in chrome... haven't tried to
reproduce it, as my tab state is now trashed.

[ 2351.274850] BUG: scheduling while atomic: swapper/1/0/0x10000500
[ 2351.274855] Modules linked in: 8723au(OF) cfg80211 dm_crypt vsock
ip6t_REJECT xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT
xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4
xt_conntrack rts5139(C) ip6table_filter ip6_tables joydev
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat
nf_conntrack_ftp nf_conntrack hid_multitouch iptable_filter ip_tables
x_tables hid_sensor_hub parport_pc ppdev uvcvideo videobuf2_vmalloc
videobuf2_memops videobuf2_core videodev snd_hda_codec_hdmi
x86_pkg_temp_thermal snd_hda_codec_conexant rfcomm intel_powerclamp bnep
snd_hda_intel kvm_intel bluetooth kvm snd_hda_codec crct10dif_pclmul
crc32_pclmul snd_hwdep ghash_clmulni_intel snd_pcm aesni_intel aes_x86_64
lrw gf128mul glue_helper ablk_helper snd_page_alloc cryptd snd_seq_midi
snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd mei_me
psmouse soundcore lpc_ich mei serio_raw mac_hid nls_iso8859_1 coretemp lp
parport usbhid hid i915 microcode i2c_algo_bit drm_kms_helper ahci drm
libahci wmi video
[ 2351.274918] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GF C O
3.11.0-19-generic #33-Ubuntu
[ 2351.274920] Hardware name: LENOVO 20175/INVALID, BIOS 66CN55WW 02/28/2013
[ 2351.274922] ffff88022f254580 ffff88022f243680 ffffffff816e87f1
ffff880223af2000
[ 2351.274926] ffff88022f243690 ffffffff816e2af1 ffff88022f2436f0
ffffffff816ed8d1
[ 2351.274929] ffff880223af3fd8 0000000000014580 ffff880223af3fd8
0000000000014580
[ 2351.274933] Call Trace:
[ 2351.274935] [] dump_stack+0x45/0x56
[ 2351.274944] [] __schedule_bug+0x4d/0x5b
[ 2351.274948] [] __schedule+0x6d1/0x7e0
[ 2351.274953] [] __cond_resched+0x26/0x30
[ 2351.274956] [] _cond_resched+0x3a/0x50
[ 2351.274961] [] kmem_cache_alloc_trace+0x38/0x130
[ 2351.274973] [] ? rtw_addbareq_cmd+0x5e/0x122 [8723au]
[ 2351.274981] [] rtw_addbareq_cmd+0x5e/0x122 [8723au]
[ 2351.274992] [] rtw_issue_addbareq_cmd+0x177/0x198
[8723au]
[ 2351.275006] [] rtw_dump_xframe+0x5d/0x66a [8723au]
[ 2351.275019] [] rtl8723au_hal_xmit+0x143/0x1fb [8723au]
[ 2351.275033] [] rtw_hal_xmit+0x1c/0x1f [8723au]
[ 2351.275046] [] rtw_xmit+0x995/0x9fc [8723au]
[ 2351.275050] [] ? kfree_skbmem+0x37/0x90
[ 2351.275061] [] rtw_xmit_entry+0x1a1/0x2ab [8723au]
[ 2351.275065] [] dev_hard_start_xmit+0x318/0x560
[ 2351.275068] [] sch_direct_xmit+0xe6/0x1c0
[ 2351.275071] [] dev_queue_xmit+0x201/0x4c0
[ 2351.275075] [] neigh_resolve_output+0x11b/0x220
[ 2351.275079] [] ip_finish_output+0x1b1/0x3b0
[ 2351.275083] [] ip_output+0x58/0x90
[ 2351.275086] [] ip_local_out+0x25/0x30
[ 2351.275090] [] ip_queue_xmit+0x13d/0x3e0
[ 2351.275093] [] tcp_transmit_skb+0x463/0x8c0
[ 2351.275097] [] tcp_send_ack+0xa4/0xf0
[ 2351.275100] [] tcp_rcv_state_process+0xcf1/0xd00
[ 2351.275103] [] tcp_v4_do_rcv+0x268/0x470
[ 2351.275106] [] tcp_v4_rcv+0x777/0x790
[ 2351.275110] [] ? ip_rcv_finish+0x350/0x350
[ 2351.275113] [] ? nf_hook_slow+0x74/0x130
[ 2351.275116] [] ? ip_rcv_finish+0x350/0x350
[ 2351.275119] [] ip_local_deliver_finish+0xb4/0x1f0
[ 2351.275122] [] ip_local_deliver+0x48/0x80
[ 2351.275125] [] ip_rcv_finish+0x7d/0x350
[ 2351.275128] [] ip_rcv+0x234/0x370
[ 2351.275131] [] __netif_receive_skb_core+0x646/0x830
[ 2351.275134] [] __netif_receive_skb+0x18/0x60
[ 2351.275137] [] process_backlog+0xad/0x1a0
[ 2351.275139] [] net_rx_action+0x11c/0x230
[ 2351.275143] [] __do_softirq+0xf7/0x240
[ 2351.275147] [] call_softirq+0x1c/0x30
[ 2351.275151] [] do_softirq+0x55/0x90
[ 2351.275153] [] irq_exit+0xb5/0xc0
[ 2351.275156] [] do_IRQ+0x56/0xc0
[ 2351.275160] [] common_interrupt+0x6d/0x6d
[ 2351.275161] [] ? cpuidle_enter_state+0x52/0xc0
[ 2351.275170] [] ? cpuidle_enter_state+0x48/0xc0
[ 2351.275173] [] cpuidle_idle_call+0xc9/0x210
[ 2351.275177] [] arch_cpu_idle+0xe/0x30
[ 2351.275182] [] cpu_startup_entry+0xe5/0x280
[ 2351.275185] [] start_secondary+0x217/0x2c0

On Tue, Mar 18, 2014 at 8:26 PM, lwfinger notifications@github.com wrote:

Unfortunately, none of the routines in the traceback are part of r8723au.

If these crashes have recently started, then you might consider using the
'git bisect' commands to see what commit contributed to the crashes.

Reply to this email directly or view it on GitHubhttps://github.com//issues/33#issuecomment-38014333
.

GDB has a 'break' feature; why doesn't it have 'fix' too?

lwfinger added a commit that referenced this issue Mar 20, 2014
In #33 (comment),
the following problem is reported:

[ 2351.274850] BUG: scheduling while atomic: swapper/1/0/0x10000500
[ 2351.274855] Modules linked in: 8723au(OF) cfg80211 dm_crypt vsock
ip6t_REJECT xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT
xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4
xt_conntrack rts5139(C) ip6table_filter ip6_tables joydev
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat
nf_conntrack_ftp nf_conntrack hid_multitouch iptable_filter ip_tables
x_tables hid_sensor_hub parport_pc ppdev uvcvideo videobuf2_vmalloc
videobuf2_memops videobuf2_core videodev snd_hda_codec_hdmi
x86_pkg_temp_thermal snd_hda_codec_conexant rfcomm intel_powerclamp bnep
snd_hda_intel kvm_intel bluetooth kvm snd_hda_codec crct10dif_pclmul
crc32_pclmul snd_hwdep ghash_clmulni_intel snd_pcm aesni_intel aes_x86_64
lrw gf128mul glue_helper ablk_helper snd_page_alloc cryptd snd_seq_midi
snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd mei_me
psmouse soundcore lpc_ich mei serio_raw mac_hid nls_iso8859_1 coretemp lp
parport usbhid hid i915 microcode i2c_algo_bit drm_kms_helper ahci drm
libahci wmi video
[ 2351.274918] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GF C O
3.11.0-19-generic #33-Ubuntu
[ 2351.274920] Hardware name: LENOVO 20175/INVALID, BIOS 66CN55WW 02/28/2013
[ 2351.274922] ffff88022f254580 ffff88022f243680 ffffffff816e87f1
ffff880223af2000
[ 2351.274926] ffff88022f243690 ffffffff816e2af1 ffff88022f2436f0
ffffffff816ed8d1
[ 2351.274929] ffff880223af3fd8 0000000000014580 ffff880223af3fd8
0000000000014580
[ 2351.274933] Call Trace:
[ 2351.274935] <IRQ> [<ffffffff816e87f1>] dump_stack+0x45/0x56
[ 2351.274944] [<ffffffff816e2af1>] __schedule_bug+0x4d/0x5b
[ 2351.274948] [<ffffffff816ed8d1>] __schedule+0x6d1/0x7e0
[ 2351.274953] [<ffffffff81092886>] __cond_resched+0x26/0x30
[ 2351.274956] [<ffffffff816eddda>] _cond_resched+0x3a/0x50
[ 2351.274961] [<ffffffff8118e6d8>] kmem_cache_alloc_trace+0x38/0x130
[ 2351.274973] [<ffffffffa061de9c>] ? rtw_addbareq_cmd+0x5e/0x122 [8723au]
[ 2351.274981] [<ffffffffa061de9c>] rtw_addbareq_cmd+0x5e/0x122 [8723au]
[ 2351.274992] [<ffffffffa062c790>] rtw_issue_addbareq_cmd+0x177/0x198
[8723au]
[ 2351.275006] [<ffffffffa0663b9d>] rtw_dump_xframe+0x5d/0x66a [8723au]
[ 2351.275019] [<ffffffffa06644ac>] rtl8723au_hal_xmit+0x143/0x1fb [8723au]
[ 2351.275033] [<ffffffffa0657ba1>] rtw_hal_xmit+0x1c/0x1f [8723au]
[ 2351.275046] [<ffffffffa064bf5c>] rtw_xmit+0x995/0x9fc [8723au]
[ 2351.275050] [<ffffffff815e1887>] ? kfree_skbmem+0x37/0x90
[ 2351.275061] [<ffffffffa0688eb6>] rtw_xmit_entry+0x1a1/0x2ab [8723au]
[ 2351.275065] [<ffffffff815f4368>] dev_hard_start_xmit+0x318/0x560
[ 2351.275068] [<ffffffff81611966>] sch_direct_xmit+0xe6/0x1c0
[ 2351.275071] [<ffffffff815f47b1>] dev_queue_xmit+0x201/0x4c0
[ 2351.275075] [<ffffffff815fbcfb>] neigh_resolve_output+0x11b/0x220
[ 2351.275079] [<ffffffff8162af51>] ip_finish_output+0x1b1/0x3b0
[ 2351.275083] [<ffffffff8162c448>] ip_output+0x58/0x90
[ 2351.275086] [<ffffffff8162bba5>] ip_local_out+0x25/0x30
[ 2351.275090] [<ffffffff8162befd>] ip_queue_xmit+0x13d/0x3e0
[ 2351.275093] [<ffffffff81642603>] tcp_transmit_skb+0x463/0x8c0
[ 2351.275097] [<ffffffff816453b4>] tcp_send_ack+0xa4/0xf0
[ 2351.275100] [<ffffffff81640781>] tcp_rcv_state_process+0xcf1/0xd00
[ 2351.275103] [<ffffffff81649378>] tcp_v4_do_rcv+0x268/0x470
[ 2351.275106] [<ffffffff8164b497>] tcp_v4_rcv+0x777/0x790
[ 2351.275110] [<ffffffff81626aa0>] ? ip_rcv_finish+0x350/0x350
[ 2351.275113] [<ffffffff816202f4>] ? nf_hook_slow+0x74/0x130
[ 2351.275116] [<ffffffff81626aa0>] ? ip_rcv_finish+0x350/0x350
[ 2351.275119] [<ffffffff81626b54>] ip_local_deliver_finish+0xb4/0x1f0
[ 2351.275122] [<ffffffff81626e28>] ip_local_deliver+0x48/0x80
[ 2351.275125] [<ffffffff816267cd>] ip_rcv_finish+0x7d/0x350
[ 2351.275128] [<ffffffff81627094>] ip_rcv+0x234/0x370
[ 2351.275131] [<ffffffff815f2656>] __netif_receive_skb_core+0x646/0x830
[ 2351.275134] [<ffffffff815f2858>] __netif_receive_skb+0x18/0x60
[ 2351.275137] [<ffffffff815f335d>] process_backlog+0xad/0x1a0
[ 2351.275139] [<ffffffff815f2bfc>] net_rx_action+0x11c/0x230
[ 2351.275143] [<ffffffff81067477>] __do_softirq+0xf7/0x240
[ 2351.275147] [<ffffffff816fa09c>] call_softirq+0x1c/0x30
[ 2351.275151] [<ffffffff81014bf5>] do_softirq+0x55/0x90
[ 2351.275153] [<ffffffff81067755>] irq_exit+0xb5/0xc0
[ 2351.275156] [<ffffffff816fa996>] do_IRQ+0x56/0xc0
[ 2351.275160] [<ffffffff816effed>] common_interrupt+0x6d/0x6d
[ 2351.275161] <EOI> [<ffffffff815a1812>] ? cpuidle_enter_state+0x52/0xc0
[ 2351.275170] [<ffffffff815a1808>] ? cpuidle_enter_state+0x48/0xc0
[ 2351.275173] [<ffffffff815a1949>] cpuidle_idle_call+0xc9/0x210
[ 2351.275177] [<ffffffff8101bafe>] arch_cpu_idle+0xe/0x30
[ 2351.275182] [<ffffffff810b5825>] cpu_startup_entry+0xe5/0x280
[ 2351.275185] [<ffffffff8103f187>] start_secondary+0x217/0x2c0

The problem was fixed by xhanging two kzalloc() calls in rtw_addbareq_cmd()
from GFP_KERNEL to GFP_ATOMIC().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
@lwfinger
Copy link
Owner

Yjanks, that dump helps a lot. Please test this morning's commit.

Good question about GDB. If I were not writing so often to non-coders, I would make that my .sig.

@jaredraby
Copy link
Author

Thanks for that dump. I haven't had any time to do testing since the original post. I'll load it up tonight and give it a shot and I'll post back my results. Thanks again

@ckuethe
Copy link

ckuethe commented Mar 20, 2014

That seems to help. I'll throw some more traffic at it today, but so far
things are looking good.

On Thu, Mar 20, 2014 at 7:41 AM, lwfinger notifications@github.com wrote:

Yjanks, that dump helps a lot. Please test this morning's commit.

Good question about GDB. If I were not writing so often to non-coders, I
would make that my .sig.

Reply to this email directly or view it on GitHubhttps://github.com//issues/33#issuecomment-38174682
.

GDB has a 'break' feature; why doesn't it have 'fix' too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants