Kernel crash using 4.4.16 #1587

ezar · 2016-08-07T05:52:40Z

[24480.835708] INFO: task kworker/u8:2:20204 blocked for more than 120 seconds.
[24480.840274]       Not tainted 4.4.16-v7+ #899
[24480.844783] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[24480.849397] kworker/u8:2    D 805b527c     0 20204      2 0x00000000
[24480.853950] Workqueue: kmmcd mmc_rescan
[24480.858484] [<805b527c>] (__schedule) from [<805b57f4>] (schedule+0x50/0xa8)
[24480.863044] [<805b57f4>] (schedule) from [<8046bb38>] (__mmc_claim_host+0xb8/0x1cc)
[24480.867619] [<8046bb38>] (__mmc_claim_host) from [<8046bc7c>] (mmc_get_card+0x30/0x34)
[24480.872141] [<8046bc7c>] (mmc_get_card) from [<80473d50>] (mmc_sd_detect+0x2c/0x80)
[24480.876681] [<80473d50>] (mmc_sd_detect) from [<8046e28c>] (mmc_rescan+0xc8/0x324)
[24480.881210] [<8046e28c>] (mmc_rescan) from [<8003c644>] (process_one_work+0x154/0x458)
[24480.885715] [<8003c644>] (process_one_work) from [<8003c99c>] (worker_thread+0x54/0x500)
[24480.890224] [<8003c99c>] (worker_thread) from [<80042678>] (kthread+0xec/0x104)
[24480.894710] [<80042678>] (kthread) from [<8000fbc8>] (ret_from_fork+0x14/0x2c)
[24600.896680] INFO: task kworker/u8:2:20204 blocked for more than 120 seconds.
[24600.900070]       Not tainted 4.4.16-v7+ #899
[24600.903380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[24600.906743] kworker/u8:2    D 805b527c     0 20204      2 0x00000000
[24600.910093] Workqueue: kmmcd mmc_rescan
[24600.913416] [<805b527c>] (__schedule) from [<805b57f4>] (schedule+0x50/0xa8)
[24600.916767] [<805b57f4>] (schedule) from [<8046bb38>] (__mmc_claim_host+0xb8/0x1cc)
[24600.920101] [<8046bb38>] (__mmc_claim_host) from [<8046bc7c>] (mmc_get_card+0x30/0x34)
[24600.923416] [<8046bc7c>] (mmc_get_card) from [<80473d50>] (mmc_sd_detect+0x2c/0x80)
[24600.926728] [<80473d50>] (mmc_sd_detect) from [<8046e28c>] (mmc_rescan+0xc8/0x324)
[24600.930040] [<8046e28c>] (mmc_rescan) from [<8003c644>] (process_one_work+0x154/0x458)
[24600.933328] [<8003c644>] (process_one_work) from [<8003c99c>] (worker_thread+0x54/0x500)
[24600.936636] [<8003c99c>] (worker_thread) from [<80042678>] (kthread+0xec/0x104)
[24600.939926] [<80042678>] (kthread) from [<8000fbc8>] (ret_from_fork+0x14/0x2c)

The text was updated successfully, but these errors were encountered:

ezar · 2016-08-09T08:12:37Z

Nothing?

Ruffio · 2016-08-09T08:17:42Z

I think you must explain in details what happened, what did you/program do when it occurred? Can you reproduce it? how? have you done any configuration? which OS? Raspbian?

ezar · 2016-08-09T14:13:35Z

I try to answer all this questions:

I can´t reproduce it. It occurs sporadically once or twice a day.
I think it started when I update the kernel.

I'm using raspbian lite.

x29a · 2016-08-10T10:07:49Z

Which raspberry is this on and which storage card do you use?

ezar · 2016-08-10T12:59:47Z

Pi3

Samsung Evo MB-MP16DA/EU
man:0x00001b oem:0x534d name:00000 hwrev:0x1 fwrev:0x0

bcutter · 2016-11-01T20:48:53Z

I also saw this a few days ago according /var/log/kern.log(.1):

Oct 29 12:37:07 RPi1Raspbian kernel: [4461281.508821] INFO: task kworker/u8:1:1675 blocked for more than 120 seconds.
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.516071]       Not tainted 4.4.13-v7+ #894
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.520720] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.528830] kworker/u8:1    D 805b4bbc     0  1675      2 0x00000000
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.535471] Workqueue: kmmcd mmc_rescan
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.539571] [<805b4bbc>] (__schedule) from [<805b5134>] (schedule+0x50/0xa8)
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.546892] [<805b5134>] (schedule) from [<8046ba94>] (__mmc_claim_host+0xb8/0x1cc)
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.554828] [<8046ba94>] (__mmc_claim_host) from [<8046bbd8>] (mmc_get_card+0x30/0x34)
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.563056] [<8046bbd8>] (mmc_get_card) from [<80473cac>] (mmc_sd_detect+0x2c/0x80)
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.571018] [<80473cac>] (mmc_sd_detect) from [<8046e1e8>] (mmc_rescan+0xc8/0x324)
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.578888] [<8046e1e8>] (mmc_rescan) from [<8003c644>] (process_one_work+0x154/0x458)
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.587076] [<8003c644>] (process_one_work) from [<8003c99c>] (worker_thread+0x54/0x500)
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.595448] [<8003c99c>] (worker_thread) from [<80042678>] (kthread+0xec/0x104)
Oct 29 12:37:10 RPi1Raspbian kernel: [4461281.603042] [<80042678>] (kthread) from [<8000fbc8>] (ret_from_fork+0x14/0x2c)

It only happened ONCE - and I´m not sure (and can´t reverse check with "/var/log/apt/history.log" (cause it´s empty)) if it is/was related to package updates; I think there recently were ones kernel related, weren´t they?

Ruffio · 2017-02-05T11:12:00Z

@ezar is this still an issue?

JamesH65 · 2017-05-18T12:13:30Z

Closing due to lack of activity. Reopen if you feel this issue is still relevant.

mtausk · 2018-05-17T17:13:10Z

Hello. Possibly still relevant.
Linux pi 4.14.17-v7+ #1090 SMP Mon Feb 5 21:02:18 GMT 2018 armv7l armv7l armv7l GNU/Linux
Ubuntu 16.04

May 17 18:48:11 kernel: [ 6141.921740] INFO: task kworker/u8:0:5 blocked for more than 120 seconds.
May 17 18:48:11 kernel: [ 6141.921764]       Tainted: G         C      4.14.17-v7+ #1090
May 17 18:48:11 kernel: [ 6141.921774] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 17 18:48:11 kernel: [ 6141.921785] kworker/u8:0    D    0     5      2 0x00000000
May 17 18:48:11 kernel: [ 6141.921812] Workqueue: writeback wb_workfn (flush-8:1-fuseblk)
May 17 18:48:11 kernel: [ 6141.921849] [<807762c0>] (__schedule) from [<80776938>] (schedule+0x50/0xa8)
May 17 18:48:11 kernel: [ 6141.921868] [<80776938>] (schedule) from [<8014ac8c>] (io_schedule+0x20/0x40)
May 17 18:48:11 kernel: [ 6141.921887] [<8014ac8c>] (io_schedule) from [<8021c4e4>] (__lock_page+0x100/0x128)
May 17 18:48:11 kernel: [ 6141.921906] [<8021c4e4>] (__lock_page) from [<8022bb78>] (write_cache_pages+0x30c/0x4d4)
May 17 18:48:11 kernel: [ 6141.921985] [<8022bb78>] (write_cache_pages) from [<7f4ba618>] (fuse_writepages+0x80/0xe4 [fuse])
May 17 18:48:11 kernel: [ 6141.922075] [<7f4ba618>] (fuse_writepages [fuse]) from [<8022e284>] (do_writepages+0x30/0x8c)
May 17 18:48:11 kernel: [ 6141.922094] [<8022e284>] (do_writepages) from [<802ba854>] (__writeback_single_inode+0x44/0x430)
May 17 18:48:11 kernel: [ 6141.922111] [<802ba854>] (__writeback_single_inode) from [<802bb148>] (writeback_sb_inodes+0x20c/0x4c4)
May 17 18:48:11 kernel: [ 6141.922126] [<802bb148>] (writeback_sb_inodes) from [<802bb490>] (__writeback_inodes_wb+0x90/0xd0)
May 17 18:48:11 kernel: [ 6141.922141] [<802bb490>] (__writeback_inodes_wb) from [<802bb714>] (wb_writeback+0x244/0x358)
May 17 18:48:11 kernel: [ 6141.922155] [<802bb714>] (wb_writeback) from [<802bc2ac>] (wb_workfn+0x36c/0x4d8)
May 17 18:48:11 kernel: [ 6141.922173] [<802bc2ac>] (wb_workfn) from [<801370d8>] (process_one_work+0x158/0x454)
May 17 18:48:11 kernel: [ 6141.922192] [<801370d8>] (process_one_work) from [<80137438>] (worker_thread+0x64/0x5b8)
May 17 18:48:11 kernel: [ 6141.922209] [<80137438>] (worker_thread) from [<8013d4a8>] (kthread+0x13c/0x16c)
May 17 18:48:11 kernel: [ 6141.922226] [<8013d4a8>] (kthread) from [<8010812c>] (ret_from_fork+0x14/0x28)
May 17 18:48:11 kernel: [ 6141.922313] INFO: task atop:671 blocked for more than 120 seconds.
May 17 18:48:11 kernel: [ 6141.922326]       Tainted: G         C      4.14.17-v7+ #1090
May 17 18:48:11 kernel: [ 6141.922335] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 17 18:48:11 kernel: [ 6141.922344] atop            D    0   671      1 0x00000000
May 17 18:48:11 kernel: [ 6141.922371] [<807762c0>] (__schedule) from [<80776938>] (schedule+0x50/0xa8)
May 17 18:48:11 kernel: [ 6141.922389] [<80776938>] (schedule) from [<80779a38>] (rwsem_down_read_failed+0x10c/0x15c)
May 17 18:48:11 kernel: [ 6141.922404] [<80779a38>] (rwsem_down_read_failed) from [<807790cc>] (down_read+0x5c/0x60)
May 17 18:48:11 rsyslogd-2007: action 'action 10' suspended, next retry is Thu May 17 18:49:41 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
May 17 18:48:11 kernel: [ 6141.922421] [<807790cc>] (down_read) from [<802f8468>] (proc_pid_cmdline_read+0xd4/0x4fc)
May 17 18:48:11 kernel: [ 6141.922441] [<802f8468>] (proc_pid_cmdline_read) from [<80289d74>] (__vfs_read+0x38/0x130)
May 17 18:48:11 kernel: [ 6141.922458] [<80289d74>] (__vfs_read) from [<80289f08>] (vfs_read+0x9c/0x168)
May 17 18:48:11 kernel: [ 6141.922473] [<80289f08>] (vfs_read) from [<8028a4b8>] (SyS_read+0x54/0xb0)
May 17 18:48:11 kernel: [ 6141.922489] [<8028a4b8>] (SyS_read) from [<80108080>] (ret_fast_syscall+0x0/0x28)
May 17 18:48:11 kernel: [ 6141.922655] INFO: task tmux:5348 blocked for more than 120 seconds.
May 17 18:48:11 kernel: [ 6141.922667]       Tainted: G         C      4.14.17-v7+ #1090
May 17 18:48:11 kernel: [ 6141.922676] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 17 18:48:11 kernel: [ 6141.922685] tmux            D    0  5348      1 0x00000000
May 17 18:48:11 kernel: [ 6141.922711] [<807762c0>] (__schedule) from [<80776938>] (schedule+0x50/0xa8)
May 17 18:48:11 kernel: [ 6141.922727] [<80776938>] (schedule) from [<80779a38>] (rwsem_down_read_failed+0x10c/0x15c)
May 17 18:48:11 kernel: [ 6141.922741] [<80779a38>] (rwsem_down_read_failed) from [<807790cc>] (down_read+0x5c/0x60)
May 17 18:48:11 kernel: [ 6141.922758] [<807790cc>] (down_read) from [<802f8468>] (proc_pid_cmdline_read+0xd4/0x4fc)
May 17 18:48:11 kernel: [ 6141.922775] [<802f8468>] (proc_pid_cmdline_read) from [<80289d74>] (__vfs_read+0x38/0x130)
May 17 18:48:11 kernel: [ 6141.922790] [<80289d74>] (__vfs_read) from [<80289f08>] (vfs_read+0x9c/0x168)
May 17 18:48:11 kernel: [ 6141.922805] [<80289f08>] (vfs_read) from [<8028a4b8>] (SyS_read+0x54/0xb0)
May 17 18:48:11 kernel: [ 6141.922820] [<8028a4b8>] (SyS_read) from [<80108080>] (ret_fast_syscall+0x0/0x28)
May 17 18:48:11 kernel: [ 6141.922835] INFO: task rtorrent main:6557 blocked for more than 120 seconds.
May 17 18:48:11 kernel: [ 6141.922846]       Tainted: G         C      4.14.17-v7+ #1090
May 17 18:48:11 kernel: [ 6141.922855] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

pelwell · 2018-05-17T17:17:40Z

What's the connection? The symptom - a hung task - can have many different causes.

[ Upstream commit cc80915 ] The below WARN [1] is reported once a callback command failed. As a callback runs under an interrupt context, needs to use the IRQ save/restore variant. [1] DEBUG_LOCKS_WARN_ON(lockdep_hardirq_context()) WARNING: CPU: 15 PID: 0 at kernel/locking/lockdep.c:4353 lockdep_hardirqs_on_prepare+0x11b/0x180 Modules linked in: vhost_net vhost tap mlx5_vfio_pci vfio_pci vfio_pci_core vfio_iommu_type1 vfio mlx5_vdpa vringh vhost_iotlb vdpa nfnetlink_cttimeout openvswitch nsh ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle xt_conntrackxt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core fuse mlx5_core CPU: 15 PID: 0 Comm: swapper/15 Tainted: G W 6.7.0-rc4+ #1587 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:lockdep_hardirqs_on_prepare+0x11b/0x180 Code: 00 5b c3 c3 e8 e6 0d 58 00 85 c0 74 d6 8b 15 f0 c3 76 01 85 d2 75 cc 48 c7 c6 04 a5 3b 82 48 c7 c7 f1 e9 39 82 e8 95 12 f9 ff <0f> 0b 5b c3 e8 bc 0d 58 00 85 c0 74 ac 8b 3d c6 c3 76 01 85 ff 75 RSP: 0018:ffffc900003ecd18 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027 RDX: 0000000000000000 RSI: ffff88885fbdb880 RDI: ffff88885fbdb888 RBP: 00000000ffffff87 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 284e4f5f4e524157 R12: 00000000002c9aa1 R13: ffff88810aace980 R14: ffff88810aace9b8 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff88885fbc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f731436f4c8 CR3: 000000010aae6001 CR4: 0000000000372eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> ? __warn+0x81/0x170 ? lockdep_hardirqs_on_prepare+0x11b/0x180 ? report_bug+0xf8/0x1c0 ? handle_bug+0x3f/0x70 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? lockdep_hardirqs_on_prepare+0x11b/0x180 ? lockdep_hardirqs_on_prepare+0x11b/0x180 trace_hardirqs_on+0x4a/0xa0 raw_spin_unlock_irq+0x24/0x30 cmd_status_err+0xc0/0x1a0 [mlx5_core] cmd_status_err+0x1a0/0x1a0 [mlx5_core] mlx5_cmd_exec_cb_handler+0x24/0x40 [mlx5_core] mlx5_cmd_comp_handler+0x129/0x4b0 [mlx5_core] cmd_comp_notifier+0x1a/0x20 [mlx5_core] notifier_call_chain+0x3e/0xe0 atomic_notifier_call_chain+0x5f/0x130 mlx5_eq_async_int+0xe7/0x200 [mlx5_core] notifier_call_chain+0x3e/0xe0 atomic_notifier_call_chain+0x5f/0x130 irq_int_handler+0x11/0x20 [mlx5_core] __handle_irq_event_percpu+0x99/0x220 ? tick_irq_enter+0x5d/0x80 handle_irq_event_percpu+0xf/0x40 handle_irq_event+0x3a/0x60 handle_edge_irq+0xa2/0x1c0 __common_interrupt+0x55/0x140 common_interrupt+0x7d/0xa0 </IRQ> <TASK> asm_common_interrupt+0x22/0x40 RIP: 0010:default_idle+0x13/0x20 Code: c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 8b 05 ea 08 25 01 85 c0 7e 07 0f 00 2d 7f b0 26 00 fb f4 <fa> c3 90 66 2e 0f 1f 84 00 00 00 00 00 65 48 8b 04 25 80 d0 02 00 RSP: 0018:ffffc9000010fec8 EFLAGS: 00000242 RAX: 0000000000000001 RBX: 000000000000000f RCX: 4000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff811c410c RBP: ffffffff829478c0 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? do_idle+0x1ec/0x210 default_idle_call+0x6c/0x90 do_idle+0x1ec/0x210 cpu_startup_entry+0x26/0x30 start_secondary+0x11b/0x150 secondary_startup_64_no_verify+0x165/0x16b </TASK> irq event stamp: 833284 hardirqs last enabled at (833283): [<ffffffff811c410c>] do_idle+0x1ec/0x210 hardirqs last disabled at (833284): [<ffffffff81daf9ef>] common_interrupt+0xf/0xa0 softirqs last enabled at (833224): [<ffffffff81dc199f>] __do_softirq+0x2bf/0x40e softirqs last disabled at (833177): [<ffffffff81178ddf>] irq_exit_rcu+0x7f/0xa0 Fixes: 34f46ae ("net/mlx5: Add command failures data to debugfs") Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

JamesH65 closed this as completed May 18, 2017

mtausk mentioned this issue May 17, 2018

Kernel issue? #2557

Closed

ayasystems mentioned this issue Oct 11, 2022

raspberrypi-clk: Failed to get / change pllb frequency: -12 #3479

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel crash using 4.4.16 #1587

Kernel crash using 4.4.16 #1587

ezar commented Aug 7, 2016

ezar commented Aug 9, 2016

Ruffio commented Aug 9, 2016

ezar commented Aug 9, 2016

x29a commented Aug 10, 2016

ezar commented Aug 10, 2016

bcutter commented Nov 1, 2016

Ruffio commented Feb 5, 2017

JamesH65 commented May 18, 2017

mtausk commented May 17, 2018

pelwell commented May 17, 2018 •

edited

Kernel crash using 4.4.16 #1587

Kernel crash using 4.4.16 #1587

Comments

ezar commented Aug 7, 2016

ezar commented Aug 9, 2016

Ruffio commented Aug 9, 2016

ezar commented Aug 9, 2016

x29a commented Aug 10, 2016

ezar commented Aug 10, 2016

bcutter commented Nov 1, 2016

Ruffio commented Feb 5, 2017

JamesH65 commented May 18, 2017

mtausk commented May 17, 2018

pelwell commented May 17, 2018 • edited

pelwell commented May 17, 2018 •

edited