Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rk3288 kernel crash #96

Open
intgyl opened this issue May 26, 2018 · 1 comment
Open

rk3288 kernel crash #96

intgyl opened this issue May 26, 2018 · 1 comment

Comments

@intgyl
Copy link

intgyl commented May 26, 2018

<4>[ 602.623812] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0 #24
<4>[ 602.623829] [] (unwind_backtrace+0x0/0xe0) from [] (show_stack+0x10/0x14)
<4>[ 602.623840] [] (show_stack+0x10/0x14) from [] (warn_slowpath_common+0x4c/0x68)
<4>[ 602.623849] [] (warn_slowpath_common+0x4c/0x68) from [] (warn_slowpath_fmt+0x2c/0x3c)
<4>[ 602.623858] [] (warn_slowpath_fmt+0x2c/0x3c) from [] (watchdog_check_hardlockup_other_cpu+0xd0/0xf8)
<4>[ 602.623868] [] (watchdog_check_hardlockup_other_cpu+0xd0/0xf8) from [] (watchdog_timer_fn+0x38/0x168)
<4>[ 602.623879] [] (watchdog_timer_fn+0x38/0x168) from [] (__run_hrtimer+0x1a4/0x2b8)
<4>[ 602.623889] [] (__run_hrtimer+0x1a4/0x2b8) from [] (hrtimer_interrupt+0x11c/0x278)
<4>[ 602.623902] [] (hrtimer_interrupt+0x11c/0x278) from [] (arch_timer_handler_phys+0x28/0x30)
<4>[ 602.623914] [] (arch_timer_handler_phys+0x28/0x30) from [] (handle_percpu_devid_irq+0xf8/0x1b4)
<4>[ 602.623923] [] (handle_percpu_devid_irq+0xf8/0x1b4) from [] (generic_handle_irq+0x20/0x30)
<4>[ 602.623934] [] (generic_handle_irq+0x20/0x30) from [] (handle_IRQ+0x64/0x8c)
<4>[ 602.623943] [] (handle_IRQ+0x64/0x8c) from [] (gic_handle_irq+0x38/0x5c)
<4>[ 602.623951] [] (gic_handle_irq+0x38/0x5c) from [] (__irq_svc+0x40/0x70)
<4>[ 602.623957] Exception stack(0xc0badf08 to 0xc0badf50)
<4>[ 602.623964] df00: c0badf50 00000263 819e70f4 0000008e c1ed22e0 00000000
<4>[ 602.623972] df20: 819ddcd7 0000008e c0bbefd8 c0bac000 c0bbefe8 00000000 37f5f0ec c0badf50
<4>[ 602.623978] df40: c007935c c0591728 600d0013 ffffffff
<4>[ 602.623990] [] (__irq_svc+0x40/0x70) from [] (cpuidle_enter_state+0x54/0xec)
<4>[ 602.624000] [] (cpuidle_enter_state+0x54/0xec) from [] (cpuidle_idle_call+0x16c/0x27c)
<4>[ 602.624010] [] (cpuidle_idle_call+0x16c/0x27c) from [] (arch_cpu_idle+0x8/0x38)
<4>[ 602.624020] [] (arch_cpu_idle+0x8/0x38) from [] (cpu_idle_loop+0x1b8/0x224)
<4>[ 602.624030] [] (cpu_idle_loop+0x1b8/0x224) from [] (freezing_slow_path+0x0/0x80)
<4>[ 602.624039] [] (freezing_slow_path+0x0/0x80) from [] (0xffffffff)
<4>[ 602.624045] ---[ end trace 4019208e76fa2f15 ]---
<3>[ 605.664331] INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 3, t=2102 jiffies, g=3518, c=3517, q=10352)
<6>[ 605.664348] Backtrace for cpu 3 (current):
<4>[ 605.664357] CPU: 3 PID: 470 Comm: AudioOut_2 Tainted: G W 3.10.0 #24
<4>[ 605.664379] [] (unwind_backtrace+0x0/0xe0) from [] (show_stack+0x10/0x14)
<4>[ 605.664389] [] (show_stack+0x10/0x14) from [] (smp_send_all_cpu_backtrace+0x60/0xcc)
<4>[ 605.664401] [] (smp_send_all_cpu_backtrace+0x60/0xcc) from [] (print_other_cpu_stall+0x264/0x2e0)
<4>[ 605.664411] [] (print_other_cpu_stall+0x264/0x2e0) from [] (__rcu_pending+0x90/0x190)
<4>[ 605.664421] [] (__rcu_pending+0x90/0x190) from [] (rcu_check_callbacks+0x160/0x230)
<4>[ 605.664433] [] (rcu_check_callbacks+0x160/0x230) from [] (update_process_times+0x38/0x64)
<4>[ 605.664445] [] (update_process_times+0x38/0x64) from [] (tick_sched_timer+0xac/0xe0)
<4>[ 605.664456] [] (tick_sched_timer+0xac/0xe0) from [] (__run_hrtimer+0x1a4/0x2b8)
<4>[ 605.664466] [] (__run_hrtimer+0x1a4/0x2b8) from [] (hrtimer_interrupt+0x11c/0x278)
<4>[ 605.664478] [] (hrtimer_interrupt+0x11c/0x278) from [] (arch_timer_handler_phys+0x28/0x30)
<4>[ 605.664490] [] (arch_timer_handler_phys+0x28/0x30) from [] (handle_percpu_devid_irq+0xf8/0x1b4)
<4>[ 605.664502] [] (handle_percpu_devid_irq+0xf8/0x1b4) from [] (generic_handle_irq+0x20/0x30)
<4>[ 605.664513] [] (generic_handle_irq+0x20/0x30) from [] (handle_IRQ+0x64/0x8c)
<4>[ 605.664522] [] (handle_IRQ+0x64/0x8c) from [] (gic_handle_irq+0x38/0x5c)
<4>[ 605.664531] [] (gic_handle_irq+0x38/0x5c) from [] (__irq_svc+0x40/0x70)
4>[ 605.664551] 3e20: 00000001 c1edf100 c0bb52a8 c1eee340 c1eee344 c0bb4cf4 00000001 c0bae100
<4>[ 605.664559] 3e40: 00000004 c1ee52c0 00000001 dcd53e60 c00861d8 c00861b8 000d0113 ffffffff
<4>[ 605.664570] [] (__irq_svc+0x40/0x70) from [] (smp_call_function_many+0x25c/0x2b8)
<4>[ 605.664580] [] (smp_call_function_many+0x25c/0x2b8) from [] (smp_call_function+0x44/0x6c)
<4>[ 605.664592] [] (smp_call_function+0x44/0x6c) from [] (cpuidle_latency_notify+0x14/0x20)
<4>[ 605.664604] [] (cpuidle_latency_notify+0x14/0x20) from [] (notifier_call_chain+0x38/0x68)
<4>[ 605.664614] [] (notifier_call_chain+0x38/0x68) from [] (__blocking_notifier_call_chain+0x44/0x58)
<4>[ 605.664623] [] (__blocking_notifier_call_chain+0x44/0x58) from [] (blocking_notifier_call_chain+0x14/0x18)
<4>[ 605.664635] [] (blocking_notifier_call_chain+0x14/0x18) from [] (pm_qos_update_target+0x110/0x124)
<4>[ 605.664648] [] (pm_qos_update_target+0x110/0x124) from [] (snd_pcm_hw_params+0x2d0/0x348)
<4>[ 605.664659] [] (snd_pcm_hw_params+0x2d0/0x348) from [] (snd_pcm_common_ioctl1+0x24c/0x488)
<4>[ 605.664668] [] (snd_pcm_common_ioctl1+0x24c/0x488) from [] (snd_pcm_playback_ioctl1+0x24c/0x274)
<4>[ 605.664680] [] (snd_pcm_playback_ioctl1+0x24c/0x274) from [] (do_vfs_ioctl+0x210/0x240)
<4>[ 605.664691] [] (do_vfs_ioctl+0x210/0x240) from [] (SyS_ioctl+0x50/0x6c)
<4>[ 605.664701] [] (SyS_ioctl+0x50/0x6c) from [] (ret_fast_syscall+0x0/0x30)
<6>[ 605.664706]

@intgyl
Copy link
Author

intgyl commented May 26, 2018

I got this patch, and not sure wheater it is the same issue
https://patchwork.kernel.org/patch/6305191/

Kwiboo pushed a commit to Kwiboo/linux-rockchip that referenced this issue Feb 26, 2019
… fault

The userspace can ask kprobe to intercept strings at any memory address,
including invalid kernel address. In this case, fetch_store_strlen()
would crash since it uses general usercopy function, and user access
functions are no longer allowed to access kernel memory.

For example, we can crash the kernel by doing something as below:

$ sudo kprobe 'p:do_sys_open +0(+0(%si)):string'

[  103.620391] BUG: GPF in non-whitelisted uaccess (non-canonical address?)
[  103.622104] general protection fault: 0000 [#1] SMP PTI
[  103.623424] CPU: 10 PID: 1046 Comm: cat Not tainted 5.0.0-rc3-00130-gd73aba1-dirty rockchip-linux#96
[  103.625321] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-2-g628b2e6-dirty-20190104_103505-linux 04/01/2014
[  103.628284] RIP: 0010:process_fetch_insn+0x1ab/0x4b0
[  103.629518] Code: 10 83 80 28 2e 00 00 01 31 d2 31 ff 48 8b 74 24 28 eb 0c 81 fa ff 0f 00 00 7f 1c 85 c0 75 18 66 66 90 0f ae e8 48 63
 ca 89 f8 <8a> 0c 31 66 66 90 83 c2 01 84 c9 75 dc 89 54 24 34 89 44 24 28 48
[  103.634032] RSP: 0018:ffff88845eb37ce0 EFLAGS: 00010246
[  103.635312] RAX: 0000000000000000 RBX: ffff888456c4e5a8 RCX: 0000000000000000
[  103.637057] RDX: 0000000000000000 RSI: 2e646c2f6374652f RDI: 0000000000000000
[  103.638795] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  103.640556] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  103.642297] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  103.644040] FS:  0000000000000000(0000) GS:ffff88846f000000(0000) knlGS:0000000000000000
[  103.646019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  103.647436] CR2: 00007ffc79758038 CR3: 0000000463360006 CR4: 0000000000020ee0
[  103.649147] Call Trace:
[  103.649781]  ? sched_clock_cpu+0xc/0xa0
[  103.650747]  ? do_sys_open+0x5/0x220
[  103.651635]  kprobe_trace_func+0x303/0x380
[  103.652645]  ? do_sys_open+0x5/0x220
[  103.653528]  kprobe_dispatcher+0x45/0x50
[  103.654682]  ? do_sys_open+0x1/0x220
[  103.655875]  kprobe_ftrace_handler+0x90/0xf0
[  103.657282]  ftrace_ops_assist_func+0x54/0xf0
[  103.658564]  ? __call_rcu+0x1dc/0x280
[  103.659482]  0xffffffffc00000bf
[  103.660384]  ? __ia32_sys_open+0x20/0x20
[  103.661682]  ? do_sys_open+0x1/0x220
[  103.662863]  do_sys_open+0x5/0x220
[  103.663988]  do_syscall_64+0x60/0x210
[  103.665201]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  103.666862] RIP: 0033:0x7fc22fadccdd
[  103.668034] Code: 48 89 54 24 e0 41 83 e2 40 75 32 89 f0 25 00 00 41 00 3d 00 00 41 00 74 24 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff
 ff 0f 05 <48> 3d 00 f0 ff ff 77 33 f3 c3 66 0f 1f 84 00 00 00 00 00 48 8d 44
[  103.674029] RSP: 002b:00007ffc7972c3a8 EFLAGS: 00000287 ORIG_RAX: 0000000000000101
[  103.676512] RAX: ffffffffffffffda RBX: 0000562f86147a21 RCX: 00007fc22fadccdd
[  103.678853] RDX: 0000000000080000 RSI: 00007fc22fae1428 RDI: 00000000ffffff9c
[  103.681151] RBP: ffffffffffffffff R08: 0000000000000000 R09: 0000000000000000
[  103.683489] R10: 0000000000000000 R11: 0000000000000287 R12: 00007fc22fce90a8
[  103.685774] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[  103.688056] Modules linked in:
[  103.689131] ---[ end trace 43792035c28984a1 ]---

This can be fixed by using probe_mem_read() instead, as it can handle faulting
kernel memory addresses, which kprobes can legitimately do.

Link: http://lkml.kernel.org/r/20190125151051.7381-1-changbin.du@gmail.com

Cc: stable@vger.kernel.org
Fixes: 9da3f2b ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
rkchrome pushed a commit that referenced this issue Jun 27, 2019
commit f16eb8a upstream.

If SSDT overlay is loaded via ConfigFS and then unloaded the device,
we would like to have OF modalias for, already gone. Thus, acpi_get_name()
returns no allocated buffer for such case and kernel crashes afterwards:

 ACPI: Host-directed Dynamic ACPI Table Unload
 ads7950 spi-PRP0001:00: Dropping the link to regulator.0
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
 #PF error: [normal kernel read fault]
 PGD 80000000070d6067 P4D 80000000070d6067 PUD 70d0067 PMD 0
 Oops: 0000 [#1] SMP PTI
 CPU: 0 PID: 40 Comm: kworker/u4:2 Not tainted 5.0.0+ #96
 Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48
 Workqueue: kacpi_hotplug acpi_device_del_work_fn
 RIP: 0010:create_of_modalias.isra.1+0x4c/0x150
 Code: 00 00 48 89 44 24 18 31 c0 48 8d 54 24 08 48 c7 44 24 10 00 00 00 00 48 c7 44 24 08 ff ff ff ff e8 7a b0 03 00 48 8b 4c 24 10 <0f> b6 01 84 c0 74 27 48 c7 c7 00 09 f4 a5 0f b6 f0 8d 50 20 f6 04
 RSP: 0000:ffffa51040297c10 EFLAGS: 00010246
 RAX: 0000000000001001 RBX: 0000000000000785 RCX: 0000000000000000
 RDX: 0000000000001001 RSI: 0000000000000286 RDI: ffffa2163dc042e0
 RBP: ffffa216062b1196 R08: 0000000000001001 R09: ffffa21639873000
 R10: ffffffffa606761d R11: 0000000000000001 R12: ffffa21639873218
 R13: ffffa2163deb5060 R14: ffffa216063d1010 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffffa2163e000000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000007114000 CR4: 00000000001006f0
 Call Trace:
  __acpi_device_uevent_modalias+0xb0/0x100
  spi_uevent+0xd/0x40

 ...

In order to fix above let create_of_modalias() check the status returned
by acpi_get_name() and bail out in case of failure.

Fixes: 8765c5b ("ACPI / scan: Rework modalias creation when "compatible" is present")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201381
Reported-by: Ferry Toth <fntoth@gmail.com>
Tested-by: Ferry Toth<fntoth@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.1+ <stable@vger.kernel.org> # 4.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
0lvin pushed a commit to free-z4u/roc-rk3328-cc-official that referenced this issue Sep 21, 2019
Pipe clock comes out of the phy and is available as long as
the phy is turned on. Clock controller fails to gate this
clock after the phy is turned off and generates a warning.

/ # [   33.048561] gcc_usb3_phy_pipe_clk status stuck at 'on'
[   33.048585] ------------[ cut here ]------------
[   33.052621] WARNING: CPU: 1 PID: 18 at ../drivers/clk/qcom/clk-branch.c:97 clk_branch_wait+0xf0/0x108
[   33.057384] Modules linked in:
[   33.066497] CPU: 1 PID: 18 Comm: kworker/1:0 Tainted: G        W       4.12.0-rc7-00024-gfe926e34c36d-dirty rockchip-linux#96
[   33.069451] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
...
[   33.278565] [<ffff00000849b27c>] clk_branch_wait+0xf0/0x108
[   33.286375] [<ffff00000849b2f4>] clk_branch2_disable+0x28/0x34
[   33.291761] [<ffff0000084868dc>] clk_core_disable+0x5c/0x88
[   33.297660] [<ffff000008487d68>] clk_core_disable_lock+0x20/0x34
[   33.303129] [<ffff000008487d98>] clk_disable+0x1c/0x24
[   33.309384] [<ffff0000083ccd78>] qcom_qmp_phy_poweroff+0x20/0x48
[   33.314328] [<ffff0000083c53f4>] phy_power_off+0x80/0xdc
[   33.320492] [<ffff00000875c950>] dwc3_core_exit+0x94/0xa0
[   33.325784] [<ffff00000875c9ac>] dwc3_suspend_common+0x50/0x60
[   33.331080] [<ffff00000875ca04>] dwc3_runtime_suspend+0x48/0x6c
[   33.336810] [<ffff0000085b82f4>] pm_generic_runtime_suspend+0x28/0x38
[   33.342627] [<ffff0000085bace0>] __rpm_callback+0x150/0x254
[   33.349222] [<ffff0000085bae08>] rpm_callback+0x24/0x78
[   33.354604] [<ffff0000085b9fd8>] rpm_suspend+0xe0/0x4e4
[   33.359813] [<ffff0000085bb784>] pm_runtime_work+0xdc/0xf0
[   33.365028] [<ffff0000080d7b30>] process_one_work+0x12c/0x28c
[   33.370576] [<ffff0000080d7ce8>] worker_thread+0x58/0x3b8
[   33.376393] [<ffff0000080dd4a8>] kthread+0x100/0x12c
[   33.381776] [<ffff0000080836c0>] ret_from_fork+0x10/0x50

Fix this by disabling it as the first thing in phy_exit().

Fixes: e78f3d1 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets")
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Signed-off-by: Manu Gautam <mgautam@codeaurora.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
0lvin pushed a commit to free-z4u/roc-rk3328-cc-official that referenced this issue Oct 12, 2019
[   61.182439] UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/qp.c:5366:34
[   61.183673] shift exponent 4294967288 is too large for 32-bit type 'unsigned int'
[   61.185530] CPU: 0 PID: 639 Comm: qp Not tainted 4.18.0-rc1-00037-g4aa1d69a9c60-dirty rockchip-linux#96
[   61.186981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
[   61.188315] Call Trace:
[   61.188661]  dump_stack+0xc7/0x13b
[   61.190427]  ubsan_epilogue+0x9/0x49
[   61.190899]  __ubsan_handle_shift_out_of_bounds+0x1ea/0x22f
[   61.197040]  mlx5_ib_create_wq+0x1c99/0x1d50
[   61.206632]  ib_uverbs_ex_create_wq+0x499/0x820
[   61.213892]  ib_uverbs_write+0x77e/0xae0
[   61.248018]  vfs_write+0x121/0x3b0
[   61.249831]  ksys_write+0xa1/0x120
[   61.254024]  do_syscall_64+0x7c/0x2a0
[   61.256178]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   61.259211] RIP: 0033:0x7f54bab70e99
[   61.262125] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89
[   61.268678] RSP: 002b:00007ffe1541c318 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   61.271076] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f54bab70e99
[   61.273795] RDX: 0000000000000070 RSI: 0000000020000240 RDI: 0000000000000003
[   61.276982] RBP: 00007ffe1541c330 R08: 00000000200078e0 R09: 0000000000000002
[   61.280035] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004005c0
[   61.283279] R13: 00007ffe1541c420 R14: 0000000000000000 R15: 0000000000000000

Cc: <stable@vger.kernel.org> # 4.7
Fixes: 79b20a6 ("IB/mlx5: Add receive Work Queue verbs")
Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
scpcom pushed a commit to scpcom/linux that referenced this issue Apr 16, 2020
[ Upstream commit 65de65d ]

The IFF_BONDING means bonding master or bonding slave device.
->ndo_add_slave() sets IFF_BONDING flag and ->ndo_del_slave() unsets
IFF_BONDING flag.

bond0<--bond1

Both bond0 and bond1 are bonding device and these should keep having
IFF_BONDING flag until they are removed.
But bond1 would lose IFF_BONDING at ->ndo_del_slave() because that routine
do not check whether the slave device is the bonding type or not.
This patch adds the interface type check routine before removing
IFF_BONDING flag.

Test commands:
    ip link add bond0 type bond
    ip link add bond1 type bond
    ip link set bond1 master bond0
    ip link set bond1 nomaster
    ip link del bond1 type bond
    ip link add bond1 type bond

Splat looks like:
[  226.665555] proc_dir_entry 'bonding/bond1' already registered
[  226.666440] WARNING: CPU: 0 PID: 737 at fs/proc/generic.c:361 proc_register+0x2a9/0x3e0
[  226.667571] Modules linked in: bonding af_packet sch_fq_codel ip_tables x_tables unix
[  226.668662] CPU: 0 PID: 737 Comm: ip Not tainted 5.4.0-rc3+ rockchip-linux#96
[  226.669508] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  226.670652] RIP: 0010:proc_register+0x2a9/0x3e0
[  226.671612] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 39 01 00 00 48 8b 04 24 48 89 ea 48 c7 c7 a0 0b 14 9f 48 8b b0 e
0 00 00 00 e8 07 e7 88 ff <0f> 0b 48 c7 c7 40 2d a5 9f e8 59 d6 23 01 48 8b 4c 24 10 48 b8 00
[  226.675007] RSP: 0018:ffff888050e17078 EFLAGS: 00010282
[  226.675761] RAX: dffffc0000000008 RBX: ffff88805fdd0f10 RCX: ffffffff9dd344e2
[  226.676757] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806c9f6b8c
[  226.677751] RBP: ffff8880507160f3 R08: ffffed100d940019 R09: ffffed100d940019
[  226.678761] R10: 0000000000000001 R11: ffffed100d940018 R12: ffff888050716008
[  226.679757] R13: ffff8880507160f2 R14: dffffc0000000000 R15: ffffed100a0e2c1e
[  226.680758] FS:  00007fdc217cc0c0(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000
[  226.681886] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  226.682719] CR2: 00007f49313424d0 CR3: 0000000050e46001 CR4: 00000000000606f0
[  226.683727] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  226.684725] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  226.685681] Call Trace:
[  226.687089]  proc_create_seq_private+0xb3/0xf0
[  226.687778]  bond_create_proc_entry+0x1b3/0x3f0 [bonding]
[  226.691458]  bond_netdev_event+0x433/0x970 [bonding]
[  226.692139]  ? __module_text_address+0x13/0x140
[  226.692779]  notifier_call_chain+0x90/0x160
[  226.693401]  register_netdevice+0x9b3/0xd80
[  226.694010]  ? alloc_netdev_mqs+0x854/0xc10
[  226.694629]  ? netdev_change_features+0xa0/0xa0
[  226.695278]  ? rtnl_create_link+0x2ed/0xad0
[  226.695849]  bond_newlink+0x2a/0x60 [bonding]
[  226.696422]  __rtnl_newlink+0xb9f/0x11b0
[  226.696968]  ? rtnl_link_unregister+0x220/0x220
[ ... ]

Fixes: 0b680e7 ("[PATCH] bonding: Add priv_flag to avoid event mishandling")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kwiboo pushed a commit to Kwiboo/linux-rockchip that referenced this issue Dec 30, 2020
virt_wifi_newlink() calls netdev_upper_dev_link() and it internally
holds reference count of lower interface.

Current code does not release a reference count of the lower interface
when the lower interface is being deleted.
So, reference count leaks occur.

Test commands:
    ip link add dummy0 type dummy
    ip link add vw1 link dummy0 type virt_wifi
    ip link del dummy0

Splat looks like:
[  133.787526][  T788] WARNING: CPU: 1 PID: 788 at net/core/dev.c:8274 rollback_registered_many+0x835/0xc80
[  133.788355][  T788] Modules linked in: virt_wifi cfg80211 dummy team af_packet sch_fq_codel ip_tables x_tables unix
[  133.789377][  T788] CPU: 1 PID: 788 Comm: ip Not tainted 5.4.0-rc3+ rockchip-linux#96
[  133.790069][  T788] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  133.791167][  T788] RIP: 0010:rollback_registered_many+0x835/0xc80
[  133.791906][  T788] Code: 00 4d 85 ff 0f 84 b5 fd ff ff ba c0 0c 00 00 48 89 de 4c 89 ff e8 9b 58 04 00 48 89 df e8 30
[  133.794317][  T788] RSP: 0018:ffff88805ba3f338 EFLAGS: 00010202
[  133.795080][  T788] RAX: ffff88805e57e801 RBX: ffff88805ba34000 RCX: ffffffffa9294723
[  133.796045][  T788] RDX: 1ffff1100b746816 RSI: 0000000000000008 RDI: ffffffffabcc4240
[  133.797006][  T788] RBP: ffff88805ba3f4c0 R08: fffffbfff5798849 R09: fffffbfff5798849
[  133.797993][  T788] R10: 0000000000000001 R11: fffffbfff5798848 R12: dffffc0000000000
[  133.802514][  T788] R13: ffff88805ba3f440 R14: ffff88805ba3f400 R15: ffff88805ed622c0
[  133.803237][  T788] FS:  00007f2e9608c0c0(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000
[  133.804002][  T788] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  133.804664][  T788] CR2: 00007f2e95610603 CR3: 000000005f68c004 CR4: 00000000000606e0
[  133.805363][  T788] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  133.806073][  T788] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  133.806787][  T788] Call Trace:
[  133.807069][  T788]  ? generic_xdp_install+0x310/0x310
[  133.807612][  T788]  ? lock_acquire+0x164/0x3b0
[  133.808077][  T788]  ? is_bpf_text_address+0x5/0xf0
[  133.808640][  T788]  ? deref_stack_reg+0x9c/0xd0
[  133.809138][  T788]  ? __nla_validate_parse+0x98/0x1ab0
[  133.809944][  T788]  unregister_netdevice_many.part.122+0x13/0x1b0
[  133.810599][  T788]  rtnl_delete_link+0xbc/0x100
[  133.811073][  T788]  ? rtnl_af_register+0xc0/0xc0
[  133.811672][  T788]  rtnl_dellink+0x30e/0x8a0
[  133.812205][  T788]  ? is_bpf_text_address+0x5/0xf0
[ ... ]

[  144.110530][  T788] unregister_netdevice: waiting for dummy0 to become free. Usage count = 1

This patch adds notifier routine to delete upper interface before deleting
lower interface.

Fixes: c7cdba3 ("mac80211-next: rtnetlink wifi simulation device")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1962f86)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6cd61c66c92363b7f59615ceb753b91bc3965287
scpcom pushed a commit to scpcom/linux that referenced this issue Feb 6, 2021
[ Upstream commit b9ad3e9 ]

syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a
struct slave device could result in the following splat:

  kobject: 'bonding_slave' (00000000cecdd4fe): kobject_release, parent 0000000074ceb2b2 (delayed 1000)
  bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
  ------------[ cut here ]------------
  ODEBUG: free active (active state 0) object type: timer_list hint: workqueue_select_cpu_near kernel/workqueue.c:1549 [inline]
  ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98 kernel/workqueue.c:1600
  WARNING: CPU: 1 PID: 842 at lib/debugobjects.c:485 debug_print_object+0x180/0x240 lib/debugobjects.c:485
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 1 PID: 842 Comm: kworker/u4:4 Tainted: G S                5.9.0-rc8+ rockchip-linux#96
  Hardware name: linux,dummy-virt (DT)
  Workqueue: netns cleanup_net
  Call trace:
   dump_backtrace+0x0/0x4d8 include/linux/bitmap.h:239
   show_stack+0x34/0x48 arch/arm64/kernel/traps.c:142
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x174/0x1f8 lib/dump_stack.c:118
   panic+0x360/0x7a0 kernel/panic.c:231
   __warn+0x244/0x2ec kernel/panic.c:600
   report_bug+0x240/0x398 lib/bug.c:198
   bug_handler+0x50/0xc0 arch/arm64/kernel/traps.c:974
   call_break_hook+0x160/0x1d8 arch/arm64/kernel/debug-monitors.c:322
   brk_handler+0x30/0xc0 arch/arm64/kernel/debug-monitors.c:329
   do_debug_exception+0x184/0x340 arch/arm64/mm/fault.c:864
   el1_dbg+0x48/0xb0 arch/arm64/kernel/entry-common.c:65
   el1_sync_handler+0x170/0x1c8 arch/arm64/kernel/entry-common.c:93
   el1_sync+0x80/0x100 arch/arm64/kernel/entry.S:594
   debug_print_object+0x180/0x240 lib/debugobjects.c:485
   __debug_check_no_obj_freed lib/debugobjects.c:967 [inline]
   debug_check_no_obj_freed+0x200/0x430 lib/debugobjects.c:998
   slab_free_hook mm/slub.c:1536 [inline]
   slab_free_freelist_hook+0x190/0x210 mm/slub.c:1577
   slab_free mm/slub.c:3138 [inline]
   kfree+0x13c/0x460 mm/slub.c:4119
   bond_free_slave+0x8c/0xf8 drivers/net/bonding/bond_main.c:1492
   __bond_release_one+0xe0c/0xec8 drivers/net/bonding/bond_main.c:2190
   bond_slave_netdev_event drivers/net/bonding/bond_main.c:3309 [inline]
   bond_netdev_event+0x8f0/0xa70 drivers/net/bonding/bond_main.c:3420
   notifier_call_chain+0xf0/0x200 kernel/notifier.c:83
   __raw_notifier_call_chain kernel/notifier.c:361 [inline]
   raw_notifier_call_chain+0x44/0x58 kernel/notifier.c:368
   call_netdevice_notifiers_info+0xbc/0x150 net/core/dev.c:2033
   call_netdevice_notifiers_extack net/core/dev.c:2045 [inline]
   call_netdevice_notifiers net/core/dev.c:2059 [inline]
   rollback_registered_many+0x6a4/0xec0 net/core/dev.c:9347
   unregister_netdevice_many.part.0+0x2c/0x1c0 net/core/dev.c:10509
   unregister_netdevice_many net/core/dev.c:10508 [inline]
   default_device_exit_batch+0x294/0x338 net/core/dev.c:10992
   ops_exit_list.isra.0+0xec/0x150 net/core/net_namespace.c:189
   cleanup_net+0x44c/0x888 net/core/net_namespace.c:603
   process_one_work+0x96c/0x18c0 kernel/workqueue.c:2269
   worker_thread+0x3f0/0xc30 kernel/workqueue.c:2415
   kthread+0x390/0x498 kernel/kthread.c:292
   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:925

This is a potential use-after-free if the sysfs nodes are being accessed
whilst removing the struct slave, so wait for the object destruction to
complete before freeing the struct slave itself.

Fixes: 07699f9 ("bonding: add sysfs /slave dir for bond slave devices.")
Fixes: a068aab ("bonding: Fix reference count leak in bond_sysfs_slave_add.")
Cc: Qiushi Wu <wu000273@umn.edu>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20201120142827.879226-1-jamie@nuviainc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kwiboo pushed a commit to Kwiboo/linux-rockchip that referenced this issue Mar 5, 2024
…del video info

list_del corruption, ffffffc028662d18->next is LIST_POISON1 (dead000000000100)
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:47!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in: 8822es(O) sprdbt_tty
Process CtrlThread (pid: 3697, stack limit = 0x0000000060d302a5)
CPU: 1 PID: 3697 Comm: CtrlThread Tainted: G           O      4.19.232 rockchip-linux#96
Hardware name: Rockchip RK3528 DEMO4 DDR4 V10 Board (DT)
pstate: 40400005 (nZcv daif +PAN -UAO)
pc : __list_del_entry_valid+0x64/0xb0
lr : __list_del_entry_valid+0x64/0xb0
sp : ffffff800fd1bc70
x29: ffffff800fd1bc70 x28: ffffffc05c468000
x27: 0000000000000000 x26: 0000000000000000
x25: 0000000046000000 x24: 0000000000000011
x23: ffffff800fd1be60 x22: ffffff80098188a0
x21: ffffff8009818000 x20: ffffffc0462af700
x19: ffffffc028662d00 x18: ffffffffffffffff
x17: 0000000000000000 x16: 0000000000000000
x15: ffffff800934a980 x14: 4f53494f505f5453
x13: 494c207369207478 x12: 656e3e2d38316432
x11: 3636383230636666 x10: 66666666202c6e6f
x9 : 6974707572726f63 x8 : 3030303030303030
x7 : 0000000000000058 x6 : ffffffc07f74aa18
x5 : ffffffc07f74aa18 x4 : 0000000000000000
x3 : ffffffc07f753908 x2 : ac674fb1e4701200
x1 : 0000000000000000 x0 : 000000000000004e
Call trace:
 __list_del_entry_valid+0x64/0xb0
 rockchip_update_system_status+0x168/0x250
 status_store+0x1c/0x38
 kobj_attr_store+0x14/0x28
 sysfs_kf_write+0x48/0x58
 kernfs_fop_write+0xf4/0x220
 __vfs_write+0x34/0x158
 vfs_write+0xb0/0x1d0
 ksys_write+0x64/0xe0
 __arm64_sys_write+0x14/0x20
 el0_svc_common.constprop.0+0x64/0x178
 el0_svc_compat_handler+0x18/0x20
 el0_svc_compat+0x8/0x34

Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com>
Change-Id: I42e9c42d7e65c742226f82b9367466b2ed86550d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant