Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error :modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+/wl_iw.c #23

Closed
huaixiaoz opened this issue May 20, 2012 · 2 comments

Comments

@huaixiaoz
Copy link

CC [M]  /ics/allwinner/linux-allwinner/modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+/wl_iw.o
/ics/allwinner/linux-allwinner/modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+/wl_iw.c: In function ‘wl_iw_set_pmksa’:
/ics/allwinner/linux-allwinner/modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+/wl_iw.c:5203:5: error: array subscript is above array bounds [-Werror=array-bounds]
/ics/allwinner/linux-allwinner/modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+/wl_iw.c:5206:5: error: array subscript is above array bounds [-Werror=array-bounds]
cc1: all warnings being treated as errors

make[4]: *** [/ics/allwinner/linux-allwinner/modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+/wl_iw.o] Error 1
make[3]: *** [_module_/ics/allwinner/linux-allwinner/modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+] Error 2
make[3]: Leaving directory `/ics/allwinner/linux-allwinner'
make[2]: *** [modules] Error 2
make[2]: Leaving directory `/ics/allwinner/linux-allwinner/modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+'
make[1]: *** [objdir] Error 2
make[1]: Leaving directory `/ics/allwinner/linux-allwinner/modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux'
make: *** [dhd-cdc-sdmmc-gpl] Error 2
@huaixiaoz
Copy link
Author

and i think the same error whith the last time ,
it's bcm4329 ugly driver ,
the file modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+/wl_iw.c:5206:5: error: array subscript is above array bounds [-Werror=array-bounds]
you can check it ,
modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+/wl_iw.c:5206:5
modules/wifi/usi-bcm4329/v4.218.248.15/open-src/src/dhd/linux/dhd-cdc-sdmmc-gpl-3.0.31+/wl_iw.c:5203:5
i changed bcopy(&pmkid_list.pmkids.pmkid[i+1].BSSID,
to bcopy(&pmkid_list.pmkids.pmkid[i].BSSID,
And passed , is it right or not ?

@turl
Copy link

turl commented May 21, 2012

Related: amery@8ac9318

@amery amery closed this as completed May 22, 2012
amery pushed a commit that referenced this issue Jun 16, 2012
The warning below triggers on AMD MCM packages because physical package
IDs on the cores of a _physical_ socket are the same. I.e., this field
says which CPUs belong to the same physical package.

However, the same two CPUs belong to two different internal, i.e.
"logical" nodes in the same physical socket which is reflected in the
CPU-to-node map on x86 with NUMA.

Which makes this check wrong on the above topologies so circumvent it.

[    0.444413] Booting Node   0, Processors  #1 #2 #3 #4 #5 Ok.
[    0.461388] ------------[ cut here ]------------
[    0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81()
[    0.473960] Hardware name: Dinar
[    0.477170] sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[    0.486860] Booting Node   1, Processors  #6
[    0.491104] Modules linked in:
[    0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1
[    0.499510] Call Trace:
[    0.501946]  [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81
[    0.508185]  [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d
[    0.514163]  [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48
[    0.519881]  [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81
[    0.525943]  [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371
[    0.532004]  [<ffffffff8144c4ee>] start_secondary+0x19a/0x218
[    0.537729] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.628197]  #7 #8 #9 #10 #11 Ok.
[    0.807108] Booting Node   3, Processors  #12 #13 #14 #15 #16 #17 Ok.
[    0.897587] Booting Node   2, Processors  #18 #19 #20 #21 #22 #23 Ok.
[    0.917443] Brought up 24 CPUs

We ran a topology sanity check test we have here on it and
it all looks ok... hopefully :).

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
amery pushed a commit that referenced this issue Sep 8, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amery pushed a commit that referenced this issue Oct 6, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amery pushed a commit that referenced this issue Oct 13, 2012
…d reasons

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
amery pushed a commit that referenced this issue Apr 30, 2013
…failed

This bug could be triggered if 1st interface configuration fails:
Apr  8 18:20:30 homeserver kernel: usb 5-1: new low-speed USB device number 2 using ohci_hcd
Apr  8 18:20:30 homeserver kernel: input: iMON Panel, Knob and Mouse(15c2:0036) as /devices/pci0000:00/0000:00:13.0/usb5/5-1/5-1:1.0/input/input2
Apr  8 18:20:30 homeserver kernel: Registered IR keymap rc-imon-pad
Apr  8 18:20:30 homeserver kernel: input: iMON Remote (15c2:0036) as /devices/pci0000:00/0000:00:13.0/usb5/5-1/5-1:1.0/rc/rc0/input3
Apr  8 18:20:30 homeserver kernel: rc0: iMON Remote (15c2:0036) as /devices/pci0000:00/0000:00:13.0/usb5/5-1/5-1:1.0/rc/rc0
Apr  8 18:20:30 homeserver kernel: imon:send_packet: packet tx failed (-32)
Apr  8 18:20:30 homeserver kernel: imon 5-1:1.0: remote input dev register failed
Apr  8 18:20:30 homeserver kernel: imon 5-1:1.0: imon_init_intf0: rc device setup failed
Apr  8 18:20:30 homeserver kernel: imon 5-1:1.0: unable to initialize intf0, err 0
Apr  8 18:20:30 homeserver kernel: imon:imon_probe: failed to initialize context!
Apr  8 18:20:30 homeserver kernel: imon 5-1:1.0: unable to register, err -19
Apr  8 18:20:30 homeserver kernel: BUG: unable to handle kernel NULL pointer dereference at 00000014
Apr  8 18:20:30 homeserver kernel: IP: [<c05c4e4c>] mutex_lock+0xc/0x30
Apr  8 18:20:30 homeserver kernel: *pde = 00000000
Apr  8 18:20:30 homeserver kernel: Oops: 0002 [#1] PREEMPT SMP
Apr  8 18:20:30 homeserver kernel: Modules linked in:
Apr  8 18:20:30 homeserver kernel: Pid: 367, comm: khubd Not tainted 3.8.3-htpc-00002-g79b1403 #23 Unknow Unknow/RS780-SB700
Apr  8 18:20:30 homeserver kernel: EIP: 0060:[<c05c4e4c>] EFLAGS: 00010296 CPU: 1
Apr  8 18:20:30 homeserver kernel: EIP is at mutex_lock+0xc/0x30
Apr  8 18:20:30 homeserver kernel: EAX: 00000014 EBX: 00000014 ECX: 00000000 EDX: f590e480
Apr  8 18:20:30 homeserver kernel: ESI: f5deac00 EDI: f590e480 EBP: f5f3ee00 ESP: f6577c28
Apr  8 18:20:30 homeserver kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Apr  8 18:20:30 homeserver kernel: CR0: 8005003b CR2: 00000014 CR3: 0081b000 CR4: 000007d0
Apr  8 18:20:30 homeserver kernel: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Apr  8 18:20:30 homeserver kernel: DR6: ffff0ff0 DR7: 00000400
Apr  8 18:20:30 homeserver kernel: Process khubd (pid: 367, ti=f6576000 task=f649ea00 task.ti=f6576000)
Apr  8 18:20:30 homeserver kernel: Stack:
Apr  8 18:20:30 homeserver kernel: 00000000 f5deac00 c0448de4 f59714c0 f5deac64 c03b8ad2 f6577c90 00000004
Apr  8 18:20:30 homeserver kernel: f649ea00 c0205142 f6779820 a1ff7f08 f5deac00 00000001 f5f3ee1c 00000014
Apr  8 18:20:30 homeserver kernel: 00000004 00000202 15c20036 c07a03e8 fffee0ca f6795c00 f5f3ee1c f5deac00
Apr  8 18:20:30 homeserver kernel: Call Trace:
Apr  8 18:20:30 homeserver kernel: [<c0448de4>] ? imon_probe+0x494/0xde0
Apr  8 18:20:30 homeserver kernel: [<c03b8ad2>] ? rpm_resume+0xb2/0x4f0
Apr  8 18:20:30 homeserver kernel: [<c0205142>] ? sysfs_addrm_finish+0x12/0x90
Apr  8 18:20:30 homeserver kernel: [<c04170e9>] ? usb_probe_interface+0x169/0x240
Apr  8 18:20:30 homeserver kernel: [<c03b0ca0>] ? __driver_attach+0x80/0x80
Apr  8 18:20:30 homeserver kernel: [<c03b0ca0>] ? __driver_attach+0x80/0x80
Apr  8 18:20:30 homeserver kernel: [<c03b0a94>] ? driver_probe_device+0x54/0x1e0
Apr  8 18:20:30 homeserver kernel: [<c0416abe>] ? usb_device_match+0x4e/0x80
Apr  8 18:20:30 homeserver kernel: [<c03af314>] ? bus_for_each_drv+0x34/0x70
Apr  8 18:20:30 homeserver kernel: [<c03b0a0b>] ? device_attach+0x7b/0x90
Apr  8 18:20:30 homeserver kernel: [<c03b0ca0>] ? __driver_attach+0x80/0x80
Apr  8 18:20:30 homeserver kernel: [<c03b00ff>] ? bus_probe_device+0x5f/0x80
Apr  8 18:20:30 homeserver kernel: [<c03aeab7>] ? device_add+0x567/0x610
Apr  8 18:20:30 homeserver kernel: [<c041a7bc>] ? usb_create_ep_devs+0x7c/0xd0
Apr  8 18:20:30 homeserver kernel: [<c0413837>] ? create_intf_ep_devs+0x47/0x70
Apr  8 18:20:30 homeserver kernel: [<c04156c4>] ? usb_set_configuration+0x454/0x750
Apr  8 18:20:30 homeserver kernel: [<c03b0ca0>] ? __driver_attach+0x80/0x80
Apr  8 18:20:30 homeserver kernel: [<c041de8a>] ? generic_probe+0x2a/0x80
Apr  8 18:20:30 homeserver kernel: [<c03b0ca0>] ? __driver_attach+0x80/0x80
Apr  8 18:20:30 homeserver kernel: [<c0205aff>] ? sysfs_create_link+0xf/0x20
Apr  8 18:20:30 homeserver kernel: [<c04171db>] ? usb_probe_device+0x1b/0x40
Apr  8 18:20:30 homeserver kernel: [<c03b0a94>] ? driver_probe_device+0x54/0x1e0
Apr  8 18:20:30 homeserver kernel: [<c03af314>] ? bus_for_each_drv+0x34/0x70
Apr  8 18:20:30 homeserver kernel: [<c03b0a0b>] ? device_attach+0x7b/0x90
Apr  8 18:20:30 homeserver kernel: [<c03b0ca0>] ? __driver_attach+0x80/0x80
Apr  8 18:20:30 homeserver kernel: [<c03b00ff>] ? bus_probe_device+0x5f/0x80
Apr  8 18:20:30 homeserver kernel: [<c03aeab7>] ? device_add+0x567/0x610
Apr  8 18:20:30 homeserver kernel: [<c040e6df>] ? usb_new_device+0x12f/0x1e0
Apr  8 18:20:30 homeserver kernel: [<c040f4d8>] ? hub_thread+0x458/0x1230
Apr  8 18:20:30 homeserver kernel: [<c015554f>] ? dequeue_task_fair+0x9f/0xc0
Apr  8 18:20:30 homeserver kernel: [<c0131312>] ? release_task+0x1d2/0x330
Apr  8 18:20:30 homeserver kernel: [<c01477b0>] ? abort_exclusive_wait+0x90/0x90
Apr  8 18:20:30 homeserver kernel: [<c040f080>] ? usb_remote_wakeup+0x40/0x40
Apr  8 18:20:30 homeserver kernel: [<c0146ed2>] ? kthread+0x92/0xa0
Apr  8 18:20:30 homeserver kernel: [<c05c7877>] ? ret_from_kernel_thread+0x1b/0x28
Apr  8 18:20:30 homeserver kernel: [<c0146e40>] ? kthread_freezable_should_stop+0x50/0x50
Apr  8 18:20:30 homeserver kernel: Code: 89 04 24 89 f0 e8 05 ff ff ff 8b 5c 24 24 8b 74 24 28 8b 7c 24 2c 8b 6c 24 30 83 c4 34 c3 00 83 ec 08 89 1c 24 89 74 24 04 89 c3 <f0> ff 08 79 05 e8 ca 03 00 00 64 a1 70 d6 80 c0 8b 74 24 04 89
Apr  8 18:20:30 homeserver kernel: EIP: [<c05c4e4c>] mutex_lock+0xc/0x30 SS:ESP 0068:f6577c28
Apr  8 18:20:30 homeserver kernel: CR2: 0000000000000014
Apr  8 18:20:30 homeserver kernel: ---[ end trace df134132c967205c ]---

Signed-off-by: Kevin Baradon <kevin.baradon@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
amery pushed a commit that referenced this issue Jul 11, 2013
Several people reported the warning: "kernel BUG at kernel/timer.c:729!"
and the stack trace is:

	#7 [ffff880214d25c10] mod_timer+501 at ffffffff8106d905
	#8 [ffff880214d25c50] br_multicast_del_pg.isra.20+261 at ffffffffa0731d25 [bridge]
	#9 [ffff880214d25c80] br_multicast_disable_port+88 at ffffffffa0732948 [bridge]
	#10 [ffff880214d25cb0] br_stp_disable_port+154 at ffffffffa072bcca [bridge]
	#11 [ffff880214d25ce8] br_device_event+520 at ffffffffa072a4e8 [bridge]
	#12 [ffff880214d25d18] notifier_call_chain+76 at ffffffff8164aafc
	#13 [ffff880214d25d50] raw_notifier_call_chain+22 at ffffffff810858f6
	#14 [ffff880214d25d60] call_netdevice_notifiers+45 at ffffffff81536aad
	#15 [ffff880214d25d80] dev_close_many+183 at ffffffff81536d17
	#16 [ffff880214d25dc0] rollback_registered_many+168 at ffffffff81537f68
	#17 [ffff880214d25de8] rollback_registered+49 at ffffffff81538101
	#18 [ffff880214d25e10] unregister_netdevice_queue+72 at ffffffff815390d8
	#19 [ffff880214d25e30] __tun_detach+272 at ffffffffa074c2f0 [tun]
	#20 [ffff880214d25e88] tun_chr_close+45 at ffffffffa074c4bd [tun]
	#21 [ffff880214d25ea8] __fput+225 at ffffffff8119b1f1
	#22 [ffff880214d25ef0] ____fput+14 at ffffffff8119b3fe
	#23 [ffff880214d25f00] task_work_run+159 at ffffffff8107cf7f
	#24 [ffff880214d25f30] do_notify_resume+97 at ffffffff810139e1
	#25 [ffff880214d25f50] int_signal+18 at ffffffff8164f292

this is due to I forgot to check if mp->timer is armed in
br_multicast_del_pg(). This bug is introduced by
commit 9f00b2e (bridge: only expire the mdb entry
when query is received).

Same for __br_mdb_del().

Tested-by: poma <pomidorabelisima@gmail.com>
Reported-by: LiYonghua <809674045@qq.com>
Reported-by: Robert Hancock <hancockrwd@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
amery pushed a commit that referenced this issue Sep 20, 2013
When booting secondary CPUs, announce_cpu() is called to show which cpu has
been brought up. For example:

[    0.402751] smpboot: Booting Node   0, Processors  #1 #2 #3 #4 #5 OK
[    0.525667] smpboot: Booting Node   1, Processors  #6 #7 #8 #9 #10 #11 OK
[    0.755592] smpboot: Booting Node   0, Processors  #12 #13 #14 #15 #16 #17 OK
[    0.890495] smpboot: Booting Node   1, Processors  #18 #19 #20 #21 #22 #23

But the last "OK" is lost, because 'nr_cpu_ids-1' represents the maximum
possible cpu id. It should use the maximum present cpu id in case not all
CPUs booted up.

Signed-off-by: Libin <huawei.libin@huawei.com>
Cc: <guohanjun@huawei.com>
Cc: <wangyijing@huawei.com>
Cc: <fenghua.yu@intel.com>
Cc: <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1378378676-18276-1-git-send-email-huawei.libin@huawei.com
[ tweaked the changelog, removed unnecessary line break, tweaked the format to align the fields vertically. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
turl pushed a commit that referenced this issue Oct 5, 2013
The driver uses platform_driver_probe() to obtain platform data
if any. However, that function is placed in the .init section so
it must be called upon driver module initialization.

The problem was reported by Fenguang Wu resulting in a kernel
oops because the .init section was already freed.

[   48.966342] Switched to clocksource tsc
[   48.970002] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[   48.970851] BUG: unable to handle kernel paging request at ffffffff82196446
[   48.970957] IP: [<ffffffff82196446>] classes_init+0x26/0x26
[   48.970957] PGD 1e76067 PUD 1e77063 PMD f388063 PTE 8000000002196163
[   48.970957] Oops: 0011 [#1]
[   48.970957] CPU: 0 PID: 17 Comm: kworker/0:1 Not tainted 3.11.0-rc7-00444-gc52dd7f #23
[   48.970957] Workqueue: events brcmf_driver_init
[   48.970957] task: ffff8800001d2000 ti: ffff8800001d4000 task.ti: ffff8800001d4000
[   48.970957] RIP: 0010:[<ffffffff82196446>]  [<ffffffff82196446>] classes_init+0x26/0x26
[   48.970957] RSP: 0000:ffff8800001d5d40  EFLAGS: 00000286
[   48.970957] RAX: 0000000000000001 RBX: ffffffff820c5620 RCX: 0000000000000000
[   48.970957] RDX: 0000000000000001 RSI: ffffffff816f7380 RDI: ffffffff820c56c0
[   48.970957] RBP: ffff8800001d5d50 R08: ffff8800001d2508 R09: 0000000000000002
[   48.970957] R10: 0000000000000000 R11: 0001f7ce298c5620 R12: ffff8800001c76b0
[   48.970957] R13: ffffffff81e91d40 R14: 0000000000000000 R15: ffff88000e0ce300
[   48.970957] FS:  0000000000000000(0000) GS:ffffffff81e84000(0000) knlGS:0000000000000000
[   48.970957] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   48.970957] CR2: ffffffff82196446 CR3: 0000000001e75000 CR4: 00000000000006b0
[   48.970957] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   48.970957] DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
[   48.970957] Stack:
[   48.970957]  ffffffff816f7df8 ffffffff820c5620 ffff8800001d5d60 ffffffff816eeec9
[   48.970957]  ffff8800001d5de0 ffffffff81073dc5 ffffffff81073d68 ffff8800001d5db8
[   48.970957]  0000000000000086 ffffffff820c5620 ffffffff824f7fd0 0000000000000000
[   48.970957] Call Trace:
[   48.970957]  [<ffffffff816f7df8>] ? brcmf_sdio_init+0x18/0x70
[   48.970957]  [<ffffffff816eeec9>] brcmf_driver_init+0x9/0x10
[   48.970957]  [<ffffffff81073dc5>] process_one_work+0x1d5/0x480
[   48.970957]  [<ffffffff81073d68>] ? process_one_work+0x178/0x480
[   48.970957]  [<ffffffff81074188>] worker_thread+0x118/0x3a0
[   48.970957]  [<ffffffff81074070>] ? process_one_work+0x480/0x480
[   48.970957]  [<ffffffff8107aa17>] kthread+0xe7/0xf0
[   48.970957]  [<ffffffff810829f7>] ? finish_task_switch.constprop.57+0x37/0xd0
[   48.970957]  [<ffffffff8107a930>] ? __kthread_parkme+0x80/0x80
[   48.970957]  [<ffffffff81a6923a>] ret_from_fork+0x7a/0xb0
[   48.970957]  [<ffffffff8107a930>] ? __kthread_parkme+0x80/0x80
[   48.970957] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc cc cc cc cc <cc> cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[   48.970957] RIP  [<ffffffff82196446>] classes_init+0x26/0x26
[   48.970957]  RSP <ffff8800001d5d40>
[   48.970957] CR2: ffffffff82196446
[   48.970957] ---[ end trace 62980817cd525f14 ]---

Cc: <stable@vger.kernel.org> # 3.10.x, 3.11.x
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
amery pushed a commit that referenced this issue Nov 12, 2013
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      #1  #2  #3  #4  #5  #6  #7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  #8  #9 #10 #11 #12 #13 #14 #15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    #1 #2 #3 #4 #5 #6 #7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
amery pushed a commit that referenced this issue Nov 12, 2013
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
[    0.603005] .... node   #1, CPUs:     #8   #9  #10  #11  #12  #13  #14  #15
[    1.200005] .... node   #2, CPUs:    #16  #17  #18  #19  #20  #21  #22  #23
[    1.796005] .... node   #3, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31
[    2.393005] .... node   #4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.996005] .... node   #5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    3.600005] .... node   #6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    4.202005] .... node   #7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    4.811005] .... node   #8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
[    5.421006] .... node   #9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
[    6.032005] .... node  #10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
[    6.648006] .... node  #11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
[    7.262005] .... node  #12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
[    7.865005] .... node  #13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
[    8.466005] .... node  #14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
[    9.073006] .... node  #15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
amery pushed a commit that referenced this issue Nov 20, 2013
When changing group_thread_cnt from sysfs entry, the kernel can oops.

The kernel messages are:
[  740.961389] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[  740.961444] IP: [<ffffffff81062570>] process_one_work+0x30/0x500
[  740.961476] PGD b9013067 PUD b651e067 PMD 0
[  740.961503] Oops: 0000 [#1] SMP
[  740.961525] Modules linked in: netconsole e1000e ptp pps_core
[  740.961577] CPU: 0 PID: 3683 Comm: kworker/u8:5 Not tainted 3.12.0+ #23
[  740.961602] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015  11/09/2011
[  740.961646] task: ffff88013abe0000 ti: ffff88013a246000 task.ti: ffff88013a246000
[  740.961673] RIP: 0010:[<ffffffff81062570>]  [<ffffffff81062570>] process_one_work+0x30/0x500
[  740.961708] RSP: 0018:ffff88013a247e08  EFLAGS: 00010086
[  740.961730] RAX: ffff8800b912b400 RBX: ffff88013a61e680 RCX: ffff8800b912b400
[  740.961757] RDX: ffff8800b912b600 RSI: ffff8800b912b600 RDI: ffff88013a61e680
[  740.961782] RBP: ffff88013a247e48 R08: ffff88013a246000 R09: 000000000002c09d
[  740.961808] R10: 000000000000010f R11: 0000000000000000 R12: ffff88013b00cc00
[  740.961833] R13: 0000000000000000 R14: ffff88013b00cf80 R15: ffff88013a61e6b0
[  740.961861] FS:  0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
[  740.961893] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  740.962001] CR2: 00000000000000b8 CR3: 00000000b24fe000 CR4: 00000000000407f0
[  740.962001] Stack:
[  740.962001]  0000000000000008 ffff8800b912b600 ffff88013b00cc00 ffff88013a61e680
[  740.962001]  ffff88013b00cc00 ffff88013b00cc18 ffff88013b00cf80 ffff88013a61e6b0
[  740.962001]  ffff88013a247eb8 ffffffff810639c6 0000000000012a80 ffff88013a247fd8
[  740.962001] Call Trace:
[  740.962001]  [<ffffffff810639c6>] worker_thread+0x206/0x3f0
[  740.962001]  [<ffffffff810637c0>] ? manage_workers+0x2c0/0x2c0
[  740.962001]  [<ffffffff81069656>] kthread+0xc6/0xd0
[  740.962001]  [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70
[  740.962001]  [<ffffffff81722ffc>] ret_from_fork+0x7c/0xb0
[  740.962001]  [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70
[  740.962001] Code: 89 e5 41 57 41 56 41 55 45 31 ed 41 54 53 48 89 fb 48 83 ec 18 48 8b 06 4c 8b 67 48 48 89 c1 30 c9 a8 04 4c 0f 45 e9 80 7f 58 00 <49> 8b 45 08 44 8b b0 00 01 00 00 78 0c 41 f6 44 24 10 04 0f 84
[  740.962001] RIP  [<ffffffff81062570>] process_one_work+0x30/0x500
[  740.962001]  RSP <ffff88013a247e08>
[  740.962001] CR2: 0000000000000008
[  740.962001] ---[ end trace 39181460000748de ]---
[  740.962001] Kernel panic - not syncing: Fatal exception

This can happen if there are some stripes left, fewer than MAX_STRIPE_BATCH.
A worker is queued to handle them.
But before calling raid5_do_work, raid5d handles those
stripes making conf->active_stripe = 0.
So mddev_suspend() can return.
We might then free old worker resources before the queued
raid5_do_work() handled them.  When it runs, it crashes.

	raid5d()		raid5_store_group_thread_cnt()
	queue_work		mddev_suspend()
				handle_strips
				active_stripe=0
				free(old worker resources)
	process_one_work
	raid5_do_work

To avoid this, we should only flush the worker resources before freeing them.

This fixes a bug introduced in 3.12 so is suitable for the 3.12.x
stable series.

Cc: stable@vger.kernel.org (3.12)
Fixes: b721420
Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Shaohua Li <shli@kernel.org>
amery pushed a commit that referenced this issue Dec 30, 2013
commit db4efbb upstream.

The driver uses platform_driver_probe() to obtain platform data
if any. However, that function is placed in the .init section so
it must be called upon driver module initialization.

The problem was reported by Fenguang Wu resulting in a kernel
oops because the .init section was already freed.

[   48.966342] Switched to clocksource tsc
[   48.970002] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[   48.970851] BUG: unable to handle kernel paging request at ffffffff82196446
[   48.970957] IP: [<ffffffff82196446>] classes_init+0x26/0x26
[   48.970957] PGD 1e76067 PUD 1e77063 PMD f388063 PTE 8000000002196163
[   48.970957] Oops: 0011 [#1]
[   48.970957] CPU: 0 PID: 17 Comm: kworker/0:1 Not tainted 3.11.0-rc7-00444-gc52dd7f #23
[   48.970957] Workqueue: events brcmf_driver_init
[   48.970957] task: ffff8800001d2000 ti: ffff8800001d4000 task.ti: ffff8800001d4000
[   48.970957] RIP: 0010:[<ffffffff82196446>]  [<ffffffff82196446>] classes_init+0x26/0x26
[   48.970957] RSP: 0000:ffff8800001d5d40  EFLAGS: 00000286
[   48.970957] RAX: 0000000000000001 RBX: ffffffff820c5620 RCX: 0000000000000000
[   48.970957] RDX: 0000000000000001 RSI: ffffffff816f7380 RDI: ffffffff820c56c0
[   48.970957] RBP: ffff8800001d5d50 R08: ffff8800001d2508 R09: 0000000000000002
[   48.970957] R10: 0000000000000000 R11: 0001f7ce298c5620 R12: ffff8800001c76b0
[   48.970957] R13: ffffffff81e91d40 R14: 0000000000000000 R15: ffff88000e0ce300
[   48.970957] FS:  0000000000000000(0000) GS:ffffffff81e84000(0000) knlGS:0000000000000000
[   48.970957] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   48.970957] CR2: ffffffff82196446 CR3: 0000000001e75000 CR4: 00000000000006b0
[   48.970957] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   48.970957] DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
[   48.970957] Stack:
[   48.970957]  ffffffff816f7df8 ffffffff820c5620 ffff8800001d5d60 ffffffff816eeec9
[   48.970957]  ffff8800001d5de0 ffffffff81073dc5 ffffffff81073d68 ffff8800001d5db8
[   48.970957]  0000000000000086 ffffffff820c5620 ffffffff824f7fd0 0000000000000000
[   48.970957] Call Trace:
[   48.970957]  [<ffffffff816f7df8>] ? brcmf_sdio_init+0x18/0x70
[   48.970957]  [<ffffffff816eeec9>] brcmf_driver_init+0x9/0x10
[   48.970957]  [<ffffffff81073dc5>] process_one_work+0x1d5/0x480
[   48.970957]  [<ffffffff81073d68>] ? process_one_work+0x178/0x480
[   48.970957]  [<ffffffff81074188>] worker_thread+0x118/0x3a0
[   48.970957]  [<ffffffff81074070>] ? process_one_work+0x480/0x480
[   48.970957]  [<ffffffff8107aa17>] kthread+0xe7/0xf0
[   48.970957]  [<ffffffff810829f7>] ? finish_task_switch.constprop.57+0x37/0xd0
[   48.970957]  [<ffffffff8107a930>] ? __kthread_parkme+0x80/0x80
[   48.970957]  [<ffffffff81a6923a>] ret_from_fork+0x7a/0xb0
[   48.970957]  [<ffffffff8107a930>] ? __kthread_parkme+0x80/0x80
[   48.970957] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc cc cc cc cc <cc> cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[   48.970957] RIP  [<ffffffff82196446>] classes_init+0x26/0x26
[   48.970957]  RSP <ffff8800001d5d40>
[   48.970957] CR2: ffffffff82196446
[   48.970957] ---[ end trace 62980817cd525f14 ]---

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
turl referenced this issue in allwinner-dev-team/linux-allwinner Jan 28, 2014
Printing the "start_ip" for every secondary cpu is very noisy on a large
system - and doesn't add any value. Drop this message.

Console log before:
Booting Node   0, Processors  #1
smpboot cpu 1: start_ip = 96000
 #2
smpboot cpu 2: start_ip = 96000
 #3
smpboot cpu 3: start_ip = 96000
 #4
smpboot cpu 4: start_ip = 96000
       ...
 torvalds#31
smpboot cpu 31: start_ip = 96000
Brought up 32 CPUs

Console log after:
Booting Node   0, Processors  #1 #2 #3 #4 #5 torvalds#6 torvalds#7 Ok.
Booting Node   1, Processors  torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 Ok.
Booting Node   0, Processors  torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 Ok.
Booting Node   1, Processors  torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31
Brought up 32 CPUs

Acked-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4f452eb42507460426@agluck-desktop.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
turl referenced this issue in allwinner-dev-team/linux-allwinner Jan 28, 2014
This patch converts the remaining struct se_portal_group->session_lock
usage to use irqsave+irqrestore to address the following warnings for
hardware target mode interrupt context usage.  This change generate
other warnings for current iscsi-target mode still using ->session_lock
with spin_lock_bh, which will need to be converted in a seperate patch.

[  492.480728] [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
[  492.488194] 3.0.0+ torvalds#23
[  492.490820] ------------------------------------------------------
[  492.497704] sh/7162 [HC0[0]:SC0[2]:HE0:SE0] is trying to acquire:
[  492.504493] (&(&se_tpg->session_lock)->rlock){+.....}, at: [<ffffffffa022364d>] transport_deregister_session+0x2d/0x163 [target_core_mod]
  492.518390]
[  492.518390] and this task is already holding:
[  492.524897] (&(&ha->hardware_lock)->rlock){-.-...}, at: [<ffffffffa00b9146>] qla_tgt_stop_phase1+0x5e/0x27e [qla2xxx]
[  492.536856] which would create a new lock dependency:
[  492.542481] (&(&ha->hardware_lock)->rlock){-.-...} -> (&(&se_tpg->session_lock)->rlock){+.....}
[  492.552321]
[  492.552321] but this new dependency connects a HARDIRQ-irq-safe lock:
[  492.561149] (&(&ha->hardware_lock)->rlock){-.-...}
[  492.566400] ... which became HARDIRQ-irq-safe at:
[  492.571841]   [<ffffffff81064720>] __lock_acquire+0x68f/0x921
[  492.578247]   [<ffffffff81064eff>] lock_acquire+0xe0/0x10d
[  492.584367]   [<ffffffff813a74c6>] _raw_spin_lock_irqsave+0x44/0x56
[  492.591358]   [<ffffffffa009b1be>] qla24xx_msix_default+0x5c/0x2aa [qla2xxx]
[  492.599227]   [<ffffffff81088582>] handle_irq_event_percpu+0x5a/0x197
[  492.606413]   [<ffffffff810886fb>] handle_irq_event+0x3c/0x5c
[  492.612822]   [<ffffffff8108a6dc>] handle_edge_irq+0xcc/0xf1
[  492.619138]   [<ffffffff810039b9>] handle_irq+0x83/0x8e
[  492.624971]   [<ffffffff8100333e>] do_IRQ+0x48/0xaf
[  492.630413]   [<ffffffff813a7cd3>] ret_from_intr+0x0/0x1a
[  492.636437]   [<ffffffff81001dc1>] cpu_idle+0x5b/0x8d
[  492.642073]   [<ffffffff81392709>] rest_init+0xad/0xb4
[  492.647809]   [<ffffffff81a1cbbc>] start_kernel+0x366/0x371
[  492.654030]   [<ffffffff81a1c2b1>] x86_64_start_reservations+0xb8/0xbc
[  492.661311]   [<ffffffff81a1c3b6>] x86_64_start_kernel+0x101/0x110
[  492.668204]
[  492.668205] to a HARDIRQ-irq-unsafe lock:
[  492.674324] (&(&se_tpg->session_lock)->rlock){+.....}
[  492.679862] ... which became HARDIRQ-irq-unsafe at:
[  492.685497] ...  [<ffffffff8106479a>] __lock_acquire+0x709/0x921
[  492.692209]   [<ffffffff81064eff>] lock_acquire+0xe0/0x10d
[  492.698330]   [<ffffffff813a75ed>] _raw_spin_lock_bh+0x31/0x40
[  492.704836]   [<ffffffffa021c208>] core_tpg_del_initiator_node_acl+0x89/0x336 [target_core_mod]
[  492.714546]   [<ffffffffa02fb075>] tcm_qla2xxx_drop_nodeacl+0x20/0x2d [tcm_qla2xxx]
[  492.723087]   [<ffffffffa02108d9>] target_fabric_nacl_base_release+0x22/0x24 [target_core_mod]
[  492.732698]   [<ffffffffa01661c8>] config_item_release+0x7d/0xa3 [configfs]
[  492.740465]   [<ffffffff811d48fe>] kref_put+0x43/0x4d
[  492.746101]   [<ffffffffa0166149>] config_item_put+0x19/0x1b [configfs]
[  492.753481]   [<ffffffffa0164987>] configfs_rmdir+0x1eb/0x258 [configfs]
[  492.760957]   [<ffffffff810ecc54>] vfs_rmdir+0x79/0xd0
[  492.766690]   [<ffffffff810eec4a>] do_rmdir+0xc2/0x111
[  492.772423]   [<ffffffff810eecd0>] sys_rmdir+0x11/0x13
[  492.778156]   [<ffffffff813ae4d2>] system_call_fastpath+0x16/0x1b
[  492.784953]

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
turl referenced this issue in allwinner-dev-team/linux-allwinner Jan 28, 2014
Following commit a79dd5a titled "tty/serial/pmac_zilog: Fix suspend & resume",
my Powerbook G4 Titanium showed the following stack dump:

[   36.878225] irq 23: nobody cared (try booting with the "irqpoll" option)
[   36.878251] Call Trace:
[   36.878291] [dfff3f00] [c000984c] show_stack+0x7c/0x194 (unreliable)
[   36.878322] [dfff3f40] [c00a6868] __report_bad_irq+0x44/0xf4
[   36.878339] [dfff3f60] [c00a6b04] note_interrupt+0x1ec/0x2ac
[   36.878356] [dfff3f80] [c00a48d0] handle_irq_event_percpu+0x250/0x2b8
[   36.878372] [dfff3fd0] [c00a496c] handle_irq_event+0x34/0x54
[   36.878389] [dfff3fe0] [c00a753] handle_fasteoi_irq+0xb4/0x124
[   36.878412] [dfff3ff0] [c000f5bc] call_handle_irq+0x18/0x28
[   36.878428] [deef1f10] [c000719c] do_IRQ+0x114/0x1cc
[   36.878446] [deef1f40] [c0015868] ret_from_except+0x0/0x1c
[   36.878484] --- Exception: 501 at 0xf497610
[   36.878489]     LR = 0xfdc3dd0
[   36.878497] handlers:
[   36.878510] [<c02b7424>] pmz_interrupt
[   36.878520] Disabling IRQ torvalds#23

From an E-mail exchange about this problem, Andreas Schwab noticed a typo
that resulted in the wrong condition being tested.

The patch also corrects 2 typos that incorrectly report why an error branch
is being taken.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
amery pushed a commit that referenced this issue Apr 3, 2014
Claudiu Manoil says:

====================
gianfar: Tx timeout issue

There's an older Tx timeout issue showing up on etsec2 devices
with 2 CPUs.  I pinned this issue down to processing overhead
incurred by supporting multiple Tx/Rx rings, as explained in
the 2nd patch below.  But before this, there's also a concurency
issue leading to Rx/Tx spurrious interrupts, addressed by the
'Tx NAPI' patch below.
The Tx timeout can be triggered with multiple Tx flows,
'iperf -c -N 8' commands, on a 2 CPUs etsec2 based (P1020) board.

Before the patches:
"""
root@p1020rdb-pc:~# iperf -c 172.16.1.3 -n 1000M -P 8 &
[...]
root@p1020rdb-pc:~# NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue 1 timed out
WARNING: at net/sched/sch_generic.c:279
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc3-03386-g89ea59c #23
task: ed84ef40 ti: ed868000 task.ti: ed868000
NIP: c04627a8 LR: c04627a8 CTR: c02fb270
REGS: ed869d00 TRAP: 0700   Not tainted  (3.13.0-rc3-03386-g89ea59c)
MSR: 00029000 <CE,EE,ME>  CR: 44000022  XER: 20000000
[...]

root@p1020rdb-pc:~# [ ID] Interval       Transfer     Bandwidth
[  5]  0.0-19.3 sec  1000 MBytes    434 Mbits/sec
[  8]  0.0-39.7 sec  1000 MBytes    211 Mbits/sec
[  9]  0.0-40.1 sec  1000 MBytes    209 Mbits/sec
[  3]  0.0-40.2 sec  1000 MBytes    209 Mbits/sec
[ 10]  0.0-59.0 sec  1000 MBytes    142 Mbits/sec
[  7]  0.0-74.6 sec  1000 MBytes    112 Mbits/sec
[  6]  0.0-74.7 sec  1000 MBytes    112 Mbits/sec
[  4]  0.0-74.7 sec  1000 MBytes    112 Mbits/sec
[SUM]  0.0-74.7 sec  7.81 GBytes    898 Mbits/sec

root@p1020rdb-pc:~# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:04:9f:00:13:01
          inet addr:172.16.1.1  Bcast:172.16.255.255  Mask:255.255.0.0
          inet6 addr: fe80::204:9fff:fe00:1301/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:708722 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8717849 errors:6 dropped:0 overruns:1470 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:58118018 (55.4 MiB)  TX bytes:274069482 (261.3 MiB)
          Base address:0xa000

"""

After applying the patches:
"""
root@p1020rdb-pc:~# iperf -c 172.16.1.3 -n 1000M -P 8 &
[...]
root@p1020rdb-pc:~# [ ID] Interval       Transfer     Bandwidth
[  9]  0.0-70.5 sec  1000 MBytes    119 Mbits/sec
[  5]  0.0-70.5 sec  1000 MBytes    119 Mbits/sec
[  6]  0.0-70.7 sec  1000 MBytes    119 Mbits/sec
[  4]  0.0-71.0 sec  1000 MBytes    118 Mbits/sec
[  8]  0.0-71.1 sec  1000 MBytes    118 Mbits/sec
[  3]  0.0-71.2 sec  1000 MBytes    118 Mbits/sec
[ 10]  0.0-71.3 sec  1000 MBytes    118 Mbits/sec
[  7]  0.0-71.3 sec  1000 MBytes    118 Mbits/sec
[SUM]  0.0-71.3 sec  7.81 GBytes    942 Mbits/sec

root@p1020rdb-pc:~# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:04:9f:00:13:01
          inet addr:172.16.1.1  Bcast:172.16.255.255  Mask:255.255.0.0
          inet6 addr: fe80::204:9fff:fe00:1301/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:728446 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8690057 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:59732650 (56.9 MiB)  TX bytes:271554306 (258.9 MiB)
          Base address:0xa000
"""
v2: PATCH 2:
    Replaced CPP check with run-time condition to
    limit the number of queues. Updated comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue May 10, 2014
This is a workaround for a HW erratum on 82579 devices.
Erratum is linux-sunxi#23 in Intel 6 Series Chipset and Intel C200 Series Chipset
specification Update June 2013.

Problem: 82579 parts experience packet loss in Gig and 100 speeds
when interconnect between PHY and MAC is exiting K1 power saving state.
This was previously believed to only affect 1Gig speed, but has been observed
at 100Mbs also.

Workaround: Disable K1 for 82579 devices at Gig and 100 speeds.

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
amery pushed a commit that referenced this issue Aug 4, 2014
Key being hashed is unmapped using the digest size instead of
initial length:

caam_jr ffe301000.jr: DMA-API: device driver frees DMA memory with different size [device address=0x000000002eeedac0] [map size=80 bytes] [unmap size=20 bytes]
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:1090
Modules linked in: caamhash(+)
CPU: 0 PID: 1327 Comm: cryptomgr_test Not tainted 3.16.0-rc1 #23
task: eebda5d0 ti: ee26a000 task.ti: ee26a000
NIP: c0288790 LR: c0288790 CTR: c02d7020
REGS: ee26ba30 TRAP: 0700   Not tainted  (3.16.0-rc1)
MSR: 00021002 <CE,ME>  CR: 44022082  XER: 00000000

GPR00: c0288790 ee26bae0 eebda5d0 0000009f c1de3478 c1de382c 00000000 00021002
GPR08: 00000007 00000000 01660000 0000012f 82022082 00000000 c07a1900 eeda29c0
GPR16: 00000000 c61deea0 000c49a0 00000260 c07e1e10 c0da1180 00029002 c0d9ef08
GPR24: c07a0000 c07a4acc ee26bb38 ee2765c0 00000014 ee130210 00000000 00000014
NIP [c0288790] check_unmap+0x640/0xab0
LR [c0288790] check_unmap+0x640/0xab0
Call Trace:
[ee26bae0] [c0288790] check_unmap+0x640/0xab0 (unreliable)
[ee26bb30] [c0288c78] debug_dma_unmap_page+0x78/0x90
[ee26bbb0] [f929c3d4] ahash_setkey+0x374/0x720 [caamhash]
[ee26bc30] [c022fec8] __test_hash+0x228/0x6c0
[ee26bde0] [c0230388] test_hash+0x28/0xb0
[ee26be00] [c0230458] alg_test_hash+0x48/0xc0
[ee26be20] [c022fa94] alg_test+0x114/0x2e0
[ee26bea0] [c022cd1c] cryptomgr_test+0x4c/0x60
[ee26beb0] [c00497a4] kthread+0xc4/0xe0
[ee26bf40] [c000f2fc] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
41de03e8 83da0020 3c60c06d 83fa0024 3863f520 813b0020 815b0024 80fa0018
811a001c 93c10008 93e1000c 4830cf6d <0fe00000> 3c60c06d 3863f0f4 4830cf5d
---[ end trace db1fae088c75c26c ]---
Mapped at:
 [<f929c15c>] ahash_setkey+0xfc/0x720 [caamhash]
 [<c022fec8>] __test_hash+0x228/0x6c0
 [<c0230388>] test_hash+0x28/0xb0
 [<c0230458>] alg_test_hash+0x48/0xc0
 [<c022fa94>] alg_test+0x114/0x2e0

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
amery pushed a commit that referenced this issue Aug 4, 2014
caam_jr ffe301000.jr: DMA-API: device driver frees DMA memory with different direction [device address=0x00000000062ad1ac] [size=28 bytes] [mapped with DMA_FROM_DEVICE] [unmapped with DMA_TO_DEVICE]
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:1131
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W     3.16.0-rc1 #23
task: c0789380 ti: effd2000 task.ti: c07d6000
NIP: c02885cc LR: c02885cc CTR: c02d7020
REGS: effd3d50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00021002 <CE,ME>  CR: 44048082  XER: 00000000

GPR00: c02885cc effd3e00 c0789380 000000c6 c1de3478 c1de382c 00000000 00021002
GPR08: 00000007 00000000 01660000 0000012f 84048082 00000000 00000018 c07db080
GPR16: 00000006 00000100 0000002c eee567e0 c07e1e10 c0da1180 00029002 c0d96708
GPR24: c07a0000 c07a4acc effd3e58 ee29b140 0000001c ee130210 00000000 c0d96700
NIP [c02885cc] check_unmap+0x47c/0xab0
LR [c02885cc] check_unmap+0x47c/0xab0
Call Trace:
[effd3e00] [c02885cc] check_unmap+0x47c/0xab0 (unreliable)
[effd3e50] [c0288c78] debug_dma_unmap_page+0x78/0x90
[effd3ed0] [f9350974] ahash_done_ctx_dst+0xa4/0x200 [caamhash]
[effd3f00] [c0429640] caam_jr_dequeue+0x1c0/0x280
[effd3f50] [c002c94c] tasklet_action+0xcc/0x1a0
[effd3f80] [c002cb30] __do_softirq+0x110/0x220
[effd3fe0] [c002cf34] irq_exit+0xa4/0xe0
[effd3ff0] [c000d834] call_do_irq+0x24/0x3c
[c07d7d50] [c000489c] do_IRQ+0x8c/0x110
[c07d7d70] [c000f86c] ret_from_except+0x0/0x18
--- Exception: 501 at _raw_spin_unlock_irq+0x30/0x50
    LR = _raw_spin_unlock_irq+0x2c/0x50
[c07d7e40] [c0053084] finish_task_switch+0x74/0x130
[c07d7e60] [c058f278] __schedule+0x238/0x620
[c07d7f70] [c058fb50] schedule_preempt_disabled+0x10/0x20
[c07d7f80] [c00686a0] cpu_startup_entry+0x100/0x1b0
[c07d7fb0] [c074793c] start_kernel+0x338/0x34c
[c07d7ff0] [c00003d8] set_ivor+0x140/0x17c
Instruction dump:
7d495214 7d294214 806a0010 80c90010 811a001c 813a0020 815a0024 90610008
3c60c06d 90c1000c 3863f764 4830d131 <0fe00000> 3c60c06d 3863f0f4 4830d121
---[ end trace db1fae088c75c270 ]---
Mapped at:
 [<f9352454>] ahash_update_first+0x5b4/0xba0 [caamhash]
 [<c022ff28>] __test_hash+0x288/0x6c0
 [<c0230388>] test_hash+0x28/0xb0
 [<c02304a4>] alg_test_hash+0x94/0xc0
 [<c022fa94>] alg_test+0x114/0x2e0

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
amery pushed a commit that referenced this issue Aug 4, 2014
caam_jr ffe301000.jr: DMA-API: device driver frees DMA memory with different direction [device address=0x0000000006271dac] [size=28 bytes] [mapped with DMA_TO_DEVICE] [unmapped with DMA_FROM_DEVICE]
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:1131
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W     3.16.0-rc1 #23
task: c0789380 ti: effd2000 task.ti: c07d6000
NIP: c02885cc LR: c02885cc CTR: c02d7020
REGS: effd3d50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00021002 <CE,ME>  CR: 44048082  XER: 00000000

GPR00: c02885cc effd3e00 c0789380 000000c6 c1de3478 c1de382c 00000000 00021002
GPR08: 00000007 00000000 01660000 0000012f 84048082 00000000 00000018 c07db080
GPR16: 00000006 00000100 0000002c c62517a0 c07e1e10 c0da1180 00029002 c0d95f88
GPR24: c07a0000 c07a4acc effd3e58 ee322bc0 0000001c ee130210 00000000 c0d95f8
NIP [c02885cc] check_unmap+0x47c/0xab0
LR [c02885cc] check_unmap+0x47c/0xab0
Call Trace:
[effd3e00] [c02885cc] check_unmap+0x47c/0xab0 (unreliable)
[effd3e50] [c0288c78] debug_dma_unmap_page+0x78/0x90
[effd3ed0] [f9624d84] ahash_done_ctx_src+0xa4/0x200 [caamhash]
[effd3f00] [c0429640] caam_jr_dequeue+0x1c0/0x280
[effd3f50] [c002c94c] tasklet_action+0xcc/0x1a0
[effd3f80] [c002cb30] __do_softirq+0x110/0x220
[effd3fe0] [c002cf34] irq_exit+0xa4/0xe0
[effd3ff0] [c000d834] call_do_irq+0x24/0x3c
[c07d7d50] [c000489c] do_IRQ+0x8c/0x110
[c07d7d70] [c000f86c] ret_from_except+0x0/0x18
--- Exception: 501 at _raw_spin_unlock_irq+0x30/0x50
    LR = _raw_spin_unlock_irq+0x2c/0x50
[c07d7e40] [c0053084] finish_task_switch+0x74/0x130
[c07d7e60] [c058f278] __schedule+0x238/0x620
[c07d7f70] [c058fb50] schedule_preempt_disabled+0x10/0x20
[c07d7f80] [c00686a0] cpu_startup_entry+0x100/0x1b0
[c07d7fb0] [c074793c] start_kernel+0x338/0x34c
[c07d7ff0] [c00003d8] set_ivor+0x140/0x17c
Instruction dump:
7d495214 7d294214 806a0010 80c90010 811a001c 813a0020 815a0024 90610008
3c60c06d 90c1000c 3863f764 4830d131 <0fe00000> 3c60c06d 3863f0f4 4830d121
---[ end trace db1fae088c75c280 ]---
Mapped at:
 [<f96251bc>] ahash_final_ctx+0x14c/0x7b0 [caamhash]
 [<c022ff4c>] __test_hash+0x2ac/0x6c0
 [<c0230388>] test_hash+0x28/0xb0
 [<c02304a4>] alg_test_hash+0x94/0xc0
 [<c022fa94>] alg_test+0x114/0x2e0

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
amery pushed a commit that referenced this issue Aug 4, 2014
Not initializing edesc->sec4_sg_bytes correctly causes ahash_done
callback to free unallocated DMA memory:

caam_jr ffe301000.jr: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x300900000000b44d] [size=46158 bytes]
WARNING: at lib/dma-debug.c:1080
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 1358 Comm: cryptomgr_test Tainted: G        W     3.16.0-rc1 #23
task: eed04250 ti: effd2000 task.ti: c6046000
NIP: c02889fc LR: c02889fc CTR: c02d7020
REGS: effd3d50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00029002 <CE,EE,ME>  CR: 44048082  XER: 00000000

GPR00: c02889fc effd3e00 eed04250 00000091 c1de3478 c1de382c 00000000 00029002
GPR08: 00000007 00000000 01660000 00000000 22048082 00000000 00000018 c07db080
GPR16: 00000006 00000100 0000002c ee2497e0 c07e1e10 c0da1180 00029002 c0d912c8
GPR24: 00000014 ee2497c0 effd3e58 00000000 c078ad4c ee130210 30090000 0000b44d
NIP [c02889fc] check_unmap+0x8ac/0xab0
LR [c02889fc] check_unmap+0x8ac/0xab0
Call Trace:
[effd3e00] [c02889fc] check_unmap+0x8ac/0xab0 (unreliable)
[effd3e50] [c0288c78] debug_dma_unmap_page+0x78/0x90
[effd3ed0] [f9404fec] ahash_done+0x11c/0x190 [caamhash]
[effd3f00] [c0429640] caam_jr_dequeue+0x1c0/0x280
[effd3f50] [c002c94c] tasklet_action+0xcc/0x1a0
[effd3f80] [c002cb30] __do_softirq+0x110/0x220
[effd3fe0] [c002cf34] irq_exit+0xa4/0xe0
[effd3ff0] [c000d834] call_do_irq+0x24/0x3c
[c6047ae0] [c000489c] do_IRQ+0x8c/0x110
[c6047b00] [c000f86c] ret_from_except+0x0/0x18
--- Exception: 501 at _raw_spin_unlock_irq+0x30/0x50
    LR = _raw_spin_unlock_irq+0x2c/0x50
[c6047bd0] [c0590158] wait_for_common+0xb8/0x170
[c6047c10] [c059024c] wait_for_completion_interruptible+0x1c/0x40
[c6047c20] [c022fc78] do_one_async_hash_op.isra.2.part.3+0x18/0x40
[c6047c30] [c022ff98] __test_hash+0x2f8/0x6c0
[c6047de0] [c0230388] test_hash+0x28/0xb0
[c6047e00] [c0230458] alg_test_hash+0x48/0xc0
[c6047e20] [c022fa94] alg_test+0x114/0x2e0
[c6047ea0] [c022cd1c] cryptomgr_test+0x4c/0x60
[c6047eb0] [c00497a4] kthread+0xc4/0xe0
[c6047f40] [c000f2fc] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
41de01c8 80a9002c 2f850000 40fe0008 80a90008 80fa0018 3c60c06d 811a001c
3863f4a4 813a0020 815a0024 4830cd01 <0fe00000> 81340048 2f890000 40feff48

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
amery pushed a commit that referenced this issue Aug 4, 2014
dst_dma not being properly initialized causes ahash_done_ctx_dst
to try to free unallocated DMA memory:

caam_jr ffe301000.jr: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000006513340] [size=28 bytes]
WARNING: at lib/dma-debug.c:1080
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 1373 Comm: cryptomgr_test Tainted: G        W     3.16.0-rc1 #23
task: ee23e350 ti: effd2000 task.ti: ee1f6000
NIP: c02889fc LR: c02889fc CTR: c02d7020
REGS: effd3d50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00029002 <CE,EE,ME>  CR: 44048082  XER: 00000000

GPR00: c02889fc effd3e00 ee23e350 0000008e c1de3478 c1de382c 00000000 00029002
GPR08: 00000007 00000000 01660000 00000000 24048082 00000000 00000018 c07db080
GPR16: 00000006 00000100 0000002c eeb4a7e0 c07e1e10 c0da1180 00029002 c0d9b3c8
GPR24: eeb4a7c0 00000000 effd3e58 00000000 c078ad4c ee130210 00000000 06513340
NIP [c02889fc] check_unmap+0x8ac/0xab0
LR [c02889fc] check_unmap+0x8ac/0xab0
Call Trace:
[effd3e00] [c02889fc] check_unmap+0x8ac/0xab0 (unreliable)
[effd3e50] [c0288c78] debug_dma_unmap_page+0x78/0x90
[effd3ed0] [f94b89ec] ahash_done_ctx_dst+0x11c/0x200 [caamhash]
[effd3f00] [c0429640] caam_jr_dequeue+0x1c0/0x280
[effd3f50] [c002c94c] tasklet_action+0xcc/0x1a0
[effd3f80] [c002cb30] __do_softirq+0x110/0x220
[effd3fe0] [c002cf34] irq_exit+0xa4/0xe0
[effd3ff0] [c000d834] call_do_irq+0x24/0x3c
[ee1f7ae0] [c000489c] do_IRQ+0x8c/0x110
[ee1f7b00] [c000f86c] ret_from_except+0x0/0x18
--- Exception: 501 at _raw_spin_unlock_irq+0x30/0x50
    LR = _raw_spin_unlock_irq+0x2c/0x50
[ee1f7bd0] [c0590158] wait_for_common+0xb8/0x170
[ee1f7c10] [c059024c] wait_for_completion_interruptible+0x1c/0x40
[ee1f7c20] [c022fc78] do_one_async_hash_op.isra.2.part.3+0x18/0x40
[ee1f7c30] [c022ffb8] __test_hash+0x318/0x6c0
[ee1f7de0] [c0230388] test_hash+0x28/0xb0
[ee1f7e00] [c02304a4] alg_test_hash+0x94/0xc0
[ee1f7e20] [c022fa94] alg_test+0x114/0x2e0
[ee1f7ea0] [c022cd1c] cryptomgr_test+0x4c/0x60
[ee1f7eb0] [c00497a4] kthread+0xc4/0xe0
[ee1f7f40] [c000f2fc] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
41de01c8 80a9002c 2f850000 40fe0008 80a90008 80fa0018 3c60c06d 811a001c
3863f4a4 813a0020 815a0024 4830cd01 <0fe00000> 81340048 2f890000 40feff48

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
amery pushed a commit that referenced this issue Aug 4, 2014
state->buf_dma not being initialized can cause try_buf_map_to_sec4_sg
to try to free unallocated DMA memory:

caam_jr ffe301000.jr: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x000000002eb15068] [size=0 bytes]
WARNING: at lib/dma-debug.c:1080
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 1387 Comm: cryptomgr_test Tainted: G        W     3.16.0-rc1 #23
task: eed24e90 ti: eebd0000 task.ti: eebd0000
NIP: c02889fc LR: c02889fc CTR: c02d7020
REGS: eebd1a50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00029002 <CE,EE,ME>  CR: 44042082  XER: 00000000

GPR00: c02889fc eebd1b00 eed24e90 0000008d c1de3478 c1de382c 00000000 00029002
GPR08: 00000007 00000000 01660000 00000000 24042082 00000000 c07a1900 eeda2a40
GPR16: 005d62a0 c078ad4c 00000000 eeb15068 c07e1e10 c0da1180 00029002 c0d97408
GPR24: c62497a0 00000014 eebd1b5 00000000 c078ad4c ee130210 00000000 2eb15068
NIP [c02889fc] check_unmap+0x8ac/0xab0
LR [c02889fc] check_unmap+0x8ac/0xab0
Call Trace:
[eebd1b00] [c02889fc] check_unmap+0x8ac/0xab0 (unreliable)
--- Exception: 0 at   (null)
    LR =   (null)
[eebd1b50] [c0288c78] debug_dma_unmap_page+0x78/0x90 (unreliable)
[eebd1bd0] [f956f738] ahash_final_ctx+0x6d8/0x7b0 [caamhash]
[eebd1c30] [c022ff4c] __test_hash+0x2ac/0x6c0
[eebd1de0] [c0230388] test_hash+0x28/0xb0
[eebd1e00] [c02304a4] alg_test_hash+0x94/0xc0
[eebd1e20] [c022fa94] alg_test+0x114/0x2e0
[eebd1ea0] [c022cd1c] cryptomgr_test+0x4c/0x60
[eebd1eb0] [c00497a4] kthread+0xc4/0xe0
[eebd1f40] [c000f2fc] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
41de01c8 80a9002c 2f850000 40fe0008 80a90008 80fa0018 3c60c06d 811a001c
3863f4a4 813a0020 815a0024 4830cd01 <0fe00000> 81340048 2f890000 40feff48

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
amery pushed a commit that referenced this issue Oct 13, 2014
I'm getting the spew below when booting with Haswell (Xeon
E5-2699 v3) CPUs and the "Cluster-on-Die" (CoD) feature enabled
in the BIOS.  It seems similar to the issue that some folks from
AMD ran in to on their systems and addressed in this commit:

  161270f ("x86/smp: Fix topology checks on AMD MCM CPUs")

Both these Intel and AMD systems break an assumption which is
being enforced by topology_sane(): a socket may not contain more
than one NUMA node.

AMD special-cased their system by looking for a cpuid flag.  The
Intel mode is dependent on BIOS options and I do not know of a
way which it is enumerated other than the tables being parsed
during the CPU bringup process.  In other words, we have to trust
the ACPI tables <shudder>.

This detects the situation where a NUMA node occurs at a place in
the middle of the "CPU" sched domains.  It replaces the default
topology with one that relies on the NUMA information from the
firmware (SRAT table) for all levels of sched domains above the
hyperthreads.

This also fixes a sysfs bug.  We used to freak out when we saw
the "mc" group cross a node boundary, so we stopped building the
MC group.  MC gets exported as the 'core_siblings_list' in
/sys/devices/system/cpu/cpu*/topology/ and this caused CPUs with
the same 'physical_package_id' to not be listed together in
'core_siblings_list'.  This violates a statement from
Documentation/ABI/testing/sysfs-devices-system-cpu:

	core_siblings: internal kernel map of cpu#'s hardware threads
	within the same physical_package_id.

	core_siblings_list: human-readable list of the logical CPU
	numbers within the same physical_package_id as cpu#.

The sysfs effects here cause an issue with the hwloc tool where
it gets confused and thinks there are more sockets than are
physically present.

Before this patch, there are two packages:

# cd /sys/devices/system/cpu/
# cat cpu*/topology/physical_package_id | sort | uniq -c
     18 0
     18 1

But 4 _sets_ of core siblings:

# cat cpu*/topology/core_siblings_list | sort | uniq -c
      9 0-8
      9 18-26
      9 27-35
      9 9-17

After this set, there are only 2 sets of core siblings, which
is what we expect for a 2-socket system.

# cat cpu*/topology/physical_package_id | sort | uniq -c
     18 0
     18 1
# cat cpu*/topology/core_siblings_list | sort | uniq -c
     18 0-17
     18 18-35

Example spew:
...
	NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
	 #2  #3  #4  #5  #6  #7  #8
	.... node  #1, CPUs:    #9
	------------[ cut here ]------------
	WARNING: CPU: 9 PID: 0 at /home/ak/hle/linux-hle-2.6/arch/x86/kernel/smpboot.c:306 topology_sane.isra.2+0x74/0x90()
	sched: CPU #9's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
	Modules linked in:
	CPU: 9 PID: 0 Comm: swapper/9 Not tainted 3.17.0-rc1-00293-g8e01c4d-dirty torvalds#631
	Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0036.R05.1407140519 07/14/2014
	0000000000000009 ffff88046ddabe00 ffffffff8172e485 ffff88046ddabe48
	ffff88046ddabe38 ffffffff8109691d 000000000000b001 0000000000000009
	ffff88086fc12580 000000000000b020 0000000000000009 ffff88046ddabe98
	Call Trace:
	[<ffffffff8172e485>] dump_stack+0x45/0x56
	[<ffffffff8109691d>] warn_slowpath_common+0x7d/0xa0
	[<ffffffff8109698c>] warn_slowpath_fmt+0x4c/0x50
	[<ffffffff81074f94>] topology_sane.isra.2+0x74/0x90
	[<ffffffff8107530e>] set_cpu_sibling_map+0x31e/0x4f0
	[<ffffffff8107568d>] start_secondary+0x1ad/0x240
	---[ end trace 3fe5f587a9fcde61 ]---
	#10 #11 #12 #13 #14 #15 #16 #17
	.... node  #2, CPUs:   #18 #19 #20 #21 #22 #23 #24 #25 #26
	.... node  #3, CPUs:   #27 #28 #29 #30 #31 #32 #33 #34 #35

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
[ Added LLC domain and s/match_mc/match_die/ ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: brice.goglin@gmail.com
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/20140918193334.C065EBCE@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
amery pushed a commit that referenced this issue Oct 29, 2014
This patch wires up the new syscall sys_bpf() on powerpc.

Passes the tests in samples/bpf:

    #0 add+sub+mul OK
    #1 unreachable OK
    #2 unreachable2 OK
    #3 out of range jump OK
    #4 out of range jump2 OK
    #5 test1 ld_imm64 OK
    #6 test2 ld_imm64 OK
    #7 test3 ld_imm64 OK
    #8 test4 ld_imm64 OK
    #9 test5 ld_imm64 OK
    #10 no bpf_exit OK
    #11 loop (back-edge) OK
    #12 loop2 (back-edge) OK
    #13 conditional loop OK
    #14 read uninitialized register OK
    #15 read invalid register OK
    #16 program doesn't init R0 before exit OK
    #17 stack out of bounds OK
    #18 invalid call insn1 OK
    #19 invalid call insn2 OK
    #20 invalid function call OK
    #21 uninitialized stack1 OK
    #22 uninitialized stack2 OK
    #23 check valid spill/fill OK
    #24 check corrupted spill/fill OK
    #25 invalid src register in STX OK
    #26 invalid dst register in STX OK
    #27 invalid dst register in ST OK
    #28 invalid src register in LDX OK
    #29 invalid dst register in LDX OK
    #30 junk insn OK
    #31 junk insn2 OK
    #32 junk insn3 OK
    #33 junk insn4 OK
    #34 junk insn5 OK
    #35 misaligned read from stack OK
    #36 invalid map_fd for function call OK
    #37 don't check return value before access OK
    #38 access memory with incorrect alignment OK
    #39 sometimes access memory with incorrect alignment OK
    #40 jump test 1 OK
    #41 jump test 2 OK
    #42 jump test 3 OK
    #43 jump test 4 OK

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[mpe: test using samples/bpf]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
amery pushed a commit that referenced this issue Nov 3, 2014
The commit 2111667 ("ARM: pxa: call debug_ll_io_init for
earlyprintk") triggers in the current kernel the attached backtrace on
PXA/tosa early in the boot time when DEBUG_LL is enabled.

It is due to overlap between uart virtual memory defined in
DEBUG_UART_VIRT and mapped by debug_ll_io_init() and peripheral bus
mapped by pxa_map_io at the same address, 0xf2100000.

As hinted by Arnd, map early virtual memory for low level debug on
address 0xf6200000, even if that means 2 virtual mappings will give
access to the pxa internal UARTs (FFUART, BTUART, STUART, ...).

------------[ cut here ]------------
kernel BUG at /home/lumag/linux/mm/vmalloc.c:1143!
Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 3.17.0-00032-g8e0d202-dirty #23
task: c062a5a8 ti: c0620000 task.ti: c0620000
PC is at vm_area_add_early+0x54/0x84
LR is at add_static_vm_early+0xc/0x60
pc : [<c03e1100>]    lr : [<c03d9ef4>]    psr: 800001d3
sp : c0621f04  ip : c03efa74  fp : c03edf84
r10: c0637e98  r9 : 40000001  r8 : c03da57c
r7 : c3ffcfb0  r6 : 00000000  r5 : c3ffcfb0  r4 : 02000000
r3 : c3ffcfd8  r2 : f2100000  r1 : f4000000  r0 : c3ffcfb0
Flags: Nzcv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
Control: 00007977  Table: a0004000  DAC: 00000017
Process swapper (pid: 0, stack limit = 0xc06201c8)
Stack: (0xc0621f04 to 0xc0622000)
1f00:          c3ffcfd8 40000001 c3ffcfd8 c03ee08c c03da570 c03db90c c0637d24
1f20: 00000000 c03ec7cc c066e654 a0700000 000a0700 c03db914 c03db90c c03daf84
1f40: 00000000 000a0000 c0000000 c03ec7cc 000a0700 c0700000 ffff100 000a3fff
1f60: 00001000 00000007 00000000 c03ec7cc c0008000 c03ed748 c0621fd4 c03d5d18
1f80: 69052d00 a03ec48c 00000000 c03d8ad0 0000006c 00007977 c036c6e8 00000001
1fa0: c0621fd4 c03ed744 c0628000 a0004000 69052d00 a03ec48c 00000000 c03d68d4
1fc0: 00000000 00000000 00000000 00000000 00000000 c03ed748 c0649894 c062801c
1fe0: c03ed744 c062b2f0 a0004000 69052d00 a03ec48c a0008040 00000000 00000000
[<c03e1100>] (vm_area_add_early) from [<c03d9ef4>] (add_static_vm_early+0xc/0x60)
[<c03d9ef4>] (add_static_vm_early) from [<c03da570>] (iotable_init.part.6+0xa8/0xb4)
[<c03da570>] (iotable_init.part.6) from [<c03db914>] (pxa25x_map_io+0x8/0x24)
[<c03db914>] (pxa25x_map_io) from [<c03daf84>] (paging_init+0x744/0x8d8)
[<c03daf84>] (paging_init) from [<c03d8ad0>] (setup_arch+0x354/0x608)
[<c03d8ad0>] (setup_arch) from [<c03d68d4>] (start_kernel+0xa8/0x3dc)
[<c03d68d4>] (start_kernel) from [<a0008040>] (0xa0008040)
Code: e5904008 e0811004 e1520001 2a000005 (e7f001f2)
---[ end trace f24b6c88ae00fa9a ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!

Cc: <stable@vger.kernel.org>
Reported-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Acked-by: Arnd Bergmann <arnd@arndb.de>
hramrach pushed a commit to hramrach/linux-sunxi that referenced this issue Aug 11, 2015
Nikolay has reported a hang when a memcg reclaim got stuck with the
following backtrace:

PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
  #0 __schedule at ffffffff815ab152
  linux-sunxi#1 schedule at ffffffff815ab76e
  linux-sunxi#2 schedule_timeout at ffffffff815ae5e5
  linux-sunxi#3 io_schedule_timeout at ffffffff815aad6a
  linux-sunxi#4 bit_wait_io at ffffffff815abfc6
  linux-sunxi#5 __wait_on_bit at ffffffff815abda5
  linux-sunxi#6 wait_on_page_bit at ffffffff8111fd4f
  linux-sunxi#7 shrink_page_list at ffffffff81135445
  linux-sunxi#8 shrink_inactive_list at ffffffff81135845
  linux-sunxi#9 shrink_lruvec at ffffffff81135ead
 linux-sunxi#10 shrink_zone at ffffffff811360c3
 linux-sunxi#11 shrink_zones at ffffffff81136eff
 linux-sunxi#12 do_try_to_free_pages at ffffffff8113712f
 linux-sunxi#13 try_to_free_mem_cgroup_pages at ffffffff811372be
 linux-sunxi#14 try_charge at ffffffff81189423
 linux-sunxi#15 mem_cgroup_try_charge at ffffffff8118c6f5
 linux-sunxi#16 __add_to_page_cache_locked at ffffffff8112137d
 linux-sunxi#17 add_to_page_cache_lru at ffffffff81121618
 linux-sunxi#18 pagecache_get_page at ffffffff8112170b
 linux-sunxi#19 grow_dev_page at ffffffff811c8297
 linux-sunxi#20 __getblk_slow at ffffffff811c91d6
 linux-sunxi#21 __getblk_gfp at ffffffff811c92c1
 linux-sunxi#22 ext4_ext_grow_indepth at ffffffff8124565c
 linux-sunxi#23 ext4_ext_create_new_leaf at ffffffff81246ca8
 linux-sunxi#24 ext4_ext_insert_extent at ffffffff81246f09
 linux-sunxi#25 ext4_ext_map_blocks at ffffffff8124a848
 linux-sunxi#26 ext4_map_blocks at ffffffff8121a5b7
 linux-sunxi#27 mpage_map_one_extent at ffffffff8121b1fa
 linux-sunxi#28 mpage_map_and_submit_extent at ffffffff8121f07b
 linux-sunxi#29 ext4_writepages at ffffffff8121f6d5
 linux-sunxi#30 do_writepages at ffffffff8112c490
 linux-sunxi#31 __filemap_fdatawrite_range at ffffffff81120199
 linux-sunxi#32 filemap_flush at ffffffff8112041c
 linux-sunxi#33 ext4_alloc_da_blocks at ffffffff81219da1
 linux-sunxi#34 ext4_rename at ffffffff81229b91
 linux-sunxi#35 ext4_rename2 at ffffffff81229e32
 linux-sunxi#36 vfs_rename at ffffffff811a08a5
 linux-sunxi#37 SYSC_renameat2 at ffffffff811a3ffc
 linux-sunxi#38 sys_renameat2 at ffffffff811a408e
 linux-sunxi#39 sys_rename at ffffffff8119e51e
 linux-sunxi#40 system_call_fastpath at ffffffff815afa89

Dave Chinner has properly pointed out that this is a deadlock in the
reclaim code because ext4 doesn't submit pages which are marked by
PG_writeback right away.

The heuristic was introduced by commit e62e384 ("memcg: prevent OOM
with too many dirty pages") and it was applied only when may_enter_fs
was specified.  The code has been changed by c3b94f4 ("memcg:
further prevent OOM with too many dirty pages") which has removed the
__GFP_FS restriction with a reasoning that we do not get into the fs
code.  But this is not sufficient apparently because the fs doesn't
necessarily submit pages marked PG_writeback for IO right away.

ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
submit the bio.  Instead it tries to map more pages into the bio and
mpage_map_one_extent might trigger memcg charge which might end up
waiting on a page which is marked PG_writeback but hasn't been submitted
yet so we would end up waiting for something that never finishes.

Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
before we go to wait on the writeback.  The page fault path, which is
the only path that triggers memcg oom killer since 3.12, shouldn't
require GFP_NOFS and so we shouldn't reintroduce the premature OOM
killer issue which was originally addressed by the heuristic.

As per David Chinner the xfs is doing similar thing since 2.6.15 already
so ext4 is not the only affected filesystem.  Moreover he notes:

: For example: IO completion might require unwritten extent conversion
: which executes filesystem transactions and GFP_NOFS allocations. The
: writeback flag on the pages can not be cleared until unwritten
: extent conversion completes. Hence memory reclaim cannot wait on
: page writeback to complete in GFP_NOFS context because it is not
: safe to do so, memcg reclaim or otherwise.

Cc: stable@vger.kernel.org # 3.9+
[tytso@mit.edu: corrected the control flow]
Fixes: c3b94f4 ("memcg: further prevent OOM with too many dirty pages")
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Mar 10, 2019
commit d3d6a18 upstream.

wake_up_locked() may but does not have to be called with interrupts
disabled. Since the fuse filesystem calls wake_up_locked() without
disabling interrupts aio_poll_wake() may be called with interrupts
enabled. Since the kioctx.ctx_lock may be acquired from IRQ context,
all code that acquires that lock from thread context must disable
interrupts. Hence change the spin_trylock() call in aio_poll_wake()
into a spin_trylock_irqsave() call. This patch fixes the following
lockdep complaint:

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.0.0-rc4-next-20190131 linux-sunxi#23 Not tainted
-----------------------------------------------------
syz-executor2/13779 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
0000000098ac1230 (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:329 [inline]
0000000098ac1230 (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1772 [inline]
0000000098ac1230 (&fiq->waitq){+.+.}, at: __io_submit_one fs/aio.c:1875 [inline]
0000000098ac1230 (&fiq->waitq){+.+.}, at: io_submit_one+0xedf/0x1cf0 fs/aio.c:1908

and this task is already holding:
000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:354 [inline]
000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1771 [inline]
000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one fs/aio.c:1875 [inline]
000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: io_submit_one+0xeb6/0x1cf0 fs/aio.c:1908
which would create a new lock dependency:
 (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.}

but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&(&ctx->ctx_lock)->rlock){..-.}

... which became SOFTIRQ-irq-safe at:
  lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
  __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
  _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:160
  spin_lock_irq include/linux/spinlock.h:354 [inline]
  free_ioctx_users+0x2d/0x4a0 fs/aio.c:610
  percpu_ref_put_many include/linux/percpu-refcount.h:285 [inline]
  percpu_ref_put include/linux/percpu-refcount.h:301 [inline]
  percpu_ref_call_confirm_rcu lib/percpu-refcount.c:123 [inline]
  percpu_ref_switch_to_atomic_rcu+0x3e7/0x520 lib/percpu-refcount.c:158
  __rcu_reclaim kernel/rcu/rcu.h:240 [inline]
  rcu_do_batch kernel/rcu/tree.c:2486 [inline]
  invoke_rcu_callbacks kernel/rcu/tree.c:2799 [inline]
  rcu_core+0x928/0x1390 kernel/rcu/tree.c:2780
  __do_softirq+0x266/0x95a kernel/softirq.c:292
  run_ksoftirqd kernel/softirq.c:654 [inline]
  run_ksoftirqd+0x8e/0x110 kernel/softirq.c:646
  smpboot_thread_fn+0x6ab/0xa10 kernel/smpboot.c:164
  kthread+0x357/0x430 kernel/kthread.c:247
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352

to a SOFTIRQ-irq-unsafe lock:
 (&fiq->waitq){+.+.}

... which became SOFTIRQ-irq-unsafe at:
...
  lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
  spin_lock include/linux/spinlock.h:329 [inline]
  flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
  fuse_request_queue_background+0x2d1/0x580 fs/fuse/dev.c:676
  fuse_request_send_background+0x58/0x120 fs/fuse/dev.c:687
  fuse_send_init fs/fuse/inode.c:989 [inline]
  fuse_fill_super+0x13bb/0x1730 fs/fuse/inode.c:1214
  mount_nodev+0x68/0x110 fs/super.c:1392
  fuse_mount+0x2d/0x40 fs/fuse/inode.c:1239
  legacy_get_tree+0xf2/0x200 fs/fs_context.c:590
  vfs_get_tree+0x123/0x450 fs/super.c:1481
  do_new_mount fs/namespace.c:2610 [inline]
  do_mount+0x1436/0x2c40 fs/namespace.c:2932
  ksys_mount+0xdb/0x150 fs/namespace.c:3148
  __do_sys_mount fs/namespace.c:3162 [inline]
  __se_sys_mount fs/namespace.c:3159 [inline]
  __x64_sys_mount+0xbe/0x150 fs/namespace.c:3159
  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&fiq->waitq);
                               local_irq_disable();
                               lock(&(&ctx->ctx_lock)->rlock);
                               lock(&fiq->waitq);
  <Interrupt>
    lock(&(&ctx->ctx_lock)->rlock);

 *** DEADLOCK ***

1 lock held by syz-executor2/13779:
 #0: 000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:354 [inline]
 #0: 000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1771 [inline]
 #0: 000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one fs/aio.c:1875 [inline]
 #0: 000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: io_submit_one+0xeb6/0x1cf0 fs/aio.c:1908

the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&(&ctx->ctx_lock)->rlock){..-.} {
   IN-SOFTIRQ-W at:
                    lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                    __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
                    _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:160
                    spin_lock_irq include/linux/spinlock.h:354 [inline]
                    free_ioctx_users+0x2d/0x4a0 fs/aio.c:610
                    percpu_ref_put_many include/linux/percpu-refcount.h:285 [inline]
                    percpu_ref_put include/linux/percpu-refcount.h:301 [inline]
                    percpu_ref_call_confirm_rcu lib/percpu-refcount.c:123 [inline]
                    percpu_ref_switch_to_atomic_rcu+0x3e7/0x520 lib/percpu-refcount.c:158
                    __rcu_reclaim kernel/rcu/rcu.h:240 [inline]
                    rcu_do_batch kernel/rcu/tree.c:2486 [inline]
                    invoke_rcu_callbacks kernel/rcu/tree.c:2799 [inline]
                    rcu_core+0x928/0x1390 kernel/rcu/tree.c:2780
                    __do_softirq+0x266/0x95a kernel/softirq.c:292
                    run_ksoftirqd kernel/softirq.c:654 [inline]
                    run_ksoftirqd+0x8e/0x110 kernel/softirq.c:646
                    smpboot_thread_fn+0x6ab/0xa10 kernel/smpboot.c:164
                    kthread+0x357/0x430 kernel/kthread.c:247
                    ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
   INITIAL USE at:
                   lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                   __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
                   _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:160
                   spin_lock_irq include/linux/spinlock.h:354 [inline]
                   __do_sys_io_cancel fs/aio.c:2052 [inline]
                   __se_sys_io_cancel fs/aio.c:2035 [inline]
                   __x64_sys_io_cancel+0xd5/0x5a0 fs/aio.c:2035
                   do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
                   entry_SYSCALL_64_after_hwframe+0x49/0xbe
 }
 ... key      at: [<ffffffff8a574140>] __key.52370+0x0/0x40
 ... acquired at:
   lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
   _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
   spin_lock include/linux/spinlock.h:329 [inline]
   aio_poll fs/aio.c:1772 [inline]
   __io_submit_one fs/aio.c:1875 [inline]
   io_submit_one+0xedf/0x1cf0 fs/aio.c:1908
   __do_sys_io_submit fs/aio.c:1953 [inline]
   __se_sys_io_submit fs/aio.c:1923 [inline]
   __x64_sys_io_submit+0x1bd/0x580 fs/aio.c:1923
   do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

the dependencies between the lock to be acquired
 and SOFTIRQ-irq-unsafe lock:
-> (&fiq->waitq){+.+.} {
   HARDIRQ-ON-W at:
                    lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                    _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
                    spin_lock include/linux/spinlock.h:329 [inline]
                    flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
                    fuse_request_queue_background+0x2d1/0x580 fs/fuse/dev.c:676
                    fuse_request_send_background+0x58/0x120 fs/fuse/dev.c:687
                    fuse_send_init fs/fuse/inode.c:989 [inline]
                    fuse_fill_super+0x13bb/0x1730 fs/fuse/inode.c:1214
                    mount_nodev+0x68/0x110 fs/super.c:1392
                    fuse_mount+0x2d/0x40 fs/fuse/inode.c:1239
                    legacy_get_tree+0xf2/0x200 fs/fs_context.c:590
                    vfs_get_tree+0x123/0x450 fs/super.c:1481
                    do_new_mount fs/namespace.c:2610 [inline]
                    do_mount+0x1436/0x2c40 fs/namespace.c:2932
                    ksys_mount+0xdb/0x150 fs/namespace.c:3148
                    __do_sys_mount fs/namespace.c:3162 [inline]
                    __se_sys_mount fs/namespace.c:3159 [inline]
                    __x64_sys_mount+0xbe/0x150 fs/namespace.c:3159
                    do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
                    entry_SYSCALL_64_after_hwframe+0x49/0xbe
   SOFTIRQ-ON-W at:
                    lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                    _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
                    spin_lock include/linux/spinlock.h:329 [inline]
                    flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
                    fuse_request_queue_background+0x2d1/0x580 fs/fuse/dev.c:676
                    fuse_request_send_background+0x58/0x120 fs/fuse/dev.c:687
                    fuse_send_init fs/fuse/inode.c:989 [inline]
                    fuse_fill_super+0x13bb/0x1730 fs/fuse/inode.c:1214
                    mount_nodev+0x68/0x110 fs/super.c:1392
                    fuse_mount+0x2d/0x40 fs/fuse/inode.c:1239
                    legacy_get_tree+0xf2/0x200 fs/fs_context.c:590
                    vfs_get_tree+0x123/0x450 fs/super.c:1481
                    do_new_mount fs/namespace.c:2610 [inline]
                    do_mount+0x1436/0x2c40 fs/namespace.c:2932
                    ksys_mount+0xdb/0x150 fs/namespace.c:3148
                    __do_sys_mount fs/namespace.c:3162 [inline]
                    __se_sys_mount fs/namespace.c:3159 [inline]
                    __x64_sys_mount+0xbe/0x150 fs/namespace.c:3159
                    do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
                    entry_SYSCALL_64_after_hwframe+0x49/0xbe
   INITIAL USE at:
                   lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                   _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
                   spin_lock include/linux/spinlock.h:329 [inline]
                   flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
                   fuse_request_queue_background+0x2d1/0x580 fs/fuse/dev.c:676
                   fuse_request_send_background+0x58/0x120 fs/fuse/dev.c:687
                   fuse_send_init fs/fuse/inode.c:989 [inline]
                   fuse_fill_super+0x13bb/0x1730 fs/fuse/inode.c:1214
                   mount_nodev+0x68/0x110 fs/super.c:1392
                   fuse_mount+0x2d/0x40 fs/fuse/inode.c:1239
                   legacy_get_tree+0xf2/0x200 fs/fs_context.c:590
                   vfs_get_tree+0x123/0x450 fs/super.c:1481
                   do_new_mount fs/namespace.c:2610 [inline]
                   do_mount+0x1436/0x2c40 fs/namespace.c:2932
                   ksys_mount+0xdb/0x150 fs/namespace.c:3148
                   __do_sys_mount fs/namespace.c:3162 [inline]
                   __se_sys_mount fs/namespace.c:3159 [inline]
                   __x64_sys_mount+0xbe/0x150 fs/namespace.c:3159
                   do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
                   entry_SYSCALL_64_after_hwframe+0x49/0xbe
 }
 ... key      at: [<ffffffff8a60dec0>] __key.43450+0x0/0x40
 ... acquired at:
   lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
   _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
   spin_lock include/linux/spinlock.h:329 [inline]
   aio_poll fs/aio.c:1772 [inline]
   __io_submit_one fs/aio.c:1875 [inline]
   io_submit_one+0xedf/0x1cf0 fs/aio.c:1908
   __do_sys_io_submit fs/aio.c:1953 [inline]
   __se_sys_io_submit fs/aio.c:1923 [inline]
   __x64_sys_io_submit+0x1bd/0x580 fs/aio.c:1923
   do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

stack backtrace:
CPU: 0 PID: 13779 Comm: syz-executor2 Not tainted 5.0.0-rc4-next-20190131 linux-sunxi#23
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_bad_irq_dependency kernel/locking/lockdep.c:1573 [inline]
 check_usage.cold+0x60f/0x940 kernel/locking/lockdep.c:1605
 check_irq_usage kernel/locking/lockdep.c:1650 [inline]
 check_prev_add_irq kernel/locking/lockdep_states.h:8 [inline]
 check_prev_add kernel/locking/lockdep.c:1860 [inline]
 check_prevs_add kernel/locking/lockdep.c:1968 [inline]
 validate_chain kernel/locking/lockdep.c:2339 [inline]
 __lock_acquire+0x1f12/0x4790 kernel/locking/lockdep.c:3320
 lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
 spin_lock include/linux/spinlock.h:329 [inline]
 aio_poll fs/aio.c:1772 [inline]
 __io_submit_one fs/aio.c:1875 [inline]
 io_submit_one+0xedf/0x1cf0 fs/aio.c:1908
 __do_sys_io_submit fs/aio.c:1953 [inline]
 __se_sys_io_submit fs/aio.c:1923 [inline]
 __x64_sys_io_submit+0x1bd/0x580 fs/aio.c:1923
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Avi Kivity <avi@scylladb.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
Fixes: e8693bc ("aio: allow direct aio poll comletions for keyed wakeups") # v4.19
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
[ bvanassche: added a comment ]
Reluctantly-Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Jul 26, 2019
[ Upstream commit d8655e7 ]

Commit 9da21b1 ("EDAC: Poll timeout cannot be zero, p2") assumes
edac_mc_poll_msec to be unsigned long, but the type of the variable still
remained as int. Setting edac_mc_poll_msec can trigger out-of-bounds
write.

Reproducer:

  # echo 1001 > /sys/module/edac_core/parameters/edac_mc_poll_msec

KASAN report:

  BUG: KASAN: global-out-of-bounds in edac_set_poll_msec+0x140/0x150
  Write of size 8 at addr ffffffffb91b2d00 by task bash/1996

  CPU: 1 PID: 1996 Comm: bash Not tainted 5.2.0-rc6+ linux-sunxi#23
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
  Call Trace:
   dump_stack+0xca/0x13e
   print_address_description.cold+0x5/0x246
   __kasan_report.cold+0x75/0x9a
   ? edac_set_poll_msec+0x140/0x150
   kasan_report+0xe/0x20
   edac_set_poll_msec+0x140/0x150
   ? dimmdev_location_show+0x30/0x30
   ? vfs_lock_file+0xe0/0xe0
   ? _raw_spin_lock+0x87/0xe0
   param_attr_store+0x1b5/0x310
   ? param_array_set+0x4f0/0x4f0
   module_attr_store+0x58/0x80
   ? module_attr_show+0x80/0x80
   sysfs_kf_write+0x13d/0x1a0
   kernfs_fop_write+0x2bc/0x460
   ? sysfs_kf_bin_read+0x270/0x270
   ? kernfs_notify+0x1f0/0x1f0
   __vfs_write+0x81/0x100
   vfs_write+0x1e1/0x560
   ksys_write+0x126/0x250
   ? __ia32_sys_read+0xb0/0xb0
   ? do_syscall_64+0x1f/0x390
   do_syscall_64+0xc1/0x390
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7fa7caa5e970
  Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 04
  RSP: 002b:00007fff6acfdfe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa7caa5e970
  RDX: 0000000000000005 RSI: 0000000000e95c08 RDI: 0000000000000001
  RBP: 0000000000e95c08 R08: 00007fa7cad1e760 R09: 00007fa7cb36a700
  R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000005
  R13: 0000000000000001 R14: 00007fa7cad1d600 R15: 0000000000000005

  The buggy address belongs to the variable:
   edac_mc_poll_msec+0x0/0x40

  Memory state around the buggy address:
   ffffffffb91b2c00: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa
   ffffffffb91b2c80: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa
  >ffffffffb91b2d00: 04 fa fa fa fa fa fa fa 04 fa fa fa fa fa fa fa
                     ^
   ffffffffb91b2d80: 04 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
   ffffffffb91b2e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Fix it by changing the type of edac_mc_poll_msec to unsigned int.
The reason why this patch adopts unsigned int rather than unsigned long
is msecs_to_jiffies() assumes arg to be unsigned int. We can avoid
integer conversion bugs and unsigned int will be large enough for
edac_mc_poll_msec.

Reviewed-by: James Morse <james.morse@arm.com>
Fixes: 9da21b1 ("EDAC: Poll timeout cannot be zero, p2")
Signed-off-by: Eiichi Tsukata <devel@etsukata.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Aug 10, 2019
syzbot reported the following crash [0]:

BUG: KASAN: use-after-free in usb_free_coherent+0x79/0x80
drivers/usb/core/usb.c:928
Read of size 8 at addr ffff8881b18599c8 by task syz-executor.4/16007

CPU: 0 PID: 16007 Comm: syz-executor.4 Not tainted 5.3.0-rc2+ linux-sunxi#23
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xca/0x13e lib/dump_stack.c:113
  print_address_description+0x6a/0x32c mm/kasan/report.c:351
  __kasan_report.cold+0x1a/0x33 mm/kasan/report.c:482
  kasan_report+0xe/0x12 mm/kasan/common.c:612
  usb_free_coherent+0x79/0x80 drivers/usb/core/usb.c:928
  yurex_delete+0x138/0x330 drivers/usb/misc/yurex.c:100
  kref_put include/linux/kref.h:65 [inline]
  yurex_release+0x66/0x90 drivers/usb/misc/yurex.c:392
  __fput+0x2d7/0x840 fs/file_table.c:280
  task_work_run+0x13f/0x1c0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x1d2/0x200 arch/x86/entry/common.c:163
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x413511
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 1b 00 00 c3 48
83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48
89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007ffc424ea2e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000413511
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000006
RBP: 0000000000000001 R08: 0000000029a2fc22 R09: 0000000029a2fc26
R10: 00007ffc424ea3c0 R11: 0000000000000293 R12: 000000000075c9a0
R13: 000000000075c9a0 R14: 0000000000761938 R15: ffffffffffffffff

Allocated by task 2776:
  save_stack+0x1b/0x80 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_kmalloc mm/kasan/common.c:487 [inline]
  __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:460
  kmalloc include/linux/slab.h:552 [inline]
  kzalloc include/linux/slab.h:748 [inline]
  usb_alloc_dev+0x51/0xf95 drivers/usb/core/usb.c:583
  hub_port_connect drivers/usb/core/hub.c:5004 [inline]
  hub_port_connect_change drivers/usb/core/hub.c:5213 [inline]
  port_event drivers/usb/core/hub.c:5359 [inline]
  hub_event+0x15c0/0x3640 drivers/usb/core/hub.c:5441
  process_one_work+0x92b/0x1530 kernel/workqueue.c:2269
  worker_thread+0x96/0xe20 kernel/workqueue.c:2415
  kthread+0x318/0x420 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Freed by task 16007:
  save_stack+0x1b/0x80 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_slab_free+0x130/0x180 mm/kasan/common.c:449
  slab_free_hook mm/slub.c:1423 [inline]
  slab_free_freelist_hook mm/slub.c:1470 [inline]
  slab_free mm/slub.c:3012 [inline]
  kfree+0xe4/0x2f0 mm/slub.c:3953
  device_release+0x71/0x200 drivers/base/core.c:1064
  kobject_cleanup lib/kobject.c:693 [inline]
  kobject_release lib/kobject.c:722 [inline]
  kref_put include/linux/kref.h:65 [inline]
  kobject_put+0x171/0x280 lib/kobject.c:739
  put_device+0x1b/0x30 drivers/base/core.c:2213
  usb_put_dev+0x1f/0x30 drivers/usb/core/usb.c:725
  yurex_delete+0x40/0x330 drivers/usb/misc/yurex.c:95
  kref_put include/linux/kref.h:65 [inline]
  yurex_release+0x66/0x90 drivers/usb/misc/yurex.c:392
  __fput+0x2d7/0x840 fs/file_table.c:280
  task_work_run+0x13f/0x1c0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x1d2/0x200 arch/x86/entry/common.c:163
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8881b1859980
  which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 72 bytes inside of
  2048-byte region [ffff8881b1859980, ffff8881b185a180)
The buggy address belongs to the page:
page:ffffea0006c61600 refcount:1 mapcount:0 mapping:ffff8881da00c000
index:0x0 compound_mapcount: 0
flags: 0x200000000010200(slab|head)
raw: 0200000000010200 0000000000000000 0000000100000001 ffff8881da00c000
raw: 0000000000000000 00000000000f000f 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff8881b1859880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff8881b1859900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff8881b1859980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                               ^
  ffff8881b1859a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8881b1859a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

A quick look at the yurex_delete() shows that we drop the reference
to the usb_device before releasing any buffers associated with the
device. Delay the reference drop until we have finished the cleanup.

[0] https://lore.kernel.org/lkml/0000000000003f86d8058f0bd671@google.com/

Fixes: 6bc235a ("USB: add driver for Meywa-Denki & Kayac YUREX")
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tomoki Sekiyama <tomoki.sekiyama@gmail.com>
Cc: Oliver Neukum <oneukum@suse.com>
Cc: andreyknvl@google.com
Cc: gregkh@linuxfoundation.org
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: syzkaller-bugs@googlegroups.com
Cc: dtor@chromium.org
Reported-by: syzbot+d1fedb1c1fdb07fca507@syzkaller.appspotmail.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190805111528.6758-1-suzuki.poulose@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Aug 10, 2019
A deadlock with this stacktrace was observed.

The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio
shrinker and the shrinker depends on I/O completion in the dm-bufio
subsystem.

In order to fix the deadlock (and other similar ones), we set the flag
PF_MEMALLOC_NOIO at loop thread entry.

PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
   #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
   #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
   #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
   #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
   #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
   #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
   #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
   linux-sunxi#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
   linux-sunxi#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
   linux-sunxi#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
  linux-sunxi#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
  linux-sunxi#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
  linux-sunxi#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
  linux-sunxi#13 [ffff8813dedfbec0] kthread at ffffffff810a8428
  linux-sunxi#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

  PID: 14127  TASK: ffff881455749c00  CPU: 11  COMMAND: "loop1"
   #0 [ffff88272f5af228] __schedule at ffffffff8173f405
   #1 [ffff88272f5af280] schedule at ffffffff8173fa27
   #2 [ffff88272f5af2a0] schedule_preempt_disabled at ffffffff8173fd5e
   #3 [ffff88272f5af2b0] __mutex_lock_slowpath at ffffffff81741fb5
   #4 [ffff88272f5af330] mutex_lock at ffffffff81742133
   #5 [ffff88272f5af350] dm_bufio_shrink_count at ffffffffa03865f9 [dm_bufio]
   #6 [ffff88272f5af380] shrink_slab at ffffffff811a86bd
   linux-sunxi#7 [ffff88272f5af470] shrink_zone at ffffffff811ad778
   linux-sunxi#8 [ffff88272f5af500] do_try_to_free_pages at ffffffff811adb34
   linux-sunxi#9 [ffff88272f5af590] try_to_free_pages at ffffffff811adef8
  linux-sunxi#10 [ffff88272f5af610] __alloc_pages_nodemask at ffffffff811a09c3
  linux-sunxi#11 [ffff88272f5af710] alloc_pages_current at ffffffff811e8b71
  linux-sunxi#12 [ffff88272f5af760] new_slab at ffffffff811f4523
  linux-sunxi#13 [ffff88272f5af7b0] __slab_alloc at ffffffff8173a1b5
  linux-sunxi#14 [ffff88272f5af880] kmem_cache_alloc at ffffffff811f484b
  linux-sunxi#15 [ffff88272f5af8d0] do_blockdev_direct_IO at ffffffff812535b3
  linux-sunxi#16 [ffff88272f5afb00] __blockdev_direct_IO at ffffffff81255dc3
  linux-sunxi#17 [ffff88272f5afb30] xfs_vm_direct_IO at ffffffffa01fe3fc [xfs]
  linux-sunxi#18 [ffff88272f5afb90] generic_file_read_iter at ffffffff81198994
  linux-sunxi#19 [ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at ffffffffa020c970 [xfs]
  linux-sunxi#20 [ffff88272f5afcc0] lo_rw_aio at ffffffffa0377042 [loop]
  linux-sunxi#21 [ffff88272f5afd70] loop_queue_work at ffffffffa0377c3b [loop]
  linux-sunxi#22 [ffff88272f5afe60] kthread_worker_fn at ffffffff810a8a0c
  linux-sunxi#23 [ffff88272f5afec0] kthread at ffffffff810a8428
  linux-sunxi#24 [ffff88272f5aff50] ret_from_fork at ffffffff81745242

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Sep 13, 2019
commit d0a255e upstream.

A deadlock with this stacktrace was observed.

The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio
shrinker and the shrinker depends on I/O completion in the dm-bufio
subsystem.

In order to fix the deadlock (and other similar ones), we set the flag
PF_MEMALLOC_NOIO at loop thread entry.

PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
   #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
   jwrdegoede#1 [ffff8813dedfb990] schedule at ffffffff8173fa27
   jwrdegoede#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
   jwrdegoede#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
   jwrdegoede#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
   jwrdegoede#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
   jwrdegoede#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
   linux-sunxi#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
   linux-sunxi#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
   linux-sunxi#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
  linux-sunxi#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
  linux-sunxi#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
  linux-sunxi#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
  linux-sunxi#13 [ffff8813dedfbec0] kthread at ffffffff810a8428
  linux-sunxi#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

  PID: 14127  TASK: ffff881455749c00  CPU: 11  COMMAND: "loop1"
   #0 [ffff88272f5af228] __schedule at ffffffff8173f405
   jwrdegoede#1 [ffff88272f5af280] schedule at ffffffff8173fa27
   jwrdegoede#2 [ffff88272f5af2a0] schedule_preempt_disabled at ffffffff8173fd5e
   jwrdegoede#3 [ffff88272f5af2b0] __mutex_lock_slowpath at ffffffff81741fb5
   jwrdegoede#4 [ffff88272f5af330] mutex_lock at ffffffff81742133
   jwrdegoede#5 [ffff88272f5af350] dm_bufio_shrink_count at ffffffffa03865f9 [dm_bufio]
   jwrdegoede#6 [ffff88272f5af380] shrink_slab at ffffffff811a86bd
   linux-sunxi#7 [ffff88272f5af470] shrink_zone at ffffffff811ad778
   linux-sunxi#8 [ffff88272f5af500] do_try_to_free_pages at ffffffff811adb34
   linux-sunxi#9 [ffff88272f5af590] try_to_free_pages at ffffffff811adef8
  linux-sunxi#10 [ffff88272f5af610] __alloc_pages_nodemask at ffffffff811a09c3
  linux-sunxi#11 [ffff88272f5af710] alloc_pages_current at ffffffff811e8b71
  linux-sunxi#12 [ffff88272f5af760] new_slab at ffffffff811f4523
  linux-sunxi#13 [ffff88272f5af7b0] __slab_alloc at ffffffff8173a1b5
  linux-sunxi#14 [ffff88272f5af880] kmem_cache_alloc at ffffffff811f484b
  linux-sunxi#15 [ffff88272f5af8d0] do_blockdev_direct_IO at ffffffff812535b3
  linux-sunxi#16 [ffff88272f5afb00] __blockdev_direct_IO at ffffffff81255dc3
  linux-sunxi#17 [ffff88272f5afb30] xfs_vm_direct_IO at ffffffffa01fe3fc [xfs]
  linux-sunxi#18 [ffff88272f5afb90] generic_file_read_iter at ffffffff81198994
  linux-sunxi#19 [ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at ffffffffa020c970 [xfs]
  linux-sunxi#20 [ffff88272f5afcc0] lo_rw_aio at ffffffffa0377042 [loop]
  linux-sunxi#21 [ffff88272f5afd70] loop_queue_work at ffffffffa0377c3b [loop]
  linux-sunxi#22 [ffff88272f5afe60] kthread_worker_fn at ffffffff810a8a0c
  linux-sunxi#23 [ffff88272f5afec0] kthread at ffffffff810a8428
  linux-sunxi#24 [ffff88272f5aff50] ret_from_fork at ffffffff81745242

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Sep 13, 2019
commit fc05481 upstream.

syzbot reported the following crash [0]:

BUG: KASAN: use-after-free in usb_free_coherent+0x79/0x80
drivers/usb/core/usb.c:928
Read of size 8 at addr ffff8881b18599c8 by task syz-executor.4/16007

CPU: 0 PID: 16007 Comm: syz-executor.4 Not tainted 5.3.0-rc2+ linux-sunxi#23
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xca/0x13e lib/dump_stack.c:113
  print_address_description+0x6a/0x32c mm/kasan/report.c:351
  __kasan_report.cold+0x1a/0x33 mm/kasan/report.c:482
  kasan_report+0xe/0x12 mm/kasan/common.c:612
  usb_free_coherent+0x79/0x80 drivers/usb/core/usb.c:928
  yurex_delete+0x138/0x330 drivers/usb/misc/yurex.c:100
  kref_put include/linux/kref.h:65 [inline]
  yurex_release+0x66/0x90 drivers/usb/misc/yurex.c:392
  __fput+0x2d7/0x840 fs/file_table.c:280
  task_work_run+0x13f/0x1c0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x1d2/0x200 arch/x86/entry/common.c:163
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x413511
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 1b 00 00 c3 48
83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48
89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007ffc424ea2e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000413511
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000006
RBP: 0000000000000001 R08: 0000000029a2fc22 R09: 0000000029a2fc26
R10: 00007ffc424ea3c0 R11: 0000000000000293 R12: 000000000075c9a0
R13: 000000000075c9a0 R14: 0000000000761938 R15: ffffffffffffffff

Allocated by task 2776:
  save_stack+0x1b/0x80 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_kmalloc mm/kasan/common.c:487 [inline]
  __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:460
  kmalloc include/linux/slab.h:552 [inline]
  kzalloc include/linux/slab.h:748 [inline]
  usb_alloc_dev+0x51/0xf95 drivers/usb/core/usb.c:583
  hub_port_connect drivers/usb/core/hub.c:5004 [inline]
  hub_port_connect_change drivers/usb/core/hub.c:5213 [inline]
  port_event drivers/usb/core/hub.c:5359 [inline]
  hub_event+0x15c0/0x3640 drivers/usb/core/hub.c:5441
  process_one_work+0x92b/0x1530 kernel/workqueue.c:2269
  worker_thread+0x96/0xe20 kernel/workqueue.c:2415
  kthread+0x318/0x420 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Freed by task 16007:
  save_stack+0x1b/0x80 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_slab_free+0x130/0x180 mm/kasan/common.c:449
  slab_free_hook mm/slub.c:1423 [inline]
  slab_free_freelist_hook mm/slub.c:1470 [inline]
  slab_free mm/slub.c:3012 [inline]
  kfree+0xe4/0x2f0 mm/slub.c:3953
  device_release+0x71/0x200 drivers/base/core.c:1064
  kobject_cleanup lib/kobject.c:693 [inline]
  kobject_release lib/kobject.c:722 [inline]
  kref_put include/linux/kref.h:65 [inline]
  kobject_put+0x171/0x280 lib/kobject.c:739
  put_device+0x1b/0x30 drivers/base/core.c:2213
  usb_put_dev+0x1f/0x30 drivers/usb/core/usb.c:725
  yurex_delete+0x40/0x330 drivers/usb/misc/yurex.c:95
  kref_put include/linux/kref.h:65 [inline]
  yurex_release+0x66/0x90 drivers/usb/misc/yurex.c:392
  __fput+0x2d7/0x840 fs/file_table.c:280
  task_work_run+0x13f/0x1c0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x1d2/0x200 arch/x86/entry/common.c:163
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8881b1859980
  which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 72 bytes inside of
  2048-byte region [ffff8881b1859980, ffff8881b185a180)
The buggy address belongs to the page:
page:ffffea0006c61600 refcount:1 mapcount:0 mapping:ffff8881da00c000
index:0x0 compound_mapcount: 0
flags: 0x200000000010200(slab|head)
raw: 0200000000010200 0000000000000000 0000000100000001 ffff8881da00c000
raw: 0000000000000000 00000000000f000f 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff8881b1859880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff8881b1859900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff8881b1859980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                               ^
  ffff8881b1859a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8881b1859a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

A quick look at the yurex_delete() shows that we drop the reference
to the usb_device before releasing any buffers associated with the
device. Delay the reference drop until we have finished the cleanup.

[0] https://lore.kernel.org/lkml/0000000000003f86d8058f0bd671@google.com/

Fixes: 6bc235a ("USB: add driver for Meywa-Denki & Kayac YUREX")
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tomoki Sekiyama <tomoki.sekiyama@gmail.com>
Cc: Oliver Neukum <oneukum@suse.com>
Cc: andreyknvl@google.com
Cc: gregkh@linuxfoundation.org
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: syzkaller-bugs@googlegroups.com
Cc: dtor@chromium.org
Reported-by: syzbot+d1fedb1c1fdb07fca507@syzkaller.appspotmail.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190805111528.6758-1-suzuki.poulose@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Nov 5, 2019
Turns out hotplugging CPUs that are in exclusive cpusets can lead to the
cpuset code feeding empty cpumasks to the sched domain rebuild machinery.

This leads to the following splat:

    Internal error: Oops: 96000004 [#1] PREEMPT SMP
    Modules linked in:
    CPU: 0 PID: 235 Comm: kworker/5:2 Not tainted 5.4.0-rc1-00005-g8d495477d62e linux-sunxi#23
    Hardware name: ARM Juno development board (r0) (DT)
    Workqueue: events cpuset_hotplug_workfn
    pstate: 60000005 (nZCv daif -PAN -UAO)
    pc : build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969)
    lr : build_sched_domains (kernel/sched/topology.c:1966)
    Call trace:
    build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969)
    partition_sched_domains_locked (kernel/sched/topology.c:2250)
    rebuild_sched_domains_locked (./include/linux/bitmap.h:370 ./include/linux/cpumask.h:538 kernel/cgroup/cpuset.c:955 kernel/cgroup/cpuset.c:978 kernel/cgroup/cpuset.c:1019)
    rebuild_sched_domains (kernel/cgroup/cpuset.c:1032)
    cpuset_hotplug_workfn (kernel/cgroup/cpuset.c:3205 (discriminator 2))
    process_one_work (./arch/arm64/include/asm/jump_label.h:21 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:114 kernel/workqueue.c:2274)
    worker_thread (./include/linux/compiler.h:199 ./include/linux/list.h:268 kernel/workqueue.c:2416)
    kthread (kernel/kthread.c:255)
    ret_from_fork (arch/arm64/kernel/entry.S:1167)
    Code: f860dae2 912802d6 aa1603e1 12800000 (f8616853)

The faulty line in question is:

  cap = arch_scale_cpu_capacity(cpumask_first(cpu_map));

and we're not checking the return value against nr_cpu_ids (we shouldn't
have to!), which leads to the above.

Prevent generate_sched_domains() from returning empty cpumasks, and add
some assertion in build_sched_domains() to scream bloody murder if it
happens again.

The above splat was obtained on my Juno r0 with the following reproducer:

  $ cgcreate -g cpuset:asym
  $ cgset -r cpuset.cpus=0-3 asym
  $ cgset -r cpuset.mems=0 asym
  $ cgset -r cpuset.cpu_exclusive=1 asym

  $ cgcreate -g cpuset:smp
  $ cgset -r cpuset.cpus=4-5 smp
  $ cgset -r cpuset.mems=0 smp
  $ cgset -r cpuset.cpu_exclusive=1 smp

  $ cgset -r cpuset.sched_load_balance=0 .

  $ echo 0 > /sys/devices/system/cpu/cpu4/online
  $ echo 0 > /sys/devices/system/cpu/cpu5/online

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar.Eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hannes@cmpxchg.org
Cc: lizefan@huawei.com
Cc: morten.rasmussen@arm.com
Cc: qperret@google.com
Cc: tj@kernel.org
Cc: vincent.guittot@linaro.org
Fixes: 05484e0 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection")
Link: https://lkml.kernel.org/r/20191023153745.19515-2-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Jan 12, 2020
Before commit 4bfc0bb ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
cgroup bpf structures were released with
corresponding cgroup structures. It guaranteed the hierarchical order
of destruction: children were always first. It preserved attached
programs from being released before their propagated copies.

But with cgroup auto-detachment there are no such guarantees anymore:
cgroup bpf is released as soon as the cgroup is offline and there are
no live associated sockets. It means that an attached program can be
detached and released, while its propagated copy is still living
in the cgroup subtree. This will obviously lead to an use-after-free
bug.

To reproduce the issue the following script can be used:

  #!/bin/bash

  CGROOT=/sys/fs/cgroup

  mkdir -p ${CGROOT}/A ${CGROOT}/B ${CGROOT}/A/C
  sleep 1

  ./test_cgrp2_attach ${CGROOT}/A egress &
  A_PID=$!
  ./test_cgrp2_attach ${CGROOT}/B egress &
  B_PID=$!

  echo $$ > ${CGROOT}/A/C/cgroup.procs
  iperf -s &
  S_PID=$!
  iperf -c localhost -t 100 &
  C_PID=$!

  sleep 1

  echo $$ > ${CGROOT}/B/cgroup.procs
  echo ${S_PID} > ${CGROOT}/B/cgroup.procs
  echo ${C_PID} > ${CGROOT}/B/cgroup.procs

  sleep 1

  rmdir ${CGROOT}/A/C
  rmdir ${CGROOT}/A

  sleep 1

  kill -9 ${S_PID} ${C_PID} ${A_PID} ${B_PID}

On the unpatched kernel the following stacktrace can be obtained:

[   33.619799] BUG: unable to handle page fault for address: ffffbdb4801ab002
[   33.620677] #PF: supervisor read access in kernel mode
[   33.621293] #PF: error_code(0x0000) - not-present page
[   33.622754] Oops: 0000 [#1] SMP NOPTI
[   33.623202] CPU: 0 PID: 601 Comm: iperf Not tainted 5.5.0-rc2+ linux-sunxi#23
[   33.625545] RIP: 0010:__cgroup_bpf_run_filter_skb+0x29f/0x3d0
[   33.635809] Call Trace:
[   33.636118]  ? __cgroup_bpf_run_filter_skb+0x2bf/0x3d0
[   33.636728]  ? __switch_to_asm+0x40/0x70
[   33.637196]  ip_finish_output+0x68/0xa0
[   33.637654]  ip_output+0x76/0xf0
[   33.638046]  ? __ip_finish_output+0x1c0/0x1c0
[   33.638576]  __ip_queue_xmit+0x157/0x410
[   33.639049]  __tcp_transmit_skb+0x535/0xaf0
[   33.639557]  tcp_write_xmit+0x378/0x1190
[   33.640049]  ? _copy_from_iter_full+0x8d/0x260
[   33.640592]  tcp_sendmsg_locked+0x2a2/0xdc0
[   33.641098]  ? sock_has_perm+0x10/0xa0
[   33.641574]  tcp_sendmsg+0x28/0x40
[   33.641985]  sock_sendmsg+0x57/0x60
[   33.642411]  sock_write_iter+0x97/0x100
[   33.642876]  new_sync_write+0x1b6/0x1d0
[   33.643339]  vfs_write+0xb6/0x1a0
[   33.643752]  ksys_write+0xa7/0xe0
[   33.644156]  do_syscall_64+0x5b/0x1b0
[   33.644605]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix this by grabbing a reference to the bpf structure of each ancestor
on the initialization of the cgroup bpf structure, and dropping the
reference at the end of releasing the cgroup bpf structure.

This will restore the hierarchical order of cgroup bpf releasing,
without adding any operations on hot paths.

Thanks to Josef Bacik for the debugging and the initial analysis of
the problem.

Fixes: 4bfc0bb ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
Reported-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Mar 30, 2020
When experimenting with bpf_send_signal() helper in our production
environment (5.2 based), we experienced a deadlock in NMI mode:
   #5 [ffffc9002219f770] queued_spin_lock_slowpath at ffffffff8110be24
   #6 [ffffc9002219f770] _raw_spin_lock_irqsave at ffffffff81a43012
   linux-sunxi#7 [ffffc9002219f780] try_to_wake_up at ffffffff810e7ecd
   linux-sunxi#8 [ffffc9002219f7e0] signal_wake_up_state at ffffffff810c7b55
   linux-sunxi#9 [ffffc9002219f7f0] __send_signal at ffffffff810c8602
  linux-sunxi#10 [ffffc9002219f830] do_send_sig_info at ffffffff810ca31a
  linux-sunxi#11 [ffffc9002219f868] bpf_send_signal at ffffffff8119d227
  linux-sunxi#12 [ffffc9002219f988] bpf_overflow_handler at ffffffff811d4140
  linux-sunxi#13 [ffffc9002219f9e0] __perf_event_overflow at ffffffff811d68cf
  linux-sunxi#14 [ffffc9002219fa10] perf_swevent_overflow at ffffffff811d6a09
  linux-sunxi#15 [ffffc9002219fa38] ___perf_sw_event at ffffffff811e0f47
  linux-sunxi#16 [ffffc9002219fc30] __schedule at ffffffff81a3e04d
  linux-sunxi#17 [ffffc9002219fc90] schedule at ffffffff81a3e219
  linux-sunxi#18 [ffffc9002219fca0] futex_wait_queue_me at ffffffff8113d1b9
  linux-sunxi#19 [ffffc9002219fcd8] futex_wait at ffffffff8113e529
  linux-sunxi#20 [ffffc9002219fdf0] do_futex at ffffffff8113ffbc
  linux-sunxi#21 [ffffc9002219fec0] __x64_sys_futex at ffffffff81140d1c
  linux-sunxi#22 [ffffc9002219ff38] do_syscall_64 at ffffffff81002602
  linux-sunxi#23 [ffffc9002219ff50] entry_SYSCALL_64_after_hwframe at ffffffff81c00068

The above call stack is actually very similar to an issue
reported by Commit eac9153 ("bpf/stackmap: Fix deadlock with
rq_lock in bpf_get_stack()") by Song Liu. The only difference is
bpf_send_signal() helper instead of bpf_get_stack() helper.

The above deadlock is triggered with a perf_sw_event.
Similar to Commit eac9153, the below almost identical reproducer
used tracepoint point sched/sched_switch so the issue can be easily caught.
  /* stress_test.c */
  #include <stdio.h>
  #include <stdlib.h>
  #include <sys/mman.h>
  #include <pthread.h>
  #include <sys/types.h>
  #include <sys/stat.h>
  #include <fcntl.h>

  #define THREAD_COUNT 1000
  char *filename;
  void *worker(void *p)
  {
        void *ptr;
        int fd;
        char *pptr;

        fd = open(filename, O_RDONLY);
        if (fd < 0)
                return NULL;
        while (1) {
                struct timespec ts = {0, 1000 + rand() % 2000};

                ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
                usleep(1);
                if (ptr == MAP_FAILED) {
                        printf("failed to mmap\n");
                        break;
                }
                munmap(ptr, 4096 * 64);
                usleep(1);
                pptr = malloc(1);
                usleep(1);
                pptr[0] = 1;
                usleep(1);
                free(pptr);
                usleep(1);
                nanosleep(&ts, NULL);
        }
        close(fd);
        return NULL;
  }

  int main(int argc, char *argv[])
  {
        void *ptr;
        int i;
        pthread_t threads[THREAD_COUNT];

        if (argc < 2)
                return 0;

        filename = argv[1];

        for (i = 0; i < THREAD_COUNT; i++) {
                if (pthread_create(threads + i, NULL, worker, NULL)) {
                        fprintf(stderr, "Error creating thread\n");
                        return 0;
                }
        }

        for (i = 0; i < THREAD_COUNT; i++)
                pthread_join(threads[i], NULL);
        return 0;
  }
and the following command:
  1. run `stress_test /bin/ls` in one windown
  2. hack bcc trace.py with the following change:
     --- a/tools/trace.py
     +++ b/tools/trace.py
     @@ -513,6 +513,7 @@ BPF_PERF_OUTPUT(%s);
              __data.tgid = __tgid;
              __data.pid = __pid;
              bpf_get_current_comm(&__data.comm, sizeof(__data.comm));
     +        bpf_send_signal(10);
      %s
      %s
              %s.perf_submit(%s, &__data, sizeof(__data));
  3. in a different window run
     ./trace.py -p $(pidof stress_test) t:sched:sched_switch

The deadlock can be reproduced in our production system.

Similar to Song's fix, the fix is to delay sending signal if
irqs is disabled to avoid deadlocks involving with rq_lock.
With this change, my above stress-test in our production system
won't cause deadlock any more.

I also implemented a scale-down version of reproducer in the
selftest (a subsequent commit). With latest bpf-next,
it complains for the following potential deadlock.
  [   32.832450] -> #1 (&p->pi_lock){-.-.}:
  [   32.833100]        _raw_spin_lock_irqsave+0x44/0x80
  [   32.833696]        task_rq_lock+0x2c/0xa0
  [   32.834182]        task_sched_runtime+0x59/0xd0
  [   32.834721]        thread_group_cputime+0x250/0x270
  [   32.835304]        thread_group_cputime_adjusted+0x2e/0x70
  [   32.835959]        do_task_stat+0x8a7/0xb80
  [   32.836461]        proc_single_show+0x51/0xb0
  ...
  [   32.839512] -> #0 (&(&sighand->siglock)->rlock){....}:
  [   32.840275]        __lock_acquire+0x1358/0x1a20
  [   32.840826]        lock_acquire+0xc7/0x1d0
  [   32.841309]        _raw_spin_lock_irqsave+0x44/0x80
  [   32.841916]        __lock_task_sighand+0x79/0x160
  [   32.842465]        do_send_sig_info+0x35/0x90
  [   32.842977]        bpf_send_signal+0xa/0x10
  [   32.843464]        bpf_prog_bc13ed9e4d3163e3_send_signal_tp_sched+0x465/0x1000
  [   32.844301]        trace_call_bpf+0x115/0x270
  [   32.844809]        perf_trace_run_bpf_submit+0x4a/0xc0
  [   32.845411]        perf_trace_sched_switch+0x10f/0x180
  [   32.846014]        __schedule+0x45d/0x880
  [   32.846483]        schedule+0x5f/0xd0
  ...

  [   32.853148] Chain exists of:
  [   32.853148]   &(&sighand->siglock)->rlock --> &p->pi_lock --> &rq->lock
  [   32.853148]
  [   32.854451]  Possible unsafe locking scenario:
  [   32.854451]
  [   32.855173]        CPU0                    CPU1
  [   32.855745]        ----                    ----
  [   32.856278]   lock(&rq->lock);
  [   32.856671]                                lock(&p->pi_lock);
  [   32.857332]                                lock(&rq->lock);
  [   32.857999]   lock(&(&sighand->siglock)->rlock);

  Deadlock happens on CPU0 when it tries to acquire &sighand->siglock
  but it has been held by CPU1 and CPU1 tries to grab &rq->lock
  and cannot get it.

  This is not exactly the callstack in our production environment,
  but sympotom is similar and both locks are using spin_lock_irqsave()
  to acquire the lock, and both involves rq_lock. The fix to delay
  sending signal when irq is disabled also fixed this issue.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200304191104.2796501-1-yhs@fb.com
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Aug 12, 2020
[ Upstream commit e24c644 ]

I compiled with AddressSanitizer and I had these memory leaks while I
was using the tep_parse_format function:

    Direct leak of 28 byte(s) in 4 object(s) allocated from:
        #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe)
        jwrdegoede#1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985
        jwrdegoede#2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140
        jwrdegoede#3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206
        jwrdegoede#4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291
        jwrdegoede#5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299
        jwrdegoede#6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849
        linux-sunxi#7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161
        linux-sunxi#8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207
        linux-sunxi#9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786
        linux-sunxi#10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285
        linux-sunxi#11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369
        linux-sunxi#12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335
        linux-sunxi#13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389
        linux-sunxi#14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431
        linux-sunxi#15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251
        linux-sunxi#16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284
        linux-sunxi#17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593
        linux-sunxi#18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727
        linux-sunxi#19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048
        linux-sunxi#20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127
        linux-sunxi#21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152
        linux-sunxi#22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252
        linux-sunxi#23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347
        linux-sunxi#24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461
        linux-sunxi#25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673
        linux-sunxi#26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

The token variable in the process_dynamic_array_len function is
allocated in the read_expect_type function, but is not freed before
calling the read_token function.

Free the token variable before calling read_token in order to plug the
leak.

Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Aug 20, 2020
[ Upstream commit d1bd7e0 ]

Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1 enabled
box, I get the following kernel splat:

[    0.053766] BUG: sleeping function called from invalid context at mm/slab.h:567
[    0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1
[    0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+ linux-sunxi#23
[    0.053770] Call trace:
[    0.053774]  dump_backtrace+0x0/0x218
[    0.053775]  show_stack+0x2c/0x38
[    0.053777]  dump_stack+0xc4/0x10c
[    0.053779]  ___might_sleep+0xfc/0x140
[    0.053780]  __might_sleep+0x58/0x90
[    0.053782]  slab_pre_alloc_hook+0x7c/0x90
[    0.053783]  kmem_cache_alloc_trace+0x60/0x2f0
[    0.053785]  its_cpu_init+0x6f4/0xe40
[    0.053786]  gic_starting_cpu+0x24/0x38
[    0.053788]  cpuhp_invoke_callback+0xa0/0x710
[    0.053789]  notify_cpu_starting+0xcc/0xd8
[    0.053790]  secondary_start_kernel+0x148/0x200

 # ./scripts/faddr2line vmlinux its_cpu_init+0x6f4/0xe40
its_cpu_init+0x6f4/0xe40:
allocate_vpe_l1_table at drivers/irqchip/irq-gic-v3-its.c:2818
(inlined by) its_cpu_init_lpis at drivers/irqchip/irq-gic-v3-its.c:3138
(inlined by) its_cpu_init at drivers/irqchip/irq-gic-v3-its.c:5166

It turned out that we're allocating memory using GFP_KERNEL (may sleep)
within the CPU hotplug notifier, which is indeed an atomic context. Bad
thing may happen if we're playing on a system with more than a single
CommonLPIAff group. Avoid it by turning this into an atomic allocation.

Fixes: 5e51684 ("irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200630133746.816-1-yuzenghui@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Nov 9, 2020
The atomic check hooks must look up the encoder to be used with a
connector from the connector's atomic state, and not assume that it's
the connector's current attached encoder. The latter one can change
under the atomic check func, or can be unset yet as in the case of MST
connectors.

This fixes
[    7.940719] Oops: 0000 [#1] SMP NOPTI
[    7.944407] CPU: 2 PID: 143 Comm: kworker/2:2 Not tainted 5.6.0-1023-oem linux-sunxi#23-Ubuntu
[    7.952102] Hardware name: Dell Inc. Latitude 7320/, BIOS 88.87.11 09/07/2020
[    7.959278] Workqueue: events output_poll_execute [drm_kms_helper]
[    7.965511] RIP: 0010:intel_psr_atomic_check+0x37/0xa0 [i915]
[    7.971327] Code: 80 2d 06 00 00 20 74 42 80 b8 34 71 00 00 00 74 39 48 8b 72 08 48 85 f6 74 30 80 b8 f8 71 00 00 00 74 27 4c 8b 87 80 04 00 00 <41> 8b 78 78 83 ff 08 77 19 31 c9 83 ff 05 77 19 48 81 c1 20 01 00
[    7.977541] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input5
[    7.990154] RSP: 0018:ffffb864c073fac8 EFLAGS: 00010202
[    7.990155] RAX: ffff8c5d55ce0000 RBX: ffff8c5d54519000 RCX: 0000000000000000
[    7.990155] RDX: ffff8c5d55cb30c0 RSI: ffff8c5d89a0c800 RDI: ffff8c5d55fcf800
[    7.990156] RBP: ffffb864c073fac8 R08: 0000000000000000 R09: ffff8c5d55d9f3a0
[    7.990156] R10: ffff8c5d55cb30c0 R11: 0000000000000009 R12: ffff8c5d55fcf800
[    7.990156] R13: ffff8c5d55cb30c0 R14: ffff8c5d56989cc0 R15: ffff8c5d56989cc0
[    7.990158] FS:  0000000000000000(0000) GS:ffff8c5d8e480000(0000) knlGS:0000000000000000
[    8.047193] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.052970] CR2: 0000000000000078 CR3: 0000000856500005 CR4: 0000000000760ee0
[    8.060137] PKRU: 55555554
[    8.062867] Call Trace:
[    8.065361]  intel_digital_connector_atomic_check+0x53/0x130 [i915]
[    8.071703]  intel_dp_mst_atomic_check+0x5b/0x200 [i915]
[    8.077074]  drm_atomic_helper_check_modeset+0x1db/0x790 [drm_kms_helper]
[    8.083942]  intel_atomic_check+0x92/0xc50 [i915]
[    8.088705]  ? drm_plane_check_pixel_format+0x4f/0xb0 [drm]
[    8.094345]  ? drm_atomic_plane_check+0x7a/0x3a0 [drm]
[    8.099548]  drm_atomic_check_only+0x2b1/0x450 [drm]
[    8.104573]  drm_atomic_commit+0x18/0x50 [drm]
[    8.109070]  drm_client_modeset_commit_atomic+0x1c9/0x200 [drm]
[    8.115056]  drm_client_modeset_commit_force+0x55/0x160 [drm]
[    8.120866]  drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper]
[    8.128415]  drm_fb_helper_set_par+0x34/0x50 [drm_kms_helper]
[    8.134225]  drm_fb_helper_hotplug_event.part.0+0xb4/0xe0 [drm_kms_helper]
[    8.141150]  drm_fb_helper_hotplug_event+0x1c/0x30 [drm_kms_helper]
[    8.147481]  intel_fbdev_output_poll_changed+0x6f/0xa0 [i915]
[    8.153287]  drm_kms_helper_hotplug_event+0x2c/0x40 [drm_kms_helper]
[    8.159709]  output_poll_execute+0x1aa/0x1c0 [drm_kms_helper]
[    8.165506]  process_one_work+0x1e8/0x3b0
[    8.169561]  worker_thread+0x4d/0x400
[    8.173249]  kthread+0x104/0x140
[    8.176515]  ? process_one_work+0x3b0/0x3b0
[    8.180726]  ? kthread_park+0x90/0x90
[    8.184416]  ret_from_fork+0x1f/0x40

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2361
References: https://gitlab.freedesktop.org/drm/intel/-/issues/2486
Reported-by: William Tseng <william.tseng@intel.com>
Reported-by: Cooper Chiou <cooper.chiou@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027160928.3665377-1-imre.deak@intel.com
(cherry picked from commit 00e5deb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Nov 18, 2020
This fix is for a failure that occurred in the DWARF unwind perf test.

Stack unwinders may probe memory when looking for frames.

Memory sanitizer will poison and track uninitialized memory on the
stack, and on the heap if the value is copied to the heap.

This can lead to false memory sanitizer failures for the use of an
uninitialized value.

Avoid this problem by removing the poison on the copied stack.

The full msan failure with track origins looks like:

==2168==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x559ceb10755b in handle_cfi elfutils/libdwfl/frame_unwind.c:648:8
    #1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
    #2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
    #3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
    #4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
    #5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
    #6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
    linux-sunxi#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
    linux-sunxi#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
    linux-sunxi#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
    linux-sunxi#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    linux-sunxi#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    linux-sunxi#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    linux-sunxi#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    linux-sunxi#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    linux-sunxi#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    linux-sunxi#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    linux-sunxi#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    linux-sunxi#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    linux-sunxi#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    linux-sunxi#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    linux-sunxi#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    linux-sunxi#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    linux-sunxi#23 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was stored to memory at
    #0 0x559ceb106acf in __libdwfl_frame_reg_set elfutils/libdwfl/frame_unwind.c:77:22
    #1 0x559ceb106acf in handle_cfi elfutils/libdwfl/frame_unwind.c:627:13
    #2 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
    #3 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
    #4 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
    #5 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
    #6 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
    linux-sunxi#7 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
    linux-sunxi#8 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
    linux-sunxi#9 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
    linux-sunxi#10 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
    linux-sunxi#11 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    linux-sunxi#12 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    linux-sunxi#13 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    linux-sunxi#14 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    linux-sunxi#15 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    linux-sunxi#16 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    linux-sunxi#17 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    linux-sunxi#18 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    linux-sunxi#19 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    linux-sunxi#20 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    linux-sunxi#21 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    linux-sunxi#22 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    linux-sunxi#23 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    linux-sunxi#24 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was stored to memory at
    #0 0x559ceb106a54 in handle_cfi elfutils/libdwfl/frame_unwind.c:613:9
    #1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
    #2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
    #3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
    #4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
    #5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
    #6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
    linux-sunxi#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
    linux-sunxi#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
    linux-sunxi#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
    linux-sunxi#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    linux-sunxi#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    linux-sunxi#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    linux-sunxi#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    linux-sunxi#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    linux-sunxi#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    linux-sunxi#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    linux-sunxi#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    linux-sunxi#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    linux-sunxi#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    linux-sunxi#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    linux-sunxi#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    linux-sunxi#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    linux-sunxi#23 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was stored to memory at
    #0 0x559ceaff8800 in memory_read tools/perf/util/unwind-libdw.c:156:10
    #1 0x559ceb10f053 in expr_eval elfutils/libdwfl/frame_unwind.c:501:13
    #2 0x559ceb1060cc in handle_cfi elfutils/libdwfl/frame_unwind.c:603:18
    #3 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
    #4 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
    #5 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
    #6 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
    linux-sunxi#7 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
    linux-sunxi#8 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
    linux-sunxi#9 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
    linux-sunxi#10 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
    linux-sunxi#11 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
    linux-sunxi#12 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    linux-sunxi#13 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    linux-sunxi#14 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    linux-sunxi#15 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    linux-sunxi#16 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    linux-sunxi#17 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    linux-sunxi#18 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    linux-sunxi#19 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    linux-sunxi#20 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    linux-sunxi#21 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    linux-sunxi#22 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    linux-sunxi#23 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    linux-sunxi#24 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    linux-sunxi#25 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was stored to memory at
    #0 0x559cea9027d9 in __msan_memcpy llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1558:3
    #1 0x559cea9d2185 in sample_ustack tools/perf/arch/x86/tests/dwarf-unwind.c:41:2
    #2 0x559cea9d202c in test__arch_unwind_sample tools/perf/arch/x86/tests/dwarf-unwind.c:72:9
    #3 0x559ceabc9cbd in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:106:6
    #4 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    #5 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    #6 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    linux-sunxi#7 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    linux-sunxi#8 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    linux-sunxi#9 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    linux-sunxi#10 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    linux-sunxi#11 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    linux-sunxi#12 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    linux-sunxi#13 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    linux-sunxi#14 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    linux-sunxi#15 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    linux-sunxi#16 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    linux-sunxi#17 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was created by an allocation of 'bf' in the stack frame of function 'perf_event__synthesize_mmap_events'
    #0 0x559ceafc5f60 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:445

SUMMARY: MemorySanitizer: use-of-uninitialized-value elfutils/libdwfl/frame_unwind.c:648:8 in handle_cfi
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: clang-built-linux@googlegroups.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20201113182053.754625-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Jan 25, 2021
Zygo reported the following KASAN splat:

  BUG: KASAN: use-after-free in btrfs_backref_cleanup_node+0x18a/0x420
  Read of size 8 at addr ffff888112402950 by task btrfs/28836

  CPU: 0 PID: 28836 Comm: btrfs Tainted: G        W         5.10.0-e35f27394290-for-next+ linux-sunxi#23
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
  Call Trace:
   dump_stack+0xbc/0xf9
   ? btrfs_backref_cleanup_node+0x18a/0x420
   print_address_description.constprop.8+0x21/0x210
   ? record_print_text.cold.34+0x11/0x11
   ? btrfs_backref_cleanup_node+0x18a/0x420
   ? btrfs_backref_cleanup_node+0x18a/0x420
   kasan_report.cold.10+0x20/0x37
   ? btrfs_backref_cleanup_node+0x18a/0x420
   __asan_load8+0x69/0x90
   btrfs_backref_cleanup_node+0x18a/0x420
   btrfs_backref_release_cache+0x83/0x1b0
   relocate_block_group+0x394/0x780
   ? merge_reloc_roots+0x4a0/0x4a0
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   ? check_flags.part.50+0x6c/0x1e0
   ? btrfs_relocate_chunk+0x120/0x120
   ? kmem_cache_alloc_trace+0xa06/0xcb0
   ? _copy_from_user+0x83/0xc0
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   ? __kasan_check_read+0x11/0x20
   ? check_chain_key+0x1f4/0x2f0
   ? __asan_loadN+0xf/0x20
   ? btrfs_ioctl_get_supported_features+0x30/0x30
   ? kvm_sched_clock_read+0x18/0x30
   ? check_chain_key+0x1f4/0x2f0
   ? lock_downgrade+0x3f0/0x3f0
   ? handle_mm_fault+0xad6/0x2150
   ? do_vfs_ioctl+0xfc/0x9d0
   ? ioctl_file_clone+0xe0/0xe0
   ? check_flags.part.50+0x6c/0x1e0
   ? check_flags.part.50+0x6c/0x1e0
   ? check_flags+0x26/0x30
   ? lock_is_held_type+0xc3/0xf0
   ? syscall_enter_from_user_mode+0x1b/0x60
   ? do_syscall_64+0x13/0x80
   ? rcu_read_lock_sched_held+0xa1/0xd0
   ? __kasan_check_read+0x11/0x20
   ? __fget_light+0xae/0x110
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f4c4bdfe427

  Allocated by task 28836:
   kasan_save_stack+0x21/0x50
   __kasan_kmalloc.constprop.18+0xbe/0xd0
   kasan_kmalloc+0x9/0x10
   kmem_cache_alloc_trace+0x410/0xcb0
   btrfs_backref_alloc_node+0x46/0xf0
   btrfs_backref_add_tree_node+0x60d/0x11d0
   build_backref_tree+0xc5/0x700
   relocate_tree_blocks+0x2be/0xb90
   relocate_block_group+0x2eb/0x780
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Freed by task 28836:
   kasan_save_stack+0x21/0x50
   kasan_set_track+0x20/0x30
   kasan_set_free_info+0x1f/0x30
   __kasan_slab_free+0xf3/0x140
   kasan_slab_free+0xe/0x10
   kfree+0xde/0x200
   btrfs_backref_error_cleanup+0x452/0x530
   build_backref_tree+0x1a5/0x700
   relocate_tree_blocks+0x2be/0xb90
   relocate_block_group+0x2eb/0x780
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This occurred because we freed our backref node in
btrfs_backref_error_cleanup(), but then tried to free it again in
btrfs_backref_release_cache().  This is because
btrfs_backref_release_cache() will cycle through all of the
cache->leaves nodes and free them up.  However
btrfs_backref_error_cleanup() freed the backref node with
btrfs_backref_free_node(), which simply kfree()d the backref node
without unlinking it from the cache.  Change this to a
btrfs_backref_drop_node(), which does the appropriate cleanup and
removes the node from the cache->leaves list, so when we go to free the
remaining cache we don't trip over items we've already dropped.

Fixes: 75bfb9a ("Btrfs: cleanup error handling in build_backref_tree")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Jan 30, 2021
commit 49ecc67 upstream.

Zygo reported the following KASAN splat:

  BUG: KASAN: use-after-free in btrfs_backref_cleanup_node+0x18a/0x420
  Read of size 8 at addr ffff888112402950 by task btrfs/28836

  CPU: 0 PID: 28836 Comm: btrfs Tainted: G        W         5.10.0-e35f27394290-for-next+ linux-sunxi#23
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
  Call Trace:
   dump_stack+0xbc/0xf9
   ? btrfs_backref_cleanup_node+0x18a/0x420
   print_address_description.constprop.8+0x21/0x210
   ? record_print_text.cold.34+0x11/0x11
   ? btrfs_backref_cleanup_node+0x18a/0x420
   ? btrfs_backref_cleanup_node+0x18a/0x420
   kasan_report.cold.10+0x20/0x37
   ? btrfs_backref_cleanup_node+0x18a/0x420
   __asan_load8+0x69/0x90
   btrfs_backref_cleanup_node+0x18a/0x420
   btrfs_backref_release_cache+0x83/0x1b0
   relocate_block_group+0x394/0x780
   ? merge_reloc_roots+0x4a0/0x4a0
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   ? check_flags.part.50+0x6c/0x1e0
   ? btrfs_relocate_chunk+0x120/0x120
   ? kmem_cache_alloc_trace+0xa06/0xcb0
   ? _copy_from_user+0x83/0xc0
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   ? __kasan_check_read+0x11/0x20
   ? check_chain_key+0x1f4/0x2f0
   ? __asan_loadN+0xf/0x20
   ? btrfs_ioctl_get_supported_features+0x30/0x30
   ? kvm_sched_clock_read+0x18/0x30
   ? check_chain_key+0x1f4/0x2f0
   ? lock_downgrade+0x3f0/0x3f0
   ? handle_mm_fault+0xad6/0x2150
   ? do_vfs_ioctl+0xfc/0x9d0
   ? ioctl_file_clone+0xe0/0xe0
   ? check_flags.part.50+0x6c/0x1e0
   ? check_flags.part.50+0x6c/0x1e0
   ? check_flags+0x26/0x30
   ? lock_is_held_type+0xc3/0xf0
   ? syscall_enter_from_user_mode+0x1b/0x60
   ? do_syscall_64+0x13/0x80
   ? rcu_read_lock_sched_held+0xa1/0xd0
   ? __kasan_check_read+0x11/0x20
   ? __fget_light+0xae/0x110
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f4c4bdfe427

  Allocated by task 28836:
   kasan_save_stack+0x21/0x50
   __kasan_kmalloc.constprop.18+0xbe/0xd0
   kasan_kmalloc+0x9/0x10
   kmem_cache_alloc_trace+0x410/0xcb0
   btrfs_backref_alloc_node+0x46/0xf0
   btrfs_backref_add_tree_node+0x60d/0x11d0
   build_backref_tree+0xc5/0x700
   relocate_tree_blocks+0x2be/0xb90
   relocate_block_group+0x2eb/0x780
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Freed by task 28836:
   kasan_save_stack+0x21/0x50
   kasan_set_track+0x20/0x30
   kasan_set_free_info+0x1f/0x30
   __kasan_slab_free+0xf3/0x140
   kasan_slab_free+0xe/0x10
   kfree+0xde/0x200
   btrfs_backref_error_cleanup+0x452/0x530
   build_backref_tree+0x1a5/0x700
   relocate_tree_blocks+0x2be/0xb90
   relocate_block_group+0x2eb/0x780
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This occurred because we freed our backref node in
btrfs_backref_error_cleanup(), but then tried to free it again in
btrfs_backref_release_cache().  This is because
btrfs_backref_release_cache() will cycle through all of the
cache->leaves nodes and free them up.  However
btrfs_backref_error_cleanup() freed the backref node with
btrfs_backref_free_node(), which simply kfree()d the backref node
without unlinking it from the cache.  Change this to a
btrfs_backref_drop_node(), which does the appropriate cleanup and
removes the node from the cache->leaves list, so when we go to free the
remaining cache we don't trip over items we've already dropped.

Fixes: 75bfb9a ("Btrfs: cleanup error handling in build_backref_tree")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Mar 4, 2021
[ Upstream commit cf9bf87 ]

According to Errata linux-sunxi#23 "The per-CPU GbE interrupt is limited to Core
0", we can't use the per-cpu interrupt mechanism on the Armada 3700
familly.

This is correctly checked for RSS configuration, but the initial queue
mapping is still done by having the queues spread across all the CPUs in
the system, both in the init path and in the cpu_hotplug path.

Fixes: 2636ac3 ("net: mvneta: Add network support for Armada 3700 SoC")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Mar 4, 2021
commit eddda68 upstream.

A weird KASAN problem that Zygo reported could have been easily caught
if we checked for basic things in our backref freeing code.  We have two
methods of freeing a backref node

- btrfs_backref_free_node: this just is kfree() essentially.
- btrfs_backref_drop_node: this actually unlinks the node and cleans up
  everything and then calls btrfs_backref_free_node().

We should mostly be using btrfs_backref_drop_node(), to make sure the
node is properly unlinked from the backref cache, and only use
btrfs_backref_free_node() when we know the node isn't actually linked to
the backref cache.  We made a mistake here and thus got the KASAN splat.

Make this style of issue easier to find by adding some ASSERT()'s to
btrfs_backref_free_node() and adjusting our deletion stuff to properly
init the list so we can rely on list_empty() checks working properly.

  BUG: KASAN: use-after-free in btrfs_backref_cleanup_node+0x18a/0x420
  Read of size 8 at addr ffff888112402950 by task btrfs/28836

  CPU: 0 PID: 28836 Comm: btrfs Tainted: G        W         5.10.0-e35f27394290-for-next+ linux-sunxi#23
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
  Call Trace:
   dump_stack+0xbc/0xf9
   ? btrfs_backref_cleanup_node+0x18a/0x420
   print_address_description.constprop.8+0x21/0x210
   ? record_print_text.cold.34+0x11/0x11
   ? btrfs_backref_cleanup_node+0x18a/0x420
   ? btrfs_backref_cleanup_node+0x18a/0x420
   kasan_report.cold.10+0x20/0x37
   ? btrfs_backref_cleanup_node+0x18a/0x420
   __asan_load8+0x69/0x90
   btrfs_backref_cleanup_node+0x18a/0x420
   btrfs_backref_release_cache+0x83/0x1b0
   relocate_block_group+0x394/0x780
   ? merge_reloc_roots+0x4a0/0x4a0
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   ? check_flags.part.50+0x6c/0x1e0
   ? btrfs_relocate_chunk+0x120/0x120
   ? kmem_cache_alloc_trace+0xa06/0xcb0
   ? _copy_from_user+0x83/0xc0
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   ? __kasan_check_read+0x11/0x20
   ? check_chain_key+0x1f4/0x2f0
   ? __asan_loadN+0xf/0x20
   ? btrfs_ioctl_get_supported_features+0x30/0x30
   ? kvm_sched_clock_read+0x18/0x30
   ? check_chain_key+0x1f4/0x2f0
   ? lock_downgrade+0x3f0/0x3f0
   ? handle_mm_fault+0xad6/0x2150
   ? do_vfs_ioctl+0xfc/0x9d0
   ? ioctl_file_clone+0xe0/0xe0
   ? check_flags.part.50+0x6c/0x1e0
   ? check_flags.part.50+0x6c/0x1e0
   ? check_flags+0x26/0x30
   ? lock_is_held_type+0xc3/0xf0
   ? syscall_enter_from_user_mode+0x1b/0x60
   ? do_syscall_64+0x13/0x80
   ? rcu_read_lock_sched_held+0xa1/0xd0
   ? __kasan_check_read+0x11/0x20
   ? __fget_light+0xae/0x110
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f4c4bdfe427
  RSP: 002b:00007fff33ee6df8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007fff33ee6e98 RCX: 00007f4c4bdfe427
  RDX: 00007fff33ee6e98 RSI: 00000000c4009420 RDI: 0000000000000003
  RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000078
  R10: fffffffffffff59d R11: 0000000000000202 R12: 0000000000000001
  R13: 0000000000000000 R14: 00007fff33ee8a34 R15: 0000000000000001

  Allocated by task 28836:
   kasan_save_stack+0x21/0x50
   __kasan_kmalloc.constprop.18+0xbe/0xd0
   kasan_kmalloc+0x9/0x10
   kmem_cache_alloc_trace+0x410/0xcb0
   btrfs_backref_alloc_node+0x46/0xf0
   btrfs_backref_add_tree_node+0x60d/0x11d0
   build_backref_tree+0xc5/0x700
   relocate_tree_blocks+0x2be/0xb90
   relocate_block_group+0x2eb/0x780
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Freed by task 28836:
   kasan_save_stack+0x21/0x50
   kasan_set_track+0x20/0x30
   kasan_set_free_info+0x1f/0x30
   __kasan_slab_free+0xf3/0x140
   kasan_slab_free+0xe/0x10
   kfree+0xde/0x200
   btrfs_backref_error_cleanup+0x452/0x530
   build_backref_tree+0x1a5/0x700
   relocate_tree_blocks+0x2be/0xb90
   relocate_block_group+0x2eb/0x780
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  The buggy address belongs to the object at ffff888112402900
   which belongs to the cache kmalloc-128 of size 128
  The buggy address is located 80 bytes inside of
   128-byte region [ffff888112402900, ffff888112402980)
  The buggy address belongs to the page:
  page:0000000028b1cd08 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888131c810c0 pfn:0x112402
  flags: 0x17ffe0000000200(slab)
  raw: 017ffe0000000200 ffffea000424f308 ffffea0007d572c8 ffff888100040440
  raw: ffff888131c810c0 ffff888112402000 0000000100000009 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff888112402800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
   ffff888112402880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  >ffff888112402900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                   ^
   ffff888112402980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
   ffff888112402a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Link: https://lore.kernel.org/linux-btrfs/20201208194607.GI31381@hungrycats.org/
CC: stable@vger.kernel.org # 5.10+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Mar 4, 2021
[ Upstream commit cf9bf87 ]

According to Errata linux-sunxi#23 "The per-CPU GbE interrupt is limited to Core
0", we can't use the per-cpu interrupt mechanism on the Armada 3700
familly.

This is correctly checked for RSS configuration, but the initial queue
mapping is still done by having the queues spread across all the CPUs in
the system, both in the init path and in the cpu_hotplug path.

Fixes: 2636ac3 ("net: mvneta: Add network support for Armada 3700 SoC")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Mar 4, 2021
commit eddda68 upstream.

A weird KASAN problem that Zygo reported could have been easily caught
if we checked for basic things in our backref freeing code.  We have two
methods of freeing a backref node

- btrfs_backref_free_node: this just is kfree() essentially.
- btrfs_backref_drop_node: this actually unlinks the node and cleans up
  everything and then calls btrfs_backref_free_node().

We should mostly be using btrfs_backref_drop_node(), to make sure the
node is properly unlinked from the backref cache, and only use
btrfs_backref_free_node() when we know the node isn't actually linked to
the backref cache.  We made a mistake here and thus got the KASAN splat.

Make this style of issue easier to find by adding some ASSERT()'s to
btrfs_backref_free_node() and adjusting our deletion stuff to properly
init the list so we can rely on list_empty() checks working properly.

  BUG: KASAN: use-after-free in btrfs_backref_cleanup_node+0x18a/0x420
  Read of size 8 at addr ffff888112402950 by task btrfs/28836

  CPU: 0 PID: 28836 Comm: btrfs Tainted: G        W         5.10.0-e35f27394290-for-next+ linux-sunxi#23
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
  Call Trace:
   dump_stack+0xbc/0xf9
   ? btrfs_backref_cleanup_node+0x18a/0x420
   print_address_description.constprop.8+0x21/0x210
   ? record_print_text.cold.34+0x11/0x11
   ? btrfs_backref_cleanup_node+0x18a/0x420
   ? btrfs_backref_cleanup_node+0x18a/0x420
   kasan_report.cold.10+0x20/0x37
   ? btrfs_backref_cleanup_node+0x18a/0x420
   __asan_load8+0x69/0x90
   btrfs_backref_cleanup_node+0x18a/0x420
   btrfs_backref_release_cache+0x83/0x1b0
   relocate_block_group+0x394/0x780
   ? merge_reloc_roots+0x4a0/0x4a0
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   ? check_flags.part.50+0x6c/0x1e0
   ? btrfs_relocate_chunk+0x120/0x120
   ? kmem_cache_alloc_trace+0xa06/0xcb0
   ? _copy_from_user+0x83/0xc0
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   ? __kasan_check_read+0x11/0x20
   ? check_chain_key+0x1f4/0x2f0
   ? __asan_loadN+0xf/0x20
   ? btrfs_ioctl_get_supported_features+0x30/0x30
   ? kvm_sched_clock_read+0x18/0x30
   ? check_chain_key+0x1f4/0x2f0
   ? lock_downgrade+0x3f0/0x3f0
   ? handle_mm_fault+0xad6/0x2150
   ? do_vfs_ioctl+0xfc/0x9d0
   ? ioctl_file_clone+0xe0/0xe0
   ? check_flags.part.50+0x6c/0x1e0
   ? check_flags.part.50+0x6c/0x1e0
   ? check_flags+0x26/0x30
   ? lock_is_held_type+0xc3/0xf0
   ? syscall_enter_from_user_mode+0x1b/0x60
   ? do_syscall_64+0x13/0x80
   ? rcu_read_lock_sched_held+0xa1/0xd0
   ? __kasan_check_read+0x11/0x20
   ? __fget_light+0xae/0x110
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f4c4bdfe427
  RSP: 002b:00007fff33ee6df8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007fff33ee6e98 RCX: 00007f4c4bdfe427
  RDX: 00007fff33ee6e98 RSI: 00000000c4009420 RDI: 0000000000000003
  RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000078
  R10: fffffffffffff59d R11: 0000000000000202 R12: 0000000000000001
  R13: 0000000000000000 R14: 00007fff33ee8a34 R15: 0000000000000001

  Allocated by task 28836:
   kasan_save_stack+0x21/0x50
   __kasan_kmalloc.constprop.18+0xbe/0xd0
   kasan_kmalloc+0x9/0x10
   kmem_cache_alloc_trace+0x410/0xcb0
   btrfs_backref_alloc_node+0x46/0xf0
   btrfs_backref_add_tree_node+0x60d/0x11d0
   build_backref_tree+0xc5/0x700
   relocate_tree_blocks+0x2be/0xb90
   relocate_block_group+0x2eb/0x780
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Freed by task 28836:
   kasan_save_stack+0x21/0x50
   kasan_set_track+0x20/0x30
   kasan_set_free_info+0x1f/0x30
   __kasan_slab_free+0xf3/0x140
   kasan_slab_free+0xe/0x10
   kfree+0xde/0x200
   btrfs_backref_error_cleanup+0x452/0x530
   build_backref_tree+0x1a5/0x700
   relocate_tree_blocks+0x2be/0xb90
   relocate_block_group+0x2eb/0x780
   btrfs_relocate_block_group+0x26e/0x4c0
   btrfs_relocate_chunk+0x52/0x120
   btrfs_balance+0xe2e/0x1900
   btrfs_ioctl_balance+0x3a7/0x460
   btrfs_ioctl+0x24c8/0x4360
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  The buggy address belongs to the object at ffff888112402900
   which belongs to the cache kmalloc-128 of size 128
  The buggy address is located 80 bytes inside of
   128-byte region [ffff888112402900, ffff888112402980)
  The buggy address belongs to the page:
  page:0000000028b1cd08 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888131c810c0 pfn:0x112402
  flags: 0x17ffe0000000200(slab)
  raw: 017ffe0000000200 ffffea000424f308 ffffea0007d572c8 ffff888100040440
  raw: ffff888131c810c0 ffff888112402000 0000000100000009 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff888112402800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
   ffff888112402880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  >ffff888112402900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                   ^
   ffff888112402980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
   ffff888112402a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Link: https://lore.kernel.org/linux-btrfs/20201208194607.GI31381@hungrycats.org/
CC: stable@vger.kernel.org # 5.10+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Mar 26, 2022
[ Upstream commit 4224cfd ]

When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     linux-sunxi#9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    linux-sunxi#10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    linux-sunxi#11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    linux-sunxi#12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    linux-sunxi#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    linux-sunxi#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    linux-sunxi#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    linux-sunxi#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    linux-sunxi#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    linux-sunxi#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    linux-sunxi#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    linux-sunxi#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    linux-sunxi#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    linux-sunxi#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    linux-sunxi#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    linux-sunxi#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    linux-sunxi#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    linux-sunxi#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Apr 25, 2022
We enabled UBSAN in the ubuntu kernel, and the cs35l41 driver triggers
a warning calltrace like below:

cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: bitoffset= 8, word_offset=23, bit_sum mod 32=0, otp_map[i].size = 24
cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: bitoffset= 0, word_offset=24, bit_sum mod 32=24, otp_map[i].size = 0
================================================================================
UBSAN: shift-out-of-bounds in linux-kernel-src/sound/soc/codecs/cs35l41-lib.c:836:8
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 10 PID: 595 Comm: systemd-udevd Not tainted 5.15.0-23-generic linux-sunxi#23
Hardware name: LENOVO \x02MFG_IN_GO/\x02MFG_IN_GO, BIOS N3GET19W (1.00 ) 03/11/2022
Call Trace:
 <TASK>
 show_stack+0x52/0x58
 dump_stack_lvl+0x4a/0x5f
 dump_stack+0x10/0x12
 ubsan_epilogue+0x9/0x45
 __ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
 ? regmap_unlock_mutex+0xe/0x10
 cs35l41_otp_unpack.cold+0x1c6/0x2b2 [snd_soc_cs35l41_lib]
 cs35l41_hda_probe+0x24f/0x33a [snd_hda_scodec_cs35l41]
 cs35l41_hda_i2c_probe+0x65/0x90 [snd_hda_scodec_cs35l41_i2c]

When both bitoffset and otp_map[i].size are 0, the line 836 will
result in GENMASK(-1, 0), this triggers the shift-out-of-bounds
calltrace.

Here add a checking, if both bitoffset and otp_map[i].size are 0,
do not run GENMASK() and directly set otp_val to 0, this will not
bring any function change on the driver but could avoid the calltrace.

Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20220324081839.62009-2-hui.wang@canonical.com
Signed-off-by: Mark Brown <broonie@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue May 10, 2022
[ Upstream commit 0b3d5d2 ]

We enabled UBSAN in the ubuntu kernel, and the cs35l41 driver triggers
a warning calltrace like below:

cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: bitoffset= 8, word_offset=23, bit_sum mod 32=0, otp_map[i].size = 24
cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: bitoffset= 0, word_offset=24, bit_sum mod 32=24, otp_map[i].size = 0
================================================================================
UBSAN: shift-out-of-bounds in linux-kernel-src/sound/soc/codecs/cs35l41-lib.c:836:8
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 10 PID: 595 Comm: systemd-udevd Not tainted 5.15.0-23-generic linux-sunxi#23
Hardware name: LENOVO \x02MFG_IN_GO/\x02MFG_IN_GO, BIOS N3GET19W (1.00 ) 03/11/2022
Call Trace:
 <TASK>
 show_stack+0x52/0x58
 dump_stack_lvl+0x4a/0x5f
 dump_stack+0x10/0x12
 ubsan_epilogue+0x9/0x45
 __ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
 ? regmap_unlock_mutex+0xe/0x10
 cs35l41_otp_unpack.cold+0x1c6/0x2b2 [snd_soc_cs35l41_lib]
 cs35l41_hda_probe+0x24f/0x33a [snd_hda_scodec_cs35l41]
 cs35l41_hda_i2c_probe+0x65/0x90 [snd_hda_scodec_cs35l41_i2c]

When both bitoffset and otp_map[i].size are 0, the line 836 will
result in GENMASK(-1, 0), this triggers the shift-out-of-bounds
calltrace.

Here add a checking, if both bitoffset and otp_map[i].size are 0,
do not run GENMASK() and directly set otp_val to 0, this will not
bring any function change on the driver but could avoid the calltrace.

Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20220324081839.62009-2-hui.wang@canonical.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Jun 10, 2022
[ Upstream commit 9f34290 ]

The CS35L41_NUM_OTP_ELEM is 100, but only 99 entries are defined in
the array otp_map_1/2[CS35L41_NUM_OTP_ELEM], this will trigger UBSAN
to report a shift-out-of-bounds warning in the cs35l41_otp_unpack()
since the last entry in the array will result in GENMASK(-1, 0).

UBSAN reports this problem:
 UBSAN: shift-out-of-bounds in /home/hwang4/build/jammy/jammy/sound/soc/codecs/cs35l41-lib.c:836:8
 shift exponent 64 is too large for 64-bit type 'long unsigned int'
 CPU: 10 PID: 595 Comm: systemd-udevd Not tainted 5.15.0-23-generic linux-sunxi#23
 Hardware name: LENOVO \x02MFG_IN_GO/\x02MFG_IN_GO, BIOS N3GET19W (1.00 ) 03/11/2022
 Call Trace:
  <TASK>
  show_stack+0x52/0x58
  dump_stack_lvl+0x4a/0x5f
  dump_stack+0x10/0x12
  ubsan_epilogue+0x9/0x45
  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
  ? regmap_unlock_mutex+0xe/0x10
  cs35l41_otp_unpack.cold+0x1c6/0x2b2 [snd_soc_cs35l41_lib]
  cs35l41_hda_probe+0x24f/0x33a [snd_hda_scodec_cs35l41]
  cs35l41_hda_i2c_probe+0x65/0x90 [snd_hda_scodec_cs35l41_i2c]
  ? cs35l41_hda_i2c_remove+0x20/0x20 [snd_hda_scodec_cs35l41_i2c]
  i2c_device_probe+0x252/0x2b0

Fixes: 6450ef5 ("ASoC: cs35l41: CS35L41 Boosted Smart Amplifier")
Reviewed-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20220328123535.50000-2-hui.wang@canonical.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Sep 9, 2022
After modifying the QP to the Error state, all RX WR would be completed
with WC in IB_WC_WR_FLUSH_ERR status. Current implementation does not
wait for it is done, but destroy the QP and free the link group directly.
So there is a risk that accessing the freed memory in tasklet context.

Here is a crash example:

 BUG: unable to handle page fault for address: ffffffff8f220860
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD f7300e067 P4D f7300e067 PUD f7300f063 PMD 8c4e45063 PTE 800ffff08c9df060
 Oops: 0002 [#1] SMP PTI
 CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G S         OE     5.10.0-0607+ linux-sunxi#23
 Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.20 07/09/2018
 RIP: 0010:native_queued_spin_lock_slowpath+0x176/0x1b0
 Code: f3 90 48 8b 32 48 85 f6 74 f6 eb d5 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 c8 02 00 48 03 04 f5 00 09 98 8e <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32
 RSP: 0018:ffffb3b6c001ebd8 EFLAGS: 00010086
 RAX: ffffffff8f220860 RBX: 0000000000000246 RCX: 0000000000080000
 RDX: ffff91db1f86c800 RSI: 000000000000173c RDI: ffff91db62bace00
 RBP: ffff91db62bacc00 R08: 0000000000000000 R09: c00000010000028b
 R10: 0000000000055198 R11: ffffb3b6c001ea58 R12: ffff91db80e05010
 R13: 000000000000000a R14: 0000000000000006 R15: 0000000000000040
 FS:  0000000000000000(0000) GS:ffff91db1f840000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffff8f220860 CR3: 00000001f9580004 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <IRQ>
  _raw_spin_lock_irqsave+0x30/0x40
  mlx5_ib_poll_cq+0x4c/0xc50 [mlx5_ib]
  smc_wr_rx_tasklet_fn+0x56/0xa0 [smc]
  tasklet_action_common.isra.21+0x66/0x100
  __do_softirq+0xd5/0x29c
  asm_call_irq_on_stack+0x12/0x20
  </IRQ>
  do_softirq_own_stack+0x37/0x40
  irq_exit_rcu+0x9d/0xa0
  sysvec_call_function_single+0x34/0x80
  asm_sysvec_call_function_single+0x12/0x20

Fixes: bd4ad57 ("smc: initialize IB transport incl. PD, MR, QP, CQ, event, WR")
Signed-off-by: Yacan Liu <liuyacan@corp.netease.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Oct 4, 2022
[ Upstream commit e9b1a4f ]

After modifying the QP to the Error state, all RX WR would be completed
with WC in IB_WC_WR_FLUSH_ERR status. Current implementation does not
wait for it is done, but destroy the QP and free the link group directly.
So there is a risk that accessing the freed memory in tasklet context.

Here is a crash example:

 BUG: unable to handle page fault for address: ffffffff8f220860
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD f7300e067 P4D f7300e067 PUD f7300f063 PMD 8c4e45063 PTE 800ffff08c9df060
 Oops: 0002 [jwrdegoede#1] SMP PTI
 CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G S         OE     5.10.0-0607+ linux-sunxi#23
 Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.20 07/09/2018
 RIP: 0010:native_queued_spin_lock_slowpath+0x176/0x1b0
 Code: f3 90 48 8b 32 48 85 f6 74 f6 eb d5 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 c8 02 00 48 03 04 f5 00 09 98 8e <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32
 RSP: 0018:ffffb3b6c001ebd8 EFLAGS: 00010086
 RAX: ffffffff8f220860 RBX: 0000000000000246 RCX: 0000000000080000
 RDX: ffff91db1f86c800 RSI: 000000000000173c RDI: ffff91db62bace00
 RBP: ffff91db62bacc00 R08: 0000000000000000 R09: c00000010000028b
 R10: 0000000000055198 R11: ffffb3b6c001ea58 R12: ffff91db80e05010
 R13: 000000000000000a R14: 0000000000000006 R15: 0000000000000040
 FS:  0000000000000000(0000) GS:ffff91db1f840000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffff8f220860 CR3: 00000001f9580004 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <IRQ>
  _raw_spin_lock_irqsave+0x30/0x40
  mlx5_ib_poll_cq+0x4c/0xc50 [mlx5_ib]
  smc_wr_rx_tasklet_fn+0x56/0xa0 [smc]
  tasklet_action_common.isra.21+0x66/0x100
  __do_softirq+0xd5/0x29c
  asm_call_irq_on_stack+0x12/0x20
  </IRQ>
  do_softirq_own_stack+0x37/0x40
  irq_exit_rcu+0x9d/0xa0
  sysvec_call_function_single+0x34/0x80
  asm_sysvec_call_function_single+0x12/0x20

Fixes: bd4ad57 ("smc: initialize IB transport incl. PD, MR, QP, CQ, event, WR")
Signed-off-by: Yacan Liu <liuyacan@corp.netease.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Oct 26, 2022
commit 9cc205e upstream.

Fix port I/O string accessors such as `insb', `outsb', etc. which use
the physical PCI port I/O address rather than the corresponding memory
mapping to get at the requested location, which in turn breaks at least
accesses made by our parport driver to a PCIe parallel port such as:

PCI parallel port detected: 1415:c118, I/O at 0x1000(0x1008), IRQ 20
parport0: PC-style at 0x1000 (0x1008), irq 20, using FIFO [PCSPP,TRISTATE,COMPAT,EPP,ECP]

causing a memory access fault:

Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000001008
Oops [jwrdegoede#1]
Modules linked in:
CPU: 1 PID: 350 Comm: cat Not tainted 6.0.0-rc2-00283-g10d4879f9ef0-dirty linux-sunxi#23
Hardware name: SiFive HiFive Unmatched A00 (DT)
epc : parport_pc_fifo_write_block_pio+0x266/0x416
 ra : parport_pc_fifo_write_block_pio+0xb4/0x416
epc : ffffffff80542c3e ra : ffffffff80542a8c sp : ffffffd88899fc60
 gp : ffffffff80fa2700 tp : ffffffd882b1e900 t0 : ffffffd883d0b000
 t1 : ffffffffff000002 t2 : 4646393043330a38 s0 : ffffffd88899fcf0
 s1 : 0000000000001000 a0 : 0000000000000010 a1 : 0000000000000000
 a2 : ffffffd883d0a010 a3 : 0000000000000023 a4 : 00000000ffff8fbb
 a5 : ffffffd883d0a001 a6 : 0000000100000000 a7 : ffffffc800000000
 s2 : ffffffffff000002 s3 : ffffffff80d28880 s4 : ffffffff80fa1f50
 s5 : 0000000000001008 s6 : 0000000000000008 s7 : ffffffd883d0a000
 s8 : 0004000000000000 s9 : ffffffff80dc1d80 s10: ffffffd8807e4000
 s11: 0000000000000000 t3 : 00000000000000ff t4 : 393044410a303930
 t5 : 0000000000001000 t6 : 0000000000040000
status: 0000000200000120 badaddr: 0000000000001008 cause: 000000000000000f
[<ffffffff80543212>] parport_pc_compat_write_block_pio+0xfe/0x200
[<ffffffff8053bbc0>] parport_write+0x46/0xf8
[<ffffffff8050530e>] lp_write+0x158/0x2d2
[<ffffffff80185716>] vfs_write+0x8e/0x2c2
[<ffffffff80185a74>] ksys_write+0x52/0xc2
[<ffffffff80185af2>] sys_write+0xe/0x16
[<ffffffff80003770>] ret_from_syscall+0x0/0x2
---[ end trace 0000000000000000 ]---

For simplicity address the problem by adding PCI_IOBASE to the physical
address requested in the respective wrapper macros only, observing that
the raw accessors such as `__insb', `__outsb', etc. are not supposed to
be used other than by said macros.  Remove the cast to `long' that is no
longer needed on `addr' now that it is used as an offset from PCI_IOBASE
and add parentheses around `addr' needed for predictable evaluation in
macro expansion.  No need to make said adjustments in separate changes
given that current code is gravely broken and does not ever work.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: fab957c ("RISC-V: Atomic and Locking Code")
Cc: stable@vger.kernel.org # v4.15+
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209220223080.29493@angie.orcam.me.uk
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Mar 22, 2023
The UAF bug occurred because we were putting DFS root sessions in
cifs_umount() while DFS cache refresher was being executed.

Make DFS root sessions have same lifetime as DFS tcons so we can avoid
the use-after-free bug is DFS cache refresher and other places that
require IPCs to get new DFS referrals on.  Also, get rid of mount
group handling in DFS cache as we no longer need it.

This fixes below use-after-free bug catched by KASAN

[ 379.946955] BUG: KASAN: use-after-free in __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.947642] Read of size 8 at addr ffff888018f57030 by task kworker/u4:3/56
[ 379.948096]
[ 379.948208] CPU: 0 PID: 56 Comm: kworker/u4:3 Not tainted 6.2.0-rc7-lku linux-sunxi#23
[ 379.948661] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014
[ 379.949368] Workqueue: cifs-dfscache refresh_cache_worker [cifs]
[ 379.949942] Call Trace:
[ 379.950113] <TASK>
[ 379.950260] dump_stack_lvl+0x50/0x67
[ 379.950510] print_report+0x16a/0x48e
[ 379.950759] ? __virt_addr_valid+0xd8/0x160
[ 379.951040] ? __phys_addr+0x41/0x80
[ 379.951285] kasan_report+0xdb/0x110
[ 379.951533] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.952056] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.952585] __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.953096] ? __pfx___refresh_tcon.isra.0+0x10/0x10 [cifs]
[ 379.953637] ? __pfx___mutex_lock+0x10/0x10
[ 379.953915] ? lock_release+0xb6/0x720
[ 379.954167] ? __pfx_lock_acquire+0x10/0x10
[ 379.954443] ? refresh_cache_worker+0x34e/0x6d0 [cifs]
[ 379.954960] ? __pfx_wb_workfn+0x10/0x10
[ 379.955239] refresh_cache_worker+0x4ad/0x6d0 [cifs]
[ 379.955755] ? __pfx_refresh_cache_worker+0x10/0x10 [cifs]
[ 379.956323] ? __pfx_lock_acquired+0x10/0x10
[ 379.956615] ? read_word_at_a_time+0xe/0x20
[ 379.956898] ? lockdep_hardirqs_on_prepare+0x12/0x220
[ 379.957235] process_one_work+0x535/0x990
[ 379.957509] ? __pfx_process_one_work+0x10/0x10
[ 379.957812] ? lock_acquired+0xb7/0x5f0
[ 379.958069] ? __list_add_valid+0x37/0xd0
[ 379.958341] ? __list_add_valid+0x37/0xd0
[ 379.958611] worker_thread+0x8e/0x630
[ 379.958861] ? __pfx_worker_thread+0x10/0x10
[ 379.959148] kthread+0x17d/0x1b0
[ 379.959369] ? __pfx_kthread+0x10/0x10
[ 379.959630] ret_from_fork+0x2c/0x50
[ 379.959879] </TASK>

Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org # 6.2
Signed-off-by: Steve French <stfrench@microsoft.com>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Jun 12, 2023
The cited commit adds a compeletion to remove dependency on rtnl
lock. But it causes a deadlock for multiple encapsulations:

 crash> bt ffff8aece8a64000
 PID: 1514557  TASK: ffff8aece8a64000  CPU: 3    COMMAND: "tc"
  #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418
  #2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898
  #3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8
  #4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb
  #5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core]
  #6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core]
  linux-sunxi#7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core]
  linux-sunxi#8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core]
  linux-sunxi#9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core]
 linux-sunxi#10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core]
 linux-sunxi#11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core]
 linux-sunxi#12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8
 linux-sunxi#13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower]
 linux-sunxi#14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower]
 linux-sunxi#15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047
 linux-sunxi#16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31
 linux-sunxi#17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853
 linux-sunxi#18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835
 linux-sunxi#19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27
 linux-sunxi#20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245
 linux-sunxi#21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482
 linux-sunxi#22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a
 linux-sunxi#23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2
 linux-sunxi#24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2
 linux-sunxi#25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f
 linux-sunxi#26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8
 linux-sunxi#27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c
 crash> bt 0xffff8aeb07544000
 PID: 1110766  TASK: ffff8aeb07544000  CPU: 0    COMMAND: "kworker/u20:9"
  #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418
  #2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88
  #3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b
  #4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core]
  #5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core]
  #6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core]
  linux-sunxi#7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c
  linux-sunxi#8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012
  linux-sunxi#9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d
 linux-sunxi#10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f

After the first encap is attached, flow will be added to encap
entry's flows list. If neigh update is running at this time, the
following encaps of the flow can't hold the encap_tbl_lock and
sleep. If neigh update thread is waiting for that flow's init_done,
deadlock happens.

Fix it by holding lock outside of the for loop. If neigh update is
running, prevent encap flows from offloading. Since the lock is held
outside of the for loop, concurrent creation of encap entries is not
allowed. So remove unnecessary wait_for_completion call for res_ready.

Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Sep 20, 2023
The following processes run into a deadlock. CPU 41 was waiting for CPU 29
to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29
was hung by that spinlock with IRQs disabled.

  PID: 17360    TASK: ffff95c1090c5c40  CPU: 41  COMMAND: "mrdiagd"
  !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0
  !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0
  !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0
   # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0
   # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0
   # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0
   # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0
   # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0
   # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0
   # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0
   linux-sunxi#10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0
   linux-sunxi#11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0
   linux-sunxi#12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0
   linux-sunxi#13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0
   linux-sunxi#14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0
   linux-sunxi#15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0
   linux-sunxi#16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0
   linux-sunxi#17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0
   linux-sunxi#18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0
   linux-sunxi#19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   linux-sunxi#20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   linux-sunxi#21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   linux-sunxi#22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   linux-sunxi#23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   linux-sunxi#24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   linux-sunxi#25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   linux-sunxi#26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   linux-sunxi#27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

  PID: 17355    TASK: ffff95c1090c3d80  CPU: 29  COMMAND: "mrdiagd"
  !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0
  !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0
   # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0
   # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0
   # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0
   # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0
   # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0
   # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0
   # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0
   # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   linux-sunxi#10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   linux-sunxi#11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   linux-sunxi#12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   linux-sunxi#13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   linux-sunxi#14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   linux-sunxi#15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   linux-sunxi#16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   linux-sunxi#17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

The lock is used to synchronize different sysfs operations, it doesn't
protect any resource that will be touched by an interrupt. Consequently
it's not required to disable IRQs. Replace the spinlock with a mutex to fix
the deadlock.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Jan 19, 2024
[ Upstream commit 37c3b9f ]

The cited commit adds a compeletion to remove dependency on rtnl
lock. But it causes a deadlock for multiple encapsulations:

 crash> bt ffff8aece8a64000
 PID: 1514557  TASK: ffff8aece8a64000  CPU: 3    COMMAND: "tc"
  #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45
  jwrdegoede#1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418
  jwrdegoede#2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898
  jwrdegoede#3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8
  jwrdegoede#4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb
  jwrdegoede#5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core]
  jwrdegoede#6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core]
  linux-sunxi#7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core]
  linux-sunxi#8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core]
  linux-sunxi#9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core]
 linux-sunxi#10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core]
 linux-sunxi#11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core]
 linux-sunxi#12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8
 linux-sunxi#13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower]
 linux-sunxi#14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower]
 linux-sunxi#15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047
 linux-sunxi#16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31
 linux-sunxi#17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853
 linux-sunxi#18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835
 linux-sunxi#19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27
 linux-sunxi#20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245
 linux-sunxi#21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482
 linux-sunxi#22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a
 linux-sunxi#23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2
 linux-sunxi#24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2
 linux-sunxi#25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f
 linux-sunxi#26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8
 linux-sunxi#27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c
 crash> bt 0xffff8aeb07544000
 PID: 1110766  TASK: ffff8aeb07544000  CPU: 0    COMMAND: "kworker/u20:9"
  #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45
  jwrdegoede#1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418
  jwrdegoede#2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88
  jwrdegoede#3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b
  jwrdegoede#4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core]
  jwrdegoede#5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core]
  jwrdegoede#6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core]
  linux-sunxi#7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c
  linux-sunxi#8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012
  linux-sunxi#9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d
 linux-sunxi#10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f

After the first encap is attached, flow will be added to encap
entry's flows list. If neigh update is running at this time, the
following encaps of the flow can't hold the encap_tbl_lock and
sleep. If neigh update thread is waiting for that flow's init_done,
deadlock happens.

Fix it by holding lock outside of the for loop. If neigh update is
running, prevent encap flows from offloading. Since the lock is held
outside of the for loop, concurrent creation of encap entries is not
allowed. So remove unnecessary wait_for_completion call for res_ready.

Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants