cp: cannot stat `drivers/net/wireless/bcm4330/firmware/mw269v3_fw.bin': No such file or directory #20

huaixiaoz · 2012-05-19T13:07:53Z

Maybe you have forget put these into the fold ...
cp: cannot stat `drivers/net/wireless/bcm4330/firmware/mw269v3_fw.bin': No such file or directory

And This is the output error :

Image Name: Linux-3.0.31+
Created: Sat May 19 20:57:10 2012
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 4042472 Bytes = 3947.73 kB = 3.86 MB
Load Address: 40008000
Entry Point: 40008000
Image arch/arm/boot/uImage is ready
arch/arm/boot/uImage' ->output/uImage'
arch/arm/boot/zImage' ->output/zImage'
cp: cannot stat `drivers/net/wireless/bcm4330/firmware/mw269v3_fw.bin': No such file or directory

i used the old(3.08+ /based my table ) instead of it

The text was updated successfully, but these errors were encountered:

amery · 2012-05-19T15:10:58Z

file added in 9f2c962 please confirm it solves the issue

The warning below triggers on AMD MCM packages because physical package IDs on the cores of a _physical_ socket are the same. I.e., this field says which CPUs belong to the same physical package. However, the same two CPUs belong to two different internal, i.e. "logical" nodes in the same physical socket which is reflected in the CPU-to-node map on x86 with NUMA. Which makes this check wrong on the above topologies so circumvent it. [ 0.444413] Booting Node 0, Processors #1 #2 #3 #4 #5 Ok. [ 0.461388] ------------[ cut here ]------------ [ 0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81() [ 0.473960] Hardware name: Dinar [ 0.477170] sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. [ 0.486860] Booting Node 1, Processors #6 [ 0.491104] Modules linked in: [ 0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1 [ 0.499510] Call Trace: [ 0.501946] [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81 [ 0.508185] [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d [ 0.514163] [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48 [ 0.519881] [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81 [ 0.525943] [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371 [ 0.532004] [<ffffffff8144c4ee>] start_secondary+0x19a/0x218 [ 0.537729] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.628197] #7 #8 #9 #10 #11 Ok. [ 0.807108] Booting Node 3, Processors #12 #13 #14 #15 #16 #17 Ok. [ 0.897587] Booting Node 2, Processors #18 #19 #20 #21 #22 #23 Ok. [ 0.917443] Brought up 24 CPUs We ran a topology sanity check test we have here on it and it all looks ok... hopefully :). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 #24 [ffff8810343bfee8] kthread at ffffffff8108dd96 #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9b469a6 upstream. Add 6 new devices and one modified device, based on information from laptop vendor Windows drivers. Sony provides a driver with two new devices using a Gobi 2k+ layout (1199:68a5 and 1199:68a9). The Sony driver also adds a non-standard QMI/net interface to the already supported 1199:9011 Gobi device. We do not know whether this is an alternate interface number or an additional interface which might be present, but that doesn't really matter. Lenovo provides a driver supporting 4 new devices: - MC7770 (1199:901b) with standard Gobi 2k+ layout - MC7700 (0f3d:68a2) with layout similar to MC7710 - MC7750 (114f:68a2) with layout similar to MC7710 - EM7700 (1199:901c) with layout similar to MC7710 Note regaring the three devices similar to MC7710: The Windows drivers only support interface #8 on these devices. The MC7710 can support QMI/net functions on interface #19 and #20 as well, and this driver is verified to work on interface #19 (a firmware bug is suspected to prevent #20 from working). We do not enable these additional interfaces until they either show up in a Windows driver or are verified to work in some other way. Therefore limiting the new devices to interface #8 for now. [bmork: backported to 3.4: use driver whitelisting] Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

…d reasons We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 #24 [ffff8810343bfee8] kthread at ffffffff8108dd96 #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org

The sm_lock spinlock is taken in the process context by mlx4_ib_modify_device, and in the interrupt context by update_sm_ah, so we need to take that spinlock with irqsave, and release it with irqrestore. Lockdeps reports this as follows: [ INFO: inconsistent lock state ] 3.5.0+ #20 Not tainted inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes: (&(&ibdev->sm_lock)->rlock){?.+...}, at: [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib] {HARDIRQ-ON-W} state was registered at: [<ffffffff810b84a0>] mark_irqflags+0x120/0x190 [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0 [<ffffffff810b9f51>] lock_acquire+0xb1/0x150 [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50 [<ffffffffa028d563>] mlx4_ib_modify_device+0x63/0x240 [mlx4_ib] [<ffffffffa026d1fc>] ib_modify_device+0x1c/0x20 [ib_core] [<ffffffffa026c353>] set_node_desc+0x83/0xc0 [ib_core] [<ffffffff8136a150>] dev_attr_store+0x20/0x30 [<ffffffff81201fd6>] sysfs_write_file+0xe6/0x170 [<ffffffff8118da38>] vfs_write+0xc8/0x190 [<ffffffff8118dc01>] sys_write+0x51/0x90 [<ffffffff8155b869>] system_call_fastpath+0x16/0x1b ... *** DEADLOCK *** 1 lock held by swapper/0/0: stack backtrace: Pid: 0, comm: swapper/0 Not tainted 3.5.0+ #20 Call Trace: <IRQ> [<ffffffff810b7bea>] print_usage_bug+0x18a/0x190 [<ffffffff810b7370>] ? print_irq_inversion_bug+0x210/0x210 [<ffffffff810b7fb2>] mark_lock_irq+0xf2/0x280 [<ffffffff810b8290>] mark_lock+0x150/0x240 [<ffffffff810b84ef>] mark_irqflags+0x16f/0x190 [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0 [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffff810b9f51>] lock_acquire+0xb1/0x150 [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50 [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffffa026b2fa>] ? ib_create_ah+0x1a/0x40 [ib_core] [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffff810c27c3>] ? is_module_address+0x23/0x30 [<ffffffffa028b05b>] handle_port_mgmt_change_event+0xeb/0x150 [mlx4_ib] [<ffffffffa028c177>] mlx4_ib_event+0x117/0x160 [mlx4_ib] [<ffffffff81552501>] ? _raw_spin_lock_irqsave+0x61/0x70 [<ffffffffa022718c>] mlx4_dispatch_event+0x6c/0x90 [mlx4_core] [<ffffffffa0221b40>] mlx4_eq_int+0x500/0x950 [mlx4_core] Reported by: Or Gerlitz <ogerlitz@mellanox.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>

Add 6 new devices and one modified device, based on information from laptop vendor Windows drivers. Sony provides a driver with two new devices using a Gobi 2k+ layout (1199:68a5 and 1199:68a9). The Sony driver also adds a non-standard QMI/net interface to the already supported 1199:9011 Gobi device. We do not know whether this is an alternate interface number or an additional interface which might be present, but that doesn't really matter. Lenovo provides a driver supporting 4 new devices: - MC7770 (1199:901b) with standard Gobi 2k+ layout - MC7700 (0f3d:68a2) with layout similar to MC7710 - MC7750 (114f:68a2) with layout similar to MC7710 - EM7700 (1199:901c) with layout similar to MC7710 Note regaring the three devices similar to MC7710: The Windows drivers only support interface #8 on these devices. The MC7710 can support QMI/net functions on interface #19 and #20 as well, and this driver is verified to work on interface #19 (a firmware bug is suspected to prevent #20 from working). We do not enable these additional interfaces until they either show up in a Windows driver or are verified to work in some other way. Therefore limiting the new devices to interface #8 for now. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>

Currently IOP3XX_PERIPHERAL_VIRT_BASE conflicts with PCI_IO_VIRT_BASE: address size PCI_IO_VIRT_BASE 0xfee00000 0x200000 IOP3XX_PERIPHERAL_VIRT_BASE 0xfeffe000 0x2000 Fix by moving IOP3XX_PERIPHERAL_VIRT_BASE below PCI_IO_VIRT_BASE. The patch fixes the following kernel panic with 3.9-rc1 on iop3xx boards: [ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 3.9.0-rc1-iop32x (aaro@blackmetal) (gcc version 4.7.2 (GCC) ) #20 PREEMPT Tue Mar 5 16:44:36 EET 2013 [ 0.000000] bootconsole [earlycon0] enabled [ 0.000000] ------------[ cut here ]------------ [ 0.000000] kernel BUG at mm/vmalloc.c:1145! [ 0.000000] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 Not tainted (3.9.0-rc1-iop32x #20) [ 0.000000] PC is at vm_area_add_early+0x4c/0x88 [ 0.000000] LR is at add_static_vm_early+0x14/0x68 [ 0.000000] pc : [<c03e74a8>] lr : [<c03e1c40>] psr: 800000d3 [ 0.000000] sp : c03ffee4 ip : dfffdf88 fp : c03ffef4 [ 0.000000] r10: 00000002 r9 : 000000cf r8 : 00000653 [ 0.000000] r7 : c040eca8 r6 : c03e2408 r5 : dfffdf60 r4 : 00200000 [ 0.000000] r3 : dfffdfd8 r2 : feffe000 r1 : ff000000 r0 : dfffdf60 [ 0.000000] Flags: Nzcv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel [ 0.000000] Control: 0000397f Table: a0004000 DAC: 00000017 [ 0.000000] Process swapper (pid: 0, stack limit = 0xc03fe1b8) [ 0.000000] Stack: (0xc03ffee4 to 0xc0400000) [ 0.000000] fee0: 00200000 c03fff0c c03ffef8 c03e1c40 c03e7468 00200000 fee00000 [ 0.000000] ff00: c03fff2c c03fff10 c03e23e4 c03e1c38 feffe000 c0408ee4 ff000000 c0408f04 [ 0.000000] ff20: c03fff3c c03fff30 c03e2434 c03e23b4 c03fff84 c03fff40 c03e2c94 c03e2414 [ 0.000000] ff40: c03f8878 c03f6410 ffff0000 000bffff 00001000 00000008 c03fff84 c03f6410 [ 0.000000] ff60: c04227e8 c03fffd4 a0008000 c03f8878 69052e30 c02f96eb c03fffbc c03fff88 [ 0.000000] ff80: c03e044c c03e268c 00000000 0000397f c0385130 00000001 ffffffff c03f8874 [ 0.000000] ffa0: dfffffff a0004000 69052e30 a03f61a0 c03ffff4 c03fffc0 c03dd5cc c03e0184 [ 0.000000] ffc0: 00000000 00000000 00000000 00000000 00000000 c03f8878 0000397d c040601c [ 0.000000] ffe0: c03f8874 c0408674 00000000 c03ffff8 a0008040 c03dd558 00000000 00000000 [ 0.000000] Backtrace: [ 0.000000] [<c03e745c>] (vm_area_add_early+0x0/0x88) from [<c03e1c40>] (add_static_vm_early+0x14/0x68) Tested-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

…rupt handler Mutexes should not be acquired in interrupt context. While the trylock fastpath is arguably safe on all implementations, the slowpath unlock path definitely isn't. This fixes the following lockdep splat: [ 13.044313] ------------[ cut here ]------------ [ 13.044367] WARNING: at /c/kernel-tests/src/tip/kernel/mutex.c:858 mutex_trylock+0x87/0x220() [ 13.044378] DEBUG_LOCKS_WARN_ON(in_interrupt()) [ 13.044378] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-rc4-00296-ga2963dd #20 [ 13.044379] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 13.044390] 0000000000000009 ffff88000de039f8 ffffffff81fc86d5 ffff88000de03a38 [ 13.044395] ffffffff810d511b ffff880000000018 ffff88000f33c690 0000000000000001 [ 13.044398] 00000000000003f0 ffff88000f4677c8 0000000000000000 ffff88000de03a98 [ 13.044400] Call Trace: [ 13.044412] <IRQ> [<ffffffff81fc86d5>] dump_stack+0x19/0x1b [ 13.044441] [<ffffffff810d511b>] warn_slowpath_common+0x6b/0x90 [ 13.044445] [<ffffffff810d51a6>] warn_slowpath_fmt+0x46/0x50 [ 13.044448] [<ffffffff81fd34d7>] mutex_trylock+0x87/0x220 [ 13.044482] [<ffffffff8186484d>] cirrus_dirty_update+0x1cd/0x330 [ 13.044486] [<ffffffff818649e8>] cirrus_imageblit+0x38/0x50 [ 13.044506] [<ffffffff8165782e>] soft_cursor+0x22e/0x240 [ 13.044510] [<ffffffff81656c31>] bit_cursor+0x581/0x5b0 [ 13.044525] [<ffffffff815de9f4>] ? vsnprintf+0x124/0x670 [ 13.044529] [<ffffffff81651333>] ? get_color.isra.16+0x43/0x130 [ 13.044532] [<ffffffff81653fca>] fbcon_cursor+0x18a/0x1d0 [ 13.044535] [<ffffffff816566b0>] ? update_attr.isra.2+0xa0/0xa0 [ 13.044556] [<ffffffff81754b82>] hide_cursor+0x32/0xa0 [ 13.044565] [<ffffffff81755bd3>] vt_console_print+0x103/0x3b0 [ 13.044569] [<ffffffff810d58ac>] ? print_time+0x9c/0xb0 [ 13.044576] [<ffffffff810d5960>] ? print_prefix+0xa0/0xc0 [ 13.044580] [<ffffffff810d63f6>] call_console_drivers.constprop.6+0x146/0x1f0 [ 13.044593] [<ffffffff815f9b38>] ? do_raw_spin_unlock+0xc8/0x100 [ 13.044597] [<ffffffff810d6f27>] console_unlock+0x2f7/0x460 [ 13.044600] [<ffffffff810d787a>] vprintk_emit+0x59a/0x5e0 [ 13.044615] [<ffffffff81fb676c>] printk+0x4d/0x4f [ 13.044650] [<ffffffff82ba5511>] print_local_APIC+0x28/0x41c [ 13.044672] [<ffffffff8114db55>] generic_smp_call_function_single_interrupt+0x145/0x2b0 [ 13.044688] [<ffffffff8106f9e7>] smp_call_function_single_interrupt+0x27/0x40 [ 13.044697] [<ffffffff81fd8f72>] call_function_single_interrupt+0x72/0x80 [ 13.044707] <EOI> [<ffffffff81078166>] ? native_safe_halt+0x6/0x10 [ 13.044717] [<ffffffff811425cd>] ? trace_hardirqs_on+0xd/0x10 [ 13.044738] [<ffffffff8104f669>] default_idle+0x59/0x120 [ 13.044742] [<ffffffff810501e8>] arch_cpu_idle+0x18/0x40 [ 13.044754] [<ffffffff811320c5>] cpu_startup_entry+0x235/0x410 [ 13.044763] [<ffffffff81f9e781>] rest_init+0xd1/0xe0 [ 13.044766] [<ffffffff81f9e6b5>] ? rest_init+0x5/0xe0 [ 13.044778] [<ffffffff82b93ec2>] start_kernel+0x425/0x493 [ 13.044781] [<ffffffff82b93810>] ? repair_env_string+0x5e/0x5e [ 13.044786] [<ffffffff82b93595>] x86_64_start_reservations+0x2a/0x2c [ 13.044789] [<ffffffff82b93688>] x86_64_start_kernel+0xf1/0x100 [ 13.044799] ---[ end trace 113ad28772af4058 ]--- Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Dave Airlie <airlied@redhat.com>

Several people reported the warning: "kernel BUG at kernel/timer.c:729!" and the stack trace is: #7 [ffff880214d25c10] mod_timer+501 at ffffffff8106d905 #8 [ffff880214d25c50] br_multicast_del_pg.isra.20+261 at ffffffffa0731d25 [bridge] #9 [ffff880214d25c80] br_multicast_disable_port+88 at ffffffffa0732948 [bridge] #10 [ffff880214d25cb0] br_stp_disable_port+154 at ffffffffa072bcca [bridge] #11 [ffff880214d25ce8] br_device_event+520 at ffffffffa072a4e8 [bridge] #12 [ffff880214d25d18] notifier_call_chain+76 at ffffffff8164aafc #13 [ffff880214d25d50] raw_notifier_call_chain+22 at ffffffff810858f6 #14 [ffff880214d25d60] call_netdevice_notifiers+45 at ffffffff81536aad #15 [ffff880214d25d80] dev_close_many+183 at ffffffff81536d17 #16 [ffff880214d25dc0] rollback_registered_many+168 at ffffffff81537f68 #17 [ffff880214d25de8] rollback_registered+49 at ffffffff81538101 #18 [ffff880214d25e10] unregister_netdevice_queue+72 at ffffffff815390d8 #19 [ffff880214d25e30] __tun_detach+272 at ffffffffa074c2f0 [tun] #20 [ffff880214d25e88] tun_chr_close+45 at ffffffffa074c4bd [tun] #21 [ffff880214d25ea8] __fput+225 at ffffffff8119b1f1 #22 [ffff880214d25ef0] ____fput+14 at ffffffff8119b3fe #23 [ffff880214d25f00] task_work_run+159 at ffffffff8107cf7f #24 [ffff880214d25f30] do_notify_resume+97 at ffffffff810139e1 #25 [ffff880214d25f50] int_signal+18 at ffffffff8164f292 this is due to I forgot to check if mp->timer is armed in br_multicast_del_pg(). This bug is introduced by commit 9f00b2e (bridge: only expire the mdb entry when query is received). Same for __br_mdb_del(). Tested-by: poma <pomidorabelisima@gmail.com> Reported-by: LiYonghua <809674045@qq.com> Reported-by: Robert Hancock <hancockrwd@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>

…s struct file The following call chain: ------------------------------------------------------------ nfs4_get_vfs_file - nfsd_open - dentry_open - do_dentry_open - __get_file_write_access - get_write_access - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY; ------------------------------------------------------------ can result in the following state: ------------------------------------------------------------ struct nfs4_file { ... fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0}, fi_access = {{ counter = 0x1 }, { counter = 0x0 }}, ... ------------------------------------------------------------ 1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NULL, hence nfsd_open() is called where we get status set to an error and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented. 2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented. Thus we leave a landmine in the form of the nfs4_file data structure in an incorrect state. 3) Eventually, when __nfs4_file_put_access() is called it finds fi_access[O_WRONLY] being non-zero, it decrements it and calls nfs4_file_put_fd() which tries to fput -ETXTBSY. ------------------------------------------------------------ ... [exception RIP: fput+0x9] RIP: ffffffff81177fa9 RSP: ffff88062e365c90 RFLAGS: 00010282 RAX: ffff880c2b3d99cc RBX: ffff880c2b3d9978 RCX: 0000000000000002 RDX: dead000000100101 RSI: 0000000000000001 RDI: ffffffffffffffe6 RBP: ffff88062e365c90 R8: ffff88041fe797d8 R9: ffff88062e365d58 R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000007 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd] #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd] #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd] #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd] #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd] #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd] #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd] #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc] #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc] #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd] #19 [ffff88062e365ee8] kthread at ffffffff81090886 #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a ------------------------------------------------------------ Cc: stable@vger.kernel.org Signed-off-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>

When booting secondary CPUs, announce_cpu() is called to show which cpu has been brought up. For example: [ 0.402751] smpboot: Booting Node 0, Processors #1 #2 #3 #4 #5 OK [ 0.525667] smpboot: Booting Node 1, Processors #6 #7 #8 #9 #10 #11 OK [ 0.755592] smpboot: Booting Node 0, Processors #12 #13 #14 #15 #16 #17 OK [ 0.890495] smpboot: Booting Node 1, Processors #18 #19 #20 #21 #22 #23 But the last "OK" is lost, because 'nr_cpu_ids-1' represents the maximum possible cpu id. It should use the maximum present cpu id in case not all CPUs booted up. Signed-off-by: Libin <huawei.libin@huawei.com> Cc: <guohanjun@huawei.com> Cc: <wangyijing@huawei.com> Cc: <fenghua.yu@intel.com> Cc: <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/1378378676-18276-1-git-send-email-huawei.libin@huawei.com [ tweaked the changelog, removed unnecessary line break, tweaked the format to align the fields vertically. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>

…s struct file commit e4daf1f upstream. The following call chain: ------------------------------------------------------------ nfs4_get_vfs_file - nfsd_open - dentry_open - do_dentry_open - __get_file_write_access - get_write_access - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY; ------------------------------------------------------------ can result in the following state: ------------------------------------------------------------ struct nfs4_file { ... fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0}, fi_access = {{ counter = 0x1 }, { counter = 0x0 }}, ... ------------------------------------------------------------ 1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NULL, hence nfsd_open() is called where we get status set to an error and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented. 2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented. Thus we leave a landmine in the form of the nfs4_file data structure in an incorrect state. 3) Eventually, when __nfs4_file_put_access() is called it finds fi_access[O_WRONLY] being non-zero, it decrements it and calls nfs4_file_put_fd() which tries to fput -ETXTBSY. ------------------------------------------------------------ ... [exception RIP: fput+0x9] RIP: ffffffff81177fa9 RSP: ffff88062e365c90 RFLAGS: 00010282 RAX: ffff880c2b3d99cc RBX: ffff880c2b3d9978 RCX: 0000000000000002 RDX: dead000000100101 RSI: 0000000000000001 RDI: ffffffffffffffe6 RBP: ffff88062e365c90 R8: ffff88041fe797d8 R9: ffff88062e365d58 R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000007 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd] #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd] #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd] #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd] #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd] #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd] #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd] #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc] #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc] #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd] #19 [ffff88062e365ee8] kthread at ffffffff81090886 #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a ------------------------------------------------------------ Signed-off-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

As the new x86 CPU bootup printout format code maintainer, I am taking immediate action to improve and clean (and thus indulge my OCD) the reporting of the cores when coming up online. Fix padding to a right-hand alignment, cleanup code and bind reporting width to the max number of supported CPUs on the system, like this: [ 0.074509] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 #6 #7 OK [ 0.644008] smpboot: Booting Node 1, Processors: #8 #9 #10 #11 #12 #13 #14 #15 OK [ 1.245006] smpboot: Booting Node 2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK [ 1.864005] smpboot: Booting Node 3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK [ 2.489005] smpboot: Booting Node 4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK [ 3.093005] smpboot: Booting Node 5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK [ 3.698005] smpboot: Booting Node 6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK [ 4.304005] smpboot: Booting Node 7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK [ 4.961413] Brought up 64 CPUs and this: [ 0.072367] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 #6 #7 OK [ 0.686329] Brought up 8 CPUs Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Libin <huawei.libin@huawei.com> Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>

Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 [ 0.603005] .... node #1, CPUs: #8 #9 #10 #11 #12 #13 #14 #15 [ 1.200005] .... node #2, CPUs: #16 #17 #18 #19 #20 #21 #22 #23 [ 1.796005] .... node #3, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 [ 2.393005] .... node #4, CPUs: #32 #33 #34 #35 #36 #37 #38 #39 [ 2.996005] .... node #5, CPUs: #40 #41 #42 #43 #44 #45 #46 #47 [ 3.600005] .... node #6, CPUs: #48 #49 #50 #51 #52 #53 #54 #55 [ 4.202005] .... node #7, CPUs: #56 #57 #58 #59 #60 #61 #62 #63 [ 4.811005] .... node #8, CPUs: #64 #65 #66 #67 #68 #69 #70 #71 [ 5.421006] .... node #9, CPUs: #72 #73 #74 #75 #76 #77 #78 #79 [ 6.032005] .... node #10, CPUs: #80 #81 #82 #83 #84 #85 #86 #87 [ 6.648006] .... node #11, CPUs: #88 #89 #90 #91 #92 #93 #94 #95 [ 7.262005] .... node #12, CPUs: #96 #97 #98 #99 #100 #101 #102 #103 [ 7.865005] .... node #13, CPUs: #104 #105 #106 #107 #108 #109 #110 #111 [ 8.466005] .... node #14, CPUs: #112 #113 #114 #115 #116 #117 #118 #119 [ 9.073006] .... node #15, CPUs: #120 #121 #122 #123 #124 #125 #126 #127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: huawei.libin@huawei.com Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>

…ssion() While running stress tests on adding and deleting ftrace instances I hit this bug: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: selinux_inode_permission+0x85/0x160 PGD 63681067 PUD 7ddbe067 PMD 0 Oops: 0000 [#1] PREEMPT CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006 task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000 RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160 RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840 RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000 RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54 R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000 R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000 FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0 Call Trace: security_inode_permission+0x1c/0x30 __inode_permission+0x41/0xa0 inode_permission+0x18/0x50 link_path_walk+0x66/0x920 path_openat+0xa6/0x6c0 do_filp_open+0x43/0xa0 do_sys_open+0x146/0x240 SyS_open+0x1e/0x20 system_call_fastpath+0x16/0x1b Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff RIP selinux_inode_permission+0x85/0x160 CR2: 0000000000000020 Investigating, I found that the inode->i_security was NULL, and the dereference of it caused the oops. in selinux_inode_permission(): isec = inode->i_security; rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd); Note, the crash came from stressing the deletion and reading of debugfs files. I was not able to recreate this via normal files. But I'm not sure they are safe. It may just be that the race window is much harder to hit. What seems to have happened (and what I have traced), is the file is being opened at the same time the file or directory is being deleted. As the dentry and inode locks are not held during the path walk, nor is the inodes ref counts being incremented, there is nothing saving these structures from being discarded except for an rcu_read_lock(). The rcu_read_lock() protects against freeing of the inode, but it does not protect freeing of the inode_security_struct. Now if the freeing of the i_security happens with a call_rcu(), and the i_security field of the inode is not changed (it gets freed as the inode gets freed) then there will be no issue here. (Linus Torvalds suggested not setting the field to NULL such that we do not need to check if it is NULL in the permission check). Note, this is a hack, but it fixes the problem at hand. A real fix is to restructure the destroy_inode() to call all the destructor handlers from the RCU callback. But that is a major job to do, and requires a lot of work. For now, we just band-aid this bug with this fix (it works), and work on a more maintainable solution in the future. Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

`comedi_free_board_dev()` is called (via `comedi_auto_unconfig()` --> `comedi_release_hardware_device()`) when an auto-configured comedi device is removed. This destroys the main sysfs class device and then calls `comedi_device_cleanup()` to clean up the comedi device. For comedi devices that have comedi subdevices that asynchronous commands, the clean up involves destroying the sysfs class devices associated with those subdevices. There is a bug in the above sequence because the sysfs class devices associated with the comedi subdevices are children of the sysfs class device associated with the main comedi device. Therefore they will have been automatically destroyed when the main sysfs class device is destroyed. When they are destroyed again as part of the clean-up, they will not be found, leading to a warning and a stack trace similar to this: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1213 at fs/sysfs/group.c:214 sysfs_remove_group+0x4e/0xa7() sysfs group ffffffff817504c0 not found for kobject 'comedi4_subd0' Modules linked in: nfsd auth_rpcgss oid_registry exportfs nfs_acl lockd bridge stp llc sunrpc fuse binfmt_misc cpufreq_userspace sr_mod snd_hda_codec_analog cdrom powernow_k8 kvm_amd kvm amplc_pci230(C) 8255(C) comedi(C) pcmcia xhci_hcd ehci_pci pcmcia_core ohci_pci ohci_hcd ehci_hcd usbcore snd_hda_intel snd_hda_codec snd_pcm k8temp snd_page_alloc 8139too snd_timer snd soundcore mii usb_common forcedeth pata_amd CPU: 1 PID: 1213 Comm: kworker/u4:6 Tainted: G C 3.13.0-rc5-ija1+ #20 Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010 Workqueue: sysfsd sysfs_schedule_callback_work 0000000000000000 ffff8800bf17fb38 ffffffff814672ce ffff8800bf17fb80 ffff8800bf17fb70 ffffffff8103470b ffffffff8114f780 0000000000000000 ffffffff817504c0 ffff8800bf39f410 ffff880139b68670 ffff8800bf17fbd0 Call Trace: [<ffffffff814672ce>] dump_stack+0x45/0x56 [<ffffffff8103470b>] warn_slowpath_common+0x7a/0x93 [<ffffffff8114f780>] ? sysfs_remove_group+0x4e/0xa7 [<ffffffff8103476b>] warn_slowpath_fmt+0x47/0x49 [<ffffffff8114e92d>] ? sysfs_get_dirent_ns+0x5e/0x66 [<ffffffff8114f780>] sysfs_remove_group+0x4e/0xa7 [<ffffffff8132aac0>] dpm_sysfs_remove+0x37/0x3b [<ffffffff81323781>] device_del+0x3e/0x173 [<ffffffff813238c3>] device_unregister+0xd/0x18 [<ffffffff8132392e>] device_destroy+0x33/0x37 [<ffffffffa0212086>] comedi_free_subdevice_minor+0x80/0x92 [comedi] [<ffffffffa02128bb>] comedi_device_detach+0x79/0x152 [comedi] [<ffffffffa020f223>] comedi_device_cleanup+0x36/0x57 [comedi] [<ffffffffa020f275>] comedi_free_board_dev+0x31/0x3c [comedi] [<ffffffffa0211f2a>] comedi_release_hardware_device+0x5a/0x73 [comedi] [<ffffffffa0212547>] comedi_auto_unconfig+0xe/0x10 [comedi] [<ffffffffa021357c>] comedi_pci_auto_unconfig+0x10/0x12 [comedi] [<ffffffff811d2335>] pci_device_remove+0x40/0x8a [<ffffffff813261d0>] __device_release_driver+0x84/0xda [<ffffffff81326244>] device_release_driver+0x1e/0x2b [<ffffffff811cdcb5>] pci_stop_bus_device+0x44/0x87 [<ffffffff811cdde2>] pci_stop_and_remove_bus_device+0xd/0x18 [<ffffffff811d3f3d>] remove_callback+0x20/0x2f [<ffffffff8114d1f7>] sysfs_schedule_callback_work+0xf/0x70 [<ffffffff81049498>] process_one_work+0x1d6/0x34c [<ffffffff81049a5f>] worker_thread+0x1cf/0x2b5 [<ffffffff81049890>] ? rescuer_thread+0x258/0x258 [<ffffffff8104e0e6>] kthread+0xd6/0xde [<ffffffff8104e010>] ? kthread_create_on_node+0x160/0x160 [<ffffffff81472cbc>] ret_from_fork+0x7c/0xb0 [<ffffffff8104e010>] ? kthread_create_on_node+0x160/0x160 ---[ end trace 94722aa2936a7adf ]--- To correct the bug, rearrange `comedi_free_board_dev()` to destroy the main sysfs class device *after* the clean-up operation. Thanks to Bernd Porr for finding the bug and his initial attempt to fix it. Reported-by: Bernd Porr <mail@berndporr.me.uk> Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Cc: Bernd Porr <mail@berndporr.me.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

When hibernating ->resume may not be called by usb core, but disconnect and probe instead, so we do not increase the counter after decreasing it in ->supend. As a result we free memory early, and get crash when unplugging usb dongle. BUG: unable to handle kernel paging request at 6b6b6b9f IP: [<c06909b0>] driver_sysfs_remove+0x10/0x30 *pdpt = 0000000034f21001 *pde = 0000000000000000 Pid: 20, comm: khubd Not tainted 3.1.0-rc1-wl+ torvalds#20 LENOVO 6369CTO/6369CTO EIP: 0060:[<c06909b0>] EFLAGS: 00010202 CPU: 1 EIP is at driver_sysfs_remove+0x10/0x30 EAX: 6b6b6b6b EBX: f52bba34 ECX: 00000000 EDX: 6b6b6b6b ESI: 6b6b6b6b EDI: c0a0ea20 EBP: f61c9e68 ESP: f61c9e64 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process khubd (pid: 20, ti=f61c8000 task=f6138270 task.ti=f61c8000) Call Trace: [<c06909ef>] __device_release_driver+0x1f/0xa0 [<c0690b20>] device_release_driver+0x20/0x40 [<c068fd64>] bus_remove_device+0x84/0xe0 [<c068e12a>] ? device_remove_attrs+0x2a/0x80 [<c068e267>] device_del+0xe7/0x170 [<c06d93d4>] usb_disconnect+0xd4/0x180 [<c06d9d61>] hub_thread+0x691/0x1600 [<c0473260>] ? wake_up_bit+0x30/0x30 [<c0442a39>] ? complete+0x49/0x60 [<c06d96d0>] ? hub_disconnect+0xd0/0xd0 [<c06d96d0>] ? hub_disconnect+0xd0/0xd0 [<c0472eb4>] kthread+0x74/0x80 [<c0472e40>] ? kthread_worker_fn+0x150/0x150 [<c0809b3e>] kernel_thread_helper+0x6/0x10 Cc: stable@kernel.org Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>

Errata E20: UART: Character Timeout interrupt remains set under certain software conditions. Implication: The software servicing the UART can be trapped in an infinite loop. Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

If the dummy evgen failed init, the irq allocation functions which assume init succeeded may still be called - causing an OOPS due to wrong assumption. Here's the oops: [ 3.914332] BUG: unable to handle kernel NULL pointer dereference at 0000000000000148 [ 3.915310] IP: [<ffffffff810b3008>] __lock_acquire+0xac/0xe50 [ 3.915310] PGD 0 [ 3.915310] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 3.915310] CPU 1 [ 3.915310] Pid: 1, comm: swapper Not tainted 3.2.0-rc2-sasha-00279-gd7bfb12-dirty torvalds#20 [ 3.915310] RIP: 0010:[<ffffffff810b3008>] [<ffffffff810b3008>] __lock_acquire+0xac/0xe50 [ 3.915310] RSP: 0018:ffff880012499bc0 EFLAGS: 00010046 [ 3.915310] RAX: 0000000000000086 RBX: ffff880012490000 RCX: 0000000000000000 [ 3.915310] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000148 [ 3.915310] RBP: ffff880012499c90 R08: 0000000000000002 R09: 0000000000000000 [ 3.915310] R10: 0000000000000148 R11: 0000000000000000 R12: 0000000000000148 [ 3.915310] R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000 [ 3.915310] FS: 0000000000000000(0000) GS:ffff880013c00000(0000) knlGS:0000000000000000 [ 3.915310] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 3.915310] CR2: 0000000000000148 CR3: 0000000002605000 CR4: 00000000000406e0 [ 3.915310] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.915310] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 3.915310] Process swapper (pid: 1, threadinfo ffff880012498000, task ffff880012490000) [ 3.915310] Stack: [ 3.915310] ffff880012490000 ffffffff81e6fd38 ffffffff00000000 0000000000000000 [ 3.915310] 0000000000000148 0000000012499c08 ffffffff00000000 000000000000002e [ 3.915310] 0000000000000001 ffff880012499ce0 ffffffff8161620e 0000000000000000 [ 3.915310] Call Trace: [ 3.915310] [<ffffffff81e6fd38>] ? retint_restore_args+0x13/0x13 [ 3.915310] [<ffffffff8161620e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 3.915310] [<ffffffff81e6fd38>] ? retint_restore_args+0x13/0x13 [ 3.915310] [<ffffffff81af8883>] ? iio_dummy_evgen_get_irq+0x33/0x8a [ 3.915310] [<ffffffff810b4255>] lock_acquire+0x8a/0xa7 [ 3.915310] [<ffffffff81af8883>] ? iio_dummy_evgen_get_irq+0x33/0x8a [ 3.915310] [<ffffffff81e6db81>] __mutex_lock_common+0x63/0x491 [ 3.915310] [<ffffffff81af8883>] ? iio_dummy_evgen_get_irq+0x33/0x8a [ 3.915310] [<ffffffff810b474d>] ? debug_check_no_locks_freed+0x135/0x14a [ 3.915310] [<ffffffff810b2c3a>] ? lock_is_held+0x92/0x9d [ 3.915310] [<ffffffff81e6dfe5>] mutex_lock_nested+0x36/0x3b [ 3.915310] [<ffffffff81af8883>] iio_dummy_evgen_get_irq+0x33/0x8a [ 3.915310] [<ffffffff81af8594>] iio_simple_dummy_events_register+0x1b/0x69 [ 3.915310] [<ffffffff82ad4a91>] iio_dummy_init+0x105/0x18d [ 3.915310] [<ffffffff82ad498c>] ? iio_init+0x7d/0x7d [ 3.915310] [<ffffffff82a8dc02>] do_one_initcall+0x7a/0x135 [ 3.915310] [<ffffffff82a8dda7>] kernel_init+0xea/0x16f [ 3.915310] [<ffffffff81e727c4>] kernel_thread_helper+0x4/0x10 [ 3.915310] [<ffffffff81e6fd38>] ? retint_restore_args+0x13/0x13 [ 3.915310] [<ffffffff82a8dcbd>] ? do_one_initcall+0x135/0x135 [ 3.915310] [<ffffffff81e727c0>] ? gs_change+0x13/0x13 [ 3.915310] Code: 95 50 ff ff ff 74 24 e8 1f 3f 56 00 85 c0 0f 84 4e 0d 00 00 be cf 0b 00 00 83 3d 63 7c 58 02 00 0f 85 3c 0d 00 00 e9 c1 0c 00 00 [ 3.915310] 81 3a a0 17 ca 82 b8 01 00 00 00 44 0f 44 e8 83 fe 01 77 0c [ 3.915310] RIP [<ffffffff810b3008>] __lock_acquire+0xac/0xe50 [ 3.915310] RSP <ffff880012499bc0> [ 3.915310] CR2: 0000000000000148 Acked-by: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not update the real num tx queues. netdev_queue_update_kobjects() is already called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when upper layer driver, e.g., FCoE protocol stack is monitoring the netdev event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove extra queues allocated for FCoE, the associated txq sysfs kobjects are already removed, and trying to update the real num queues would cause something like below: ... PID: 25138 TASK: ffff88021e64c440 CPU: 3 COMMAND: "kworker/3:3" #0 [ffff88021f007760] machine_kexec at ffffffff810226d9 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d #2 [ffff88021f0078a0] oops_end at ffffffff813bca78 #3 [ffff88021f0078d0] no_context at ffffffff81029e72 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e torvalds#6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e torvalds#7 [ffff88021f007b10] page_fault at ffffffff813bc045 [exception RIP: sysfs_find_dirent+17] RIP: ffffffff81178611 RSP: ffff88021f007bc0 RFLAGS: 00010246 RAX: ffff88021e64c440 RBX: ffffffff8156cc63 RCX: 0000000000000004 RDX: ffffffff8156cc63 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88021f007be0 R8: 0000000000000004 R9: 0000000000000008 R10: ffffffff816fed00 R11: 0000000000000004 R12: 0000000000000000 R13: ffffffff8156cc63 R14: 0000000000000000 R15: ffff8802222a0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07 torvalds#9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27 torvalds#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9 torvalds#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38 torvalds#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe] torvalds#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe] torvalds#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe] torvalds#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q] torvalds#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe] torvalds#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe] torvalds#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca torvalds#19 [ffff88021f007e68] worker_thread at ffffffff81060513 torvalds#20 [ffff88021f007ee8] kthread at ffffffff810648b6 torvalds#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4 Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Printing the "start_ip" for every secondary cpu is very noisy on a large system - and doesn't add any value. Drop this message. Console log before: Booting Node 0, Processors #1 smpboot cpu 1: start_ip = 96000 #2 smpboot cpu 2: start_ip = 96000 #3 smpboot cpu 3: start_ip = 96000 #4 smpboot cpu 4: start_ip = 96000 ... torvalds#31 smpboot cpu 31: start_ip = 96000 Brought up 32 CPUs Console log after: Booting Node 0, Processors #1 #2 #3 #4 #5 torvalds#6 torvalds#7 Ok. Booting Node 1, Processors torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 Ok. Booting Node 0, Processors torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 Ok. Booting Node 1, Processors torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 Brought up 32 CPUs Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4f452eb42507460426@agluck-desktop.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>

As the V4L2 based UVC webcam gadget (g_webcam) expects the 'videodev' to present when the 'webcam_bind' routine is called, so 'videodev' should be available as early as possible. Now, when 'g_webcam' is built as a module (i.e. not a part of kernel) the late availability of 'videodev' is OK, but if 'g_webcam' is built statically as a part of the kernel, the kernel crashes (a sample crash dump using Designware 2.0 UDC is provided below). To solve the same, this patch makes 'videodev_init' as a subsys initcall. Kernel Crash Dump: ------------------ designware_udc designware_udc: Device Synopsys UDC probed csr 90810000: plug 9081200 g_webcam gadget: uvc_function_bind Unable to handle kernel NULL pointer dereference at virtual address 000000e4 pgd = 80004000 [000000e4] *pgd=00000000 Internal error: Oops: 5 [#1] SMP Modules linked in: CPU: 0 Not tainted (3.3.0-rc3-13888-ge774c03-dirty torvalds#20) PC is at do_raw_spin_lock+0x10/0x16c LR is at _raw_spin_lock+0x10/0x14 pc : [<8019e344>] lr : [<804095c0>] psr: 60000013 sp : 8f839d20 ip : 8f839d50 fp : 8f839d4c r10: 80760a94 r9 : 8042de98 r8 : 00000154 r7 : 80760e94 r6 : 805cfc10 r5 : 8fb6a008 r4 : 8fb6a008 r3 : 805dd0c8 r2 : 8f839d48 r1 : 805cfc08 r0 : 000000e0 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5387d Table: 0000404a DAC: 00000015 Process swapper/0 (pid: 1, stack limit = 0x8f8382f0) Stack: (0x8f839d20 to 0x8f83a000) 9d20: ffffffff ffffffff 8fb6a008 8fb6a008 805cfc10 80760e94 00000154 8042de98 9d40: 8f839d5c 8f839d50 804095c0 8019e340 8f839d7c 8f839d60 80222b28 804095bc 9d60: 8fb12b80 8fb6a008 8fb6a010 805cfc08 8f839dc4 8f839d80 80223db8 80222adc 9d80: 8f839da 8f839d90 8022baa0 8019e2e8 8fb6a008 8075e7f4 8fb6a008 8fb6a008 9da0: 00000000 8fb6a008 80760e94 00000154 8042de98 80760a94 8f839ddc 8f839dc8 9dc0: 802242a8 80223d1c 8fb12b80 8fb6a000 8f839e1c 8f839de0 8030132c 80224298 9de0: 80223ce8 803f2ee8 00000001 804f7750 8f839e1c 8f824008 805cff20 8f824000 9e00: 8fb6a000 ffffffff 00000000 8f8d4880 8f839e4c 8f839e20 80562e3c 80301100 9e20: 00000000 8fb13140 8f824008 805cff20 8042aa68 8f824000 8042aa8c 805e4d40 9e40: 8f839e64 8f839e50 802d20c4 80562ba8 805d0058 805cff20 8f839e8c 8f839e68 9e60: 80563034 802d206c 8042aa8c 805cff20 8f8d4880 00000000 805cfc08 8fb12a40 9e80: 8f839e9c 8f839e90 805630c4 80562ec4 8f839ebc 8f839ea0 802d2364 805630b0 9ea0: 805cfeac 8f8d4880 805cfbe8 807605e8 8f839ed4 8f839ec0 80562b3c 802d22cc 9ec0: 80562ac4 8f8d4880 8f839f04 8f839ed8 802d0b40 80562ad0 8f839ef4 805cff90 9ee0: 805cff90 805cfb98 00000000 00000000 805cfbe8 805e4d40 8f839f3c 8f839f08 9f00: 802cd078 802d0a1 00000000 802d0a0c 00000000 8fb9ba00 802d0a0c 805cff90 9f20: 00000013 00000000 00000000 805e4d40 8f839f5c 8f839f40 802cf390 802ccff0 9f40: 00000003 00000003 804fb598 00000000 8f839f74 8f839f60 802d2554 802cf2a0 9f60: 8f838000 8057731c 8f839f84 8f839f78 80562b90 802d24d0 8f839fdc 8f839f88 9f80: 800085d4 80562b84 805af2ac 805af2ac 80562b78 00000000 00000013 00000000 9fa0: 00000000 00000000 8f839fc4 8f839fb8 80043dd0 8057706c 8057731c 8002875c 9fc0: 00000013 00000000 00000000 00000000 8f839ff4 8f839fe0 805468d4 800085a0 9fe0: 00000000 80546840 00000000 8f839ff8 8002875c 8054684c 51155555 55545555 Backtrace: [<8019e334>] (do_raw_spin_lock+0x0/0x16c) from [<804095c0>] (_raw_spin_lock+0x10/0x14) r9:8042de98 r8:00000154 r7:80760e94 r6:805cfc10 r5:8fb6a008 r4:8fb6a008 [<804095b0>] (_raw_spin_lock+0x0/0x14) from [<80222b28>] (get_device_parent+0x58/0x1c0) [<80222ad0>] (get_device_parent+0x0/0x1c0) from [<80223db8>] (device_add+0xa8/0x57c) r6:805cfc08 r5:8fb6a010 r4:8fb6a008 r3:8fb12b80 [<80223d10>] (device_add+0x0/0x57c) from [<802242a8>] (device_register+0x1c/0x20) [<8022428c>] (device_register+0x0/0x20) from [<8030132c>] (__video_register_device+0x238/0x484) r4:8fb6a000 r3:8fb12b80 [<803010f4>] (__video_register_device+0x0/0x484) from [<80562e3c>] (uvc_function_bind+0x2a0/0x31c) [<80562b9c>] (uvc_function_bind+0x0/0x31c) from [<802d20c4>] (usb_add_function+0x64/0x118) [<802d2060>] (usb_add_function+0x0/0x118) from [<80563034>] (uvc_bind_config+0x17c/0x1ec) r5:805cff20 r4:805d0058 [<80562eb8>] (uvc_bind_config+0x0/0x1ec) from [<805630c4>] (webcam_config_bind+0x20/0x28) r8:8fb12a40 r7:805cfc08 r6:00000000 r5:8f8d4880 r4:805cff20 r3:8042aa8c [<805630a4>] (webcam_config_bind+0x0/0x28) from [<802d2364>] (usb_add_config+0xa4/0x124) [<802d22c0>] (usb_add_config+0x0/0x124) from [<80562b3c>] (webcam_bind+0x78/0xb4) r6:807605e8 r5:805cfbe8 r4:8f8d4880 r3:805cfeac [<80562ac4>] (webcam_bind+0x0/0xb4) from [<802d0b40>] (composite_bind+0x134/0x308) r4:8f8d4880 r3:80562ac4 [<802d0a0c>] (composite_bind+0x0/0x308) from [<802cd078>] (dw_udc_start+0x94/0x2bc) [<802ccfe4>] (dw_udc_start+0x0/0x2bc) from [<802cf390>] (usb_gadget_probe_driver+0xfc/0x180) [<802cf294>] (usb_gadget_probe_driver+0x0/0x180) from [<802d2554>] (usb_composite_probe+0x90/0xb4) r6:00000000 r5:804fb598 r4:00000003 r3:00000003 [<802d24c4>] (usb_composite_probe+0x0/0xb4) from [<80562b90>] (webcam_init+0x18/0x24) r5:8057731c r4:8f838000 [<80562b78>] (webcam_init+0x0/0x24) from [<800085d4>] (do_one_initcall+0x40/0x184) [<80008594>] (do_one_initcall+0x0/0x184) from [<805468d4>] (kernel_init+0x94/0x134) [<80546840>] (kernel_init+0x0/0x134) from [<8002875c>] (do_exit+0x0/0x6f8) r5:80546840 r4:00000000 Code: e1a0c00d e92ddbf0 e24cb004 e24dd008 (e5902004) ---[ end trace 7ecca37f36fbdc06 ]--- Signed-off-by: Bhupesh Sharma <bhupesh.sharma@st.com> Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

…ssion() While running stress tests on adding and deleting ftrace instances I hit this bug: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: selinux_inode_permission+0x85/0x160 PGD 63681067 PUD 7ddbe067 PMD 0 Oops: 0000 [#1] PREEMPT CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006 task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000 RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160 RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840 RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000 RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54 R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000 R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000 FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0 Call Trace: security_inode_permission+0x1c/0x30 __inode_permission+0x41/0xa0 inode_permission+0x18/0x50 link_path_walk+0x66/0x920 path_openat+0xa6/0x6c0 do_filp_open+0x43/0xa0 do_sys_open+0x146/0x240 SyS_open+0x1e/0x20 system_call_fastpath+0x16/0x1b Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff RIP selinux_inode_permission+0x85/0x160 CR2: 0000000000000020 Investigating, I found that the inode->i_security was NULL, and the dereference of it caused the oops. in selinux_inode_permission(): isec = inode->i_security; rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd); Note, the crash came from stressing the deletion and reading of debugfs files. I was not able to recreate this via normal files. But I'm not sure they are safe. It may just be that the race window is much harder to hit. What seems to have happened (and what I have traced), is the file is being opened at the same time the file or directory is being deleted. As the dentry and inode locks are not held during the path walk, nor is the inodes ref counts being incremented, there is nothing saving these structures from being discarded except for an rcu_read_lock(). The rcu_read_lock() protects against freeing of the inode, but it does not protect freeing of the inode_security_struct. Now if the freeing of the i_security happens with a call_rcu(), and the i_security field of the inode is not changed (it gets freed as the inode gets freed) then there will be no issue here. (Linus Torvalds suggested not setting the field to NULL such that we do not need to check if it is NULL in the permission check). Note, this is a hack, but it fixes the problem at hand. A real fix is to restructure the destroy_inode() to call all the destructor handlers from the RCU callback. But that is a major job to do, and requires a lot of work. For now, we just band-aid this bug with this fix (it works), and work on a more maintainable solution in the future. Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

…ssion() commit 3dc91d4 upstream. While running stress tests on adding and deleting ftrace instances I hit this bug: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: selinux_inode_permission+0x85/0x160 PGD 63681067 PUD 7ddbe067 PMD 0 Oops: 0000 [#1] PREEMPT CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006 task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000 RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160 RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840 RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000 RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54 R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000 R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000 FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0 Call Trace: security_inode_permission+0x1c/0x30 __inode_permission+0x41/0xa0 inode_permission+0x18/0x50 link_path_walk+0x66/0x920 path_openat+0xa6/0x6c0 do_filp_open+0x43/0xa0 do_sys_open+0x146/0x240 SyS_open+0x1e/0x20 system_call_fastpath+0x16/0x1b Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff RIP selinux_inode_permission+0x85/0x160 CR2: 0000000000000020 Investigating, I found that the inode->i_security was NULL, and the dereference of it caused the oops. in selinux_inode_permission(): isec = inode->i_security; rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd); Note, the crash came from stressing the deletion and reading of debugfs files. I was not able to recreate this via normal files. But I'm not sure they are safe. It may just be that the race window is much harder to hit. What seems to have happened (and what I have traced), is the file is being opened at the same time the file or directory is being deleted. As the dentry and inode locks are not held during the path walk, nor is the inodes ref counts being incremented, there is nothing saving these structures from being discarded except for an rcu_read_lock(). The rcu_read_lock() protects against freeing of the inode, but it does not protect freeing of the inode_security_struct. Now if the freeing of the i_security happens with a call_rcu(), and the i_security field of the inode is not changed (it gets freed as the inode gets freed) then there will be no issue here. (Linus Torvalds suggested not setting the field to NULL such that we do not need to check if it is NULL in the permission check). Note, this is a hack, but it fixes the problem at hand. A real fix is to restructure the destroy_inode() to call all the destructor handlers from the RCU callback. But that is a major job to do, and requires a lot of work. For now, we just band-aid this bug with this fix (it works), and work on a more maintainable solution in the future. Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 543cc38 upstream. When hibernating ->resume may not be called by usb core, but disconnect and probe instead, so we do not increase the counter after decreasing it in ->supend. As a result we free memory early, and get crash when unplugging usb dongle. BUG: unable to handle kernel paging request at 6b6b6b9f IP: [<c06909b0>] driver_sysfs_remove+0x10/0x30 *pdpt = 0000000034f21001 *pde = 0000000000000000 Pid: 20, comm: khubd Not tainted 3.1.0-rc1-wl+ linux-sunxi#20 LENOVO 6369CTO/6369CTO EIP: 0060:[<c06909b0>] EFLAGS: 00010202 CPU: 1 EIP is at driver_sysfs_remove+0x10/0x30 EAX: 6b6b6b6b EBX: f52bba34 ECX: 00000000 EDX: 6b6b6b6b ESI: 6b6b6b6b EDI: c0a0ea20 EBP: f61c9e68 ESP: f61c9e64 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process khubd (pid: 20, ti=f61c8000 task=f6138270 task.ti=f61c8000) Call Trace: [<c06909ef>] __device_release_driver+0x1f/0xa0 [<c0690b20>] device_release_driver+0x20/0x40 [<c068fd64>] bus_remove_device+0x84/0xe0 [<c068e12a>] ? device_remove_attrs+0x2a/0x80 [<c068e267>] device_del+0xe7/0x170 [<c06d93d4>] usb_disconnect+0xd4/0x180 [<c06d9d61>] hub_thread+0x691/0x1600 [<c0473260>] ? wake_up_bit+0x30/0x30 [<c0442a39>] ? complete+0x49/0x60 [<c06d96d0>] ? hub_disconnect+0xd0/0xd0 [<c06d96d0>] ? hub_disconnect+0xd0/0xd0 [<c0472eb4>] kthread+0x74/0x80 [<c0472e40>] ? kthread_worker_fn+0x150/0x150 [<c0809b3e>] kernel_thread_helper+0x6/0x10 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

The test_generic_metric() missed to release entries in the pctx. Asan reported following leak (and more): Direct leak of 128 byte(s) in 1 object(s) allocated from: #0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e) #1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14) #2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497) #3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111 #4 0x55f7e7341667 in expr__add_ref util/expr.c:120 #5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783 #6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858 #7 0x55f7e712390b in compute_single tests/parse-metric.c:128 #8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180 #9 0x55f7e712446d in compute_metric tests/parse-metric.c:196 #10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295 #11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355 #12 0x55f7e70be09b in run_test tests/builtin-test.c:410 #13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440 #14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661 #15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807 #16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: 6d432c4 ("perf tools: Add test_generic_metric function") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

This fix is for a failure that occurred in the DWARF unwind perf test. Stack unwinders may probe memory when looking for frames. Memory sanitizer will poison and track uninitialized memory on the stack, and on the heap if the value is copied to the heap. This can lead to false memory sanitizer failures for the use of an uninitialized value. Avoid this problem by removing the poison on the copied stack. The full msan failure with track origins looks like: ==2168==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x559ceb10755b in handle_cfi elfutils/libdwfl/frame_unwind.c:648:8 #1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4 #2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7 #3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10 #4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17 #5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17 #6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14 linux-sunxi#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10 linux-sunxi#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8 linux-sunxi#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8 linux-sunxi#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 linux-sunxi#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) linux-sunxi#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 linux-sunxi#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 linux-sunxi#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 linux-sunxi#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 linux-sunxi#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 linux-sunxi#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 linux-sunxi#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 linux-sunxi#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 linux-sunxi#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 linux-sunxi#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 linux-sunxi#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 linux-sunxi#23 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was stored to memory at #0 0x559ceb106acf in __libdwfl_frame_reg_set elfutils/libdwfl/frame_unwind.c:77:22 #1 0x559ceb106acf in handle_cfi elfutils/libdwfl/frame_unwind.c:627:13 #2 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4 #3 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7 #4 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10 #5 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17 #6 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17 linux-sunxi#7 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14 linux-sunxi#8 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10 linux-sunxi#9 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8 linux-sunxi#10 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8 linux-sunxi#11 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 linux-sunxi#12 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) linux-sunxi#13 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 linux-sunxi#14 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 linux-sunxi#15 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 linux-sunxi#16 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 linux-sunxi#17 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 linux-sunxi#18 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 linux-sunxi#19 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 linux-sunxi#20 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 linux-sunxi#21 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 linux-sunxi#22 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 linux-sunxi#23 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 linux-sunxi#24 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was stored to memory at #0 0x559ceb106a54 in handle_cfi elfutils/libdwfl/frame_unwind.c:613:9 #1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4 #2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7 #3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10 #4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17 #5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17 #6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14 linux-sunxi#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10 linux-sunxi#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8 linux-sunxi#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8 linux-sunxi#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 linux-sunxi#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) linux-sunxi#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 linux-sunxi#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 linux-sunxi#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 linux-sunxi#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 linux-sunxi#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 linux-sunxi#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 linux-sunxi#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 linux-sunxi#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 linux-sunxi#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 linux-sunxi#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 linux-sunxi#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 linux-sunxi#23 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was stored to memory at #0 0x559ceaff8800 in memory_read tools/perf/util/unwind-libdw.c:156:10 #1 0x559ceb10f053 in expr_eval elfutils/libdwfl/frame_unwind.c:501:13 #2 0x559ceb1060cc in handle_cfi elfutils/libdwfl/frame_unwind.c:603:18 #3 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4 #4 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7 #5 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10 #6 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17 linux-sunxi#7 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17 linux-sunxi#8 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14 linux-sunxi#9 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10 linux-sunxi#10 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8 linux-sunxi#11 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8 linux-sunxi#12 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 linux-sunxi#13 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) linux-sunxi#14 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 linux-sunxi#15 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 linux-sunxi#16 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 linux-sunxi#17 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 linux-sunxi#18 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 linux-sunxi#19 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 linux-sunxi#20 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 linux-sunxi#21 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 linux-sunxi#22 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 linux-sunxi#23 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 linux-sunxi#24 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 linux-sunxi#25 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was stored to memory at #0 0x559cea9027d9 in __msan_memcpy llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1558:3 #1 0x559cea9d2185 in sample_ustack tools/perf/arch/x86/tests/dwarf-unwind.c:41:2 #2 0x559cea9d202c in test__arch_unwind_sample tools/perf/arch/x86/tests/dwarf-unwind.c:72:9 #3 0x559ceabc9cbd in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:106:6 #4 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 #5 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) #6 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 linux-sunxi#7 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 linux-sunxi#8 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 linux-sunxi#9 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 linux-sunxi#10 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 linux-sunxi#11 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 linux-sunxi#12 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 linux-sunxi#13 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 linux-sunxi#14 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 linux-sunxi#15 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 linux-sunxi#16 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 linux-sunxi#17 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was created by an allocation of 'bf' in the stack frame of function 'perf_event__synthesize_mmap_events' #0 0x559ceafc5f60 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:445 SUMMARY: MemorySanitizer: use-of-uninitialized-value elfutils/libdwfl/frame_unwind.c:648:8 in handle_cfi Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: clang-built-linux@googlegroups.com Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandeep Dasgupta <sdasgup@google.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201113182053.754625-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Calling btrfs_qgroup_reserve_meta_prealloc from btrfs_delayed_inode_reserve_metadata can result in flushing delalloc while holding a transaction and delayed node locks. This is deadlock prone. In the past multiple commits: * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're already holding a transaction") * 6f23277 ("btrfs: qgroup: don't commit transaction when we already hold the handle") Tried to solve various aspects of this but this was always a whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock scenario involving btrfs_delayed_node::mutex. Namely, one thread can call btrfs_dirty_inode as a result of reading a file and modifying its atime: PID: 6963 TASK: ffff8c7f3f94c000 CPU: 2 COMMAND: "test" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_timeout at ffffffffa52a1bdd #3 wait_for_completion at ffffffffa529eeea <-- sleeps with delayed node mutex held #4 start_delalloc_inodes at ffffffffc0380db5 #5 btrfs_start_delalloc_snapshot at ffffffffc0393836 #6 try_flush_qgroup at ffffffffc03f04b2 linux-sunxi#7 __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6 <-- tries to reserve space and starts delalloc inodes. linux-sunxi#8 btrfs_delayed_update_inode at ffffffffc03e31aa <-- acquires delayed node mutex linux-sunxi#9 btrfs_update_inode at ffffffffc0385ba8 linux-sunxi#10 btrfs_dirty_inode at ffffffffc038627b <-- TRANSACTIION OPENED linux-sunxi#11 touch_atime at ffffffffa4cf0000 linux-sunxi#12 generic_file_read_iter at ffffffffa4c1f123 linux-sunxi#13 new_sync_read at ffffffffa4ccdc8a linux-sunxi#14 vfs_read at ffffffffa4cd0849 linux-sunxi#15 ksys_read at ffffffffa4cd0bd1 linux-sunxi#16 do_syscall_64 at ffffffffa4a052eb linux-sunxi#17 entry_SYSCALL_64_after_hwframe at ffffffffa540008c This will cause an asynchronous work to flush the delalloc inodes to happen which can try to acquire the same delayed_node mutex: PID: 455 TASK: ffff8c8085fa4000 CPU: 5 COMMAND: "kworker/u16:30" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_preempt_disabled at ffffffffa529e80a #3 __mutex_lock at ffffffffa529fdcb <-- goes to sleep, never wakes up. #4 btrfs_delayed_update_inode at ffffffffc03e3143 <-- tries to acquire the mutex #5 btrfs_update_inode at ffffffffc0385ba8 <-- this is the same inode that pid 6963 is holding #6 cow_file_range_inline.constprop.78 at ffffffffc0386be7 linux-sunxi#7 cow_file_range at ffffffffc03879c1 linux-sunxi#8 btrfs_run_delalloc_range at ffffffffc038894c linux-sunxi#9 writepage_delalloc at ffffffffc03a3c8f linux-sunxi#10 __extent_writepage at ffffffffc03a4c01 linux-sunxi#11 extent_write_cache_pages at ffffffffc03a500b linux-sunxi#12 extent_writepages at ffffffffc03a6de2 linux-sunxi#13 do_writepages at ffffffffa4c277eb linux-sunxi#14 __filemap_fdatawrite_range at ffffffffa4c1e5bb linux-sunxi#15 btrfs_run_delalloc_work at ffffffffc0380987 <-- starts running delayed nodes linux-sunxi#16 normal_work_helper at ffffffffc03b706c linux-sunxi#17 process_one_work at ffffffffa4aba4e4 linux-sunxi#18 worker_thread at ffffffffa4aba6fd linux-sunxi#19 kthread at ffffffffa4ac0a3d linux-sunxi#20 ret_from_fork at ffffffffa54001ff To fully address those cases the complete fix is to never issue any flushing while holding the transaction or the delayed node lock. This patch achieves it by calling qgroup_reserve_meta directly which will either succeed without flushing or will fail and return -EDQUOT. In the latter case that return value is going to be propagated to btrfs_dirty_inode which will fallback to start a new transaction. That's fine as the majority of time we expect the inode will have BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly copying the in-memory state. Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

commit 4d14c5c upstream Calling btrfs_qgroup_reserve_meta_prealloc from btrfs_delayed_inode_reserve_metadata can result in flushing delalloc while holding a transaction and delayed node locks. This is deadlock prone. In the past multiple commits: * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're already holding a transaction") * 6f23277 ("btrfs: qgroup: don't commit transaction when we already hold the handle") Tried to solve various aspects of this but this was always a whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock scenario involving btrfs_delayed_node::mutex. Namely, one thread can call btrfs_dirty_inode as a result of reading a file and modifying its atime: PID: 6963 TASK: ffff8c7f3f94c000 CPU: 2 COMMAND: "test" #0 __schedule at ffffffffa529e07d jwrdegoede#1 schedule at ffffffffa529e4ff jwrdegoede#2 schedule_timeout at ffffffffa52a1bdd jwrdegoede#3 wait_for_completion at ffffffffa529eeea <-- sleeps with delayed node mutex held jwrdegoede#4 start_delalloc_inodes at ffffffffc0380db5 jwrdegoede#5 btrfs_start_delalloc_snapshot at ffffffffc0393836 jwrdegoede#6 try_flush_qgroup at ffffffffc03f04b2 linux-sunxi#7 __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6 <-- tries to reserve space and starts delalloc inodes. linux-sunxi#8 btrfs_delayed_update_inode at ffffffffc03e31aa <-- acquires delayed node mutex linux-sunxi#9 btrfs_update_inode at ffffffffc0385ba8 linux-sunxi#10 btrfs_dirty_inode at ffffffffc038627b <-- TRANSACTIION OPENED linux-sunxi#11 touch_atime at ffffffffa4cf0000 linux-sunxi#12 generic_file_read_iter at ffffffffa4c1f123 linux-sunxi#13 new_sync_read at ffffffffa4ccdc8a linux-sunxi#14 vfs_read at ffffffffa4cd0849 linux-sunxi#15 ksys_read at ffffffffa4cd0bd1 linux-sunxi#16 do_syscall_64 at ffffffffa4a052eb linux-sunxi#17 entry_SYSCALL_64_after_hwframe at ffffffffa540008c This will cause an asynchronous work to flush the delalloc inodes to happen which can try to acquire the same delayed_node mutex: PID: 455 TASK: ffff8c8085fa4000 CPU: 5 COMMAND: "kworker/u16:30" #0 __schedule at ffffffffa529e07d jwrdegoede#1 schedule at ffffffffa529e4ff jwrdegoede#2 schedule_preempt_disabled at ffffffffa529e80a jwrdegoede#3 __mutex_lock at ffffffffa529fdcb <-- goes to sleep, never wakes up. jwrdegoede#4 btrfs_delayed_update_inode at ffffffffc03e3143 <-- tries to acquire the mutex jwrdegoede#5 btrfs_update_inode at ffffffffc0385ba8 <-- this is the same inode that pid 6963 is holding jwrdegoede#6 cow_file_range_inline.constprop.78 at ffffffffc0386be7 linux-sunxi#7 cow_file_range at ffffffffc03879c1 linux-sunxi#8 btrfs_run_delalloc_range at ffffffffc038894c linux-sunxi#9 writepage_delalloc at ffffffffc03a3c8f linux-sunxi#10 __extent_writepage at ffffffffc03a4c01 linux-sunxi#11 extent_write_cache_pages at ffffffffc03a500b linux-sunxi#12 extent_writepages at ffffffffc03a6de2 linux-sunxi#13 do_writepages at ffffffffa4c277eb linux-sunxi#14 __filemap_fdatawrite_range at ffffffffa4c1e5bb linux-sunxi#15 btrfs_run_delalloc_work at ffffffffc0380987 <-- starts running delayed nodes linux-sunxi#16 normal_work_helper at ffffffffc03b706c linux-sunxi#17 process_one_work at ffffffffa4aba4e4 linux-sunxi#18 worker_thread at ffffffffa4aba6fd linux-sunxi#19 kthread at ffffffffa4ac0a3d linux-sunxi#20 ret_from_fork at ffffffffa54001ff To fully address those cases the complete fix is to never issue any flushing while holding the transaction or the delayed node lock. This patch achieves it by calling qgroup_reserve_meta directly which will either succeed without flushing or will fail and return -EDQUOT. In the latter case that return value is going to be propagated to btrfs_dirty_inode which will fallback to start a new transaction. That's fine as the majority of time we expect the inode will have BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly copying the in-memory state. Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> [sudip: adjust context] Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4d14c5c upstream Calling btrfs_qgroup_reserve_meta_prealloc from btrfs_delayed_inode_reserve_metadata can result in flushing delalloc while holding a transaction and delayed node locks. This is deadlock prone. In the past multiple commits: * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're already holding a transaction") * 6f23277 ("btrfs: qgroup: don't commit transaction when we already hold the handle") Tried to solve various aspects of this but this was always a whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock scenario involving btrfs_delayed_node::mutex. Namely, one thread can call btrfs_dirty_inode as a result of reading a file and modifying its atime: PID: 6963 TASK: ffff8c7f3f94c000 CPU: 2 COMMAND: "test" #0 __schedule at ffffffffa529e07d jwrdegoede#1 schedule at ffffffffa529e4ff jwrdegoede#2 schedule_timeout at ffffffffa52a1bdd jwrdegoede#3 wait_for_completion at ffffffffa529eeea <-- sleeps with delayed node mutex held jwrdegoede#4 start_delalloc_inodes at ffffffffc0380db5 jwrdegoede#5 btrfs_start_delalloc_snapshot at ffffffffc0393836 jwrdegoede#6 try_flush_qgroup at ffffffffc03f04b2 linux-sunxi#7 __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6 <-- tries to reserve space and starts delalloc inodes. linux-sunxi#8 btrfs_delayed_update_inode at ffffffffc03e31aa <-- acquires delayed node mutex linux-sunxi#9 btrfs_update_inode at ffffffffc0385ba8 linux-sunxi#10 btrfs_dirty_inode at ffffffffc038627b <-- TRANSACTIION OPENED linux-sunxi#11 touch_atime at ffffffffa4cf0000 linux-sunxi#12 generic_file_read_iter at ffffffffa4c1f123 linux-sunxi#13 new_sync_read at ffffffffa4ccdc8a linux-sunxi#14 vfs_read at ffffffffa4cd0849 linux-sunxi#15 ksys_read at ffffffffa4cd0bd1 linux-sunxi#16 do_syscall_64 at ffffffffa4a052eb linux-sunxi#17 entry_SYSCALL_64_after_hwframe at ffffffffa540008c This will cause an asynchronous work to flush the delalloc inodes to happen which can try to acquire the same delayed_node mutex: PID: 455 TASK: ffff8c8085fa4000 CPU: 5 COMMAND: "kworker/u16:30" #0 __schedule at ffffffffa529e07d jwrdegoede#1 schedule at ffffffffa529e4ff jwrdegoede#2 schedule_preempt_disabled at ffffffffa529e80a jwrdegoede#3 __mutex_lock at ffffffffa529fdcb <-- goes to sleep, never wakes up. jwrdegoede#4 btrfs_delayed_update_inode at ffffffffc03e3143 <-- tries to acquire the mutex jwrdegoede#5 btrfs_update_inode at ffffffffc0385ba8 <-- this is the same inode that pid 6963 is holding jwrdegoede#6 cow_file_range_inline.constprop.78 at ffffffffc0386be7 linux-sunxi#7 cow_file_range at ffffffffc03879c1 linux-sunxi#8 btrfs_run_delalloc_range at ffffffffc038894c linux-sunxi#9 writepage_delalloc at ffffffffc03a3c8f linux-sunxi#10 __extent_writepage at ffffffffc03a4c01 linux-sunxi#11 extent_write_cache_pages at ffffffffc03a500b linux-sunxi#12 extent_writepages at ffffffffc03a6de2 linux-sunxi#13 do_writepages at ffffffffa4c277eb linux-sunxi#14 __filemap_fdatawrite_range at ffffffffa4c1e5bb linux-sunxi#15 btrfs_run_delalloc_work at ffffffffc0380987 <-- starts running delayed nodes linux-sunxi#16 normal_work_helper at ffffffffc03b706c linux-sunxi#17 process_one_work at ffffffffa4aba4e4 linux-sunxi#18 worker_thread at ffffffffa4aba6fd linux-sunxi#19 kthread at ffffffffa4ac0a3d linux-sunxi#20 ret_from_fork at ffffffffa54001ff To fully address those cases the complete fix is to never issue any flushing while holding the transaction or the delayed node lock. This patch achieves it by calling qgroup_reserve_meta directly which will either succeed without flushing or will fail and return -EDQUOT. In the latter case that return value is going to be propagated to btrfs_dirty_inode which will fallback to start a new transaction. That's fine as the majority of time we expect the inode will have BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly copying the in-memory state. Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

The evlist and the cpu/thread maps should be released together. Otherwise following error was reported by Asan. Note that this test still has memory leaks in DSOs so it still fails even after this change. I'll take a look at that too. # perf test -v 26 26: Object code reading : --- start --- test child forked, pid 154184 Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols Parsing event 'cycles' mmap size 528384B ... ================================================================= ==154184==ERROR: LeakSanitizer: detected memory leaks Direct leak of 439 byte(s) in 1 object(s) allocated from: #0 0x7fcb66e77037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x55ad9b7e821e in dso__new_id util/dso.c:1256 #2 0x55ad9b8cfd4a in __machine__addnew_vdso util/vdso.c:132 #3 0x55ad9b8cfd4a in machine__findnew_vdso util/vdso.c:347 #4 0x55ad9b845b7e in map__new util/map.c:176 #5 0x55ad9b8415a2 in machine__process_mmap2_event util/machine.c:1787 #6 0x55ad9b8fab16 in perf_tool__process_synth_event util/synthetic-events.c:64 linux-sunxi#7 0x55ad9b8fab16 in perf_event__synthesize_mmap_events util/synthetic-events.c:499 linux-sunxi#8 0x55ad9b8fbfdf in __event__synthesize_thread util/synthetic-events.c:741 linux-sunxi#9 0x55ad9b8ff3e3 in perf_event__synthesize_thread_map util/synthetic-events.c:833 linux-sunxi#10 0x55ad9b738585 in do_test_code_reading tests/code-reading.c:608 linux-sunxi#11 0x55ad9b73b25d in test__code_reading tests/code-reading.c:722 linux-sunxi#12 0x55ad9b6f28fb in run_test tests/builtin-test.c:428 linux-sunxi#13 0x55ad9b6f28fb in test_and_print tests/builtin-test.c:458 linux-sunxi#14 0x55ad9b6f4a53 in __cmd_test tests/builtin-test.c:679 linux-sunxi#15 0x55ad9b6f4a53 in cmd_test tests/builtin-test.c:825 linux-sunxi#16 0x55ad9b760cc4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313 linux-sunxi#17 0x55ad9b5eaa88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365 linux-sunxi#18 0x55ad9b5eaa88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409 linux-sunxi#19 0x55ad9b5eaa88 in main /home/namhyung/project/linux/tools/perf/perf.c:539 linux-sunxi#20 0x7fcb669acd09 in __libc_start_main ../csu/libc-start.c:308 ... SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s). test child finished with 1 ---- end ---- Object code reading: FAILED! Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20210301140409.184570-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

I got several memory leak reports from Asan with a simple command. It was because VDSO is not released due to the refcount. Like in __dsos_addnew_id(), it should put the refcount after adding to the list. $ perf record true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ] ================================================================= ==692599==ERROR: LeakSanitizer: detected memory leaks Direct leak of 439 byte(s) in 1 object(s) allocated from: #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x559bce4aa8ee in dso__new_id util/dso.c:1256 #2 0x559bce59245a in __machine__addnew_vdso util/vdso.c:132 #3 0x559bce59245a in machine__findnew_vdso util/vdso.c:347 #4 0x559bce50826c in map__new util/map.c:175 #5 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787 #6 0x559bce512f6b in machines__deliver_event util/session.c:1481 linux-sunxi#7 0x559bce515107 in perf_session__deliver_event util/session.c:1551 linux-sunxi#8 0x559bce51d4d2 in do_flush util/ordered-events.c:244 linux-sunxi#9 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323 linux-sunxi#10 0x559bce519bea in __perf_session__process_events util/session.c:2268 linux-sunxi#11 0x559bce519bea in perf_session__process_events util/session.c:2297 linux-sunxi#12 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017 linux-sunxi#13 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234 linux-sunxi#14 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026 linux-sunxi#15 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858 linux-sunxi#16 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313 linux-sunxi#17 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365 linux-sunxi#18 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409 linux-sunxi#19 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539 linux-sunxi#20 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308 Indirect leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x559bce520907 in nsinfo__copy util/namespaces.c:169 #2 0x559bce50821b in map__new util/map.c:168 #3 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787 #4 0x559bce512f6b in machines__deliver_event util/session.c:1481 #5 0x559bce515107 in perf_session__deliver_event util/session.c:1551 #6 0x559bce51d4d2 in do_flush util/ordered-events.c:244 linux-sunxi#7 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323 linux-sunxi#8 0x559bce519bea in __perf_session__process_events util/session.c:2268 linux-sunxi#9 0x559bce519bea in perf_session__process_events util/session.c:2297 linux-sunxi#10 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017 linux-sunxi#11 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234 linux-sunxi#12 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026 linux-sunxi#13 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858 linux-sunxi#14 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313 linux-sunxi#15 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365 linux-sunxi#16 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409 linux-sunxi#17 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539 linux-sunxi#18 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308 SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210315045641.700430-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

[ Upstream commit 0ea9fd0 ] Rfkill block and unblock Intel USB Bluetooth [8087:0026] may make it stops working: [ 509.691509] Bluetooth: hci0: HCI reset during shutdown failed [ 514.897584] Bluetooth: hci0: MSFT filter_enable is already on [ 530.044751] usb 3-10: reset full-speed USB device number 5 using xhci_hcd [ 545.660350] usb 3-10: device descriptor read/64, error -110 [ 561.283530] usb 3-10: device descriptor read/64, error -110 [ 561.519682] usb 3-10: reset full-speed USB device number 5 using xhci_hcd [ 566.686650] Bluetooth: hci0: unexpected event for opcode 0x0500 [ 568.752452] Bluetooth: hci0: urb 0000000096cd309b failed to resubmit (113) [ 578.797955] Bluetooth: hci0: Failed to read MSFT supported features (-110) [ 586.286565] Bluetooth: hci0: urb 00000000c522f633 failed to resubmit (113) [ 596.215302] Bluetooth: hci0: Failed to read MSFT supported features (-110) Or kernel panics because other workqueues already freed skb: [ 2048.663763] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 2048.663775] #PF: supervisor read access in kernel mode [ 2048.663779] #PF: error_code(0x0000) - not-present page [ 2048.663782] PGD 0 P4D 0 [ 2048.663787] Oops: 0000 [jwrdegoede#1] SMP NOPTI [ 2048.663793] CPU: 3 PID: 4491 Comm: rfkill Tainted: G W 5.13.0-rc1-next-20210510+ linux-sunxi#20 [ 2048.663799] Hardware name: HP HP EliteBook 850 G8 Notebook PC/8846, BIOS T76 Ver. 01.01.04 12/02/2020 [ 2048.663801] RIP: 0010:__skb_ext_put+0x6/0x50 [ 2048.663814] Code: 8b 1b 48 85 db 75 db 5b 41 5c 5d c3 be 01 00 00 00 e8 de 13 c0 ff eb e7 be 02 00 00 00 e8 d2 13 c0 ff eb db 0f 1f 44 00 00 55 <8b> 07 48 89 e5 83 f8 01 74 14 b8 ff ff ff ff f0 0f c1 07 83 f8 01 [ 2048.663819] RSP: 0018:ffffc1d105b6fd80 EFLAGS: 00010286 [ 2048.663824] RAX: 0000000000000000 RBX: ffff9d9ac5649000 RCX: 0000000000000000 [ 2048.663827] RDX: ffffffffc0d1daf6 RSI: 0000000000000206 RDI: 0000000000000000 [ 2048.663830] RBP: ffffc1d105b6fd98 R08: 0000000000000001 R09: ffff9d9ace8ceac0 [ 2048.663834] R10: ffff9d9ace8ceac0 R11: 0000000000000001 R12: ffff9d9ac5649000 [ 2048.663838] R13: 0000000000000000 R14: 00007ffe0354d650 R15: 0000000000000000 [ 2048.663843] FS: 00007fe02ab19740(0000) GS:ffff9d9e5f8c0000(0000) knlGS:0000000000000000 [ 2048.663849] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2048.663853] CR2: 0000000000000000 CR3: 0000000111a52004 CR4: 0000000000770ee0 [ 2048.663856] PKRU: 55555554 [ 2048.663859] Call Trace: [ 2048.663865] ? skb_release_head_state+0x5e/0x80 [ 2048.663873] kfree_skb+0x2f/0xb0 [ 2048.663881] btusb_shutdown_intel_new+0x36/0x60 [btusb] [ 2048.663905] hci_dev_do_close+0x48c/0x5e0 [bluetooth] [ 2048.663954] ? __cond_resched+0x1a/0x50 [ 2048.663962] hci_rfkill_set_block+0x56/0xa0 [bluetooth] [ 2048.664007] rfkill_set_block+0x98/0x170 [ 2048.664016] rfkill_fop_write+0x136/0x1e0 [ 2048.664022] vfs_write+0xc7/0x260 [ 2048.664030] ksys_write+0xb1/0xe0 [ 2048.664035] ? exit_to_user_mode_prepare+0x37/0x1c0 [ 2048.664042] __x64_sys_write+0x1a/0x20 [ 2048.664048] do_syscall_64+0x40/0xb0 [ 2048.664055] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 2048.664060] RIP: 0033:0x7fe02ac23c27 [ 2048.664066] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ 2048.664070] RSP: 002b:00007ffe0354d638 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 2048.664075] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe02ac23c27 [ 2048.664078] RDX: 0000000000000008 RSI: 00007ffe0354d650 RDI: 0000000000000003 [ 2048.664081] RBP: 0000000000000000 R08: 0000559b05998440 R09: 0000559b05998440 [ 2048.664084] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 [ 2048.664086] R13: 0000000000000000 R14: ffffffff00000000 R15: 00000000ffffffff So move the shutdown callback to a place where workqueues are either flushed or cancelled to resolve the issue. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

When the kernel is built with CONFIG_KASAN_HW_TAGS and the CPU supports MTE, memory accesses are checked at 16-byte granularity, and out-of-bounds accesses can result in tag check faults. Our current implementation of strlen() makes unaligned 16-byte accesses (within a naturally aligned 4096-byte window), and can trigger tag check faults. This can be seen at boot time, e.g. | BUG: KASAN: invalid-access in __pi_strlen+0x14/0x150 | Read at addr f4ff0000c0028300 by task swapper/0/0 | Pointer tag: [f4], memory tag: [fe] | | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-09550-g03c2813535a2-dirty linux-sunxi#20 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x1b0 | show_stack+0x1c/0x30 | dump_stack_lvl+0x68/0x84 | print_address_description+0x7c/0x2b4 | kasan_report+0x138/0x38c | __do_kernel_fault+0x190/0x1c4 | do_tag_check_fault+0x78/0x90 | do_mem_abort+0x44/0xb4 | el1_abort+0x40/0x60 | el1h_64_sync_handler+0xb0/0xd0 | el1h_64_sync+0x78/0x7c | __pi_strlen+0x14/0x150 | __register_sysctl_table+0x7c4/0x890 | register_leaf_sysctl_tables+0x1a4/0x210 | register_leaf_sysctl_tables+0xc8/0x210 | __register_sysctl_paths+0x22c/0x290 | register_sysctl_table+0x2c/0x40 | sysctl_init+0x20/0x30 | proc_sys_init+0x3c/0x48 | proc_root_init+0x80/0x9c | start_kernel+0x640/0x69c | __primary_switched+0xc0/0xc8 To fix this, we can reduce the (strlen-internal) MIN_PAGE_SIZE to 16 bytes when CONFIG_KASAN_HW_TAGS is selected. This will cause strlen() to align the base pointer downwards to a 16-byte boundary, and to discard the additional prefix bytes without counting them. All subsequent accesses will be 16-byte aligned 16-byte LDPs. While the comments say the body of the loop will access 32 bytes, this is performed as two 16-byte acceses, with the second made only if the first did not encounter a NUL byte, so the body of the loop will not over-read across a 16-byte boundary. No other string routines are affected. The other str*() routines will not make any access which straddles a 16-byte boundary, and the mem*() routines will only make acceses which straddle a 16-byte boundary when which is entirely within the bounds of the relevant base and size arguments. Fixes: 325a1de ("arm64: Import updated version of Cortex Strings' strlen") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Potapenko <glider@google.com Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20210712090043.20847-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>

commit 41d5854 upstream. I got several memory leak reports from Asan with a simple command. It was because VDSO is not released due to the refcount. Like in __dsos_addnew_id(), it should put the refcount after adding to the list. $ perf record true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ] ================================================================= ==692599==ERROR: LeakSanitizer: detected memory leaks Direct leak of 439 byte(s) in 1 object(s) allocated from: #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 jwrdegoede#1 0x559bce4aa8ee in dso__new_id util/dso.c:1256 jwrdegoede#2 0x559bce59245a in __machine__addnew_vdso util/vdso.c:132 jwrdegoede#3 0x559bce59245a in machine__findnew_vdso util/vdso.c:347 jwrdegoede#4 0x559bce50826c in map__new util/map.c:175 jwrdegoede#5 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787 jwrdegoede#6 0x559bce512f6b in machines__deliver_event util/session.c:1481 linux-sunxi#7 0x559bce515107 in perf_session__deliver_event util/session.c:1551 linux-sunxi#8 0x559bce51d4d2 in do_flush util/ordered-events.c:244 linux-sunxi#9 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323 linux-sunxi#10 0x559bce519bea in __perf_session__process_events util/session.c:2268 linux-sunxi#11 0x559bce519bea in perf_session__process_events util/session.c:2297 linux-sunxi#12 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017 linux-sunxi#13 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234 linux-sunxi#14 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026 linux-sunxi#15 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858 linux-sunxi#16 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313 linux-sunxi#17 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365 linux-sunxi#18 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409 linux-sunxi#19 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539 linux-sunxi#20 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308 Indirect leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 jwrdegoede#1 0x559bce520907 in nsinfo__copy util/namespaces.c:169 jwrdegoede#2 0x559bce50821b in map__new util/map.c:168 jwrdegoede#3 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787 jwrdegoede#4 0x559bce512f6b in machines__deliver_event util/session.c:1481 jwrdegoede#5 0x559bce515107 in perf_session__deliver_event util/session.c:1551 jwrdegoede#6 0x559bce51d4d2 in do_flush util/ordered-events.c:244 linux-sunxi#7 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323 linux-sunxi#8 0x559bce519bea in __perf_session__process_events util/session.c:2268 linux-sunxi#9 0x559bce519bea in perf_session__process_events util/session.c:2297 linux-sunxi#10 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017 linux-sunxi#11 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234 linux-sunxi#12 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026 linux-sunxi#13 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858 linux-sunxi#14 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313 linux-sunxi#15 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365 linux-sunxi#16 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409 linux-sunxi#17 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539 linux-sunxi#18 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308 SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210315045641.700430-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Attempting to defragment a Btrfs file containing a transparent huge page immediately deadlocks with the following stack trace: #0 context_switch (kernel/sched/core.c:4940:2) #1 __schedule (kernel/sched/core.c:6287:8) #2 schedule (kernel/sched/core.c:6366:3) #3 io_schedule (kernel/sched/core.c:8389:2) #4 wait_on_page_bit_common (mm/filemap.c:1356:4) #5 __lock_page (mm/filemap.c:1648:2) #6 lock_page (./include/linux/pagemap.h:625:3) linux-sunxi#7 pagecache_get_page (mm/filemap.c:1910:4) linux-sunxi#8 find_or_create_page (./include/linux/pagemap.h:420:9) linux-sunxi#9 defrag_prepare_one_page (fs/btrfs/ioctl.c:1068:9) linux-sunxi#10 defrag_one_range (fs/btrfs/ioctl.c:1326:14) linux-sunxi#11 defrag_one_cluster (fs/btrfs/ioctl.c:1421:9) linux-sunxi#12 btrfs_defrag_file (fs/btrfs/ioctl.c:1523:9) linux-sunxi#13 btrfs_ioctl_defrag (fs/btrfs/ioctl.c:3117:9) linux-sunxi#14 btrfs_ioctl (fs/btrfs/ioctl.c:4872:10) linux-sunxi#15 vfs_ioctl (fs/ioctl.c:51:10) linux-sunxi#16 __do_sys_ioctl (fs/ioctl.c:874:11) linux-sunxi#17 __se_sys_ioctl (fs/ioctl.c:860:1) linux-sunxi#18 __x64_sys_ioctl (fs/ioctl.c:860:1) linux-sunxi#19 do_syscall_x64 (arch/x86/entry/common.c:50:14) linux-sunxi#20 do_syscall_64 (arch/x86/entry/common.c:80:7) linux-sunxi#21 entry_SYSCALL_64+0x7c/0x15b (arch/x86/entry/entry_64.S:113) A huge page is represented by a compound page, which consists of a struct page for each PAGE_SIZE page within the huge page. The first struct page is the "head page", and the remaining are "tail pages". Defragmentation attempts to lock each page in the range. However, lock_page() on a tail page actually locks the corresponding head page. So, if defragmentation tries to lock more than one struct page in a compound page, it tries to lock the same head page twice and deadlocks with itself. Ideally, we should be able to defragment transparent huge pages. However, THP for filesystems is currently read-only, so a lot of code is not ready to use huge pages for I/O. For now, let's just return ETXTBUSY. This can be reproduced with the following on a kernel with CONFIG_READ_ONLY_THP_FOR_FS=y: $ cat create_thp_file.c #include <fcntl.h> #include <stdbool.h> #include <stdio.h> #include <stdint.h> #include <stdlib.h> #include <unistd.h> #include <sys/mman.h> static const char zeroes[1024 * 1024]; static const size_t FILE_SIZE = 2 * 1024 * 1024; int main(int argc, char **argv) { if (argc != 2) { fprintf(stderr, "usage: %s PATH\n", argv[0]); return EXIT_FAILURE; } int fd = creat(argv[1], 0777); if (fd == -1) { perror("creat"); return EXIT_FAILURE; } size_t written = 0; while (written < FILE_SIZE) { ssize_t ret = write(fd, zeroes, sizeof(zeroes) < FILE_SIZE - written ? sizeof(zeroes) : FILE_SIZE - written); if (ret < 0) { perror("write"); return EXIT_FAILURE; } written += ret; } close(fd); fd = open(argv[1], O_RDONLY); if (fd == -1) { perror("open"); return EXIT_FAILURE; } /* * Reserve some address space so that we can align the file mapping to * the huge page size. */ void *placeholder_map = mmap(NULL, FILE_SIZE * 2, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (placeholder_map == MAP_FAILED) { perror("mmap (placeholder)"); return EXIT_FAILURE; } void *aligned_address = (void *)(((uintptr_t)placeholder_map + FILE_SIZE - 1) & ~(FILE_SIZE - 1)); void *map = mmap(aligned_address, FILE_SIZE, PROT_READ | PROT_EXEC, MAP_SHARED | MAP_FIXED, fd, 0); if (map == MAP_FAILED) { perror("mmap"); return EXIT_FAILURE; } if (madvise(map, FILE_SIZE, MADV_HUGEPAGE) < 0) { perror("madvise"); return EXIT_FAILURE; } char *line = NULL; size_t line_capacity = 0; FILE *smaps_file = fopen("/proc/self/smaps", "r"); if (!smaps_file) { perror("fopen"); return EXIT_FAILURE; } for (;;) { for (size_t off = 0; off < FILE_SIZE; off += 4096) ((volatile char *)map)[off]; ssize_t ret; bool this_mapping = false; while ((ret = getline(&line, &line_capacity, smaps_file)) > 0) { unsigned long start, end, huge; if (sscanf(line, "%lx-%lx", &start, &end) == 2) { this_mapping = (start <= (uintptr_t)map && (uintptr_t)map < end); } else if (this_mapping && sscanf(line, "FilePmdMapped: %ld", &huge) == 1 && huge > 0) { return EXIT_SUCCESS; } } sleep(6); rewind(smaps_file); fflush(smaps_file); } } $ ./create_thp_file huge $ btrfs fi defrag -czstd ./huge Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

[ Upstream commit 39fbef4 ] The following kernel crash can be triggered: [ 89.266592] ------------[ cut here ]------------ [ 89.267427] kernel BUG at fs/buffer.c:3020! [ 89.268264] invalid opcode: 0000 [jwrdegoede#1] SMP KASAN PTI [ 89.269116] CPU: 7 PID: 1750 Comm: kmmpd-loop0 Not tainted 5.10.0-862.14.0.6.x86_64-08610-gc932cda3cef4-dirty linux-sunxi#20 [ 89.273169] RIP: 0010:submit_bh_wbc.isra.0+0x538/0x6d0 [ 89.277157] RSP: 0018:ffff888105ddfd08 EFLAGS: 00010246 [ 89.278093] RAX: 0000000000000005 RBX: ffff888124231498 RCX: ffffffffb2772612 [ 89.279332] RDX: 1ffff11024846293 RSI: 0000000000000008 RDI: ffff888124231498 [ 89.280591] RBP: ffff8881248cc000 R08: 0000000000000001 R09: ffffed1024846294 [ 89.281851] R10: ffff88812423149f R11: ffffed1024846293 R12: 0000000000003800 [ 89.283095] R13: 0000000000000001 R14: 0000000000000000 R15: ffff8881161f7000 [ 89.284342] FS: 0000000000000000(0000) GS:ffff88839b5c0000(0000) knlGS:0000000000000000 [ 89.285711] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 89.286701] CR2: 00007f166ebc01a0 CR3: 0000000435c0e000 CR4: 00000000000006e0 [ 89.287919] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 89.289138] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 89.290368] Call Trace: [ 89.290842] write_mmp_block+0x2ca/0x510 [ 89.292218] kmmpd+0x433/0x9a0 [ 89.294902] kthread+0x2dd/0x3e0 [ 89.296268] ret_from_fork+0x22/0x30 [ 89.296906] Modules linked in: by running the following commands: 1. mkfs.ext4 -O mmp /dev/sda -b 1024 2. mount /dev/sda /home/test 3. echo "/dev/sda" > /sys/power/resume That happens because swsusp_check() calls set_blocksize() on the target partition which confuses the file system: Thread1 Thread2 mount /dev/sda /home/test get s_mmp_bh --> has mapped flag start kmmpd thread echo "/dev/sda" > /sys/power/resume resume_store software_resume swsusp_check set_blocksize truncate_inode_pages_range truncate_cleanup_page block_invalidatepage discard_buffer --> clean mapped flag write_mmp_block submit_bh submit_bh_wbc BUG_ON(!buffer_mapped(bh)) To address this issue, modify swsusp_check() to open the target block device with exclusive access. Signed-off-by: Ye Bin <yebin10@huawei.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 273703e ] Commit 3158237 ("ath11k: Change number of TCL rings to one for QCA6390") avoids initializing the other entries of dp->tx_ring cause the corresponding TX rings on QCA6390/WCN6855 are not used, but leaves those ring masks in ath11k_hw_ring_mask_qca6390.tx unchanged. Normally this is OK because we will only get interrupts from the first TX ring on these chips and thus only the first entry of dp->tx_ring is involved. In case of one MSI vector, all DP rings share the same IRQ. For each interrupt, all rings have to be checked, which means the other entries of dp->tx_ring are involved. However since they are not initialized, system crashes. Fix this issue by simply removing those ring masks. crash stack: [ 102.907438] BUG: kernel NULL pointer dereference, address: 0000000000000028 [ 102.907447] #PF: supervisor read access in kernel mode [ 102.907451] #PF: error_code(0x0000) - not-present page [ 102.907453] PGD 1081f0067 P4D 1081f0067 PUD 1081f1067 PMD 0 [ 102.907460] Oops: 0000 [jwrdegoede#1] SMP DEBUG_PAGEALLOC NOPTI [ 102.907465] CPU: 0 PID: 3511 Comm: apt-check Kdump: loaded Tainted: G E 5.15.0-rc4-wt-ath+ linux-sunxi#20 [ 102.907470] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS RCD1005E 10/08/2020 [ 102.907472] RIP: 0010:ath11k_dp_tx_completion_handler+0x201/0x830 [ath11k] [ 102.907497] Code: 3c 24 4e 8d ac 37 10 04 00 00 4a 8d bc 37 68 04 00 00 48 89 3c 24 48 63 c8 89 83 84 18 00 00 48 c1 e1 05 48 03 8b 78 18 00 00 <8b> 51 08 89 d6 83 e6 07 89 74 24 24 83 fe 03 74 04 85 f6 75 63 41 [ 102.907501] RSP: 0000:ffff9b7340003e08 EFLAGS: 00010202 [ 102.907505] RAX: 0000000000000001 RBX: ffff8e21530c0100 RCX: 0000000000000020 [ 102.907508] RDX: 0000000000000000 RSI: 00000000fffffe00 RDI: ffff8e21530c1938 [ 102.907511] RBP: ffff8e21530c0000 R08: 0000000000000001 R09: 0000000000000000 [ 102.907513] R10: ffff8e2145534c10 R11: 0000000000000001 R12: ffff8e21530c2938 [ 102.907515] R13: ffff8e21530c18e0 R14: 0000000000000100 R15: ffff8e21530c2978 [ 102.907518] FS: 00007f5d4297e740(0000) GS:ffff8e243d600000(0000) knlGS:0000000000000000 [ 102.907521] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 102.907524] CR2: 0000000000000028 CR3: 00000001034ea000 CR4: 0000000000350ef0 [ 102.907527] Call Trace: [ 102.907531] <IRQ> [ 102.907537] ath11k_dp_service_srng+0x5c/0x2f0 [ath11k] [ 102.907556] ath11k_pci_ext_grp_napi_poll+0x21/0x70 [ath11k_pci] [ 102.907562] __napi_poll+0x2c/0x160 [ 102.907570] net_rx_action+0x251/0x310 [ 102.907576] __do_softirq+0x107/0x2fc [ 102.907585] irq_exit_rcu+0x74/0x90 [ 102.907593] common_interrupt+0x83/0xa0 [ 102.907600] </IRQ> [ 102.907601] asm_common_interrupt+0x1e/0x40 Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1 Signed-off-by: Baochen Qiang <bqiang@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211026011605.58615-1-quic_bqiang@quicinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... linux-sunxi#9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 linux-sunxi#10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] linux-sunxi#11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] linux-sunxi#12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] linux-sunxi#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] linux-sunxi#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] linux-sunxi#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] linux-sunxi#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] linux-sunxi#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 linux-sunxi#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 linux-sunxi#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 linux-sunxi#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf linux-sunxi#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 linux-sunxi#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 linux-sunxi#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 linux-sunxi#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff linux-sunxi#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f linux-sunxi#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <suresh2514@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 42dcbe7 ] The following WARN is triggered from kvm_vm_ioctl_set_clock(): WARNING: CPU: 10 PID: 579353 at arch/x86/kvm/../../../virt/kvm/kvm_main.c:3161 mark_page_dirty_in_slot+0x6c/0x80 [kvm] ... CPU: 10 PID: 579353 Comm: qemu-system-x86 Tainted: G W O 5.16.0.stable linux-sunxi#20 Hardware name: LENOVO 20UF001CUS/20UF001CUS, BIOS R1CET65W(1.34 ) 06/17/2021 RIP: 0010:mark_page_dirty_in_slot+0x6c/0x80 [kvm] ... Call Trace: <TASK> ? kvm_write_guest+0x114/0x120 [kvm] kvm_hv_invalidate_tsc_page+0x9e/0xf0 [kvm] kvm_arch_vm_ioctl+0xa26/0xc50 [kvm] ? schedule+0x4e/0xc0 ? __cond_resched+0x1a/0x50 ? futex_wait+0x166/0x250 ? __send_signal+0x1f1/0x3d0 kvm_vm_ioctl+0x747/0xda0 [kvm] ... The WARN was introduced by commit 03c0304 ("KVM: Warn if mark_page_dirty() is called without an active vCPU") but the change seems to be correct (unlike Hyper-V TSC page update mechanism). In fact, there's no real need to actually write to guest memory to invalidate TSC page, this can be done by the first vCPU which goes through kvm_guest_time_update(). Reported-by: Maxim Levitsky <mlevitsk@redhat.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220407201013.963226-1-vkuznets@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 8659ab3 ] Warning log: [ 4.141392] Unexpected gfp: 0x4 (GFP_DMA32). Fixing up to gfp: 0xa20 (GFP_ATOMIC). Fix your code! [ 4.150340] CPU: 1 PID: 175 Comm: 1-0050 Not tainted 5.15.5-00039-g2fd9ae1b568c linux-sunxi#20 [ 4.158010] Hardware name: Freescale i.MX8QXP MEK (DT) [ 4.163155] Call trace: [ 4.165600] dump_backtrace+0x0/0x1b0 [ 4.169286] show_stack+0x18/0x68 [ 4.172611] dump_stack_lvl+0x68/0x84 [ 4.176286] dump_stack+0x18/0x34 [ 4.179613] kmalloc_fix_flags+0x60/0x88 [ 4.183550] new_slab+0x334/0x370 [ 4.186878] ___slab_alloc.part.108+0x4d4/0x748 [ 4.191419] __slab_alloc.isra.109+0x30/0x78 [ 4.195702] kmem_cache_alloc+0x40c/0x420 [ 4.199725] dma_pool_alloc+0xac/0x1f8 [ 4.203486] cdns3_allocate_trb_pool+0xb4/0xd0 pool_alloc_page(struct dma_pool *pool, gfp_t mem_flags) { ... page = kmalloc(sizeof(*page), mem_flags); page->vaddr = dma_alloc_coherent(pool->dev, pool->allocation, &page->dma, mem_flags); ... } kmalloc was called with mem_flags, which is passed down in cdns3_allocate_trb_pool() and have GFP_DMA32 flags. kmall_fix_flags() report warning. GFP_DMA32 is not useful at all. dma_alloc_coherent() will handle DMA memory region correctly by pool->dev. GFP_DMA32 can be removed safely. Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20220609154456.2871672-1-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb jwrdegoede#1 [ffffaad04005fae8] schedule at ffffffff8b323e2d jwrdegoede#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc jwrdegoede#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 jwrdegoede#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] jwrdegoede#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 jwrdegoede#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa linux-sunxi#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc linux-sunxi#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e linux-sunxi#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 linux-sunxi#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 linux-sunxi#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] linux-sunxi#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] linux-sunxi#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] linux-sunxi#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 linux-sunxi#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 linux-sunxi#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 linux-sunxi#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 linux-sunxi#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 linux-sunxi#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc linux-sunxi#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d linux-sunxi#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 linux-sunxi#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <mcornea@redhat.com> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

The cited commit adds a compeletion to remove dependency on rtnl lock. But it causes a deadlock for multiple encapsulations: crash> bt ffff8aece8a64000 PID: 1514557 TASK: ffff8aece8a64000 CPU: 3 COMMAND: "tc" #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45 #1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418 #2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898 #3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8 #4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb #5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core] #6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core] linux-sunxi#7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core] linux-sunxi#8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core] linux-sunxi#9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core] linux-sunxi#10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core] linux-sunxi#11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core] linux-sunxi#12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8 linux-sunxi#13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower] linux-sunxi#14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower] linux-sunxi#15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047 linux-sunxi#16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31 linux-sunxi#17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853 linux-sunxi#18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835 linux-sunxi#19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27 linux-sunxi#20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245 linux-sunxi#21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482 linux-sunxi#22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a linux-sunxi#23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2 linux-sunxi#24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2 linux-sunxi#25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f linux-sunxi#26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8 linux-sunxi#27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c crash> bt 0xffff8aeb07544000 PID: 1110766 TASK: ffff8aeb07544000 CPU: 0 COMMAND: "kworker/u20:9" #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45 #1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418 #2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88 #3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b #4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core] #5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core] #6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core] linux-sunxi#7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c linux-sunxi#8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012 linux-sunxi#9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d linux-sunxi#10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f After the first encap is attached, flow will be added to encap entry's flows list. If neigh update is running at this time, the following encaps of the flow can't hold the encap_tbl_lock and sleep. If neigh update thread is waiting for that flow's init_done, deadlock happens. Fix it by holding lock outside of the for loop. If neigh update is running, prevent encap flows from offloading. Since the lock is held outside of the for loop, concurrent creation of encap entries is not allowed. So remove unnecessary wait_for_completion call for res_ready. Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

Christoph reported a divide by zero bug in mptcp_recvmsg(): divide error: 0000 [#1] PREEMPT SMP CPU: 1 PID: 19978 Comm: syz-executor.6 Not tainted 6.4.0-rc2-gffcc7899081b linux-sunxi#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:__tcp_select_window+0x30e/0x420 net/ipv4/tcp_output.c:3018 Code: 11 ff 0f b7 cd c1 e9 0c b8 ff ff ff ff d3 e0 89 c1 f7 d1 01 cb 21 c3 eb 17 e8 2e 83 11 ff 31 db eb 0e e8 25 83 11 ff 89 d8 99 <f7> 7c 24 04 29 d3 65 48 8b 04 25 28 00 00 00 48 3b 44 24 10 75 60 RSP: 0018:ffffc90000a07a18 EFLAGS: 00010246 RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000040000 RDX: 0000000000000000 RSI: 000000000003ffff RDI: 0000000000040000 RBP: 000000000000ffd7 R08: ffffffff820cf297 R09: 0000000000000001 R10: 0000000000000000 R11: ffffffff8103d1a0 R12: 0000000000003f00 R13: 0000000000300000 R14: ffff888101cf3540 R15: 0000000000180000 FS: 00007f9af4c09640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b33824000 CR3: 000000012f241001 CR4: 0000000000170ee0 Call Trace: <TASK> __tcp_cleanup_rbuf+0x138/0x1d0 net/ipv4/tcp.c:1611 mptcp_recvmsg+0xcb8/0xdd0 net/mptcp/protocol.c:2034 inet_recvmsg+0x127/0x1f0 net/ipv4/af_inet.c:861 ____sys_recvmsg+0x269/0x2b0 net/socket.c:1019 ___sys_recvmsg+0xe6/0x260 net/socket.c:2764 do_recvmmsg+0x1a5/0x470 net/socket.c:2858 __do_sys_recvmmsg net/socket.c:2937 [inline] __se_sys_recvmmsg net/socket.c:2953 [inline] __x64_sys_recvmmsg+0xa6/0x130 net/socket.c:2953 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f9af58fc6a9 Code: 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 37 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007f9af4c08cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f9af58fc6a9 RDX: 0000000000000001 RSI: 0000000020000140 RDI: 0000000000000004 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000f00 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40 </TASK> mptcp_recvmsg is allowed to release the msk socket lock when blocking, and before re-acquiring it another thread could have switched the sock to TCP_LISTEN status - with a prior connect(AF_UNSPEC) - also clearing icsk_ack.rcv_mss. Address the issue preventing the disconnect if some other process is concurrently performing a blocking syscall on the same socket, alike commit 4faeee0 ("tcp: deny tcp_disconnect() when threads are waiting"). Fixes: a6b118f ("mptcp: add receive buffer auto-tuning") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: multipath-tcp/mptcp_net-next#404 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

[ Upstream commit ff59808 ] The pointer to mdev_bus_compat_class is statically defined at the top of mdev_core, and was originally (commit 7b96953 ("vfio: Mediated device Core driver") serialized by the parent_list_lock. The blamed commit removed this mutex, leaving the pointer initialization unserialized. As a result, the creation of multiple MDEVs in parallel (such as during boot) can encounter errors during the creation of the sysfs entries, such as: [ 8.337509] sysfs: cannot create duplicate filename '/class/mdev_bus' [ 8.337514] vfio_ccw 0.0.01d8: MDEV: Registered [ 8.337516] CPU: 13 PID: 946 Comm: driverctl Not tainted 6.4.0-rc7 linux-sunxi#20 [ 8.337522] Hardware name: IBM 3906 M05 780 (LPAR) [ 8.337525] Call Trace: [ 8.337528] [<0000000162b0145a>] dump_stack_lvl+0x62/0x80 [ 8.337540] [<00000001622aeb30>] sysfs_warn_dup+0x78/0x88 [ 8.337549] [<00000001622aeca6>] sysfs_create_dir_ns+0xe6/0xf8 [ 8.337552] [<0000000162b04504>] kobject_add_internal+0xf4/0x340 [ 8.337557] [<0000000162b04d48>] kobject_add+0x78/0xd0 [ 8.337561] [<0000000162b04e0a>] kobject_create_and_add+0x6a/0xb8 [ 8.337565] [<00000001627a110e>] class_compat_register+0x5e/0x90 [ 8.337572] [<000003ff7fd815da>] mdev_register_parent+0x102/0x130 [mdev] [ 8.337581] [<000003ff7fdc7f2c>] vfio_ccw_sch_probe+0xe4/0x178 [vfio_ccw] [ 8.337588] [<0000000162a7833c>] css_probe+0x44/0x80 [ 8.337599] [<000000016279f4da>] really_probe+0xd2/0x460 [ 8.337603] [<000000016279fa08>] driver_probe_device+0x40/0xf0 [ 8.337606] [<000000016279fb78>] __device_attach_driver+0xc0/0x140 [ 8.337610] [<000000016279cbe0>] bus_for_each_drv+0x90/0xd8 [ 8.337618] [<00000001627a00b0>] __device_attach+0x110/0x190 [ 8.337621] [<000000016279c7c8>] bus_rescan_devices_helper+0x60/0xb0 [ 8.337626] [<000000016279cd48>] drivers_probe_store+0x48/0x80 [ 8.337632] [<00000001622ac9b0>] kernfs_fop_write_iter+0x138/0x1f0 [ 8.337635] [<00000001621e5e14>] vfs_write+0x1ac/0x2f8 [ 8.337645] [<00000001621e61d8>] ksys_write+0x70/0x100 [ 8.337650] [<0000000162b2bdc4>] __do_syscall+0x1d4/0x200 [ 8.337656] [<0000000162b3c828>] system_call+0x70/0x98 [ 8.337664] kobject: kobject_add_internal failed for mdev_bus with -EEXIST, don't try to register things with the same name in the same directory. [ 8.337668] kobject: kobject_create_and_add: kobject_add error: -17 [ 8.337674] vfio_ccw: probe of 0.0.01d9 failed with error -12 [ 8.342941] vfio_ccw_mdev aeb9ca91-10c6-42bc-a168-320023570aea: Adding to iommu group 2 Move the initialization of the mdev_bus_compat_class pointer to the init path, to match the cleanup in module exit. This way the code in mdev_register_parent() can simply link the new parent to it, rather than determining whether initialization is required first. Fixes: 89345d5 ("vfio/mdev: embedd struct mdev_parent in the parent data structure") Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230626133642.2939168-1-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 linux-sunxi#10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 linux-sunxi#11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 linux-sunxi#12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 linux-sunxi#13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 linux-sunxi#14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 linux-sunxi#15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 linux-sunxi#16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 linux-sunxi#17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 linux-sunxi#18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 linux-sunxi#19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 linux-sunxi#20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 linux-sunxi#21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 linux-sunxi#22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 linux-sunxi#23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 linux-sunxi#24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 linux-sunxi#25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 linux-sunxi#26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 linux-sunxi#27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 linux-sunxi#10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 linux-sunxi#11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 linux-sunxi#12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 linux-sunxi#13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 linux-sunxi#14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 linux-sunxi#15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 linux-sunxi#16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 linux-sunxi#17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Inject fault while probing mdpy.ko, if kstrdup() of create_dir() fails in kobject_add_internal() in kobject_init_and_add() in mdev_type_add() in parent_create_sysfs_files(), it will return 0 and probe successfully. And when rmmod mdpy.ko, the mdpy_dev_exit() will call mdev_unregister_parent(), the mdev_type_remove() may traverse uninitialized parent->types[i] in parent_remove_sysfs_files(), and it will cause below null-ptr-deref. If mdev_type_add() fails, return the error code and kset_unregister() to fix the issue. general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 2 PID: 10215 Comm: rmmod Tainted: G W N 6.6.0-rc2+ linux-sunxi#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:__kobject_del+0x62/0x1c0 Code: 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 51 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 8b 6b 28 48 8d 7d 10 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 24 01 00 00 48 8b 75 10 48 89 df 48 8d 6b 3c e8 RSP: 0018:ffff88810695fd30 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffffffffa0270268 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000004 RDI: 0000000000000010 RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed10233a4ef1 R10: ffff888119d2778b R11: 0000000063666572 R12: 0000000000000000 R13: fffffbfff404e2d4 R14: dffffc0000000000 R15: ffffffffa0271660 FS: 00007fbc81981540(0000) GS:ffff888119d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc14a142dc0 CR3: 0000000110a62003 CR4: 0000000000770ee0 DR0: ffffffff8fb0bce8 DR1: ffffffff8fb0bce9 DR2: ffffffff8fb0bcea DR3: ffffffff8fb0bceb DR6: 00000000fffe0ff0 DR7: 0000000000000600 PKRU: 55555554 Call Trace: <TASK> ? die_addr+0x3d/0xa0 ? exc_general_protection+0x144/0x220 ? asm_exc_general_protection+0x22/0x30 ? __kobject_del+0x62/0x1c0 kobject_del+0x32/0x50 parent_remove_sysfs_files+0xd6/0x170 [mdev] mdev_unregister_parent+0xfb/0x190 [mdev] ? mdev_register_parent+0x270/0x270 [mdev] ? find_module_all+0x9d/0xe0 mdpy_dev_exit+0x17/0x63 [mdpy] __do_sys_delete_module.constprop.0+0x2fa/0x4b0 ? module_flags+0x300/0x300 ? __fput+0x4e7/0xa00 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fbc813221b7 Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe780e0648 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 00007ffe780e06a8 RCX: 00007fbc813221b7 RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055e214df9b58 RBP: 000055e214df9af0 R08: 00007ffe780df5c1 R09: 0000000000000000 R10: 00007fbc8139ecc0 R11: 0000000000000206 R12: 00007ffe780e0870 R13: 00007ffe780e0ed0 R14: 000055e214df9260 R15: 000055e214df9af0 </TASK> Modules linked in: mdpy(-) mdev vfio_iommu_type1 vfio [last unloaded: mdpy] Dumping ftrace buffer: (ftrace buffer empty) ---[ end trace 0000000000000000 ]--- RIP: 0010:__kobject_del+0x62/0x1c0 Code: 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 51 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 8b 6b 28 48 8d 7d 10 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 24 01 00 00 48 8b 75 10 48 89 df 48 8d 6b 3c e8 RSP: 0018:ffff88810695fd30 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffffffffa0270268 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000004 RDI: 0000000000000010 RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed10233a4ef1 R10: ffff888119d2778b R11: 0000000063666572 R12: 0000000000000000 R13: fffffbfff404e2d4 R14: dffffc0000000000 R15: ffffffffa0271660 FS: 00007fbc81981540(0000) GS:ffff888119d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc14a142dc0 CR3: 0000000110a62003 CR4: 0000000000770ee0 DR0: ffffffff8fb0bce8 DR1: ffffffff8fb0bce9 DR2: ffffffff8fb0bcea DR3: ffffffff8fb0bceb DR6: 00000000fffe0ff0 DR7: 0000000000000600 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 1 seconds.. Fixes: da44c34 ("vfio/mdev: simplify mdev_type handling") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230918115551.1423193-1-ruanjinjie@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device(). PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f #1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 #2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee #3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 #4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 #5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c #6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] linux-sunxi#7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] linux-sunxi#8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f linux-sunxi#9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 linux-sunxi#10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] linux-sunxi#11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc linux-sunxi#12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] linux-sunxi#13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] linux-sunxi#14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] linux-sunxi#15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] linux-sunxi#16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 linux-sunxi#17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] linux-sunxi#18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] linux-sunxi#19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 linux-sunxi#20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364 Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230918225848.66463-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 jwrdegoede#1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 jwrdegoede#2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f jwrdegoede#3 [fffffe0000549eb8] do_nmi at ffffffff81079582 jwrdegoede#4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- jwrdegoede#5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b jwrdegoede#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 linux-sunxi#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 linux-sunxi#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] linux-sunxi#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] linux-sunxi#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] linux-sunxi#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] linux-sunxi#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb linux-sunxi#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 linux-sunxi#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 linux-sunxi#15 [ffff88aa841efb88] device_del at ffffffff82179d23 linux-sunxi#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] linux-sunxi#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] linux-sunxi#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a linux-sunxi#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff linux-sunxi#20 [ffff88aa841eff10] kthread at ffffffff811d87a0 linux-sunxi#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

When running autonuma with enabling multi-size THP, I encountered the following kernel crash issue: [ 134.290216] list_del corruption. prev->next should be fffff9ad42e1c490, but was dead000000000100. (prev=fffff9ad42399890) [ 134.290877] kernel BUG at lib/list_debug.c:62! [ 134.291052] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 134.291210] CPU: 56 PID: 8037 Comm: numa01 Kdump: loaded Tainted: G E 6.7.0-rc4+ linux-sunxi#20 [ 134.291649] RIP: 0010:__list_del_entry_valid_or_report+0x97/0xb0 ...... [ 134.294252] Call Trace: [ 134.294362] <TASK> [ 134.294440] ? die+0x33/0x90 [ 134.294561] ? do_trap+0xe0/0x110 ...... [ 134.295681] ? __list_del_entry_valid_or_report+0x97/0xb0 [ 134.295842] folio_undo_large_rmappable+0x99/0x100 [ 134.296003] destroy_large_folio+0x68/0x70 [ 134.296172] migrate_folio_move+0x12e/0x260 [ 134.296264] ? __pfx_remove_migration_pte+0x10/0x10 [ 134.296389] migrate_pages_batch+0x495/0x6b0 [ 134.296523] migrate_pages+0x1d0/0x500 [ 134.296646] ? __pfx_alloc_misplaced_dst_folio+0x10/0x10 [ 134.296799] migrate_misplaced_folio+0x12d/0x2b0 [ 134.296953] do_numa_page+0x1f4/0x570 [ 134.297121] __handle_mm_fault+0x2b0/0x6c0 [ 134.297254] handle_mm_fault+0x107/0x270 [ 134.300897] do_user_addr_fault+0x167/0x680 [ 134.304561] exc_page_fault+0x65/0x140 [ 134.307919] asm_exc_page_fault+0x22/0x30 The reason for the crash is that, the commit 85ce2c5 ("memcontrol: only transfer the memcg data for migration") removed the charging and uncharging operations of the migration folios and cleared the memcg data of the old folio. During the subsequent release process of the old large folio in destroy_large_folio(), if the large folio needs to be removed from the split queue, an incorrect split queue can be obtained (which is pgdat->deferred_split_queue) because the old folio's memcg is NULL now. This can lead to list operations being performed under the wrong split queue lock protection, resulting in a list crash as above. After the migration, the old folio is going to be freed, so we can remove it from the split queue in mem_cgroup_migrate() a bit earlier before clearing the memcg data to avoid getting incorrect split queue. [akpm@linux-foundation.org: fix comment, per Zi Yan] Link: https://lkml.kernel.org/r/61273e5e9b490682388377c20f52d19de4a80460.1703054559.git.baolin.wang@linux.alibaba.com Fixes: 85ce2c5 ("memcontrol: only transfer the memcg data for migration") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

[ Upstream commit 37c3b9f ] The cited commit adds a compeletion to remove dependency on rtnl lock. But it causes a deadlock for multiple encapsulations: crash> bt ffff8aece8a64000 PID: 1514557 TASK: ffff8aece8a64000 CPU: 3 COMMAND: "tc" #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45 jwrdegoede#1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418 jwrdegoede#2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898 jwrdegoede#3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8 jwrdegoede#4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb jwrdegoede#5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core] jwrdegoede#6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core] linux-sunxi#7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core] linux-sunxi#8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core] linux-sunxi#9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core] linux-sunxi#10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core] linux-sunxi#11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core] linux-sunxi#12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8 linux-sunxi#13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower] linux-sunxi#14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower] linux-sunxi#15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047 linux-sunxi#16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31 linux-sunxi#17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853 linux-sunxi#18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835 linux-sunxi#19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27 linux-sunxi#20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245 linux-sunxi#21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482 linux-sunxi#22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a linux-sunxi#23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2 linux-sunxi#24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2 linux-sunxi#25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f linux-sunxi#26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8 linux-sunxi#27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c crash> bt 0xffff8aeb07544000 PID: 1110766 TASK: ffff8aeb07544000 CPU: 0 COMMAND: "kworker/u20:9" #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45 jwrdegoede#1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418 jwrdegoede#2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88 jwrdegoede#3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b jwrdegoede#4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core] jwrdegoede#5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core] jwrdegoede#6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core] linux-sunxi#7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c linux-sunxi#8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012 linux-sunxi#9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d linux-sunxi#10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f After the first encap is attached, flow will be added to encap entry's flows list. If neigh update is running at this time, the following encaps of the flow can't hold the encap_tbl_lock and sleep. If neigh update thread is waiting for that flow's init_done, deadlock happens. Fix it by holding lock outside of the for loop. If neigh update is running, prevent encap flows from offloading. Since the lock is held outside of the for loop, concurrent creation of encap entries is not allowed. So remove unnecessary wait_for_completion call for res_ready. Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

…_write() [ Upstream commit 2d1e952 ] If a user can make copy_from_user() fail, there is a potential for UAF/DF due to a lack of locking around the allocation, use and freeing of the data buffers. This issue is not theoretical. I managed to author a POC for it: BUG: KASAN: double-free in kfree+0x5c/0xac Free of addr ffff29280be5de00 by task poc/356 CPU: 1 PID: 356 Comm: poc Not tainted 6.1.0-00001-g961aa6552c04-dirty linux-sunxi#20 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace.part.0+0xe0/0xf0 show_stack+0x18/0x40 dump_stack_lvl+0x64/0x80 print_report+0x188/0x48c kasan_report_invalid_free+0xa0/0xc0 ____kasan_slab_free+0x174/0x1b0 __kasan_slab_free+0x18/0x24 __kmem_cache_free+0x130/0x2e0 kfree+0x5c/0xac mbox_test_message_write+0x208/0x29c full_proxy_write+0x90/0xf0 vfs_write+0x154/0x440 ksys_write+0xcc/0x180 __arm64_sys_write+0x44/0x60 invoke_syscall+0x60/0x190 el0_svc_common.constprop.0+0x7c/0x160 do_el0_svc+0x40/0xf0 el0_svc+0x2c/0x6c el0t_64_sync_handler+0xf4/0x120 el0t_64_sync+0x18c/0x190 Allocated by task 356: kasan_save_stack+0x3c/0x70 kasan_set_track+0x2c/0x40 kasan_save_alloc_info+0x24/0x34 __kasan_kmalloc+0xb8/0xc0 kmalloc_trace+0x58/0x70 mbox_test_message_write+0x6c/0x29c full_proxy_write+0x90/0xf0 vfs_write+0x154/0x440 ksys_write+0xcc/0x180 __arm64_sys_write+0x44/0x60 invoke_syscall+0x60/0x190 el0_svc_common.constprop.0+0x7c/0x160 do_el0_svc+0x40/0xf0 el0_svc+0x2c/0x6c el0t_64_sync_handler+0xf4/0x120 el0t_64_sync+0x18c/0x190 Freed by task 357: kasan_save_stack+0x3c/0x70 kasan_set_track+0x2c/0x40 kasan_save_free_info+0x38/0x5c ____kasan_slab_free+0x13c/0x1b0 __kasan_slab_free+0x18/0x24 __kmem_cache_free+0x130/0x2e0 kfree+0x5c/0xac mbox_test_message_write+0x208/0x29c full_proxy_write+0x90/0xf0 vfs_write+0x154/0x440 ksys_write+0xcc/0x180 __arm64_sys_write+0x44/0x60 invoke_syscall+0x60/0x190 el0_svc_common.constprop.0+0x7c/0x160 do_el0_svc+0x40/0xf0 el0_svc+0x2c/0x6c el0t_64_sync_handler+0xf4/0x120 el0t_64_sync+0x18c/0x190 Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

Commit a5a9230 (fbdev: fbcon: Properly revert changes when vc_resize() failed) started restoring old font data upon failure (of vc_resize()). But it performs so only for user fonts. It means that the "system"/internal fonts are not restored at all. So in result, the very first call to fbcon_do_set_font() performs no restore at all upon failing vc_resize(). This can be reproduced by Syzkaller to crash the system on the next invocation of font_get(). It's rather hard to hit the allocation failure in vc_resize() on the first font_set(), but not impossible. Esp. if fault injection is used to aid the execution/failure. It was demonstrated by Sirius: BUG: unable to handle page fault for address: fffffffffffffff8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD cb7b067 P4D cb7b067 PUD cb7d067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 8007 Comm: poc Not tainted 6.7.0-g9d1694dc91ce linux-sunxi#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:fbcon_get_font+0x229/0x800 drivers/video/fbdev/core/fbcon.c:2286 Call Trace: <TASK> con_font_get drivers/tty/vt/vt.c:4558 [inline] con_font_op+0x1fc/0xf20 drivers/tty/vt/vt.c:4673 vt_k_ioctl drivers/tty/vt/vt_ioctl.c:474 [inline] vt_ioctl+0x632/0x2ec0 drivers/tty/vt/vt_ioctl.c:752 tty_ioctl+0x6f8/0x1570 drivers/tty/tty_io.c:2803 vfs_ioctl fs/ioctl.c:51 [inline] ... So restore the font data in any case, not only for user fonts. Note the later 'if' is now protected by 'old_userfont' and not 'old_data' as the latter is always set now. (And it is supposed to be non-NULL. Otherwise we would see the bug above again.) Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Fixes: a5a9230 ("fbdev: fbcon: Properly revert changes when vc_resize() failed") Reported-and-tested-by: Ubisectech Sirius <bugreport@ubisectech.com> Cc: Ubisectech Sirius <bugreport@ubisectech.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Helge Deller <deller@gmx.de> Cc: linux-fbdev@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20240208114411.14604-1-jirislaby@kernel.org

commit a5b862c upstream. l2cap_le_flowctl_init() can cause both div-by-zero and an integer overflow since hdev->le_mtu may not fall in the valid range. Move MTU from hci_dev to hci_conn to validate MTU and stop the connection process earlier if MTU is invalid. Also, add a missing validation in read_buffer_size() and make it return an error value if the validation fails. Now hci_conn_add() returns ERR_PTR() as it can fail due to the both a kzalloc failure and invalid MTU value. divide error: 0000 [jwrdegoede#1] PREEMPT SMP KASAN NOPTI CPU: 0 PID: 67 Comm: kworker/u5:0 Tainted: G W 6.9.0-rc5+ linux-sunxi#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci0 hci_rx_work RIP: 0010:l2cap_le_flowctl_init+0x19e/0x3f0 net/bluetooth/l2cap_core.c:547 Code: e8 17 17 0c 00 66 41 89 9f 84 00 00 00 bf 01 00 00 00 41 b8 02 00 00 00 4c 89 fe 4c 89 e2 89 d9 e8 27 17 0c 00 44 89 f0 31 d2 <66> f7 f3 89 c3 ff c3 4d 8d b7 88 00 00 00 4c 89 f0 48 c1 e8 03 42 RSP: 0018:ffff88810bc0f858 EFLAGS: 00010246 RAX: 00000000000002a0 RBX: 0000000000000000 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffff88810bc0f7c0 RDI: ffffc90002dcb66f RBP: ffff88810bc0f880 R08: aa69db2dda70ff01 R09: 0000ffaaaaaaaaaa R10: 0084000000ffaaaa R11: 0000000000000000 R12: ffff88810d65a084 R13: dffffc0000000000 R14: 00000000000002a0 R15: ffff88810d65a000 FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000100 CR3: 0000000103268003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> l2cap_le_connect_req net/bluetooth/l2cap_core.c:4902 [inline] l2cap_le_sig_cmd net/bluetooth/l2cap_core.c:5420 [inline] l2cap_le_sig_channel net/bluetooth/l2cap_core.c:5486 [inline] l2cap_recv_frame+0xe59d/0x11710 net/bluetooth/l2cap_core.c:6809 l2cap_recv_acldata+0x544/0x10a0 net/bluetooth/l2cap_core.c:7506 hci_acldata_packet net/bluetooth/hci_core.c:3939 [inline] hci_rx_work+0x5e5/0xb20 net/bluetooth/hci_core.c:4176 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0x90f/0x1530 kernel/workqueue.c:3335 worker_thread+0x926/0xe70 kernel/workqueue.c:3416 kthread+0x2e3/0x380 kernel/kthread.c:388 ret_from_fork+0x5c/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- Fixes: 6ed58ec ("Bluetooth: Use LE buffers for LE traffic") Suggested-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

amery closed this as completed May 22, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cp: cannot stat `drivers/net/wireless/bcm4330/firmware/mw269v3_fw.bin': No such file or directory #20

cp: cannot stat `drivers/net/wireless/bcm4330/firmware/mw269v3_fw.bin': No such file or directory #20

huaixiaoz commented May 19, 2012

amery commented May 19, 2012

cp: cannot stat `drivers/net/wireless/bcm4330/firmware/mw269v3_fw.bin': No such file or directory #20

cp: cannot stat `drivers/net/wireless/bcm4330/firmware/mw269v3_fw.bin': No such file or directory #20

Comments

huaixiaoz commented May 19, 2012

amery commented May 19, 2012