forked from msm8953-mainline/linux
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync with msm8953-mainline. #1
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is just a stub for now. Later this would be used for features like call recording.
q6voice combines the 3 q6voice-related services (q6mvm, q6cvp, q6cvs) and allows to start/end voice calls at the moment.
The Q6 Voice DAI driver exposes the Q6 Voice subsystem to ASoC. At the moment it provides a single CS-Voice DAI with a hostless FE. Eventually usage should be simplified using a codec2codec link.
Kiciuk
pushed a commit
that referenced
this pull request
Sep 27, 2020
Unable to handle kernel write to read-only memory at virtual address ffff8000111240e8 Mem abort info: ESR = 0x9600004e EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x0000004e CM = 0, WnR = 1 swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000004125b000 [ffff8000111240e8] pgd=00000000fbfff003, p4d=00000000fbfff003, pud=00000000fbffe003, pmd=0060000041000791 Internal error: Oops: 9600004e [#1] SMP Modules linked in: sm5708_fuelgauge(+) sg scsi_mod g_ether CPU: 0 PID: 251 Comm: kworker/0:3 Not tainted 5.8.0-rc7-ARCH_QCOM+ #1636 Hardware name: Samsung A6-Plus LTE Rev.4 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO BTYPE=--) pc : __memcpy+0x94/0x180 lr : vsnprintf+0x294/0x730 sp : ffff800011f1b970 x29: ffff800011f1b970 x28: ffff800011123fb8 x27: ffff8000111240e8 x26: 0000000000000020 x25: ffff800011123fbd x24: 0000000000000012 x23: 00000000ffffffd8 x22: ffff800010d92f50 x21: ffff800011f1baa0 x20: ffff8000111240e8 x19: ffff8000111240fa x18: 0000000000000020 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000004 x14: ffff000052154810 x13: 0000000000000000 x12: 0000000000000020 x11: 0000000000000030 x10: 0101010101010101 x9 : ffff8000109bf950 x8 : ffff0000b13c4f80 x7 : 0000000000000000 x6 : ffff8000111240e8 x5 : 00000000ffffffd8 x4 : ffff0000b6cf3e80 x3 : 0000000065657073 x2 : 0000000000000005 x1 : ffff800011123fbc x0 : ffff8000111240e8 Call trace: __memcpy+0x94/0x180 snprintf+0x58/0x7c qcom_cpufreq_generic_name_version+0x88/0x104 qcom_cpufreq_probe+0xd0/0x490 platform_drv_probe+0x58/0xac really_probe+0xe4/0x490 driver_probe_device+0xd8/0xf0 __device_attach_driver+0x90/0x110 bus_for_each_drv+0x7c/0xcc __device_attach+0x104/0x19c device_initial_probe+0x18/0x20 bus_probe_device+0x98/0xa0 deferred_probe_work_func+0xa0/0xf0 process_one_work+0x1cc/0x464 worker_thread+0x178/0x3f0 kthread+0x114/0x120 ret_from_fork+0x10/0x18 Code: f8408423 f80084c3 36100062 b8404423 (b80044c3) ---[ end trace 8bb6277c7185430c ]---
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
crypto: Fix divide error in do_xor_speed() From: Kirill Tkhai <ktkhai@virtuozzo.com> Latest (but not only latest) linux-next panics with divide error on my QEMU setup. The patch at the bottom of this message fixes the problem. xor: measuring software checksum speed divide error: 0000 [#1] PREEMPT SMP KASAN PREEMPT SMP KASAN CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.10.0-next-20201223+ #2177 RIP: 0010:do_xor_speed+0xbb/0xf3 Code: 41 ff cc 75 b5 bf 01 00 00 00 e8 3d 23 8b fe 65 8b 05 f6 49 83 7d 85 c0 75 05 e8 84 70 81 fe b8 00 00 50 c3 31 d2 48 8d 7b 10 <f7> f5 41 89 c4 e8 58 07 a2 fe 44 89 63 10 48 8d 7b 08 e8 cb 07 a2 RSP: 0000:ffff888100137dc8 EFLAGS: 00010246 RAX: 00000000c3500000 RBX: ffffffff823f0160 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000808 RDI: ffffffff823f0170 RBP: 0000000000000000 R08: ffffffff8109c50f R09: ffffffff824bb6f7 R10: fffffbfff04976de R11: 0000000000000001 R12: 0000000000000000 R13: ffff888101997000 R14: ffff888101994000 R15: ffffffff823f0178 FS: 0000000000000000(0000) GS:ffff8881f7780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000000220e000 CR4: 00000000000006a0 Call Trace: calibrate_xor_blocks+0x13c/0x1c4 ? do_xor_speed+0xf3/0xf3 do_one_initcall+0xc1/0x1b7 ? start_kernel+0x373/0x373 ? unpoison_range+0x3a/0x60 kernel_init_freeable+0x1dd/0x238 ? rest_init+0xc6/0xc6 kernel_init+0x8/0x10a ret_from_fork+0x1f/0x30 ---[ end trace 5bd3c1d0b77772da ]--- Fixes: c055e3e ("crypto: xor - use ktime for template benchmarking") Cc: <stable@vger.kernel.org> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
Commit a351290 ("scsi: target: tcmu: Use priv pointer in se_cmd") modified tcmu_free_cmd() to set NULL to priv pointer in se_cmd. However, se_cmd can be already freed by work queue triggered in target_complete_cmd(). This caused BUG KASAN use-after-free [1]. To fix the bug, do not touch priv pointer in tcmu_free_cmd(). Instead, set NULL to priv pointer before target_complete_cmd() calls. Also, to avoid unnecessary priv pointer change in tcmu_queue_cmd(), modify priv pointer in the function only when tcmu_free_cmd() is not called. [1] BUG: KASAN: use-after-free in tcmu_handle_completions+0x1172/0x1770 [target_core_user] Write of size 8 at addr ffff88814cf79a40 by task cmdproc-uio0/14842 CPU: 2 PID: 14842 Comm: cmdproc-uio0 Not tainted 5.11.0-rc2 #1 Hardware name: Supermicro Super Server/X10SRL-F, BIOS 3.2 11/22/2019 Call Trace: dump_stack+0x9a/0xcc ? tcmu_handle_completions+0x1172/0x1770 [target_core_user] print_address_description.constprop.0+0x18/0x130 ? tcmu_handle_completions+0x1172/0x1770 [target_core_user] ? tcmu_handle_completions+0x1172/0x1770 [target_core_user] kasan_report.cold+0x7f/0x10e ? tcmu_handle_completions+0x1172/0x1770 [target_core_user] tcmu_handle_completions+0x1172/0x1770 [target_core_user] ? queue_tmr_ring+0x5d0/0x5d0 [target_core_user] tcmu_irqcontrol+0x28/0x60 [target_core_user] uio_write+0x155/0x230 ? uio_vma_fault+0x460/0x460 ? security_file_permission+0x4f/0x440 vfs_write+0x1ce/0x860 ksys_write+0xe9/0x1b0 ? __ia32_sys_read+0xb0/0xb0 ? syscall_enter_from_user_mode+0x27/0x70 ? trace_hardirqs_on+0x1c/0x110 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fcf8b61905f Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 b9 fc ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 0c fd ff ff 48 RSP: 002b:00007fcf7b3e6c30 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcf8b61905f RDX: 0000000000000004 RSI: 00007fcf7b3e6c78 RDI: 000000000000000c RBP: 00007fcf7b3e6c80 R08: 0000000000000000 R09: 00007fcf7b3e6aa8 R10: 000000000b01c000 R11: 0000000000000293 R12: 00007ffe0c32a52e R13: 00007ffe0c32a52f R14: 0000000000000000 R15: 00007fcf7b3e7640 Allocated by task 383: kasan_save_stack+0x1b/0x40 ____kasan_kmalloc.constprop.0+0x84/0xa0 kmem_cache_alloc+0x142/0x330 tcm_loop_queuecommand+0x2a/0x4e0 [tcm_loop] scsi_queue_rq+0x12ec/0x2d20 blk_mq_dispatch_rq_list+0x30a/0x1db0 __blk_mq_do_dispatch_sched+0x326/0x830 __blk_mq_sched_dispatch_requests+0x2c8/0x3f0 blk_mq_sched_dispatch_requests+0xca/0x120 __blk_mq_run_hw_queue+0x93/0xe0 process_one_work+0x7b6/0x1290 worker_thread+0x590/0xf80 kthread+0x362/0x430 ret_from_fork+0x22/0x30 Freed by task 11655: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 ____kasan_slab_free+0xec/0x120 slab_free_freelist_hook+0x53/0x160 kmem_cache_free+0xf4/0x5c0 target_release_cmd_kref+0x3ea/0x9e0 [target_core_mod] transport_generic_free_cmd+0x28b/0x2f0 [target_core_mod] target_complete_ok_work+0x250/0xac0 [target_core_mod] process_one_work+0x7b6/0x1290 worker_thread+0x590/0xf80 kthread+0x362/0x430 ret_from_fork+0x22/0x30 Last potentially related work creation: kasan_save_stack+0x1b/0x40 kasan_record_aux_stack+0xa3/0xb0 insert_work+0x48/0x2e0 __queue_work+0x4e8/0xdf0 queue_work_on+0x78/0x80 tcmu_handle_completions+0xad0/0x1770 [target_core_user] tcmu_irqcontrol+0x28/0x60 [target_core_user] uio_write+0x155/0x230 vfs_write+0x1ce/0x860 ksys_write+0xe9/0x1b0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Second to last potentially related work creation: kasan_save_stack+0x1b/0x40 kasan_record_aux_stack+0xa3/0xb0 insert_work+0x48/0x2e0 __queue_work+0x4e8/0xdf0 queue_work_on+0x78/0x80 tcm_loop_queuecommand+0x1c3/0x4e0 [tcm_loop] scsi_queue_rq+0x12ec/0x2d20 blk_mq_dispatch_rq_list+0x30a/0x1db0 __blk_mq_do_dispatch_sched+0x326/0x830 __blk_mq_sched_dispatch_requests+0x2c8/0x3f0 blk_mq_sched_dispatch_requests+0xca/0x120 __blk_mq_run_hw_queue+0x93/0xe0 process_one_work+0x7b6/0x1290 worker_thread+0x590/0xf80 kthread+0x362/0x430 ret_from_fork+0x22/0x30 The buggy address belongs to the object at ffff88814cf79800 which belongs to the cache tcm_loop_cmd_cache of size 896. Link: https://lore.kernel.org/r/20210113024508.1264992-1-shinichiro.kawasaki@wdc.com Fixes: a351290 ("scsi: target: tcmu: Use priv pointer in se_cmd") Cc: stable@vger.kernel.org # v5.9+ Acked-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
While testing the error paths of relocation I hit the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 5.10.0-rc6+ msm8916-mainline#217 Not tainted ------------------------------------------------------ mount/779 is trying to acquire lock: ffffa0e676945418 (&fs_info->balance_mutex){+.+.}-{3:3}, at: btrfs_recover_balance+0x2f0/0x340 but task is already holding lock: ffffa0e60ee31da8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x100 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (btrfs-root-00){++++}-{3:3}: down_read_nested+0x43/0x130 __btrfs_tree_read_lock+0x27/0x100 btrfs_read_lock_root_node+0x31/0x40 btrfs_search_slot+0x462/0x8f0 btrfs_update_root+0x55/0x2b0 btrfs_drop_snapshot+0x398/0x750 clean_dirty_subvols+0xdf/0x120 btrfs_recover_relocation+0x534/0x5a0 btrfs_start_pre_rw_mount+0xcb/0x170 open_ctree+0x151f/0x1726 btrfs_mount_root.cold+0x12/0xea legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 vfs_kern_mount.part.0+0x71/0xb0 btrfs_mount+0x10d/0x380 legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 path_mount+0x433/0xc10 __x64_sys_mount+0xe3/0x120 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #1 (sb_internal#2){.+.+}-{0:0}: start_transaction+0x444/0x700 insert_balance_item.isra.0+0x37/0x320 btrfs_balance+0x354/0xf40 btrfs_ioctl_balance+0x2cf/0x380 __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&fs_info->balance_mutex){+.+.}-{3:3}: __lock_acquire+0x1120/0x1e10 lock_acquire+0x116/0x370 __mutex_lock+0x7e/0x7b0 btrfs_recover_balance+0x2f0/0x340 open_ctree+0x1095/0x1726 btrfs_mount_root.cold+0x12/0xea legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 vfs_kern_mount.part.0+0x71/0xb0 btrfs_mount+0x10d/0x380 legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 path_mount+0x433/0xc10 __x64_sys_mount+0xe3/0x120 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 other info that might help us debug this: Chain exists of: &fs_info->balance_mutex --> sb_internal#2 --> btrfs-root-00 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(btrfs-root-00); lock(sb_internal#2); lock(btrfs-root-00); lock(&fs_info->balance_mutex); *** DEADLOCK *** 2 locks held by mount/779: #0: ffffa0e60dc040e0 (&type->s_umount_key#47/1){+.+.}-{3:3}, at: alloc_super+0xb5/0x380 #1: ffffa0e60ee31da8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x100 stack backtrace: CPU: 0 PID: 779 Comm: mount Not tainted 5.10.0-rc6+ msm8916-mainline#217 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack+0x8b/0xb0 check_noncircular+0xcf/0xf0 ? trace_call_bpf+0x139/0x260 __lock_acquire+0x1120/0x1e10 lock_acquire+0x116/0x370 ? btrfs_recover_balance+0x2f0/0x340 __mutex_lock+0x7e/0x7b0 ? btrfs_recover_balance+0x2f0/0x340 ? btrfs_recover_balance+0x2f0/0x340 ? rcu_read_lock_sched_held+0x3f/0x80 ? kmem_cache_alloc_trace+0x2c4/0x2f0 ? btrfs_get_64+0x5e/0x100 btrfs_recover_balance+0x2f0/0x340 open_ctree+0x1095/0x1726 btrfs_mount_root.cold+0x12/0xea ? rcu_read_lock_sched_held+0x3f/0x80 legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 vfs_kern_mount.part.0+0x71/0xb0 btrfs_mount+0x10d/0x380 ? __kmalloc_track_caller+0x2f2/0x320 legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 ? capable+0x3a/0x60 path_mount+0x433/0xc10 __x64_sys_mount+0xe3/0x120 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This is straightforward to fix, simply release the path before we setup the balance_ctl. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
In Linux, if a driver does disable_irq() and later does enable_irq() on its interrupt, I believe it's expecting these properties: * If an interrupt was pending when the driver disabled then it will still be pending after the driver re-enables. * If an edge-triggered interrupt comes in while an interrupt is disabled it should assert when the interrupt is re-enabled. If you think that the above sounds a lot like the disable_irq() and enable_irq() are supposed to be masking/unmasking the interrupt instead of disabling/enabling it then you've made an astute observation. Specifically when talking about interrupts, "mask" usually means to stop posting interrupts but keep tracking them and "disable" means to fully shut off interrupt detection. It's unfortunate that this is so confusing, but presumably this is all the way it is for historical reasons. Perhaps more confusing than the above is that, even though clients of IRQs themselves don't have a way to request mask/unmask vs. disable/enable calls, IRQ chips themselves can implement both. ...and yet more confusing is that if an IRQ chip implements disable/enable then they will be called when a client driver calls disable_irq() / enable_irq(). It does feel like some of the above could be cleared up. However, without any other core interrupt changes it should be clear that when an IRQ chip gets a request to "disable" an IRQ that it has to treat it like a mask of that IRQ. In any case, after that long interlude you can see that the "unmask and clear" can break things. Maulik tried to fix it so that we no longer did "unmask and clear" in commit 71266d9 ("pinctrl: qcom: Move clearing pending IRQ to .irq_request_resources callback"), but it only handled the PDC case and it had problems (it caused sc7180-trogdor devices to fail to suspend). Let's fix. >From my understanding the source of the phantom interrupt in the were these two things: 1. One that could have been introduced in msm_gpio_irq_set_type() (only for the non-PDC case). 2. Edges could have been detected when a GPIO was muxed away. Fixing case #1 is easy. We can just add a clear in msm_gpio_irq_set_type(). Fixing case #2 is harder. Let's use a concrete example. In sc7180-trogdor.dtsi we configure the uart3 to have two pinctrl states, sleep and default, and mux between the two during runtime PM and system suspend (see geni_se_resources_{on,off}() for more details). The difference between the sleep and default state is that the RX pin is muxed to a GPIO during sleep and muxed to the UART otherwise. As per Qualcomm, when we mux the pin over to the UART function the PDC (or the non-PDC interrupt detection logic) is still watching it / latching edges. These edges don't cause interrupts because the current code masks the interrupt unless we're entering suspend. However, as soon as we enter suspend we unmask the interrupt and it's counted as a wakeup. Let's deal with the problem like this: * When we mux away, we'll mask our interrupt. This isn't necessary in the above case since the client already masked us, but it's a good idea in general. * When we mux back will clear any interrupts and unmask. Fixes: 4b7618f ("pinctrl: qcom: Add irq_enable callback for msm gpio") Fixes: 71266d9 ("pinctrl: qcom: Move clearing pending IRQ to .irq_request_resources callback") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Maulik Shah <mkshah@codeaurora.org> Tested-by: Maulik Shah <mkshah@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20210114191601.v7.4.I7cf3019783720feb57b958c95c2b684940264cd1@changeid Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
Wolfram reports that his R-Car H2-based Lager board can no longer be rebooted in v5.11-rc1, as it crashes with an imprecise external abort. The issue can be reproduced on other boards (e.g. Koelsch with R-Car M2-W) too, if CONFIG_IP_PNP is disabled, and the Ethernet interface is down at reboot time: Unhandled fault: imprecise external abort (0x1406) at 0x00000000 pgd = (ptrval) [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000 Internal error: : 1406 [#1] ARM Modules linked in: CPU: 0 PID: 1105 Comm: init Tainted: G W 5.10.0-rc1-00402-ge2f016cf7751 #1048 Hardware name: Generic R-Car Gen2 (Flattened Device Tree) PC is at sh_mdio_ctrl+0x44/0x60 LR is at sh_mmd_ctrl+0x20/0x24 ... Backtrace: [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24) r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4 [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8) [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc) r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001 [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0) r7:0000001f r6:00000001 r5:c221e000 r4:c221e000 [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54) r7:0000001f r6:00000001 r5:c221e000 r4:c221e458 [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20) r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800 [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80) [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50) r5:c229f800 r4:c229f800 [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c) r5:c229f800 r4:c229f804 [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8) [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48) r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000 [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60) [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208) r5:4321fedc r4:01234567 [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c) r7:00000058 r6:00000000 r5:00000000 r4:00000000 [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54) As of commit e2f016c ("net: phy: add a shutdown procedure"), system reboot calls phy_disable_interrupts() during shutdown. As this happens unconditionally, the PHY registers may be accessed while the device is suspended, causing undefined behavior, which may crash the system. Fix this by wrapping the PHY bitbang accessors in the sh_eth driver by wrappers that take care of Runtime PM, to resume the device when needed. Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
might_sleep() is a debugging aid and triggers rescheduling only for certain kernel configurations. Replace with an explicit check and reschedule to work for all kernel configurations. Fixes the following trace: [ 572.945146] rcu: INFO: rcu_sched self-detected stall on CPU [ 572.949275] rcu: 0-....: (2099 ticks this GP) idle=572/1/0x40000002 softirq=7412/7412 fqs=974 [ 572.957964] (t=2100 jiffies g=10393 q=21) [ 572.962054] NMI backtrace for cpu 0 [ 572.965540] CPU: 0 PID: 165 Comm: xtest Not tainted 5.8.7 #1 [ 572.971188] Hardware name: STM32 (Device Tree Support) [ 572.976354] [<c011163c>] (unwind_backtrace) from [<c010b7f8>] (show_stack+0x10/0x14) [ 572.984080] [<c010b7f8>] (show_stack) from [<c0511e4c>] (dump_stack+0xc4/0xd8) [ 572.991300] [<c0511e4c>] (dump_stack) from [<c0519abc>] (nmi_cpu_backtrace+0x90/0xc4) [ 572.999130] [<c0519abc>] (nmi_cpu_backtrace) from [<c0519bdc>] (nmi_trigger_cpumask_backtrace+0xec/0x130) [ 573.008706] [<c0519bdc>] (nmi_trigger_cpumask_backtrace) from [<c01a5184>] (rcu_dump_cpu_stacks+0xe8/0x110) [ 573.018453] [<c01a5184>] (rcu_dump_cpu_stacks) from [<c01a4234>] (rcu_sched_clock_irq+0x7fc/0xa88) [ 573.027416] [<c01a4234>] (rcu_sched_clock_irq) from [<c01acdd0>] (update_process_times+0x30/0x8c) [ 573.036291] [<c01acdd0>] (update_process_times) from [<c01bfb90>] (tick_sched_timer+0x4c/0xa8) [ 573.044905] [<c01bfb90>] (tick_sched_timer) from [<c01adcc8>] (__hrtimer_run_queues+0x174/0x358) [ 573.053696] [<c01adcc8>] (__hrtimer_run_queues) from [<c01aea2c>] (hrtimer_interrupt+0x118/0x2bc) [ 573.062573] [<c01aea2c>] (hrtimer_interrupt) from [<c09ad664>] (arch_timer_handler_virt+0x28/0x30) [ 573.071536] [<c09ad664>] (arch_timer_handler_virt) from [<c0190f50>] (handle_percpu_devid_irq+0x8c/0x240) [ 573.081109] [<c0190f50>] (handle_percpu_devid_irq) from [<c018ab8c>] (generic_handle_irq+0x34/0x44) [ 573.090156] [<c018ab8c>] (generic_handle_irq) from [<c018b194>] (__handle_domain_irq+0x5c/0xb0) [ 573.098857] [<c018b194>] (__handle_domain_irq) from [<c052ac50>] (gic_handle_irq+0x4c/0x90) [ 573.107209] [<c052ac50>] (gic_handle_irq) from [<c0100b0c>] (__irq_svc+0x6c/0x90) [ 573.114682] Exception stack(0xd90dfcf8 to 0xd90dfd40) [ 573.119732] fce0: ffff0004 00000000 [ 573.127917] fd00: 00000000 00000000 00000000 00000000 00000000 00000000 d93493cc ffff0000 [ 573.136098] fd20: d2bc39c0 be926998 d90dfd58 d90dfd48 c09f3384 c01151f0 400d0013 ffffffff [ 573.144281] [<c0100b0c>] (__irq_svc) from [<c01151f0>] (__arm_smccc_smc+0x10/0x20) [ 573.151854] [<c01151f0>] (__arm_smccc_smc) from [<c09f3384>] (optee_smccc_smc+0x3c/0x44) [ 573.159948] [<c09f3384>] (optee_smccc_smc) from [<c09f4170>] (optee_do_call_with_arg+0xb8/0x154) [ 573.168735] [<c09f4170>] (optee_do_call_with_arg) from [<c09f4638>] (optee_invoke_func+0x110/0x190) [ 573.177786] [<c09f4638>] (optee_invoke_func) from [<c09f1ebc>] (tee_ioctl+0x10b8/0x11c0) [ 573.185879] [<c09f1ebc>] (tee_ioctl) from [<c029f62c>] (ksys_ioctl+0xe0/0xa4c) [ 573.193101] [<c029f62c>] (ksys_ioctl) from [<c0100060>] (ret_fast_syscall+0x0/0x54) [ 573.200750] Exception stack(0xd90dffa8 to 0xd90dfff0) [ 573.205803] ffa0: be926bf4 be926a78 00000003 8010a403 be926908 004e3cf8 [ 573.213987] ffc0: be926bf4 be926a78 00000000 00000036 be926908 be926918 be9269b0 bffdf0f8 [ 573.222162] ffe0: b6d76fb0 be9268fc b6d66621 b6c7e0d8 seen on STM32 DK2 with CONFIG_PREEMPT_NONE. Fixes: 9f02b8f ("tee: optee: add might_sleep for RPC requests") Signed-off-by: Rouven Czerwinski <r.czerwinski@pengutronix.de> Tested-by: Sumit Garg <sumit.garg@linaro.org> [jw: added fixes tag + small adjustments in the code] Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
Suspending/resuming with an HDMI dongle attached leads to crashes from an audio regmap. Unable to handle kernel paging request at virtual address ffffffc018068000 Mem abort info: ESR = 0x96000047 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000047 CM = 0, WnR = 1 swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000081b12000 [ffffffc018068000] pgd=0000000275d14003, pud=0000000275d14003, pmd=000000026365d003, pte=0000000000000000 Internal error: Oops: 96000047 [#1] PREEMPT SMP Call trace: regmap_mmio_write32le+0x2c/0x40 regmap_mmio_write+0x48/0x6c _regmap_bus_reg_write+0x34/0x44 _regmap_write+0x100/0x150 regcache_default_sync+0xc0/0x138 regcache_sync+0x188/0x26c lpass_platform_pcmops_resume+0x48/0x54 [snd_soc_lpass_platform] snd_soc_component_resume+0x28/0x40 soc_resume_deferred+0x6c/0x178 process_one_work+0x208/0x3c8 worker_thread+0x23c/0x3e8 kthread+0x144/0x178 ret_from_fork+0x10/0x18 Code: d503201f d50332bf f94002a8 8b344108 (b9000113) I can reliably reproduce this problem by running 'tail' on the registers file in debugfs for the hdmi regmap. # tail /sys/kernel/debug/regmap/62d87000.lpass-lpass_hdmi/registers [ 84.658733] Unable to handle kernel paging request at virtual address ffffffd0128e800c This crash happens because we're trying to read registers from the regmap beyond the length of the mapping created by ioremap(). The number of hdmi_rdma_channels determines the size of the regmap via this code in sound/soc/qcom/lpass-cpu.c: lpass_hdmi_regmap_config.max_register = LPAIF_HDMI_RDMAPER_REG(variant, variant->hdmi_rdma_channels); According to debugfs the size of the regmap is 0x68010 but according to the DTS file posted in [1] the size is only 0x68000 (see the first reg property of the lpass_cpu node). Let's change the number of channels to be 3 instead of 4 so the math works out to have a max register of 0x67010, nicely fitting inside of the region size of 0x68000. Note: I tried to bump up the size of the register region to the next page to include the 0x68010 register but then the tail command caused SErrors with an async abort, implying that the register region doesn't exist or it isn't clocked because the bus is telling us that the register read failed. I reduce the number of channels and played audio through the HDMI channel and it kept working so I think this is correct. Fixes: 2ad63dc ("ASoC: qcom: sc7180: Add support for audio over DP") Link: https://lore.kernel.org/r/1601448168-18396-2-git-send-email-srivasam@codeaurora.org [1] Cc: V Sujith Kumar Reddy <vsujithk@codeaurora.org> Cc: Srinivasa Rao <srivasam@codeaurora.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Cheng-Yi Chiang <cychiang@chromium.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20210115203329.846824-1-swboyd@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
If dobj->control is not initialized we end up in an OOPs during skl_tplg_complete: [ 26.553358] BUG: kernel NULL pointer dereference, address: 0000000000000078 [ 26.561151] #PF: supervisor read access in kernel mode [ 26.566897] #PF: error_code(0x0000) - not-present page [ 26.572642] PGD 0 P4D 0 [ 26.575479] Oops: 0000 [#1] PREEMPT SMP PTI [ 26.580158] CPU: 2 PID: 2082 Comm: udevd Tainted: G C 5.4.81 #4 [ 26.588232] Hardware name: HP Soraka/Soraka, BIOS Google_Soraka.10431.106.0 12/03/2019 [ 26.597082] RIP: 0010:skl_tplg_complete+0x70/0x144 [snd_soc_skl] Fixes: 2d744ec ("ASoC: Intel: Skylake: Automatic DMIC format configuration according to information from NHL") Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Tested-by: Lukasz Majczak <lma@semihalf.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210121171644.131059-1-ribalda@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
I was hitting the below panic continuously when attaching kprobes to scheduler functions [ 159.045212] Unexpected kernel BRK exception at EL1 [ 159.053753] Internal error: BRK handler: f2000006 [#1] PREEMPT SMP [ 159.059954] Modules linked in: [ 159.063025] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.11.0-rc4-00008-g1e2a199f6ccd msm8953-mainline#56 [rt-app] <notice> [1] Exiting.[ 159.071166] Hardware name: ARM Juno development board (r2) (DT) [ 159.079689] pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO BTYPE=--) [ 159.085723] pc : 0xffff80001624501c [ 159.089377] lr : attach_entity_load_avg+0x2ac/0x350 [ 159.094271] sp : ffff80001622b640 [rt-app] <notice> [0] Exiting.[ 159.097591] x29: ffff80001622b640 x28: 0000000000000001 [ 159.105515] x27: 0000000000000049 x26: ffff000800b79980 [ 159.110847] x25: ffff00097ef37840 x24: 0000000000000000 [ 159.116331] x23: 00000024eacec1ec x22: ffff00097ef12b90 [ 159.121663] x21: ffff00097ef37700 x20: ffff800010119170 [rt-app] <notice> [11] Exiting.[ 159.126995] x19: ffff00097ef37840 x18: 000000000000000e [ 159.135003] x17: 0000000000000001 x16: 0000000000000019 [ 159.140335] x15: 0000000000000000 x14: 0000000000000000 [ 159.145666] x13: 0000000000000002 x12: 0000000000000002 [ 159.150996] x11: ffff80001592f9f0 x10: 0000000000000060 [ 159.156327] x9 : ffff8000100f6f9c x8 : be618290de0999a1 [ 159.161659] x7 : ffff80096a4b1000 x6 : 0000000000000000 [ 159.166990] x5 : ffff00097ef37840 x4 : 0000000000000000 [ 159.172321] x3 : ffff000800328948 x2 : 0000000000000000 [ 159.177652] x1 : 0000002507d52fec x0 : ffff00097ef12b90 [ 159.182983] Call trace: [ 159.185433] 0xffff80001624501c [ 159.188581] update_load_avg+0x2d0/0x778 [ 159.192516] enqueue_task_fair+0x134/0xe20 [ 159.196625] enqueue_task+0x4c/0x2c8 [ 159.200211] ttwu_do_activate+0x70/0x138 [ 159.204147] sched_ttwu_pending+0xbc/0x160 [ 159.208253] flush_smp_call_function_queue+0x16c/0x320 [ 159.213408] generic_smp_call_function_single_interrupt+0x1c/0x28 [ 159.219521] ipi_handler+0x1e8/0x3c8 [ 159.223106] handle_percpu_devid_irq+0xd8/0x460 [ 159.227650] generic_handle_irq+0x38/0x50 [ 159.231672] __handle_domain_irq+0x6c/0xc8 [ 159.235781] gic_handle_irq+0xcc/0xf0 [ 159.239452] el1_irq+0xb4/0x180 [ 159.242600] rcu_is_watching+0x28/0x70 [ 159.246359] rcu_read_lock_held_common+0x44/0x88 [ 159.250991] rcu_read_lock_any_held+0x30/0xc0 [ 159.255360] kretprobe_dispatcher+0xc4/0xf0 [ 159.259555] __kretprobe_trampoline_handler+0xc0/0x150 [ 159.264710] trampoline_probe_handler+0x38/0x58 [ 159.269255] kretprobe_trampoline+0x70/0xc4 [ 159.273450] run_rebalance_domains+0x54/0x80 [ 159.277734] __do_softirq+0x164/0x684 [ 159.281406] irq_exit+0x198/0x1b8 [ 159.284731] __handle_domain_irq+0x70/0xc8 [ 159.288840] gic_handle_irq+0xb0/0xf0 [ 159.292510] el1_irq+0xb4/0x180 [ 159.295658] arch_cpu_idle+0x18/0x28 [ 159.299245] default_idle_call+0x9c/0x3e8 [ 159.303265] do_idle+0x25c/0x2a8 [ 159.306502] cpu_startup_entry+0x2c/0x78 [ 159.310436] secondary_start_kernel+0x160/0x198 [ 159.314984] Code: d42000c0 aa1e03e9 d42000c0 aa1e03e9 (d42000c0) After a bit of head scratching and debugging it turned out that it is due to kprobe handler being interrupted by a tick that causes us to go into (I think another) kprobe handler. The culprit was kprobe_breakpoint_ss_handler() returning DBG_HOOK_ERROR which leads to the Unexpected kernel BRK exception. Reverting commit ba090f9 ("arm64: kprobes: Remove redundant kprobe_step_ctx") seemed to fix the problem for me. Further analysis showed that kcb->kprobe_status is set to KPROBE_REENTER when the error occurs. By teaching kprobe_breakpoint_ss_handler() to handle this status I can no longer reproduce the problem. Fixes: ba090f9 ("arm64: kprobes: Remove redundant kprobe_step_ctx") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20210122110909.3324607-1-qais.yousef@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
IORING_OP_CLOSE is special in terms of cancelation, since it has an intermediate state where we've removed the file descriptor but hasn't closed the file yet. For that reason, it's currently marked with IO_WQ_WORK_NO_CANCEL to prevent cancelation. This ensures that the op is always run even if canceled, to prevent leaving us with a live file but an fd that is gone. However, with SQPOLL, since a cancel request doesn't carry any resources on behalf of the request being canceled, if we cancel before any of the close op has been run, we can end up with io-wq not having the ->files assigned. This can result in the following oops reported by Joseph: BUG: kernel NULL pointer dereference, address: 00000000000000d8 PGD 800000010b76f067 P4D 800000010b76f067 PUD 10b462067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 1788 Comm: io_uring-sq Not tainted 5.11.0-rc4 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__lock_acquire+0x19d/0x18c0 Code: 00 00 8b 1d fd 56 dd 08 85 db 0f 85 43 05 00 00 48 c7 c6 98 7b 95 82 48 c7 c7 57 96 93 82 e8 9a bc f5 ff 0f 0b e9 2b 05 00 00 <48> 81 3f c0 ca 67 8a b8 00 00 00 00 41 0f 45 c0 89 04 24 e9 81 fe RSP: 0018:ffffc90001933828 EFLAGS: 00010002 RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000000d8 RBP: 0000000000000246 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: ffff888106e8a140 R15: 00000000000000d8 FS: 0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000d8 CR3: 0000000106efa004 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: lock_acquire+0x31a/0x440 ? close_fd_get_file+0x39/0x160 ? __lock_acquire+0x647/0x18c0 _raw_spin_lock+0x2c/0x40 ? close_fd_get_file+0x39/0x160 close_fd_get_file+0x39/0x160 io_issue_sqe+0x1334/0x14e0 ? lock_acquire+0x31a/0x440 ? __io_free_req+0xcf/0x2e0 ? __io_free_req+0x175/0x2e0 ? find_held_lock+0x28/0xb0 ? io_wq_submit_work+0x7f/0x240 io_wq_submit_work+0x7f/0x240 io_wq_cancel_cb+0x161/0x580 ? io_wqe_wake_worker+0x114/0x360 ? io_uring_get_socket+0x40/0x40 io_async_find_and_cancel+0x3b/0x140 io_issue_sqe+0xbe1/0x14e0 ? __lock_acquire+0x647/0x18c0 ? __io_queue_sqe+0x10b/0x5f0 __io_queue_sqe+0x10b/0x5f0 ? io_req_prep+0xdb/0x1150 ? mark_held_locks+0x6d/0xb0 ? mark_held_locks+0x6d/0xb0 ? io_queue_sqe+0x235/0x4b0 io_queue_sqe+0x235/0x4b0 io_submit_sqes+0xd7e/0x12a0 ? _raw_spin_unlock_irq+0x24/0x30 ? io_sq_thread+0x3ae/0x940 io_sq_thread+0x207/0x940 ? do_wait_intr_irq+0xc0/0xc0 ? __ia32_sys_io_uring_enter+0x650/0x650 kthread+0x134/0x180 ? kthread_create_worker_on_cpu+0x90/0x90 ret_from_fork+0x1f/0x30 Fix this by moving the IO_WQ_WORK_NO_CANCEL until _after_ we've modified the fdtable. Canceling before this point is totally fine, and running it in the io-wq context _after_ that point is also fine. For 5.12, we'll handle this internally and get rid of the no-cancel flag, as IORING_OP_CLOSE is the only user of it. Cc: stable@vger.kernel.org Fixes: b5dba59 ("io_uring: add support for IORING_OP_CLOSE") Reported-by: "Abaci <abaci@linux.alibaba.com>" Reviewed-and-tested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
Fixes failure with 4096x1080 resolutions [ 284.315379] WARNING: CPU: 1 PID: 901 at drivers/gpu/drm/vc4/vc4_plane.c:981 vc4_plane_mode_set+0x1374/0x13c4 [ 284.315385] Modules linked in: ir_rc5_decoder rpivid_hevc(C) bcm2835_codec(C) bcm2835_isp(C) bcm2835_mmal_vchiq(C) bcm2835_gpiomem v4l2_mem2mem videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc cdc_acm xpad ir_rc6_decoder rc_rc6_mce gpio_ir_recv fuse [ 284.315509] CPU: 1 PID: 901 Comm: kodi.bin Tainted: G C 5.10.7 #1 [ 284.315514] Hardware name: BCM2711 [ 284.315518] Backtrace: [ 284.315533] [<c0cc5ca0>] (dump_backtrace) from [<c0cc6014>] (show_stack+0x20/0x24) [ 284.315540] r7:ffffffff r6:00000000 r5:68000013 r4:c18ecf1c [ 284.315549] [<c0cc5ff4>] (show_stack) from [<c0cca638>] (dump_stack+0xc4/0xf0) [ 284.315558] [<c0cca574>] (dump_stack) from [<c022314c>] (__warn+0xfc/0x158) [ 284.315564] r9:00000000 r8:00000009 r7:000003d5 r6:00000009 r5:c08cc7dc r4:c0fd09b8 [ 284.315572] [<c0223050>] (__warn) from [<c0cc67ec>] (warn_slowpath_fmt+0x74/0xe4) [ 284.315577] r7:c08cc7dc r6:000003d5 r5:c0fd09b8 r4:00000000 [ 284.315584] [<c0cc677c>] (warn_slowpath_fmt) from [<c08cc7dc>] (vc4_plane_mode_set+0x1374/0x13c4) [ 284.315589] r8:00000000 r7:00000000 r6:00001000 r5:c404c600 r4:c2e34600 [ 284.315596] [<c08cb468>] (vc4_plane_mode_set) from [<c08cc984>] (vc4_plane_atomic_check+0x40/0x1c0) [ 284.315601] r10:00000001 r9:c2e34600 r8:c0e67068 r7:c0fc44e0 r6:c2ce3640 r5:c3d636c0 [ 284.315605] r4:c2e34600 [ 284.315614] [<c08cc944>] (vc4_plane_atomic_check) from [<c0860504>] (drm_atomic_helper_check_planes+0xec/0x1ec) [ 284.315620] r9:c2e34600 r8:c0e67068 r7:c0fc44e0 r6:c2ce3640 r5:c3d636c0 r4:00000006 [ 284.315627] [<c0860418>] (drm_atomic_helper_check_planes) from [<c0860658>] (drm_atomic_helper_check+0x54/0x9c) [ 284.315633] r9:c2e35400 r8:00000006 r7:00000000 r6:c2ba7800 r5:c3d636c0 r4:00000000 [ 284.315641] [<c0860604>] (drm_atomic_helper_check) from [<c08b7ca8>] (vc4_atomic_check+0x25c/0x454) [ 284.315645] r7:00000000 r6:c2ba7800 r5:00000001 r4:c3d636c0 [ 284.315652] [<c08b7a4c>] (vc4_atomic_check) from [<c0881278>] (drm_atomic_check_only+0x5cc/0x7e0) [ 284.315658] r10:c404c6c8 r9:ffffffff r8:c472c480 r7:00000003 r6:c3d636c0 r5:00000000 [ 284.315662] r4:0000003c r3:c08b7a4c [ 284.315670] [<c0880cac>] (drm_atomic_check_only) from [<c089ba60>] (drm_mode_atomic_ioctl+0x758/0xa7c) [ 284.315675] r10:c3d46000 r9:c3d636c0 r8:c2ce8a70 r7:027e3a54 r6:00000043 r5:c1fbb800 [ 284.315679] r4:0281a858 [ 284.315688] [<c089b308>] (drm_mode_atomic_ioctl) from [<c086e9f8>] (drm_ioctl_kernel+0xc4/0x108) [ 284.315693] r10:c03864bc r9:c1fbb800 r8:c3d47e64 r7:c089b308 r6:00000002 r5:c2ba7800 [ 284.315697] r4:00000000 [ 284.315705] [<c086e934>] (drm_ioctl_kernel) from [<c086ee28>] (drm_ioctl+0x1e8/0x3a0) [ 284.315711] r9:c1fbb800 r8:000000bc r7:c3d47e64 r6:00000038 r5:c0e59570 r4:00000038 [ 284.315719] [<c086ec40>] (drm_ioctl) from [<c041f354>] (sys_ioctl+0x35c/0x914) [ 284.315724] r10:c2d08200 r9:00000000 r8:c36fa300 r7:befdd870 r6:c03864bc r5:c36fa301 [ 284.315728] r4:c03864bc [ 284.315735] [<c041eff8>] (sys_ioctl) from [<c0200040>] (ret_fast_syscall+0x0/0x28) [ 284.315739] Exception stack(0xc3d47fa8 to 0xc3d47ff0) [ 284.315745] 7fa0: 027eb750 befdd870 00000000 c03864bc befdd870 00000000 [ 284.315750] 7fc0: 027eb750 befdd870 c03864bc 00000036 027e3948 0281a64 0281a850 027e3a50 [ 284.315756] 7fe0: b4b64100 befdd844 b4b5ba2c b49c994c [ 284.315762] r10:00000036 r9:c3d46000 r8:c0200204 r7:00000036 r6:c03864bc r5:befdd870 [ 284.315765] r4:027eb750 Fixes: c54619b ("drm/vc4: Add support for the BCM2711 HVS5") Signed-off-by: Dom Cobley <popcornmix@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Tested-By: Lucas Nussbaum <lucas@debian.org> Tested-By: Ryutaroh Matsumoto <ryutaroh@ict.e.titech.ac.jp> Link: https://patchwork.freedesktop.org/patch/msgid/20210121105759.1262699-2-maxime@cerno.tech
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
When handling an auth_gss downcall, it's possible to get 0-length opaque object for the acceptor. In the case of a 0-length XDR object, make sure simple_get_netobj() fills in dest->data = NULL, and does not continue to kmemdup() which will set dest->data = ZERO_SIZE_PTR for the acceptor. The trace event code can handle NULL but not ZERO_SIZE_PTR for a string, and so without this patch the rpcgss_context trace event will crash the kernel as follows: [ 162.887992] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 162.898693] #PF: supervisor read access in kernel mode [ 162.900830] #PF: error_code(0x0000) - not-present page [ 162.902940] PGD 0 P4D 0 [ 162.904027] Oops: 0000 [#1] SMP PTI [ 162.905493] CPU: 4 PID: 4321 Comm: rpc.gssd Kdump: loaded Not tainted 5.10.0 msm8953-mainline#133 [ 162.908548] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 162.910978] RIP: 0010:strlen+0x0/0x20 [ 162.912505] Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 162.920101] RSP: 0018:ffffaec900c77d90 EFLAGS: 00010202 [ 162.922263] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000fffde697 [ 162.925158] RDX: 000000000000002f RSI: 0000000000000080 RDI: 0000000000000010 [ 162.928073] RBP: 0000000000000010 R08: 0000000000000e10 R09: 0000000000000000 [ 162.930976] R10: ffff8e698a590cb8 R11: 0000000000000001 R12: 0000000000000e10 [ 162.933883] R13: 00000000fffde697 R14: 000000010034d517 R15: 0000000000070028 [ 162.936777] FS: 00007f1e1eb93700(0000) GS:ffff8e6ab7d00000(0000) knlGS:0000000000000000 [ 162.940067] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 162.942417] CR2: 0000000000000010 CR3: 0000000104eba000 CR4: 00000000000406e0 [ 162.945300] Call Trace: [ 162.946428] trace_event_raw_event_rpcgss_context+0x84/0x140 [auth_rpcgss] [ 162.949308] ? __kmalloc_track_caller+0x35/0x5a0 [ 162.951224] ? gss_pipe_downcall+0x3a3/0x6a0 [auth_rpcgss] [ 162.953484] gss_pipe_downcall+0x585/0x6a0 [auth_rpcgss] [ 162.955953] rpc_pipe_write+0x58/0x70 [sunrpc] [ 162.957849] vfs_write+0xcb/0x2c0 [ 162.959264] ksys_write+0x68/0xe0 [ 162.960706] do_syscall_64+0x33/0x40 [ 162.962238] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 162.964346] RIP: 0033:0x7f1e1f1e57df Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
Currently, if a neighbour isn't valid when offloading tunnel encap rules, we offload the original match and replace the original action with "goto slow path" action. For this we use a temporary flow attribute based on the original flow attribute and then change the action. Flow flags, which among those is the CT flag, are still shared for the slow path rule offload, so we end up parsing this flow as a CT + goto slow path rule. Besides being unnecessary, CT action offload saves extra information in the passed flow attribute, such as created ct_flow and mod_hdr, which is lost onces the temporary flow attribute is freed. When a neigh is updated and is valid, we offload the original CT rule with original CT action, which again creates a ct_flow and mod_hdr and saves it in the flow's original attribute. Then we delete the slow path rule with a temporary flow attribute based on original updated flow attribute, and we free the relevant ct_flow and mod_hdr. Then when tc deletes this flow, we try to free the ct_flow and mod_hdr on the flow's attribute again. To fix the issue, skip all furture proccesing (CT/Sample/Split rules) in offload/unoffload of slow path rules. Call trace: [ 758.850525] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000218 [ 758.952987] Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 758.964170] Modules linked in: act_csum(E) act_pedit(E) act_tunnel_key(E) act_ct(E) nf_flow_table(E) xt_nat(E) ip6table_filter(E) ip6table_nat(E) xt_comment(E) ip6_tables(E) xt_conntrack(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) xt_addrtype(E) iptable_filter(E) iptable_nat(E) bpfilter(E) br_netfilter(E) bridge(E) stp(E) llc(E) xfrm_user(E) overlay(E) act_mirred(E) act_skbedit(E) rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) esp6_offload(E) esp6(E) esp4_offload(E) esp4(E) xfrm_algo(E) mlx5_ib(OE) ib_uverbs(OE) geneve(E) ip6_udp_tunnel(E) udp_tunnel(E) nfnetlink_cttimeout(E) nfnetlink(E) mlx5_core(OE) act_gact(E) cls_flower(E) sch_ingress(E) openvswitch(E) nsh(E) nf_conncount(E) nf_nat(E) mlxfw(OE) psample(E) nf_conntrack(E) nf_defrag_ipv4(E) vfio_mdev(E) mdev(E) ib_core(OE) mlx_compat(OE) crct10dif_ce(E) uio_pdrv_genirq(E) uio(E) i2c_mlx(E) mlxbf_pmc(E) sbsa_gwdt(E) mlxbf_gige(E) gpio_mlxbf2(E) mlxbf_pka(E) mlx_trio(E) mlx_bootctl(E) bluefield_edac(E) knem(O) [ 758.964225] ip_tables(E) mlxbf_tmfifo(E) ipv6(E) crc_ccitt(E) nf_defrag_ipv6(E) [ 759.154186] CPU: 5 PID: 122 Comm: kworker/u16:1 Tainted: G OE 5.4.60-mlnx.52.gde81e85 #1 [ 759.172870] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:3.5.0-2-gc1b5d64 Jan 4 2021 [ 759.195466] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [ 759.207344] pstate: a0000005 (NzCv daif -PAN -UAO) [ 759.217003] pc : mlx5_del_flow_rules+0x5c/0x160 [mlx5_core] [ 759.228229] lr : mlx5_del_flow_rules+0x34/0x160 [mlx5_core] [ 759.405858] Call trace: [ 759.410804] mlx5_del_flow_rules+0x5c/0x160 [mlx5_core] [ 759.421337] __mlx5_eswitch_del_rule.isra.43+0x5c/0x1c8 [mlx5_core] [ 759.433963] mlx5_eswitch_del_offloaded_rule_ct+0x34/0x40 [mlx5_core] [ 759.446942] mlx5_tc_rule_delete_ct+0x68/0x74 [mlx5_core] [ 759.457821] mlx5_tc_ct_delete_flow+0x160/0x21c [mlx5_core] [ 759.469051] mlx5e_tc_unoffload_fdb_rules+0x158/0x168 [mlx5_core] [ 759.481325] mlx5e_tc_encap_flows_del+0x140/0x26c [mlx5_core] [ 759.492901] mlx5e_rep_update_flows+0x11c/0x1ec [mlx5_core] [ 759.504127] mlx5e_rep_neigh_update+0x160/0x200 [mlx5_core] [ 759.515314] process_one_work+0x178/0x400 [ 759.523350] worker_thread+0x58/0x3e8 [ 759.530685] kthread+0x100/0x12c [ 759.537152] ret_from_fork+0x10/0x18 [ 759.544320] Code: 97ffef55 51000673 3100067f 54ffff41 (b9421ab3) [ 759.556548] ---[ end trace fab818bb1085832d ]--- Fixes: 4c3844d ("net/mlx5e: CT: Introduce connection tracking") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
Abaci reported the follow warning: [ 27.073425] do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_wait_exclusive+0x3a/0xc0 [ 27.075805] WARNING: CPU: 0 PID: 951 at kernel/sched/core.c:7853 __might_sleep+0x80/0xa0 [ 27.077604] Modules linked in: [ 27.078379] CPU: 0 PID: 951 Comm: a.out Not tainted 5.11.0-rc3+ #1 [ 27.079637] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 27.080852] RIP: 0010:__might_sleep+0x80/0xa0 [ 27.081835] Code: 65 48 8b 04 25 80 71 01 00 48 8b 90 c0 15 00 00 48 8b 70 18 48 c7 c7 08 39 95 82 c6 05 f9 5f de 08 01 48 89 d1 e8 00 c6 fa ff 0b eb bf 41 0f b6 f5 48 c7 c7 40 23 c9 82 e8 f3 48 ec 00 eb a7 [ 27.084521] RSP: 0018:ffffc90000fe3ce8 EFLAGS: 00010286 [ 27.085350] RAX: 0000000000000000 RBX: ffffffff82956083 RCX: 0000000000000000 [ 27.086348] RDX: ffff8881057a0000 RSI: ffffffff8118cc9e RDI: ffff88813bc28570 [ 27.087598] RBP: 00000000000003a7 R08: 0000000000000001 R09: 0000000000000001 [ 27.088819] R10: ffffc90000fe3e00 R11: 00000000fffef9f0 R12: 0000000000000000 [ 27.089819] R13: 0000000000000000 R14: ffff88810576eb80 R15: ffff88810576e800 [ 27.091058] FS: 00007f7b144cf740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 [ 27.092775] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 27.093796] CR2: 00000000022da7b8 CR3: 000000010b928002 CR4: 00000000003706f0 [ 27.094778] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 27.095780] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 27.097011] Call Trace: [ 27.097685] __mutex_lock+0x5d/0xa30 [ 27.098565] ? prepare_to_wait_exclusive+0x71/0xc0 [ 27.099412] ? io_cqring_overflow_flush.part.101+0x6d/0x70 [ 27.100441] ? lockdep_hardirqs_on_prepare+0xe9/0x1c0 [ 27.101537] ? _raw_spin_unlock_irqrestore+0x2d/0x40 [ 27.102656] ? trace_hardirqs_on+0x46/0x110 [ 27.103459] ? io_cqring_overflow_flush.part.101+0x6d/0x70 [ 27.104317] io_cqring_overflow_flush.part.101+0x6d/0x70 [ 27.105113] io_cqring_wait+0x36e/0x4d0 [ 27.105770] ? find_held_lock+0x28/0xb0 [ 27.106370] ? io_uring_remove_task_files+0xa0/0xa0 [ 27.107076] __x64_sys_io_uring_enter+0x4fb/0x640 [ 27.107801] ? rcu_read_lock_sched_held+0x59/0xa0 [ 27.108562] ? lockdep_hardirqs_on_prepare+0xe9/0x1c0 [ 27.109684] ? syscall_enter_from_user_mode+0x26/0x70 [ 27.110731] do_syscall_64+0x2d/0x40 [ 27.111296] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 27.112056] RIP: 0033:0x7f7b13dc8239 [ 27.112663] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48 [ 27.115113] RSP: 002b:00007ffd6d7f5c88 EFLAGS: 00000286 ORIG_RAX: 00000000000001aa [ 27.116562] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b13dc8239 [ 27.117961] RDX: 000000000000478e RSI: 0000000000000000 RDI: 0000000000000003 [ 27.118925] RBP: 00007ffd6d7f5cb0 R08: 0000000020000040 R09: 0000000000000008 [ 27.119773] R10: 0000000000000001 R11: 0000000000000286 R12: 0000000000400480 [ 27.120614] R13: 00007ffd6d7f5d90 R14: 0000000000000000 R15: 0000000000000000 [ 27.121490] irq event stamp: 5635 [ 27.121946] hardirqs last enabled at (5643): [] console_unlock+0x5c4/0x740 [ 27.123476] hardirqs last disabled at (5652): [] console_unlock+0x4e7/0x740 [ 27.125192] softirqs last enabled at (5272): [] __do_softirq+0x3c5/0x5aa [ 27.126430] softirqs last disabled at (5267): [] asm_call_irq_on_stack+0xf/0x20 [ 27.127634] ---[ end trace 289d7e28fa60f928 ]--- This is caused by calling io_cqring_overflow_flush() which may sleep after calling prepare_to_wait_exclusive() which set task state to TASK_INTERRUPTIBLE Reported-by: Abaci <abaci@linux.alibaba.com> Fixes: 6c50315 ("io_uring: patch up IOPOLL overflow_flush sync") Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
if dma_map_single() fails, kernel will give the below oops since task_struct has been destroyed and we are running into the memory corruption due to use-after-free in kthread_stop(): [ 48.095310] Unable to handle kernel paging request at virtual address 000000c473548040 [ 48.095736] Mem abort info: [ 48.095864] ESR = 0x96000004 [ 48.096025] EC = 0x25: DABT (current EL), IL = 32 bits [ 48.096268] SET = 0, FnV = 0 [ 48.096401] EA = 0, S1PTW = 0 [ 48.096538] Data abort info: [ 48.096659] ISV = 0, ISS = 0x00000004 [ 48.096820] CM = 0, WnR = 0 [ 48.097079] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104639000 [ 48.098099] [000000c473548040] pgd=0000000000000000, p4d=0000000000000000 [ 48.098832] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 48.099232] Modules linked in: [ 48.099387] CPU: 0 PID: 2 Comm: kthreadd Tainted: G W [ 48.099887] Hardware name: linux,dummy-virt (DT) [ 48.100078] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) [ 48.100516] pc : __kmalloc_node+0x214/0x368 [ 48.100944] lr : __kmalloc_node+0x1f4/0x368 [ 48.101458] sp : ffff800011f0bb80 [ 48.101843] x29: ffff800011f0bb80 x28: ffff0000c0098ec0 [ 48.102330] x27: 0000000000000000 x26: 00000000001d4600 [ 48.102648] x25: ffff0000c0098ec0 x24: ffff800011b6a000 [ 48.102988] x23: 00000000ffffffff x22: ffff0000c0098ec0 [ 48.103333] x21: ffff8000101d7a54 x20: 0000000000000dc0 [ 48.103657] x19: ffff0000c0001e00 x18: 0000000000000000 [ 48.104069] x17: 0000000000000000 x16: 0000000000000000 [ 48.105449] x15: 000001aa0304e7b9 x14: 00000000000003b1 [ 48.106401] x13: ffff8000122d5000 x12: ffff80001228d000 [ 48.107296] x11: ffff0000c0154340 x10: 0000000000000000 [ 48.107862] x9 : ffff80000fffffff x8 : ffff0000c473527f [ 48.108326] x7 : ffff800011e62f58 x6 : ffff0000c01c8ed8 [ 48.108778] x5 : ffff0000c0098ec0 x4 : 0000000000000000 [ 48.109223] x3 : 00000000001d4600 x2 : 0000000000000040 [ 48.109656] x1 : 0000000000000001 x0 : ff0000c473548000 [ 48.110104] Call trace: [ 48.110287] __kmalloc_node+0x214/0x368 [ 48.110493] __vmalloc_node_range+0xc4/0x298 [ 48.110805] copy_process+0x2c8/0x15c8 [ 48.111133] kernel_clone+0x5c/0x3c0 [ 48.111373] kernel_thread+0x64/0x90 [ 48.111604] kthreadd+0x158/0x368 [ 48.111810] ret_from_fork+0x10/0x30 [ 48.112336] Code: 17ffffe9 b9402a62 b94008a1 11000421 (f8626802) [ 48.112884] ---[ end trace d4890e21e75419d5 ]--- Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Kiciuk
pushed a commit
that referenced
this pull request
Feb 15, 2021
We need to lock d_parent->d_lock before dget_dlock, or this may have d_lockref updated parallelly like calltrace below which will cause dentry->d_lockref leak and risk a crash. CPU 0 CPU 1 ovl_set_redirect lookup_fast ovl_get_redirect __d_lookup dget_dlock //no lock protection here spin_lock(&dentry->d_lock) dentry->d_lockref.count++ dentry->d_lockref.count++ [ 49.799059] PGD 800000061fed7067 P4D 800000061fed7067 PUD 61fec5067 PMD 0 [ 49.799689] Oops: 0002 [#1] SMP PTI [ 49.800019] CPU: 2 PID: 2332 Comm: node Not tainted 4.19.24-7.20.al7.x86_64 #1 [ 49.800678] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8a46cfe 04/01/2014 [ 49.801380] RIP: 0010:_raw_spin_lock+0xc/0x20 [ 49.803470] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246 [ 49.803949] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000 [ 49.804600] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088 [ 49.805252] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040 [ 49.805898] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000 [ 49.806548] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0 [ 49.807200] FS: 00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000 [ 49.807935] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.808461] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0 [ 49.809113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 49.809758] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 49.810410] Call Trace: [ 49.810653] d_delete+0x2c/0xb0 [ 49.810951] vfs_rmdir+0xfd/0x120 [ 49.811264] do_rmdir+0x14f/0x1a0 [ 49.811573] do_syscall_64+0x5b/0x190 [ 49.811917] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 49.812385] RIP: 0033:0x7ffbf505ffd7 [ 49.814404] RSP: 002b:00007ffbedffada8 EFLAGS: 00000297 ORIG_RAX: 0000000000000054 [ 49.815098] RAX: ffffffffffffffda RBX: 00007ffbedffb640 RCX: 00007ffbf505ffd7 [ 49.815744] RDX: 0000000004449700 RSI: 0000000000000000 RDI: 0000000006c8cd50 [ 49.816394] RBP: 00007ffbedffaea0 R08: 0000000000000000 R09: 0000000000017d0b [ 49.817038] R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000012 [ 49.817687] R13: 00000000072823d8 R14: 00007ffbedffb700 R15: 00000000072823d8 [ 49.818338] Modules linked in: pvpanic cirrusfb button qemu_fw_cfg atkbd libps2 i8042 [ 49.819052] CR2: 0000000000000088 [ 49.819368] ---[ end trace 4e652b8aa299aa2d ]--- [ 49.819796] RIP: 0010:_raw_spin_lock+0xc/0x20 [ 49.821880] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246 [ 49.822363] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000 [ 49.823008] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088 [ 49.823658] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040 [ 49.825404] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000 [ 49.827147] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0 [ 49.828890] FS: 00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000 [ 49.830725] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.832359] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0 [ 49.834085] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 49.835792] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Cc: <stable@vger.kernel.org> Fixes: a6c6065 ("ovl: redirect on rename-dir") Signed-off-by: Liangyan <liangyan.peng@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Kiciuk
pushed a commit
that referenced
this pull request
Jul 28, 2023
Add a check to avoid null pointer dereference as below: [ 90.002283] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 90.002292] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 90.002346] ? exc_general_protection+0x159/0x240 [ 90.002352] ? asm_exc_general_protection+0x26/0x30 [ 90.002357] ? ttm_bo_evict_swapout_allowable+0x322/0x5e0 [ttm] [ 90.002365] ? ttm_bo_evict_swapout_allowable+0x42e/0x5e0 [ttm] [ 90.002373] ttm_bo_swapout+0x134/0x7f0 [ttm] [ 90.002383] ? __pfx_ttm_bo_swapout+0x10/0x10 [ttm] [ 90.002391] ? lock_acquire+0x44d/0x4f0 [ 90.002398] ? ttm_device_swapout+0xa5/0x260 [ttm] [ 90.002412] ? lock_acquired+0x355/0xa00 [ 90.002416] ? do_raw_spin_trylock+0xb6/0x190 [ 90.002421] ? __pfx_lock_acquired+0x10/0x10 [ 90.002426] ? ttm_global_swapout+0x25/0x210 [ttm] [ 90.002442] ttm_device_swapout+0x198/0x260 [ttm] [ 90.002456] ? __pfx_ttm_device_swapout+0x10/0x10 [ttm] [ 90.002472] ttm_global_swapout+0x75/0x210 [ttm] [ 90.002486] ttm_tt_populate+0x187/0x3f0 [ttm] [ 90.002501] ttm_bo_handle_move_mem+0x437/0x590 [ttm] [ 90.002517] ttm_bo_validate+0x275/0x430 [ttm] [ 90.002530] ? __pfx_ttm_bo_validate+0x10/0x10 [ttm] [ 90.002544] ? kasan_save_stack+0x33/0x60 [ 90.002550] ? kasan_set_track+0x25/0x30 [ 90.002554] ? __kasan_kmalloc+0x8f/0xa0 [ 90.002558] ? amdgpu_gtt_mgr_new+0x81/0x420 [amdgpu] [ 90.003023] ? ttm_resource_alloc+0xf6/0x220 [ttm] [ 90.003038] amdgpu_bo_pin_restricted+0x2dd/0x8b0 [amdgpu] [ 90.003210] ? __x64_sys_ioctl+0x131/0x1a0 [ 90.003210] ? do_syscall_64+0x60/0x90 Fixes: a2848d0 ("drm/ttm: never consider pinned BOs for eviction&swap") Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230724024229.1118444-1-guchun.chen@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
Kiciuk
pushed a commit
that referenced
this pull request
Jul 28, 2023
Since commit bb1520d ("s390/mm: start kernel with DAT enabled") the kernel crashes early during boot when debug pagealloc is enabled: mem auto-init: stack:off, heap alloc:off, heap free:off addressing exception: 0005 ilc:2 [#1] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0-rc3-09759-gc5666c912155 torvalds#630 [..] Krnl Code: 00000000001325f6: ec5600248064 cgrj %r5,%r6,8,000000000013263e 00000000001325fc: eb880002000c srlg %r8,%r8,2 #0000000000132602: b2210051 ipte %r5,%r1,%r0,0 >0000000000132606: b90400d1 lgr %r13,%r1 000000000013260a: 41605008 la %r6,8(%r5) 000000000013260e: a7db1000 aghi %r13,4096 0000000000132612: b221006d ipte %r6,%r13,%r0,0 0000000000132616: e3d0d0000171 lay %r13,4096(%r13) Call Trace: __kernel_map_pages+0x14e/0x320 __free_pages_ok+0x23a/0x5a8) free_low_memory_core_early+0x214/0x2c8 memblock_free_all+0x28/0x58 mem_init+0xb6/0x228 mm_core_init+0xb6/0x3b0 start_kernel+0x1d2/0x5a8 startup_continue+0x36/0x40 Kernel panic - not syncing: Fatal exception: panic_on_oops This is caused by using large mappings on machines with EDAT1/EDAT2. Add the code to split the mappings into 4k pages if debug pagealloc is enabled by CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT or the debug_pagealloc kernel command line option. Fixes: bb1520d ("s390/mm: start kernel with DAT enabled") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Kiciuk
pushed a commit
that referenced
this pull request
Jul 28, 2023
Even though the test suite covers this it somehow became obscured that this wasn't working. The test iommufd_ioas.mock_domain.access_domain_destory would blow up rarely. end should be set to 1 because this just pushed an item, the carry, to the pfns list. Sometimes the test would blow up with: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 5 PID: 584 Comm: iommufd Not tainted 6.5.0-rc1-dirty #1236 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:batch_unpin+0xa2/0x100 [iommufd] Code: 17 48 81 fe ff ff 07 00 77 70 48 8b 15 b7 be 97 e2 48 85 d2 74 14 48 8b 14 fa 48 85 d2 74 0b 40 0f b6 f6 48 c1 e6 04 48 01 f2 <48> 8b 3a 48 c1 e0 06 89 ca 48 89 de 48 83 e7 f0 48 01 c7 e8 96 dc RSP: 0018:ffffc90001677a58 EFLAGS: 00010246 RAX: 00007f7e2646f000 RBX: 0000000000000000 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 00000000fefc4c8d RDI: 0000000000fefc4c RBP: ffffc90001677a80 R08: 0000000000000048 R09: 0000000000000200 R10: 0000000000030b98 R11: ffffffff81f3bb40 R12: 0000000000000001 R13: ffff888101f75800 R14: ffffc90001677ad0 R15: 00000000000001fe FS: 00007f9323679740(0000) GS:ffff8881ba540000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000105ede003 CR4: 00000000003706a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? show_regs+0x5c/0x70 ? __die+0x1f/0x60 ? page_fault_oops+0x15d/0x440 ? lock_release+0xbc/0x240 ? exc_page_fault+0x4a4/0x970 ? asm_exc_page_fault+0x27/0x30 ? batch_unpin+0xa2/0x100 [iommufd] ? batch_unpin+0xba/0x100 [iommufd] __iopt_area_unfill_domain+0x198/0x430 [iommufd] ? __mutex_lock+0x8c/0xb80 ? __mutex_lock+0x6aa/0xb80 ? xa_erase+0x28/0x30 ? iopt_table_remove_domain+0x162/0x320 [iommufd] ? lock_release+0xbc/0x240 iopt_area_unfill_domain+0xd/0x10 [iommufd] iopt_table_remove_domain+0x195/0x320 [iommufd] iommufd_hw_pagetable_destroy+0xb3/0x110 [iommufd] iommufd_object_destroy_user+0x8e/0xf0 [iommufd] iommufd_device_detach+0xc5/0x140 [iommufd] iommufd_selftest_destroy+0x1f/0x70 [iommufd] iommufd_object_destroy_user+0x8e/0xf0 [iommufd] iommufd_destroy+0x3a/0x50 [iommufd] iommufd_fops_ioctl+0xfb/0x170 [iommufd] __x64_sys_ioctl+0x40d/0x9a0 do_syscall_64+0x3c/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Link: https://lore.kernel.org/r/3-v1-85aacb2af554+bc-iommufd_syz3_jgg@nvidia.com Cc: <stable@vger.kernel.org> Fixes: f394576 ("iommufd: PFN handling for iopt_pages") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reported-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Kiciuk
pushed a commit
that referenced
this pull request
Jul 28, 2023
…attrs() Running kunit test for 6.5-rc1 hits one bug: ok 10 damon_test_update_monitoring_result general protection fault, probably for non-canonical address 0x1bffa5c419cfb81: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 110 Comm: kunit_try_catch Tainted: G N 6.5.0-rc2 msm8953-mainline#15 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:damon_set_attrs+0xb9/0x120 Code: f8 00 00 00 4c 8d 58 e0 48 39 c3 74 ba 41 ba 59 17 b7 d1 49 8b 43 10 4d 8d 4b 10 48 8d 70 e0 49 39 c1 74 50 49 8b 40 08 31 d2 <69> 4e 18 10 27 00 00 49 f7 30 31 d2 48 89 c5 89 c8 f7 f5 31 d2 89 RSP: 0000:ffffc900005bfd40 EFLAGS: 00010246 RAX: ffffffff81159fc0 RBX: ffffc900005bfeb8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 01bffa5c419cfb69 RDI: ffffc900005bfd70 RBP: ffffc90000013c10 R08: ffffc900005bfdc0 R09: ffffffff81ff10ed R10: 00000000d1b71759 R11: ffffffff81ff10dd R12: ffffc90000013a78 R13: ffff88810eb78180 R14: ffffffff818297c0 R15: ffffc90000013c28 FS: 0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000002a1c001 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> damon_test_set_attrs+0x63/0x1f0 kunit_generic_run_threadfn_adapter+0x17/0x30 kthread+0xfd/0x130 The problem seems to be related with the damon_ctx was used without being initialized. Fix it by adding the initialization. Link: https://lkml.kernel.org/r/20230718052811.1065173-1-feng.tang@intel.com Fixes: aa13779 ("mm/damon/core-test: add a test for damon_set_attrs()") Signed-off-by: Feng Tang <feng.tang@intel.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 4, 2023
In certain types of chips, such as VEGA20, reading the amdgpu_regs_smc file could result in an abnormal null pointer access when the smc_rreg pointer is NULL. Below are the steps to reproduce this issue and the corresponding exception log: 1. Navigate to the directory: /sys/kernel/debug/dri/0 2. Execute command: cat amdgpu_regs_smc 3. Exception Log:: [4005007.702554] BUG: kernel NULL pointer dereference, address: 0000000000000000 [4005007.702562] #PF: supervisor instruction fetch in kernel mode [4005007.702567] #PF: error_code(0x0010) - not-present page [4005007.702570] PGD 0 P4D 0 [4005007.702576] Oops: 0010 [#1] SMP NOPTI [4005007.702581] CPU: 4 PID: 62563 Comm: cat Tainted: G OE 5.15.0-43-generic msm8953-mainline#46-Ubunt u [4005007.702590] RIP: 0010:0x0 [4005007.702598] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.702600] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.702605] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.702609] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.702612] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.702615] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.702618] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.702622] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.702626] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.702629] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 [4005007.702633] Call Trace: [4005007.702636] <TASK> [4005007.702640] amdgpu_debugfs_regs_smc_read+0xb0/0x120 [amdgpu] [4005007.703002] full_proxy_read+0x5c/0x80 [4005007.703011] vfs_read+0x9f/0x1a0 [4005007.703019] ksys_read+0x67/0xe0 [4005007.703023] __x64_sys_read+0x19/0x20 [4005007.703028] do_syscall_64+0x5c/0xc0 [4005007.703034] ? do_user_addr_fault+0x1e3/0x670 [4005007.703040] ? exit_to_user_mode_prepare+0x37/0xb0 [4005007.703047] ? irqentry_exit_to_user_mode+0x9/0x20 [4005007.703052] ? irqentry_exit+0x19/0x30 [4005007.703057] ? exc_page_fault+0x89/0x160 [4005007.703062] ? asm_exc_page_fault+0x8/0x30 [4005007.703068] entry_SYSCALL_64_after_hwframe+0x44/0xae [4005007.703075] RIP: 0033:0x7f5e07672992 [4005007.703079] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5 1d 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 e c 28 48 89 54 24 [4005007.703083] RSP: 002b:00007ffe03097898 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [4005007.703088] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5e07672992 [4005007.703091] RDX: 0000000000020000 RSI: 00007f5e06753000 RDI: 0000000000000003 [4005007.703094] RBP: 00007f5e06753000 R08: 00007f5e06752010 R09: 00007f5e06752010 [4005007.703096] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000 [4005007.703099] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 [4005007.703105] </TASK> [4005007.703107] Modules linked in: nf_tables libcrc32c nfnetlink algif_hash af_alg binfmt_misc nls_ iso8859_1 ipmi_ssif ast intel_rapl_msr intel_rapl_common drm_vram_helper drm_ttm_helper amd64_edac t tm edac_mce_amd kvm_amd ccp mac_hid k10temp kvm acpi_ipmi ipmi_si rapl sch_fq_codel ipmi_devintf ipm i_msghandler msr parport_pc ppdev lp parport mtd pstore_blk efi_pstore ramoops pstore_zone reed_solo mon ip_tables x_tables autofs4 ib_uverbs ib_core amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) iommu_v 2 amd_sched(OE) amdkcl(OE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm igb ahci xhci_pci libahci i2c_piix4 i2c_algo_bit xhci_pci_renesas dca [4005007.703184] CR2: 0000000000000000 [4005007.703188] ---[ end trace ac65a538d240da39 ]--- [4005007.800865] RIP: 0010:0x0 [4005007.800871] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.800874] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.800878] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.800881] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.800883] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.800886] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.800888] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.800891] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.800895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.800898] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 Signed-off-by: Qu Huang <qu.huang@linux.dev> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 4, 2023
Ido Schimmel says: ==================== Add MDB get support This patchset adds MDB get support, allowing user space to request a single MDB entry to be retrieved instead of dumping the entire MDB. Support is added in both the bridge and VXLAN drivers. Patches #1-#6 are small preparations in both drivers. Patches #7-#8 add the required uAPI attributes for the new functionality and the MDB get net device operation (NDO), respectively. Patches #9-#10 implement the MDB get NDO in both drivers. Patch #11 registers a handler for RTM_GETMDB messages in rtnetlink core. The handler derives the net device from the ifindex specified in the ancillary header and invokes its MDB get NDO. Patches msm8953-mainline#12-msm8953-mainline#13 add selftests by converting tests that use MDB dump with grep to the new MDB get functionality. iproute2 changes can be found here [1]. v2: * Patch #7: Add a comment to describe attributes structure. * Patch #9: Add a comment above spin_lock_bh(). [1] https://github.com/idosch/iproute2/tree/submit/mdb_get_v1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 4, 2023
With latest sync from net-next tree, bpf-next has a bpf selftest failure: [root@arch-fb-vm1 bpf]# ./test_progs -t setget_sockopt ... [ 76.194349] ============================================ [ 76.194682] WARNING: possible recursive locking detected [ 76.195039] 6.6.0-rc7-g37884503df08-dirty msm8953-mainline#67 Tainted: G W OE [ 76.195518] -------------------------------------------- [ 76.195852] new_name/154 is trying to acquire lock: [ 76.196159] ffff8c3e06ad8d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: ip_sock_set_tos+0x19/0x30 [ 76.196669] [ 76.196669] but task is already holding lock: [ 76.197028] ffff8c3e06ad8d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_listen+0x21/0x70 [ 76.197517] [ 76.197517] other info that might help us debug this: [ 76.197919] Possible unsafe locking scenario: [ 76.197919] [ 76.198287] CPU0 [ 76.198444] ---- [ 76.198600] lock(sk_lock-AF_INET); [ 76.198831] lock(sk_lock-AF_INET); [ 76.199062] [ 76.199062] *** DEADLOCK *** [ 76.199062] [ 76.199420] May be due to missing lock nesting notation [ 76.199420] [ 76.199879] 2 locks held by new_name/154: [ 76.200131] #0: ffff8c3e06ad8d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_listen+0x21/0x70 [ 76.200644] #1: ffffffff90f96a40 (rcu_read_lock){....}-{1:2}, at: __cgroup_bpf_run_filter_sock_ops+0x55/0x290 [ 76.201268] [ 76.201268] stack backtrace: [ 76.201538] CPU: 4 PID: 154 Comm: new_name Tainted: G W OE 6.6.0-rc7-g37884503df08-dirty msm8953-mainline#67 [ 76.202134] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 76.202699] Call Trace: [ 76.202858] <TASK> [ 76.203002] dump_stack_lvl+0x4b/0x80 [ 76.203239] __lock_acquire+0x740/0x1ec0 [ 76.203503] lock_acquire+0xc1/0x2a0 [ 76.203766] ? ip_sock_set_tos+0x19/0x30 [ 76.204050] ? sk_stream_write_space+0x12a/0x230 [ 76.204389] ? lock_release+0xbe/0x260 [ 76.204661] lock_sock_nested+0x32/0x80 [ 76.204942] ? ip_sock_set_tos+0x19/0x30 [ 76.205208] ip_sock_set_tos+0x19/0x30 [ 76.205452] do_ip_setsockopt+0x4b3/0x1580 [ 76.205719] __bpf_setsockopt+0x62/0xa0 [ 76.205963] bpf_sock_ops_setsockopt+0x11/0x20 [ 76.206247] bpf_prog_630217292049c96e_bpf_test_sockopt_int+0xbc/0x123 [ 76.206660] bpf_prog_493685a3bae00bbd_bpf_test_ip_sockopt+0x49/0x4b [ 76.207055] bpf_prog_b0bcd27f269aeea0_skops_sockopt+0x44c/0xec7 [ 76.207437] __cgroup_bpf_run_filter_sock_ops+0xda/0x290 [ 76.207829] __inet_listen_sk+0x108/0x1b0 [ 76.208122] inet_listen+0x48/0x70 [ 76.208373] __sys_listen+0x74/0xb0 [ 76.208630] __x64_sys_listen+0x16/0x20 [ 76.208911] do_syscall_64+0x3f/0x90 [ 76.209174] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 ... Both ip_sock_set_tos() and inet_listen() calls lock_sock(sk) which caused a dead lock. To fix the issue, use sockopt_lock_sock() in ip_sock_set_tos() instead. sockopt_lock_sock() will avoid lock_sock() if it is in bpf context. Fixes: 878d951 ("inet: lock the socket in ip_sock_set_tos()") Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231027182424.1444845-1-yonghong.song@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 4, 2023
Skip SMB sessions that are being teared down (e.g. @ses->ses_status == SES_EXITING) in cifs_debug_data_proc_show() to avoid use-after-free in @SES. This fixes the following GPF when reading from /proc/fs/cifs/DebugData while mounting and umounting [ 816.251274] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d81: 0000 [#1] PREEMPT SMP NOPTI ... [ 816.260138] Call Trace: [ 816.260329] <TASK> [ 816.260499] ? die_addr+0x36/0x90 [ 816.260762] ? exc_general_protection+0x1b3/0x410 [ 816.261126] ? asm_exc_general_protection+0x26/0x30 [ 816.261502] ? cifs_debug_tcon+0xbd/0x240 [cifs] [ 816.261878] ? cifs_debug_tcon+0xab/0x240 [cifs] [ 816.262249] cifs_debug_data_proc_show+0x516/0xdb0 [cifs] [ 816.262689] ? seq_read_iter+0x379/0x470 [ 816.262995] seq_read_iter+0x118/0x470 [ 816.263291] proc_reg_read_iter+0x53/0x90 [ 816.263596] ? srso_alias_return_thunk+0x5/0x7f [ 816.263945] vfs_read+0x201/0x350 [ 816.264211] ksys_read+0x75/0x100 [ 816.264472] do_syscall_64+0x3f/0x90 [ 816.264750] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 816.265135] RIP: 0033:0x7fd5e669d381 Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 4, 2023
Fixes: bcachefs (e7fdc10e-54a3-49d9-bd0c-390370889d84): disk usage increased 4294967296 more than 2823707312 sectors reserved) transaction updates for __bchfs_fallocate journal seq 467859 update: btree=extents cached=0 bch2_trans_update+0x4e8/0x540 old u64s 5 type deleted 536925940:3559337304:4294967283 len 0 ver 0 new u64s 6 type reservation 536925940:3559337304:4294967283 len 3559337304 ver 0: generation 0 replicas 2 update: btree=inodes cached=1 bch2_extent_update_i_size_sectors+0x305/0x3b0 old u64s 19 type inode_v3 0:536925940:4294967283 len 0 ver 0: mode 100600 flags 15300000 journal_seq 467859 bi_size 0 bi_sectors 0 bi_version 0 bi_atime 40905301656446 bi_ctime 40905301656446 bi_mtime 40905301656446 bi_otime 40905301656446 bi_uid 0 bi_gid 0 bi_nlink 0 bi_generation 0 bi_dev 0 bi_data_checksum 0 bi_compression 0 bi_project 0 bi_background_compression 0 bi_data_replicas 0 bi_promote_target 0 bi_foreground_target 0 bi_background_target 0 bi_erasure_code 0 bi_fields_set 0 bi_dir 1879048193 bi_dir_offset 3384856038735393365 bi_subvol 0 bi_parent_subvol 0 bi_nocow 0 new u64s 19 type inode_v3 0:536925940:4294967283 len 0 ver 0: mode 100600 flags 15300000 journal_seq 467859 bi_size 0 bi_sectors 3559337304 bi_version 0 bi_atime 40905301656446 bi_ctime 40905301656446 bi_mtime 40905301656446 bi_otime 40905301656446 bi_uid 0 bi_gid 0 bi_nlink 0 bi_generation 0 bi_dev 0 bi_data_checksum 0 bi_compression 0 bi_project 0 bi_background_compression 0 bi_data_replicas 0 bi_promote_target 0 bi_foreground_target 0 bi_background_target 0 bi_erasure_code 0 bi_fields_set 0 bi_dir 1879048193 bi_dir_offset 3384856038735393365 bi_subvol 0 bi_parent_subvol 0 bi_nocow 0 Kernel panic - not syncing: bcachefs (e7fdc10e-54a3-49d9-bd0c-390370889d84): panic after error CPU: 4 PID: 5154 Comm: rsync Not tainted 6.5.9-gateway-gca1614174cc0-dirty #1 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Phantom Gaming 4, BIOS P4.20 08/02/2021 Call Trace: <TASK> dump_stack_lvl+0x5a/0x90 panic+0x105/0x300 ? console_unlock+0xf1/0x130 ? bch2_printbuf_exit+0x16/0x30 ? srso_return_thunk+0x5/0x10 bch2_inconsistent_error+0x6f/0x80 bch2_trans_fs_usage_apply+0x279/0x3d0 __bch2_trans_commit+0x112a/0x1df0 ? bch2_extent_update+0x13a/0x1d0 bch2_extent_update+0x13a/0x1d0 bch2_extent_fallocate+0x58e/0x740 bch2_fallocate_dispatch+0xb7c/0x1030 ? do_filp_open+0xa0/0x140 vfs_fallocate+0x18e/0x1d0 __x64_sys_fallocate+0x46/0x70 do_syscall_64+0x48/0xa0 ? exit_to_user_mode_prepare+0x4d/0xa0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 RIP: 0033:0x7fc85d91bbb3 Code: 64 89 02 b8 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 80 3d 31 da 0d 00 00 49 89 ca 74 14 b8 1d 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5d c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 10 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 4, 2023
The following can crash the kernel: # cd /sys/kernel/tracing # echo 'p:sched schedule' > kprobe_events # exec 5>>events/kprobes/sched/enable # > kprobe_events # exec 5>&- The above commands: 1. Change directory to the tracefs directory 2. Create a kprobe event (doesn't matter what one) 3. Open bash file descriptor 5 on the enable file of the kprobe event 4. Delete the kprobe event (removes the files too) 5. Close the bash file descriptor 5 The above causes a crash! BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 6 PID: 877 Comm: bash Not tainted 6.5.0-rc4-test-00008-g2c6b6b1029d4-dirty msm8953-mainline#186 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 RIP: 0010:tracing_release_file_tr+0xc/0x50 What happens here is that the kprobe event creates a trace_event_file "file" descriptor that represents the file in tracefs to the event. It maintains state of the event (is it enabled for the given instance?). Opening the "enable" file gets a reference to the event "file" descriptor via the open file descriptor. When the kprobe event is deleted, the file is also deleted from the tracefs system which also frees the event "file" descriptor. But as the tracefs file is still opened by user space, it will not be totally removed until the final dput() is called on it. But this is not true with the event "file" descriptor that is already freed. If the user does a write to or simply closes the file descriptor it will reference the event "file" descriptor that was just freed, causing a use-after-free bug. To solve this, add a ref count to the event "file" descriptor as well as a new flag called "FREED". The "file" will not be freed until the last reference is released. But the FREE flag will be set when the event is removed to prevent any more modifications to that event from happening, even if there's still a reference to the event "file" descriptor. Link: https://lore.kernel.org/linux-trace-kernel/20231031000031.1e705592@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20231031122453.7a48b923@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Fixes: f5ca233 ("tracing: Increase trace array ref count on enable and filter files") Reported-by: Beau Belgrave <beaub@linux.microsoft.com> Tested-by: Beau Belgrave <beaub@linux.microsoft.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 4, 2023
Chuyi Zhou says: ==================== Relax allowlist for open-coded css_task iter Hi, The patchset aims to relax the allowlist for open-coded css_task iter suggested by Alexei[1]. Please see individual patches for more details. And comments are always welcome. Patch summary: * Patch #1: Relax the allowlist and let css_task iter can be used in bpf iters and any sleepable progs. * Patch #2: Add a test in cgroup_iters.c which demonstrates how css_task iters can be combined with cgroup iter. * Patch #3: Add a test to prove css_task iter can be used in normal * sleepable progs. link[1]:https://lore.kernel.org/lkml/CAADnVQKafk_junRyE=-FVAik4hjTRDtThymYGEL8hGTuYoOGpA@mail.gmail.com/ --- Changes in v2: * Fix the incorrect logic in check_css_task_iter_allowlist. Use expected_attach_type to check whether we are using bpf_iters. * Link to v1:https://lore.kernel.org/bpf/20231022154527.229117-1-zhouchuyi@bytedance.com/T/#m946f9cde86b44a13265d9a44c5738a711eb578fd Changes in v3: * Add a testcase to prove css_task can be used in fentry.s * Link to v2:https://lore.kernel.org/bpf/20231024024240.42790-1-zhouchuyi@bytedance.com/T/#m14a97041ff56c2df21bc0149449abd275b73f6a3 Changes in v4: * Add Yonghong's ack for patch #1 and patch #2. * Solve Yonghong's comments for patch #2 * Move prog 'iter_css_task_for_each_sleep' from iters_task_failure.c to iters_css_task.c. Use RUN_TESTS to prove we can load this prog. * Link to v3:https://lore.kernel.org/bpf/20231025075914.30979-1-zhouchuyi@bytedance.com/T/#m3200d8ad29af4ffab97588e297361d0a45d7585d --- ==================== Link: https://lore.kernel.org/r/20231031050438.93297-1-zhouchuyi@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 4, 2023
When LAN9303 is MDIO-connected two callchains exist into mdio->bus->write(): 1. switch ports 1&2 ("physical" PHYs): virtual (switch-internal) MDIO bus (lan9303_switch_ops->phy_{read|write})-> lan9303_mdio_phy_{read|write} -> mdiobus_{read|write}_nested 2. LAN9303 virtual PHY: virtual MDIO bus (lan9303_phy_{read|write}) -> lan9303_virt_phy_reg_{read|write} -> regmap -> lan9303_mdio_{read|write} If the latter functions just take mutex_lock(&sw_dev->device->bus->mdio_lock) it triggers a LOCKDEP false-positive splat. It's false-positive because the first mdio_lock in the second callchain above belongs to virtual MDIO bus, the second mdio_lock belongs to physical MDIO bus. Consequent annotation in lan9303_mdio_{read|write} as nested lock (similar to lan9303_mdio_phy_{read|write}, it's the same physical MDIO bus) prevents the following splat: WARNING: possible circular locking dependency detected 5.15.71 #1 Not tainted ------------------------------------------------------ kworker/u4:3/609 is trying to acquire lock: ffff000011531c68 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}, at: regmap_lock_mutex but task is already holding lock: ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&bus->mdio_lock){+.+.}-{3:3}: lock_acquire __mutex_lock mutex_lock_nested lan9303_mdio_read _regmap_read regmap_read lan9303_probe lan9303_mdio_probe mdio_probe really_probe __driver_probe_device driver_probe_device __device_attach_driver bus_for_each_drv __device_attach device_initial_probe bus_probe_device deferred_probe_work_func process_one_work worker_thread kthread ret_from_fork -> #0 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}: __lock_acquire lock_acquire.part.0 lock_acquire __mutex_lock mutex_lock_nested regmap_lock_mutex regmap_read lan9303_phy_read dsa_slave_phy_read __mdiobus_read mdiobus_read get_phy_device mdiobus_scan __mdiobus_register dsa_register_switch lan9303_probe lan9303_mdio_probe mdio_probe really_probe __driver_probe_device driver_probe_device __device_attach_driver bus_for_each_drv __device_attach device_initial_probe bus_probe_device deferred_probe_work_func process_one_work worker_thread kthread ret_from_fork other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&bus->mdio_lock); lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock); lock(&bus->mdio_lock); lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock); *** DEADLOCK *** 5 locks held by kworker/u4:3/609: #0: ffff000002842938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work #1: ffff80000bacbd60 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work #2: ffff000007645178 (&dev->mutex){....}-{3:3}, at: __device_attach #3: ffff8000096e6e78 (dsa2_mutex){+.+.}-{3:3}, at: dsa_register_switch #4: ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read stack backtrace: CPU: 1 PID: 609 Comm: kworker/u4:3 Not tainted 5.15.71 #1 Workqueue: events_unbound deferred_probe_work_func Call trace: dump_backtrace show_stack dump_stack_lvl dump_stack print_circular_bug check_noncircular __lock_acquire lock_acquire.part.0 lock_acquire __mutex_lock mutex_lock_nested regmap_lock_mutex regmap_read lan9303_phy_read dsa_slave_phy_read __mdiobus_read mdiobus_read get_phy_device mdiobus_scan __mdiobus_register dsa_register_switch lan9303_probe lan9303_mdio_probe ... Cc: stable@vger.kernel.org Fixes: dc70058 ("net: dsa: LAN9303: add MDIO managed mode support") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231027065741.534971-1-alexander.sverdlin@siemens.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 4, 2023
The following UAF was triggered when running fstests generic/072 with KASAN enabled against Windows Server 2022 and mount options 'multichannel,max_channels=2,vers=3.1.1,mfsymlinks,noperm' BUG: KASAN: slab-use-after-free in smb2_query_info_compound+0x423/0x6d0 [cifs] Read of size 8 at addr ffff888014941048 by task xfs_io/27534 CPU: 0 PID: 27534 Comm: xfs_io Not tainted 6.6.0-rc7 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? __phys_addr+0x46/0x90 kasan_report+0xda/0x110 ? smb2_query_info_compound+0x423/0x6d0 [cifs] ? smb2_query_info_compound+0x423/0x6d0 [cifs] smb2_query_info_compound+0x423/0x6d0 [cifs] ? __pfx_smb2_query_info_compound+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __stack_depot_save+0x39/0x480 ? kasan_save_stack+0x33/0x60 ? kasan_set_track+0x25/0x30 ? ____kasan_slab_free+0x126/0x170 smb2_queryfs+0xc2/0x2c0 [cifs] ? __pfx_smb2_queryfs+0x10/0x10 [cifs] ? __pfx___lock_acquire+0x10/0x10 smb311_queryfs+0x210/0x220 [cifs] ? __pfx_smb311_queryfs+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __lock_acquire+0x480/0x26c0 ? lock_release+0x1ed/0x640 ? srso_alias_return_thunk+0x5/0x7f ? do_raw_spin_unlock+0x9b/0x100 cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0 __do_sys_fstatfs+0x7f/0xe0 ? __pfx___do_sys_fstatfs+0x10/0x10 ? srso_alias_return_thunk+0x5/0x7f ? lockdep_hardirqs_on_prepare+0x136/0x200 ? srso_alias_return_thunk+0x5/0x7f do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Allocated by task 27534: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 __kasan_kmalloc+0x8f/0xa0 open_cached_dir+0x71b/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs] smb311_queryfs+0x210/0x220 [cifs] cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0 __do_sys_fstatfs+0x7f/0xe0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Freed by task 27534: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2b/0x50 ____kasan_slab_free+0x126/0x170 slab_free_freelist_hook+0xd0/0x1e0 __kmem_cache_free+0x9d/0x1b0 open_cached_dir+0xff5/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs] This is a race between open_cached_dir() and cached_dir_lease_break() where the cache entry for the open directory handle receives a lease break while creating it. And before returning from open_cached_dir(), we put the last reference of the new @cfid because of !@cfid->has_lease. Besides the UAF, while running xfstests a lot of missed lease breaks have been noticed in tests that run several concurrent statfs(2) calls on those cached fids CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame... CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1... CIFS: VFS: \\w22-root1.gandalf.test smb buf 00000000715bfe83 len 108 CIFS: VFS: Dump pending requests: CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame... CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1... CIFS: VFS: \\w22-root1.gandalf.test smb buf 000000005aa7316e len 108 ... To fix both, in open_cached_dir() ensure that @cfid->has_lease is set right before sending out compounded request so that any potential lease break will be get processed by demultiplex thread while we're still caching @cfid. And, if open failed for some reason, re-check @cfid->has_lease to decide whether or not put lease reference. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 11, 2023
Fix kernel crash in AP bus code caused by very early invocation of the config change callback function via SCLP. After a fresh IML of the machine the crypto cards are still offline and will get switched online only with activation of any LPAR which has the card in it's configuration. A crypto card coming online is reported to the LPAR via SCLP and the AP bus offers a callback function to get this kind of information. However, it may happen that the callback is invoked before the AP bus init function is complete. As the callback triggers a synchronous AP bus scan, the scan may already run but some internal states are not initialized by the AP bus init function resulting in a crash like this: [ 11.635859] Unable to handle kernel pointer dereference in virtual kernel address space [ 11.635861] Failing address: 0000000000000000 TEID: 0000000000000887 [ 11.635862] Fault in home space mode while using kernel ASCE. [ 11.635864] AS:00000000894c4007 R3:00000001fece8007 S:00000001fece7800 P:000000000000013d [ 11.635879] Oops: 0004 ilc:1 [#1] SMP [ 11.635882] Modules linked in: [ 11.635884] CPU: 5 PID: 42 Comm: kworker/5:0 Not tainted 6.6.0-rc3-00003-g4dbf7cdc6b42 msm8953-mainline#12 [ 11.635886] Hardware name: IBM 3931 A01 751 (LPAR) [ 11.635887] Workqueue: events_long ap_scan_bus [ 11.635891] Krnl PSW : 0704c00180000000 0000000000000000 (0x0) [ 11.635895] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 11.635897] Krnl GPRS: 0000000001000a00 0000000000000000 0000000000000006 0000000089591940 [ 11.635899] 0000000080000000 0000000000000a00 0000000000000000 0000000000000000 [ 11.635901] 0000000081870c00 0000000089591000 000000008834e4e2 0000000002625a00 [ 11.635903] 0000000081734200 0000038000913c18 000000008834c6d6 0000038000913ac8 [ 11.635906] Krnl Code:>0000000000000000: 0000 illegal [ 11.635906] 0000000000000002: 0000 illegal [ 11.635906] 0000000000000004: 0000 illegal [ 11.635906] 0000000000000006: 0000 illegal [ 11.635906] 0000000000000008: 0000 illegal [ 11.635906] 000000000000000a: 0000 illegal [ 11.635906] 000000000000000c: 0000 illegal [ 11.635906] 000000000000000e: 0000 illegal [ 11.635915] Call Trace: [ 11.635916] [<0000000000000000>] 0x0 [ 11.635918] [<000000008834e4e2>] ap_queue_init_state+0x82/0xb8 [ 11.635921] [<000000008834ba1c>] ap_scan_domains+0x6fc/0x740 [ 11.635923] [<000000008834c092>] ap_scan_adapter+0x632/0x8b0 [ 11.635925] [<000000008834c3e4>] ap_scan_bus+0xd4/0x288 [ 11.635927] [<00000000879a33ba>] process_one_work+0x19a/0x410 [ 11.635930] Discipline DIAG cannot be used without z/VM [ 11.635930] [<00000000879a3a2c>] worker_thread+0x3fc/0x560 [ 11.635933] [<00000000879aea60>] kthread+0x120/0x128 [ 11.635936] [<000000008792afa4>] __ret_from_fork+0x3c/0x58 [ 11.635938] [<00000000885ebe62>] ret_from_fork+0xa/0x30 [ 11.635942] Last Breaking-Event-Address: [ 11.635942] [<000000008834c6d4>] ap_wait+0xcc/0x148 This patch improves the ap_bus_force_rescan() function which is invoked by the config change callback by checking if a first initial AP bus scan has been done. If not, the force rescan request is simple ignored. Anyhow it does not make sense to trigger AP bus re-scans even before the very first bus scan is complete. Cc: stable@vger.kernel.org Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 11, 2023
The cloned child of ahash that uses shash under the hood should use shash helpers (like crypto_shash_setkey()). The following panic may be observed on TCP-AO selftests: > ================================================================== > BUG: KASAN: wild-memory-access in crypto_mod_get+0x1b/0x60 > Write of size 4 at addr 5d5be0ff5c415e14 by task connect_ipv4/1397 > > CPU: 0 PID: 1397 Comm: connect_ipv4 Tainted: G W 6.6.0+ msm8953-mainline#47 > Call Trace: > <TASK> > dump_stack_lvl+0x46/0x70 > kasan_report+0xc3/0xf0 > kasan_check_range+0xec/0x190 > crypto_mod_get+0x1b/0x60 > crypto_spawn_alg+0x53/0x140 > crypto_spawn_tfm2+0x13/0x60 > hmac_init_tfm+0x25/0x60 > crypto_ahash_setkey+0x8b/0x100 > tcp_ao_add_cmd+0xe7a/0x1120 > do_tcp_setsockopt+0x5ed/0x12a0 > do_sock_setsockopt+0x82/0x100 > __sys_setsockopt+0xe9/0x160 > __x64_sys_setsockopt+0x60/0x70 > do_syscall_64+0x3c/0xe0 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > ================================================================== > general protection fault, probably for non-canonical address 0x5d5be0ff5c415e14: 0000 [#1] PREEMPT SMP KASAN > CPU: 0 PID: 1397 Comm: connect_ipv4 Tainted: G B W 6.6.0+ msm8953-mainline#47 > Call Trace: > <TASK> > ? die_addr+0x3c/0xa0 > ? exc_general_protection+0x144/0x210 > ? asm_exc_general_protection+0x22/0x30 > ? add_taint+0x26/0x90 > ? crypto_mod_get+0x20/0x60 > ? crypto_mod_get+0x1b/0x60 > ? ahash_def_finup_done1+0x58/0x80 > crypto_spawn_alg+0x53/0x140 > crypto_spawn_tfm2+0x13/0x60 > hmac_init_tfm+0x25/0x60 > crypto_ahash_setkey+0x8b/0x100 > tcp_ao_add_cmd+0xe7a/0x1120 > do_tcp_setsockopt+0x5ed/0x12a0 > do_sock_setsockopt+0x82/0x100 > __sys_setsockopt+0xe9/0x160 > __x64_sys_setsockopt+0x60/0x70 > do_syscall_64+0x3c/0xe0 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > </TASK> > RIP: 0010:crypto_mod_get+0x20/0x60 Make sure that the child/clone has using_shash set when parent is an shash user. Fixes: 2f1f34c ("crypto: ahash - optimize performance when wrapping shash") Cc: David Ahern <dsahern@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Francesco Ruggeri <fruggeri05@gmail.com> To: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Salam Noureddine <noureddine@arista.com> Cc: netdev@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Kiciuk
pushed a commit
that referenced
this pull request
Nov 11, 2023
…pf_iter_reg' Chuyi Zhou says: ==================== The patchset aims to let the BPF verivier consider bpf_iter__cgroup->cgroup and bpf_iter__task->task is trusted suggested by Alexei[1]. Please see individual patches for more details. And comments are always welcome. Link[1]:https://lore.kernel.org/bpf/20231022154527.229117-1-zhouchuyi@bytedance.com/T/#mb57725edc8ccdd50a1b165765c7619b4d65ed1b0 v2->v1: * Patch #1: Add Yonghong's ack and add description of similar case in log. * Patch #2: Add Yonghong's ack ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Dec 3, 2023
If SetVariable at runtime is not supported by the firmware we never assign a callback for that function. At the same time mount the efivarfs as RO so no one can call that. However, we never check the permission flags when someone remounts the filesystem as RW. As a result this leads to a crash looking like this: $ mount -o remount,rw /sys/firmware/efi/efivars $ efi-updatevar -f PK.auth PK [ 303.279166] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 303.280482] Mem abort info: [ 303.280854] ESR = 0x0000000086000004 [ 303.281338] EC = 0x21: IABT (current EL), IL = 32 bits [ 303.282016] SET = 0, FnV = 0 [ 303.282414] EA = 0, S1PTW = 0 [ 303.282821] FSC = 0x04: level 0 translation fault [ 303.283771] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004258c000 [ 303.284913] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 303.286076] Internal error: Oops: 0000000086000004 [#1] PREEMPT SMP [ 303.286936] Modules linked in: qrtr tpm_tis tpm_tis_core crct10dif_ce arm_smccc_trng rng_core drm fuse ip_tables x_tables ipv6 [ 303.288586] CPU: 1 PID: 755 Comm: efi-updatevar Not tainted 6.3.0-rc1-00108-gc7d0c4695c68 #1 [ 303.289748] Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2023.04-00627-g88336918701d 04/01/2023 [ 303.291150] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 303.292123] pc : 0x0 [ 303.292443] lr : efivar_set_variable_locked+0x74/0xec [ 303.293156] sp : ffff800008673c10 [ 303.293619] x29: ffff800008673c10 x28: ffff0000037e8000 x27: 0000000000000000 [ 303.294592] x26: 0000000000000800 x25: ffff000002467400 x24: 0000000000000027 [ 303.295572] x23: ffffd49ea9832000 x22: ffff0000020c9800 x21: ffff000002467000 [ 303.296566] x20: 0000000000000001 x19: 00000000000007fc x18: 0000000000000000 [ 303.297531] x17: 0000000000000000 x16: 0000000000000000 x15: 0000aaaac807ab54 [ 303.298495] x14: ed37489f673633c0 x13: 71c45c606de13f80 x12: 47464259e219acf4 [ 303.299453] x11: ffff000002af7b01 x10: 0000000000000003 x9 : 0000000000000002 [ 303.300431] x8 : 0000000000000010 x7 : ffffd49ea8973230 x6 : 0000000000a85201 [ 303.301412] x5 : 0000000000000000 x4 : ffff0000020c9800 x3 : 00000000000007fc [ 303.302370] x2 : 0000000000000027 x1 : ffff000002467400 x0 : ffff000002467000 [ 303.303341] Call trace: [ 303.303679] 0x0 [ 303.303938] efivar_entry_set_get_size+0x98/0x16c [ 303.304585] efivarfs_file_write+0xd0/0x1a4 [ 303.305148] vfs_write+0xc4/0x2e4 [ 303.305601] ksys_write+0x70/0x104 [ 303.306073] __arm64_sys_write+0x1c/0x28 [ 303.306622] invoke_syscall+0x48/0x114 [ 303.307156] el0_svc_common.constprop.0+0x44/0xec [ 303.307803] do_el0_svc+0x38/0x98 [ 303.308268] el0_svc+0x2c/0x84 [ 303.308702] el0t_64_sync_handler+0xf4/0x120 [ 303.309293] el0t_64_sync+0x190/0x194 [ 303.309794] Code: ???????? ???????? ???????? ???????? (????????) [ 303.310612] ---[ end trace 0000000000000000 ]--- Fix this by adding a .reconfigure() function to the fs operations which we can use to check the requested flags and deny anything that's not RO if the firmware doesn't implement SetVariable at runtime. Fixes: f88814c ("efi/efivars: Expose RT service availability via efivars abstraction") Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Dec 3, 2023
The routine __vma_private_lock tests for the existence of a reserve map associated with a private hugetlb mapping. A pointer to the reserve map is in vma->vm_private_data. __vma_private_lock was checking the pointer for NULL. However, it is possible that the low bits of the pointer could be used as flags. In such instances, vm_private_data is not NULL and not a valid pointer. This results in the null-ptr-deref reported by syzbot: general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef] CPU: 0 PID: 5048 Comm: syz-executor139 Not tainted 6.6.0-rc7-syzkaller-00142-g88 8cf78c29e2 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 1 0/09/2023 RIP: 0010:__lock_acquire+0x109/0x5de0 kernel/locking/lockdep.c:5004 ... Call Trace: <TASK> lock_acquire kernel/locking/lockdep.c:5753 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718 down_write+0x93/0x200 kernel/locking/rwsem.c:1573 hugetlb_vma_lock_write mm/hugetlb.c:300 [inline] hugetlb_vma_lock_write+0xae/0x100 mm/hugetlb.c:291 __hugetlb_zap_begin+0x1e9/0x2b0 mm/hugetlb.c:5447 hugetlb_zap_begin include/linux/hugetlb.h:258 [inline] unmap_vmas+0x2f4/0x470 mm/memory.c:1733 exit_mmap+0x1ad/0xa60 mm/mmap.c:3230 __mmput+0x12a/0x4d0 kernel/fork.c:1349 mmput+0x62/0x70 kernel/fork.c:1371 exit_mm kernel/exit.c:567 [inline] do_exit+0x9ad/0x2a20 kernel/exit.c:861 __do_sys_exit kernel/exit.c:991 [inline] __se_sys_exit kernel/exit.c:989 [inline] __x64_sys_exit+0x42/0x50 kernel/exit.c:989 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Mask off low bit flags before checking for NULL pointer. In addition, the reserve map only 'belongs' to the OWNER (parent in parent/child relationships) so also check for the OWNER flag. Link: https://lkml.kernel.org/r/20231114012033.259600-1-mike.kravetz@oracle.com Reported-by: syzbot+6ada951e7c0f7bc8a71e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-mm/00000000000078d1e00608d7878b@google.com/ Fixes: bf49169 ("hugetlbfs: extend hugetlb_vma_lock to private VMAs") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: Edward Adam Davis <eadavis@qq.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tom Rix <trix@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kiciuk
pushed a commit
that referenced
this pull request
Dec 3, 2023
…oup() Erhard reported that the 6.7-rc1 kernel panics on boot if being built with clang-16. The problem was not reproducible with gcc. [ 5.975049] general protection fault, probably for non-canonical address 0xf555515555555557: 0000 [#1] SMP KASAN PTI [ 5.976422] KASAN: maybe wild-memory-access in range [0xaaaaaaaaaaaaaab8-0xaaaaaaaaaaaaaabf] [ 5.977475] CPU: 3 PID: 1 Comm: systemd Not tainted 6.7.0-rc1-Zen3 msm8953-mainline#77 [ 5.977860] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 5.977860] RIP: 0010:obj_cgroup_charge_pages+0x27/0x2d5 [ 5.977860] Code: 90 90 90 55 41 57 41 56 41 55 41 54 53 89 d5 41 89 f6 49 89 ff 48 b8 00 00 00 00 00 fc ff df 49 83 c7 10 4d3 [ 5.977860] RSP: 0018:ffffc9000001fb18 EFLAGS: 00010a02 [ 5.977860] RAX: dffffc0000000000 RBX: aaaaaaaaaaaaaaaa RCX: ffff8883eb9a8b08 [ 5.977860] RDX: 0000000000000005 RSI: 0000000000400cc0 RDI: aaaaaaaaaaaaaaaa [ 5.977860] RBP: 0000000000000005 R08: 3333333333333333 R09: 0000000000000000 [ 5.977860] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8883eb9a8b18 [ 5.977860] R13: 1555555555555557 R14: 0000000000400cc0 R15: aaaaaaaaaaaaaaba [ 5.977860] FS: 00007f2976438b40(0000) GS:ffff8883eb980000(0000) knlGS:0000000000000000 [ 5.977860] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.977860] CR2: 00007f29769e0060 CR3: 0000000107222003 CR4: 0000000000370eb0 [ 5.977860] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5.977860] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5.977860] Call Trace: [ 5.977860] <TASK> [ 5.977860] ? __die_body+0x16/0x75 [ 5.977860] ? die_addr+0x4a/0x70 [ 5.977860] ? exc_general_protection+0x1c9/0x2d0 [ 5.977860] ? cgroup_mkdir+0x455/0x9fb [ 5.977860] ? __x64_sys_mkdir+0x69/0x80 [ 5.977860] ? asm_exc_general_protection+0x26/0x30 [ 5.977860] ? obj_cgroup_charge_pages+0x27/0x2d5 [ 5.977860] obj_cgroup_charge+0x114/0x1ab [ 5.977860] pcpu_alloc+0x1a6/0xa65 [ 5.977860] ? mem_cgroup_css_alloc+0x1eb/0x1140 [ 5.977860] ? cgroup_apply_control_enable+0x26b/0x7c0 [ 5.977860] mem_cgroup_css_alloc+0x23f/0x1140 [ 5.977860] cgroup_apply_control_enable+0x26b/0x7c0 [ 5.977860] ? cgroup_kn_set_ugid+0x2d/0x1a0 [ 5.977860] cgroup_mkdir+0x455/0x9fb [ 5.977860] ? __cfi_cgroup_mkdir+0x10/0x10 [ 5.977860] kernfs_iop_mkdir+0x130/0x170 [ 5.977860] vfs_mkdir+0x405/0x530 [ 5.977860] do_mkdirat+0x188/0x1f0 [ 5.977860] __x64_sys_mkdir+0x69/0x80 [ 5.977860] do_syscall_64+0x7d/0x100 [ 5.977860] ? do_syscall_64+0x89/0x100 [ 5.977860] ? do_syscall_64+0x89/0x100 [ 5.977860] ? do_syscall_64+0x89/0x100 [ 5.977860] ? do_syscall_64+0x89/0x100 [ 5.977860] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 5.977860] RIP: 0033:0x7f297671defb [ 5.977860] Code: 8b 05 39 7f 0d 00 bb ff ff ff ff 64 c7 00 16 00 00 00 e9 61 ff ff ff e8 23 0c 02 00 0f 1f 00 f3 0f 1e fa b88 [ 5.977860] RSP: 002b:00007ffee6242bb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000053 [ 5.977860] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f297671defb [ 5.977860] RDX: 0000000000000000 RSI: 00000000000001ed RDI: 000055c6b449f0e0 [ 5.977860] RBP: 00007ffee6242bf0 R08: 000000000000000e R09: 0000000000000000 [ 5.977860] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c6b445db80 [ 5.977860] R13: 00000000000003a0 R14: 00007f2976a68651 R15: 00000000000003a0 [ 5.977860] </TASK> [ 5.977860] Modules linked in: [ 6.014095] ---[ end trace 0000000000000000 ]--- [ 6.014701] RIP: 0010:obj_cgroup_charge_pages+0x27/0x2d5 [ 6.015348] Code: 90 90 90 55 41 57 41 56 41 55 41 54 53 89 d5 41 89 f6 49 89 ff 48 b8 00 00 00 00 00 fc ff df 49 83 c7 10 4d3 [ 6.017575] RSP: 0018:ffffc9000001fb18 EFLAGS: 00010a02 [ 6.018255] RAX: dffffc0000000000 RBX: aaaaaaaaaaaaaaaa RCX: ffff8883eb9a8b08 [ 6.019120] RDX: 0000000000000005 RSI: 0000000000400cc0 RDI: aaaaaaaaaaaaaaaa [ 6.019983] RBP: 0000000000000005 R08: 3333333333333333 R09: 0000000000000000 [ 6.020849] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8883eb9a8b18 [ 6.021747] R13: 1555555555555557 R14: 0000000000400cc0 R15: aaaaaaaaaaaaaaba [ 6.022609] FS: 00007f2976438b40(0000) GS:ffff8883eb980000(0000) knlGS:0000000000000000 [ 6.023593] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.024296] CR2: 00007f29769e0060 CR3: 0000000107222003 CR4: 0000000000370eb0 [ 6.025279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6.026139] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6.027000] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Actually the problem is caused by uninitialized local variable in current_obj_cgroup(). If the root memory cgroup is set as an active memory cgroup for a charging scope (as in the trace, where systemd tries to create the first non-root cgroup, so the parent cgroup is the root cgroup), the "for" loop is skipped and uninitialized objcg is returned, causing a panic down the accounting stack. The fix is trivial: initialize the objcg variable to NULL unconditionally before the "for" loop. Link: https://lkml.kernel.org/r/20231116025109.3775055-1-roman.gushchin@linux.dev Fixes: e86828e ("mm: kmem: scoped objcg protection") Signed-off-by: Roman Gushchin (Cruise) <roman.gushchin@linux.dev> Reported-by: Erhard Furtner <erhard_f@mailbox.org> Closes: ClangBuiltLinux#1959 Tested-by: Erhard Furtner <erhard_f@mailbox.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Shakeel Butt <shakeelb@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kiciuk
pushed a commit
that referenced
this pull request
Dec 3, 2023
Patch series "stackdepot: allow evicting stack traces", v4. Currently, the stack depot grows indefinitely until it reaches its capacity. Once that happens, the stack depot stops saving new stack traces. This creates a problem for using the stack depot for in-field testing and in production. For such uses, an ideal stack trace storage should: 1. Allow saving fresh stack traces on systems with a large uptime while limiting the amount of memory used to store the traces; 2. Have a low performance impact. Implementing #1 in the stack depot is impossible with the current keep-forever approach. This series targets to address that. Issue #2 is left to be addressed in a future series. This series changes the stack depot implementation to allow evicting unneeded stack traces from the stack depot. The users of the stack depot can do that via new stack_depot_save_flags(STACK_DEPOT_FLAG_GET) and stack_depot_put APIs. Internal changes to the stack depot code include: 1. Storing stack traces in fixed-frame-sized slots (vs precisely-sized slots in the current implementation); the slot size is controlled via CONFIG_STACKDEPOT_MAX_FRAMES (default: 64 frames); 2. Keeping available slots in a freelist (vs keeping an offset to the next free slot); 3. Using a read/write lock for synchronization (vs a lock-free approach combined with a spinlock). This series also integrates the eviction functionality into KASAN: the tag-based modes evict stack traces when the corresponding entry leaves the stack ring, and Generic KASAN evicts stack traces for objects once those leave the quarantine. With KASAN, despite wasting some space on rounding up the size of each stack record, the total memory consumed by stack depot gets saturated due to the eviction of irrelevant stack traces from the stack depot. With the tag-based KASAN modes, the average total amount of memory used for stack traces becomes ~0.5 MB (with the current default stack ring size of 32k entries and the default CONFIG_STACKDEPOT_MAX_FRAMES of 64). With Generic KASAN, the stack traces take up ~1 MB per 1 GB of RAM (as the quarantine's size depends on the amount of RAM). However, with KMSAN, the stack depot ends up using ~4x more memory per a stack trace than before. Thus, for KMSAN, the stack depot capacity is increased accordingly. KMSAN uses a lot of RAM for shadow memory anyway, so the increased stack depot memory usage will not make a significant difference. Other users of the stack depot do not save stack traces as often as KASAN and KMSAN. Thus, the increased memory usage is taken as an acceptable trade-off. In the future, these other users can take advantage of the eviction API to limit the memory waste. There is no measurable boot time performance impact of these changes for KASAN on x86-64. I haven't done any tests for arm64 modes (the stack depot without performance optimizations is not suitable for intended use of those anyway), but I expect a similar result. Obtaining and copying stack trace frames when saving them into stack depot is what takes the most time. This series does not yet provide a way to configure the maximum size of the stack depot externally (e.g. via a command-line parameter). This will be added in a separate series, possibly together with the performance improvement changes. This patch (of 22): Currently, if stack_depot_disable=off is passed to the kernel command-line after stack_depot_disable=on, stack depot prints a message that it is disabled, while it is actually enabled. Fix this by moving printing the disabled message to stack_depot_early_init. Place it before the __stack_depot_early_init_requested check, so that the message is printed even if early stack depot init has not been requested. Also drop the stack_table = NULL assignment from disable_stack_depot, as stack_table is NULL by default. Link: https://lkml.kernel.org/r/cover.1700502145.git.andreyknvl@google.com Link: https://lkml.kernel.org/r/73a25c5fff29f3357cd7a9330e85e09bc8da2cbe.1700502145.git.andreyknvl@google.com Fixes: e1fdc40 ("lib: stackdepot: add support to disable stack depot") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kiciuk
pushed a commit
that referenced
this pull request
Dec 3, 2023
Petr Machata says: ==================== mlxsw: Support CFF flood mode The registers to configure to initialize a flood table differ between the controlled and CFF flood modes. In therefore needs to be an op. Add it, hook up the current init to the existing families, and invoke the op. PGT is an in-HW table that maps addresses to sets of ports. Then when some HW process needs a set of ports as an argument, instead of embedding the actual set in the dynamic configuration, what gets configured is the address referencing the set. The HW then works with the appropriate PGT entry. Among other allocations, the PGT currently contains two large blocks for bridge flooding: one for 802.1q and one for 802.1d. Within each of these blocks are three tables, for unknown-unicast, multicast and broadcast flooding: . . . | 802.1q | 802.1d | . . . | UC | MC | BC | UC | MC | BC | \______ _____/ \_____ ______/ v v FID flood vectors Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an 802.1q bridge) uses three flood vectors spread across a fairly large region of PGT. This way of organizing the flood table (called "controlled") is not very flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one would need to completely rewrite the bridge PGT blocks, or resort to hacks such as storing individual MC flood vectors into unused part of the bridge table. In order to address these shortcomings, Spectrum-2 and above support what is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode, each FID has a little table of its own, with three entries adjacent to each other, one for unknown-UC, one for MC, one for BC. This allows for a much more fine-grained approach to PGT management, where bits of it are allocated on demand. . . . | FID | FID | FID | FID | FID | . . . |U|M|B|U|M|B|U|M|B|U|M|B|U|M|B| \_____________ _____________/ v FID flood vectors Besides the FID table organization, the CFF flood mode also impacts Router Subport (RSP) table. This table contains flood vectors for rFIDs, which are FIDs that reference front panel ports or LAGs. The RSP table contains two entries per front panel port and LAG, one for unknown-UC traffic, and one for everything else. Currently, the FW allocates and manages the table in its own part of PGT. rFIDs are marked with flood_rsp bit and managed specially. In CFF mode, rFIDs are managed as all other FIDs. The driver therefore has to allocate and maintain the flood vectors. Like with bridge FIDs, this is more work, but increases flexibility of the system. The FW currently supports both the controlled and CFF flood modes. To shed complexity, in the future it should only support CFF flood mode. Hence this patchset, which adds CFF flood mode support to mlxsw. Since mlxsw needs to maintain both the controlled mode as well as CFF mode support, we will keep the layout as compatible as possible. The bridge tables will stay in the same overall shape, just their inner organization will change from flood mode -> FID to FID -> flood mode. Likewise will RSP be kept as a contiguous block of PGT memory, as was the case when the FW maintained it. - The way FIDs get configured under the CFF flood mode differs from the currently used controlled mode. The simple approach of having several globally visible arrays for spectrum.c to statically choose from no longer works. Patch #1 thus privatizes all FID initialization and finalization logic, and exposes it as ops instead. - Patch #2 renames the ops that are specific to the controlled mode, to make room in the namespace for the CFF variants. Patch #3 extracts a helper to compute flood table base out of mlxsw_sp_fid_flood_table_mid(). - The op fid_setup configured fid_offset, i.e. the number of this FID within its family. For rFIDs in CFF mode, to determine this number, the driver will need to do fallible queries. Thus in patch #4, make the FID setup operation fallible as well. - Flood mode initialization routine differs between the controlled and CFF flood modes. The controlled mode needs to configure flood table layout, which the CFF mode does not need to do. In patch #5, move mlxsw_sp_fid_flood_table_init() up so that the following patch can make use of it. In patch #6, add an op to be invoked per table (if defined). - The current way of determining PGT allocation size depends on the number of FIDs and number of flood tables. RFIDs however have PGT footprint depending not on number of FIDs, but on number of ports and LAGs, because which ports an rFID should flood to does not depend on the FID itself, but on the port or LAG that it references. Therefore in patch #7, add FID family ops for determining PGT allocation size. - As elaborated above, layout of PGT will differ between controlled and CFF flood modes. In CFF mode, it will further differ between rFIDs and other FIDs (as described at previous patch). The way to pack the SFMR register to configure a FID will likewise differ from controlled to CFF. Thus in patches #8 and #9 add FID family ops to determine PGT base address for a FID and to pack SFMR. - Patches #10 and #11 add more bits for RSP support. In patch #10, add a new traffic type enumerator, for non-UC traffic. This is a combination of BC and MC traffic, but the way that mlxsw maps these mnemonic names to actual traffic type configurations requires that we have a new name to describe this class of traffic. Patch #11 then adds hooks necessary for RSP table maintenance. As ports come and go, and join and leave LAGs, it is necessary to update flood vectors that the rFIDs use. These new hooks will make that possible. - Patches msm8953-mainline#12, msm8953-mainline#13 and msm8953-mainline#14 introduce flood profiles. These have been implicit so far, but the way that CFF flood mode works with profile IDs requires that we make them explicit. Thus in patch msm8953-mainline#12, introduce flood profile objects as a set of flood tables that FID families then refer to. The FID code currently only uses a single flood profile. In patch msm8953-mainline#13, add a flood profile ID to flood profile objects. In patch msm8953-mainline#14, when in CFF mode, configure SFFP according to the existing flood profiles (or the one that exists as of that point). - Patches msm8953-mainline#15 and msm8953-mainline#16 add code to implement, respectively, bridge FIDs and RSP FIDs in CFF mode. - In patch msm8953-mainline#17, toggle flood_mode_prefer_cff on Spectrum-2 and above, which makes the newly-added code live. ==================== Link: https://lore.kernel.org/r/cover.1701183891.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Dec 3, 2023
syzbot was able to trigger a crash [1] in page_pool_unlist() page_pool_list() only inserts a page pool into a netdev page pool list if a netdev was set in params. Even if the kzalloc() call in page_pool_create happens to initialize pool->user.list, I chose to be more explicit in page_pool_list() adding one INIT_HLIST_NODE(). We could test in page_pool_unlist() if netdev was set, but since netdev can be changed to lo, it seems more robust to check if pool->user.list is hashed before calling hlist_del(). [1] Illegal XDP return value 4294946546 on prog (id 2) dev N/A, expect packet loss! general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 5064 Comm: syz-executor391 Not tainted 6.7.0-rc2-syzkaller-00533-ga379972973a8 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 RIP: 0010:__hlist_del include/linux/list.h:988 [inline] RIP: 0010:hlist_del include/linux/list.h:1002 [inline] RIP: 0010:page_pool_unlist+0xd1/0x170 net/core/page_pool_user.c:342 Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 90 00 00 00 4c 8b a3 f0 06 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 68 48 85 ed 49 89 2c 24 74 24 e8 1b ca 07 f9 48 8d RSP: 0018:ffffc900039ff768 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffff88814ae02000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88814ae026f0 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1d57fdc R10: ffffffff8eabfee3 R11: ffffffff8aa0008b R12: 0000000000000000 R13: ffff88814ae02000 R14: dffffc0000000000 R15: 0000000000000001 FS: 000055555717a380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000002555398 CR3: 0000000025044000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __page_pool_destroy net/core/page_pool.c:851 [inline] page_pool_release+0x507/0x6b0 net/core/page_pool.c:891 page_pool_destroy+0x1ac/0x4c0 net/core/page_pool.c:956 xdp_test_run_teardown net/bpf/test_run.c:216 [inline] bpf_test_run_xdp_live+0x1578/0x1af0 net/bpf/test_run.c:388 bpf_prog_test_run_xdp+0x827/0x1530 net/bpf/test_run.c:1254 bpf_prog_test_run kernel/bpf/syscall.c:4041 [inline] __sys_bpf+0x11bf/0x4920 kernel/bpf/syscall.c:5402 __do_sys_bpf kernel/bpf/syscall.c:5488 [inline] __se_sys_bpf kernel/bpf/syscall.c:5486 [inline] __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5486 Fixes: 083772c ("net: page_pool: record pools per netdev") Reported-and-tested-by: syzbot+f9f8efb58a4db2ca98d0@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231130092259.3797753-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Mar 27, 2024
Recovery remote processor failed when wdg irq received: [ 0.842574] remoteproc remoteproc0: crash detected in cix-dsp-rproc: type watchdog [ 0.842750] remoteproc remoteproc0: handling crash #1 in cix-dsp-rproc [ 0.842824] remoteproc remoteproc0: recovering cix-dsp-rproc [ 0.843342] remoteproc remoteproc0: stopped remote processor cix-dsp-rproc [ 0.847901] rproc-virtio rproc-virtio.0.auto: Failed to associate buffer [ 0.847979] remoteproc remoteproc0: failed to probe subdevices for cix-dsp-rproc: -16 The reason is that dma coherent mem would not be released when recovering the remote processor, due to rproc_virtio_remove() would not be called, where the mem released. It will fail when it try to allocate and associate buffer again. Releasing reserved memory from rproc_virtio_dev_release(), instead of rproc_virtio_remove(). Fixes: 1d7b61c ("remoteproc: virtio: Create platform device for the remoteproc_virtio") Signed-off-by: Joakim Zhang <joakim.zhang@cixtech.com> Acked-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231217053659.3245745-1-joakim.zhang@cixtech.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Kiciuk
pushed a commit
that referenced
this pull request
Mar 27, 2024
As of commit d740251 ("arm64: smp: IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI"), if we've got pseudo-NMI enabled then we'll use it to stop CPUs at panic time. This is nice, but it does mean that there's a pretty good chance that we'll end up stopping a CPU while it holds the port lock for the console UART. Specifically, I see a CPU get stopped while holding the port lock nearly 100% of the time on my sc7180-trogdor based Chromebook by enabling the "buddy" hardlockup detector and then doing: sysctl -w kernel.hardlockup_all_cpu_backtrace=1 sysctl -w kernel.hardlockup_panic=1 echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT UART drivers are _supposed_ to handle this case OK and this is why UART drivers check "oops_in_progress" and only do a "trylock" in that case. However, before we enabled pseudo-NMI to stop CPUs it wasn't a very well-tested situation. Now that we're testing the situation a lot, it can be seen that the Qualcomm GENI UART driver is pretty broken. Specifically, when I run my test case and look at the console output I just see a bunch of garbled output like: [ 201.069084] NMI backtrace[ 201.069084] NM[ 201.069087] CPU: 6 PID: 10296 Comm: dnsproxyd Not tainted 6.7.0-06265-gb13e8c0ede12 #1 01112b9f14923cbd0b[ 201.069090] Hardware name: Google Lazor ([ 201.069092] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DI[ 201.069095] pc : smp_call_function_man[ 201.069099] That's obviously not so great. This happens because each call to the console driver exits after the data has been written to the FIFO but before it's actually been flushed out of the serial port. When we have multiple calls into the console one after the other then (if we can't get the lock) each call tells the UART to throw away any data in the FIFO that hadn't been transferred yet. I've posted up a patch to change the arm64 core to avoid this situation most of the time [1] much like x86 seems to do, but even if that patch lands the GENI driver should still be fixed. >From testing, it appears that we can just delete the cancel/abort in the case where we weren't able to get the UART lock and the output looks good. It makes sense that we'd be able to do this since that means we'll just call into __qcom_geni_serial_console_write() and __qcom_geni_serial_console_write() looks much like qcom_geni_serial_poll_put_char() but with a loop. However, it seems safest to poll the FIFO and make sure it's empty before our transfer. This should reliably make sure that we're not interrupting/clobbering any existing transfers. As part of this change, we'll also avoid re-setting up a TX at the end of the console write function if we weren't able to get the lock, since accessing "port->tx_remaining" without the lock is not safe. This is only needed to re-start userspace initiated transfers. [1] https://lore.kernel.org/r/20231207170251.1.Id4817adef610302554b8aa42b090d57270dc119c@changeid Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240112150307.2.Idb1553d1d22123c377f31eacb4486432f6c9ac8d@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kiciuk
pushed a commit
that referenced
this pull request
Mar 27, 2024
Every other architecture in Linux includes the line "Call trace:" before backtraces. In some cases ARM would print "Backtrace:", but this was only via 1 specific call path, and wasn't included in CPU Oops nor things like KASAN, UBSAN, etc that called dump_stack(). Regularize this line so CI systems and other things (like LKDTM) that depend on parsing "Call trace:" out of dmesg will see it for ARM. Before this patch: UBSAN: array-index-out-of-bounds in ../drivers/misc/lkdtm/bugs.c:376:16 index 8 is out of range for type 'char [8]' CPU: 0 PID: 1402 Comm: cat Not tainted 6.7.0-rc2 #1 Hardware name: Generic DT based system dump_backtrace from show_stack+0x20/0x24 r7:00000042 r6:00000000 r5:60070013 r4:80cf5d7c show_stack from dump_stack_lvl+0x88/0x98 dump_stack_lvl from dump_stack+0x18/0x1c r7:00000042 r6:00000008 r5:00000008 r4:80fab118 dump_stack from ubsan_epilogue+0x10/0x3c ubsan_epilogue from __ubsan_handle_out_of_bounds+0x80/0x84 ... After this patch: UBSAN: array-index-out-of-bounds in ../drivers/misc/lkdtm/bugs.c:376:16 index 8 is out of range for type 'char [8]' CPU: 0 PID: 1402 Comm: cat Not tainted 6.7.0-rc2 #1 Hardware name: Generic DT based system Call trace: dump_backtrace from show_stack+0x20/0x24 r7:00000042 r6:00000000 r5:60070013 r4:80cf5d7c show_stack from dump_stack_lvl+0x88/0x98 dump_stack_lvl from dump_stack+0x18/0x1c r7:00000042 r6:00000008 r5:00000008 r4:80fab118 dump_stack from ubsan_epilogue+0x10/0x3c ubsan_epilogue from __ubsan_handle_out_of_bounds+0x80/0x84 ... Link: https://lore.kernel.org/r/20240110215554.work.460-kees@kernel.org Reported-by: Mark Brown <broonie@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: Keith Packard <keithpac@amazon.com> Cc: Haibo Li <haibo.li@mediatek.com> Cc: <linux-arm-kernel@lists.infradead.org> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Kiciuk
pushed a commit
that referenced
this pull request
Mar 27, 2024
PJ4 is a v7 core that incorporates a iWMMXt coprocessor. However, GCC does not support this combination (its iWMMXt configuration always implies v5te), and so there is no v6/v7 user space that actually makes use of this, beyond generic support for things like setjmp() that preserve/restore the iWMMXt register file using generic LDC/STC instructions emitted in assembler. As [0] appears to imply, this logic is triggered for the init process at boot, and so most user threads will have a iWMMXt register context associated with it, even though it is never used. At this point, it is highly unlikely that such GCC support will ever materialize (and Clang does not implement support for iWMMXt to begin with). This means that advertising iWMMXt support on these cores results in context switch overhead without any associated benefit, and so it is better to simply ignore the iWMMXt unit on these systems. So rip out the support. Doing so also fixes the issue reported in [0] related to UNDEF handling of co-processor #0/#1 instructions issued from user space running in Thumb2 mode. The PJ4 cores are used in four platforms: Armada 370/xp, Dove (Cubox, d2plug), MMP2 (xo-1.75) and Berlin (Google TV). Out of these, only the first is still widely used, but that one actually doesn't have iWMMXt but instead has only VFPV3-D16, and so it is not impacted by this change. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218427 [0] Fixes: 8bcba70 ("ARM: entry: Disregard Thumb undef exception ...") Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Kiciuk
pushed a commit
that referenced
this pull request
Mar 27, 2024
…resses Since commit a4d5613 ("arm: extend pfn_valid to take into account freed memory map alignment") changes the semantics of pfn_valid() to check presence of the memory map for a PFN. A valid page for an address which is reserved but not mapped by the kernel[1], the system crashed during some uio test with the following memory layout: node 0: [mem 0x00000000c0a00000-0x00000000cc8fffff] node 0: [mem 0x00000000d0000000-0x00000000da1fffff] the uio layout is:0xc0900000, 0x100000 the crash backtrace like: Unable to handle kernel paging request at virtual address bff00000 [...] CPU: 1 PID: 465 Comm: startapp.bin Tainted: G O 5.10.0 #1 Hardware name: Generic DT based system PC is at b15_flush_kern_dcache_area+0x24/0x3c LR is at __sync_icache_dcache+0x6c/0x98 [...] (b15_flush_kern_dcache_area) from (__sync_icache_dcache+0x6c/0x98) (__sync_icache_dcache) from (set_pte_at+0x28/0x54) (set_pte_at) from (remap_pfn_range+0x1a0/0x274) (remap_pfn_range) from (uio_mmap+0x184/0x1b8 [uio]) (uio_mmap [uio]) from (__mmap_region+0x264/0x5f4) (__mmap_region) from (__do_mmap_mm+0x3ec/0x440) (__do_mmap_mm) from (do_mmap+0x50/0x58) (do_mmap) from (vm_mmap_pgoff+0xfc/0x188) (vm_mmap_pgoff) from (ksys_mmap_pgoff+0xac/0xc4) (ksys_mmap_pgoff) from (ret_fast_syscall+0x0/0x5c) Code: e0801001 e2423001 e1c00003 f57ff04f (ee070f3e) ---[ end trace 09cf0734c3805d52 ]--- Kernel panic - not syncing: Fatal exception So check if PG_reserved was set to solve this issue. [1]: https://lore.kernel.org/lkml/Zbtdue57RO0QScJM@linux.ibm.com/ Fixes: a4d5613 ("arm: extend pfn_valid to take into account freed memory map alignment") Suggested-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Kiciuk
pushed a commit
that referenced
this pull request
Mar 27, 2024
Use rbi->len instead of rcd->len for non-dataring packet. Found issue: XDP_WARN: xdp_update_frame_from_buff(line:278): Driver BUG: missing reserved tailroom WARNING: CPU: 0 PID: 0 at net/core/xdp.c:586 xdp_warn+0xf/0x20 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W O 6.5.1 #1 RIP: 0010:xdp_warn+0xf/0x20 ... ? xdp_warn+0xf/0x20 xdp_do_redirect+0x15f/0x1c0 vmxnet3_run_xdp+0x17a/0x400 [vmxnet3] vmxnet3_process_xdp+0xe4/0x760 [vmxnet3] ? vmxnet3_tq_tx_complete.isra.0+0x21e/0x2c0 [vmxnet3] vmxnet3_rq_rx_complete+0x7ad/0x1120 [vmxnet3] vmxnet3_poll_rx_only+0x2d/0xa0 [vmxnet3] __napi_poll+0x20/0x180 net_rx_action+0x177/0x390 Reported-by: Martin Zaharinov <micron10@gmail.com> Tested-by: Martin Zaharinov <micron10@gmail.com> Link: https://lore.kernel.org/netdev/74BF3CC8-2A3A-44FF-98C2-1E20F110A92E@gmail.com/ Fixes: 54f00cc ("vmxnet3: Add XDP support.") Signed-off-by: William Tu <witu@nvidia.com> Link: https://lore.kernel.org/r/20240309183147.28222-1-witu@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kiciuk
pushed a commit
that referenced
this pull request
Mar 27, 2024
Fix race condition leading to system crash during EEH error handling During EEH error recovery, the bnx2x driver's transmit timeout logic could cause a race condition when handling reset tasks. The bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() SGEs are freed using bnx2x_free_rx_sge_range(). However, this could overlap with the EEH driver's attempt to reset the device using bnx2x_io_slot_reset(), which also tries to free SGEs. This race condition can result in system crashes due to accessing freed memory locations in bnx2x_free_rx_sge() 799 static inline void bnx2x_free_rx_sge(struct bnx2x *bp, 800 struct bnx2x_fastpath *fp, u16 index) 801 { 802 struct sw_rx_page *sw_buf = &fp->rx_page_ring[index]; 803 struct page *page = sw_buf->page; .... where sw_buf was set to NULL after the call to dma_unmap_page() by the preceding thread. EEH: Beginning: 'slot_reset' PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing... bnx2x 0011:01:00.0: enabling device (0140 -> 0142) bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc0080000025065fc Oops: Kernel access of bad area, sig: 11 [#1] ..... Call Trace: [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable) [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0 [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550 [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60 [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170 [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0 [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 To solve this issue, we need to verify page pool allocations before freeing. Fixes: 4cace67 ("bnx2x: Alloc 4k fragment for each rx ring buffer element") Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240315205535.1321-1-thinhtr@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kiciuk
pushed a commit
that referenced
this pull request
Mar 27, 2024
The boot sequence evaluates CPUID information twice: 1) During early boot 2) When finalizing the early setup right before mitigations are selected and alternatives are patched. In both cases the evaluation is stored in boot_cpu_data, but on UP the copying of boot_cpu_data to the per CPU info of the boot CPU happens between #1 and #2. So any update which happens in #2 is never propagated to the per CPU info instance. Consolidate the whole logic and copy boot_cpu_data right before applying alternatives as that's the point where boot_cpu_data is in it's final state and not supposed to change anymore. This also removes the voodoo mb() from smp_prepare_cpus_common() which had absolutely no purpose. Fixes: 71eb489 ("x86/percpu: Cure per CPU madness on UP") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240322185305.127642785@linutronix.de
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Never did such stuff before.....