-
Notifications
You must be signed in to change notification settings - Fork 5
Enable KSU, Patch SUSFS, Minor fixes #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Member
|
I know I am late, paimon/something else will be back soon |
amritokun
pushed a commit
that referenced
this pull request
Oct 26, 2025
Due to a historical oversight, we emit a redundant static branch for each atomic/atomic64 operation when CONFIG_ARM64_LSE_ATOMICS is selected. We can safely remove this, making the kernel Image reasonably smaller. When CONFIG_ARM64_LSE_ATOMICS is selected, every LSE atomic operation has two preceding static branches with the same target, e.g. b f7c <kernel_init_freeable+0xa4> b f7c <kernel_init_freeable+0xa4> mov w0, #0x1 // #1 ldadd w0, w0, [x19] This is because the __lse_ll_sc_body() wrapper uses system_uses_lse_atomics(), which checks both `arm64_const_caps_ready` and `cpu_hwcap_keys[ARM64_HAS_LSE_ATOMICS]`, each of which emits a static branch. This has been the case since commit: addfc38 ("arm64: atomics: avoid out-of-line ll/sc atomics") However, there was never a need to check `arm64_const_caps_ready`, which was itself introduced in commit: 63a1e1c ("arm64/cpufeature: don't use mutex in bringup path") ... so that cpus_have_const_cap() could fall back to checking the `cpu_hwcaps` bitmap prior to the static keys for individual caps becoming enabled. As system_uses_lse_atomics() doesn't check `cpu_hwcaps`, and doesn't need to as we can safely use the LL/SC atomics prior to enabling the `ARM64_HAS_LSE_ATOMICS` static key, it doesn't need to check `arm64_const_caps_ready`. This patch removes the `arm64_const_caps_ready` check from system_uses_lse_atomics(). As the arch_atomic_* routines are meant to be safely usable in noinstr code, I've also marked system_uses_lse_atomics() as __always_inline. This results in one fewer static branch per atomic operation, with the prior example becoming: b f78 <kernel_init_freeable+0xa0> mov w0, #0x1 // #1 ldadd w0, w0, [x19] Each static branch consists of the branch itself and an associated __jump_table entry. Removing these has a reasonable impact on the Image size, with a GCC 11.1.0 defconfig v5.17-rc2 Image being reduced by 128KiB: | [mark@lakrids:~/src/linux]% ls -al Image* | -rw-r--r-- 1 mark mark 34619904 Feb 3 18:24 Image.baseline | -rw-r--r-- 1 mark mark 34488832 Feb 3 18:33 Image.onebranch Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220204104439.270567-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Oct 26, 2025
Aneesh reported that: tlb_flush_mmu() tlb_flush_mmu_tlbonly() tlb_flush() <-- #1 tlb_flush_mmu_free() tlb_table_flush() tlb_table_invalidate() tlb_flush_mmu_tlbonly() tlb_flush() <-- #2 does two TLBIs when tlb->fullmm, because __tlb_reset_range() will not clear tlb->end in that case. Observe that any caller to __tlb_adjust_range() also sets at least one of the tlb->freed_tables || tlb->cleared_p* bits, and those are unconditionally cleared by __tlb_reset_range(). Change the condition for actually issuing TLBI to having one of those bits set, as opposed to having tlb->end != 0. Link: http://lkml.kernel.org/r/20200116064531.483522-4-aneesh.kumar@linux.ibm.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
amritokun
pushed a commit
that referenced
this pull request
Oct 26, 2025
storage-qa/generic/010 reported a RAMDUMP on reboot test. [ 532.682030] c7 1 debug-reboot: Create reboot monitor timer now [ 532.695337] c7 1 iommu: Removing device paintbox-ipu from group 54 [ 532.702153] c7 1 iommu: Removing device ipu-iommu from group 54 [ 532.711121] c5 1 sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 532.720128] c5 1 ufshcd-qcom 1d84000.ufshc: ufshcd_query_attr_retry: query attribute, idn 13, failed with error -19 after 3 retires [ 532.732754] c5 1 ufshcd-qcom 1d84000.ufshc: ufshcd_disable_auto_bkops: failed to enable exception event -19 -> suspect this causes the below aborts. [ 532.759830] c3 0 Synchronous External Abort: synchronous external abort (0x96000010) at 0xffffff800c80403c [ 532.770231] c3 0 Internal error: : 96000010 [#1] PREEMPT SMP [ 532.776517] c3 0 Modules linked in: ftm5(O) heatmap videobuf2_vmalloc videobuf2_memops lkdtm adsp_loader_dlkm stub_dlkm usf_dlkm native_dlkm machine_dlkm platform_dlkm wcd_cpe_dlkm wsa881x_dlkm wcd934x_dlkm wcd9360_dlkm mbhc_dlkm wcd9xxx_dlkm swr_ctrl_dlkm cs35l36_dlkm q6_dlkm swr_dlkm apr_dlkm q6_notifier_dlkm q6_pdr_dlkm wglink_dlkm wcd_spi_dlkm wcd_core_dlkm pinctrl_wcd_dlkm msm_11ad_proxy wlan(O) [ 532.813577] c3 0 Process swapper/3 (pid: 0, stack limit = 0x00000000aafbbfba) [ 532.821384] c3 0 CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S O 4.14.180-36668-gf872280691f4_audio-gab11b12 #1 Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Change-Id: Ia1a3212bc3038067027c981d82f95fe894a3eedf Signed-off-by: Alexander Winkowski <dereference23@outlook.com> Signed-off-by: TogoFire <togofire@mailfence.com>
amritokun
pushed a commit
that referenced
this pull request
Oct 26, 2025
When picolcd is switched into bootloader mode (for FW flashing) make sure not to try to dereference NULL-pointers of feature-devices during unplug/unbind. This fixes following BUG: BUG: unable to handle kernel NULL pointer dereference at 00000298 IP: [<f811f56b>] picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd] *pde = 00000000 Oops: 0000 [#1] Modules linked in: hid_picolcd syscopyarea sysfillrect sysimgblt fb_sys_fops CPU: 0 PID: 15 Comm: khubd Not tainted 3.11.0-rc7-00002-g50d62d4 #2 EIP: 0060:[<f811f56b>] EFLAGS: 00010292 CPU: 0 EIP is at picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd] Call Trace: [<f811d1ab>] picolcd_remove+0xcb/0x120 [hid_picolcd] [<c1469b09>] hid_device_remove+0x59/0xc0 [<c13464ca>] __device_release_driver+0x5a/0xb0 [<c134653f>] device_release_driver+0x1f/0x30 [<c134603d>] bus_remove_device+0x9d/0xd0 [<c13439a5>] device_del+0xd5/0x150 [<c14696a4>] hid_destroy_device+0x24/0x60 [<c1474cbb>] usbhid_disconnect+0x1b/0x40 ... Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Change-Id: Ibdfce9edd1a3d57f1b45c2a776152adada0a68cb Cc: stable@kernel.org Signed-off-by: Jiri Kosina <jkosina@suse.cz> (cherry picked from commit 1cde501) Signed-off-by: TogoFire <togofire@mailfence.com>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
[ Upstream commit b0b4518c992eb5f316c6e40ff186cbb7a5009518 ] Change the 'ret' variable in blk_stack_limits() from unsigned int to int, as it needs to store negative value -1. Storing the negative error codes in unsigned type, or performing equality comparisons (e.g., ret == -1), doesn't cause an issue at runtime [1] but can be confusing. Additionally, assigning negative error codes to unsigned type may trigger a GCC warning when the -Wsign-conversion flag is enabled. No effect on runtime. Link: https://lore.kernel.org/all/x3wogjf6vgpkisdhg3abzrx7v7zktmdnfmqeih5kosszmagqfs@oh3qxrgzkikf/ #1 Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Fixes: fe0b393 ("block: Correct handling of bottom device misaligment") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20250902130930.68317-1-rongqianfeng@vivo.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
[ Upstream commit b0531cdba5029f897da5156815e3bdafe1e9b88d ]
Similar to previous commit 2a934fdb01db ("media: v4l2-dev: fix error
handling in __video_register_device()"), the release hook should be set
before device_register(). Otherwise, when device_register() return error
and put_device() try to callback the release function, the below warning
may happen.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 4760 at drivers/base/core.c:2567 device_release+0x1bd/0x240 drivers/base/core.c:2567
Modules linked in:
CPU: 1 UID: 0 PID: 4760 Comm: syz.4.914 Not tainted 6.17.0-rc3+ #1 NONE
RIP: 0010:device_release+0x1bd/0x240 drivers/base/core.c:2567
Call Trace:
<TASK>
kobject_cleanup+0x136/0x410 lib/kobject.c:689
kobject_release lib/kobject.c:720 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0xe9/0x130 lib/kobject.c:737
put_device+0x24/0x30 drivers/base/core.c:3797
pps_register_cdev+0x2da/0x370 drivers/pps/pps.c:402
pps_register_source+0x2f6/0x480 drivers/pps/kapi.c:108
pps_tty_open+0x190/0x310 drivers/pps/clients/pps-ldisc.c:57
tty_ldisc_open+0xa7/0x120 drivers/tty/tty_ldisc.c:432
tty_set_ldisc+0x333/0x780 drivers/tty/tty_ldisc.c:563
tiocsetd drivers/tty/tty_io.c:2429 [inline]
tty_ioctl+0x5d1/0x1700 drivers/tty/tty_io.c:2728
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:598 [inline]
__se_sys_ioctl fs/ioctl.c:584 [inline]
__x64_sys_ioctl+0x194/0x210 fs/ioctl.c:584
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x5f/0x2a0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
Before commit c79a39dc8d06 ("pps: Fix a use-after-free"),
pps_register_cdev() call device_create() to create pps->dev, which will
init dev->release to device_create_release(). Now the comment is outdated,
just remove it.
Thanks for the reminder from Calvin Owens, 'kfree_pps' should be removed
in pps_register_source() to avoid a double free in the failure case.
Link: https://lore.kernel.org/all/20250827065010.3208525-1-wangliang74@huawei.com/
Fixes: c79a39dc8d06 ("pps: Fix a use-after-free")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Reviewed-By: Calvin Owens <calvin@wbinvd.org>
Link: https://lore.kernel.org/r/20250830075023.3498174-1-wangliang74@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
[ Upstream commit 1703fe4f8ae50d1fb6449854e1fcaed1053e3a14 ] During mpt3sas_transport_port_remove(), messages were logged with dev_printk() against &mpt3sas_port->port->dev. At this point the SAS transport device may already be partially unregistered or freed, leading to a crash when accessing its struct device. Using ioc_info(), which logs via the PCI device (ioc->pdev->dev), guaranteed to remain valid until driver removal. [83428.295776] Oops: general protection fault, probably for non-canonical address 0x6f702f323a33312d: 0000 [#1] SMP NOPTI [83428.295785] CPU: 145 UID: 0 PID: 113296 Comm: rmmod Kdump: loaded Tainted: G OE 6.16.0-rc1+ #1 PREEMPT(voluntary) [83428.295792] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [83428.295795] Hardware name: Dell Inc. Precision 7875 Tower/, BIOS 89.1.67 02/23/2024 [83428.295799] RIP: 0010:__dev_printk+0x1f/0x70 [83428.295805] Code: 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 49 89 d1 48 85 f6 74 52 4c 8b 46 50 4d 85 c0 74 1f 48 8b 46 68 48 85 c0 74 22 <48> 8b 08 0f b6 7f 01 48 c7 c2 db e8 42 ad 83 ef 30 e9 7b f8 ff ff [83428.295813] RSP: 0018:ff85aeafc3137bb0 EFLAGS: 00010206 [83428.295817] RAX: 6f702f323a33312d RBX: ff4290ee81292860 RCX: 5000cca25103be32 [83428.295820] RDX: ff85aeafc3137bb8 RSI: ff4290eeb1966c00 RDI: ffffffffc1560845 [83428.295823] RBP: ff85aeafc3137c18 R08: 74726f702f303a33 R09: ff85aeafc3137bb8 [83428.295826] R10: ff85aeafc3137b18 R11: ff4290f5bd60fe68 R12: ff4290ee81290000 [83428.295830] R13: ff4290ee6e345de0 R14: ff4290ee81290000 R15: ff4290ee6e345e30 [83428.295833] FS: 00007fd9472a6740(0000) GS:ff4290f5ce96b000(0000) knlGS:0000000000000000 [83428.295837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [83428.295840] CR2: 00007f242b4db238 CR3: 00000002372b8006 CR4: 0000000000771ef0 [83428.295844] PKRU: 55555554 [83428.295846] Call Trace: [83428.295848] <TASK> [83428.295850] _dev_printk+0x5c/0x80 [83428.295857] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.295863] mpt3sas_transport_port_remove+0x1c7/0x420 [mpt3sas] [83428.295882] _scsih_remove_device+0x21b/0x280 [mpt3sas] [83428.295894] ? _scsih_expander_node_remove+0x108/0x140 [mpt3sas] [83428.295906] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.295910] mpt3sas_device_remove_by_sas_address.part.0+0x8f/0x110 [mpt3sas] [83428.295921] _scsih_expander_node_remove+0x129/0x140 [mpt3sas] [83428.295933] _scsih_expander_node_remove+0x6a/0x140 [mpt3sas] [83428.295944] scsih_remove+0x3f0/0x4a0 [mpt3sas] [83428.295957] pci_device_remove+0x3b/0xb0 [83428.295962] device_release_driver_internal+0x193/0x200 [83428.295968] driver_detach+0x44/0x90 [83428.295971] bus_remove_driver+0x69/0xf0 [83428.295975] pci_unregister_driver+0x2a/0xb0 [83428.295979] _mpt3sas_exit+0x1f/0x300 [mpt3sas] [83428.295991] __do_sys_delete_module.constprop.0+0x174/0x310 [83428.295997] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.296000] ? __x64_sys_getdents64+0x9a/0x110 [83428.296005] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.296009] ? syscall_trace_enter+0xf6/0x1b0 [83428.296014] do_syscall_64+0x7b/0x2c0 [83428.296019] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.296023] entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: f92363d ("[SCSI] mpt3sas: add new driver supporting 12GB SAS") Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
commit 05703271c3cdcc0f2a8cf6ebdc45892b8ca83520 upstream. Before disabling SR-IOV via config space accesses to the parent PF, sriov_disable() first removes the PCI devices representing the VFs. Since commit 9d16947 ("PCI: Add global pci_lock_rescan_remove()") such removal operations are serialized against concurrent remove and rescan using the pci_rescan_remove_lock. No such locking was ever added in sriov_disable() however. In particular when commit 18f9e9d ("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device removal into sriov_del_vfs() there was still no locking around the pci_iov_remove_virtfn() calls. On s390 the lack of serialization in sriov_disable() may cause double remove and list corruption with the below (amended) trace being observed: PSW: 0704c00180000000 0000000c914e4b38 (klist_put+56) GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001 00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480 0000000000000001 0000000000000000 0000000000000000 0000000180692828 00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8 #0 [3800313fb20] device_del at c9158ad5c #1 [3800313fb88] pci_remove_bus_device at c915105ba #2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198 #3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0 #4 [3800313fc60] zpci_bus_remove_device at c90fb6104 #5 [3800313fca0] __zpci_event_availability at c90fb3dca #6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2 #7 [3800313fd60] crw_collect_info at c91905822 #8 [3800313fe10] kthread at c90feb390 #9 [3800313fe68] __ret_from_fork at c90f6aa64 #10 [3800313fe98] ret_from_fork at c9194f3f2. This is because in addition to sriov_disable() removing the VFs, the platform also generates hot-unplug events for the VFs. This being the reverse operation to the hotplug events generated by sriov_enable() and handled via pdev->no_vf_scan. And while the event processing takes pci_rescan_remove_lock and checks whether the struct pci_dev still exists, the lack of synchronization makes this checking racy. Other races may also be possible of course though given that this lack of locking persisted so long observable races seem very rare. Even on s390 the list corruption was only observed with certain devices since the platform events are only triggered by config accesses after the removal, so as long as the removal finished synchronously they would not race. Either way the locking is missing so fix this by adding it to the sriov_del_vfs() helper. Just like PCI rescan-remove, locking is also missing in sriov_add_vfs() including for the error case where pci_stop_and_remove_bus_device() is called without the PCI rescan-remove lock being held. Even in the non-error case, adding new PCI devices and buses should be serialized via the PCI rescan-remove lock. Add the necessary locking. Fixes: 18f9e9d ("PCI/IOV: Factor out sriov_add_vfs()") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Julian Ruess <julianr@linux.ibm.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250826-pci_fix_sriov_disable-v1-1-2d0bc938f2a3@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
[ Upstream commit 674b56aa57f9379854cb6798c3bbcef7e7b51ab7 ] Syzkaller reports a KASAN issue as below: general protection fault, probably for non-canonical address 0xfbd59c0000000021: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: maybe wild-memory-access in range [0xdead000000000108-0xdead00000000010f] CPU: 0 PID: 5083 Comm: syz-executor.2 Not tainted 6.1.134-syzkaller-00037-g855bd1d7d838 #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:__list_del include/linux/list.h:114 [inline] RIP: 0010:__list_del_entry include/linux/list.h:137 [inline] RIP: 0010:list_del include/linux/list.h:148 [inline] RIP: 0010:p9_fd_cancelled+0xe9/0x200 net/9p/trans_fd.c:734 Call Trace: <TASK> p9_client_flush+0x351/0x440 net/9p/client.c:614 p9_client_rpc+0xb6b/0xc70 net/9p/client.c:734 p9_client_version net/9p/client.c:920 [inline] p9_client_create+0xb51/0x1240 net/9p/client.c:1027 v9fs_session_init+0x1f0/0x18f0 fs/9p/v9fs.c:408 v9fs_mount+0xba/0xcb0 fs/9p/vfs_super.c:126 legacy_get_tree+0x108/0x220 fs/fs_context.c:632 vfs_get_tree+0x8e/0x300 fs/super.c:1573 do_new_mount fs/namespace.c:3056 [inline] path_mount+0x6a6/0x1e90 fs/namespace.c:3386 do_mount fs/namespace.c:3399 [inline] __do_sys_mount fs/namespace.c:3607 [inline] __se_sys_mount fs/namespace.c:3584 [inline] __x64_sys_mount+0x283/0x300 fs/namespace.c:3584 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 This happens because of a race condition between: - The 9p client sending an invalid flush request and later cleaning it up; - The 9p client in p9_read_work() canceled all pending requests. Thread 1 Thread 2 ... p9_client_create() ... p9_fd_create() ... p9_conn_create() ... // start Thread 2 INIT_WORK(&m->rq, p9_read_work); p9_read_work() ... p9_client_rpc() ... ... p9_conn_cancel() ... spin_lock(&m->req_lock); ... p9_fd_cancelled() ... ... spin_unlock(&m->req_lock); // status rewrite p9_client_cb(m->client, req, REQ_STATUS_ERROR) // first remove list_del(&req->req_list); ... spin_lock(&m->req_lock) ... // second remove list_del(&req->req_list); spin_unlock(&m->req_lock) ... Commit 74d6a5d56629 ("9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work") fixes a concurrency issue in the 9p filesystem client where the req_list could be deleted simultaneously by both p9_read_work and p9_fd_cancelled functions, but for the case where req->status equals REQ_STATUS_RCVD. Update the check for req->status in p9_fd_cancelled to skip processing not just received requests, but anything that is not SENT, as whatever changed the state from SENT also removed the request from its list. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: afd8d65 ("9P: Add cancelled() to the transport functions.") Cc: stable@vger.kernel.org Signed-off-by: Nalivayko Sergey <Sergey.Nalivayko@kaspersky.com> Message-ID: <20250715154815.3501030-1-Sergey.Nalivayko@kaspersky.com> [updated the check from status == RECV || status == ERROR to status != SENT] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> [ replaced m->req_lock with client->lock ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
[ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] There is a race condition between dm device suspend and table load that can lead to null pointer dereference. The issue occurs when suspend is invoked before table load completes: BUG: kernel NULL pointer dereference, address: 0000000000000054 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 Call Trace: <TASK> blk_mq_quiesce_queue+0x2c/0x50 dm_stop_queue+0xd/0x20 __dm_suspend+0x130/0x330 dm_suspend+0x11a/0x180 dev_suspend+0x27e/0x560 ctl_ioctl+0x4cf/0x850 dm_ctl_ioctl+0xd/0x20 vfs_ioctl+0x1d/0x50 __se_sys_ioctl+0x9b/0xc0 __x64_sys_ioctl+0x19/0x30 x64_sys_call+0x2c4a/0x4620 do_syscall_64+0x9e/0x1b0 The issue can be triggered as below: T1 T2 dm_suspend table_load __dm_suspend dm_setup_md_queue dm_mq_init_request_queue blk_mq_init_allocated_queue => q->mq_ops = set->ops; (1) dm_stop_queue / dm_wait_for_completion => q->tag_set NULL pointer! (2) => q->tag_set = set; (3) Fix this by checking if a valid table (map) exists before performing request-based suspend and waiting for target I/O. When map is NULL, skip these table-dependent suspend steps. Even when map is NULL, no I/O can reach any target because there is no table loaded; I/O submitted in this state will fail early in the DM layer. Skipping the table-dependent suspend logic in this case is safe and avoids NULL pointer dereferences. Fixes: c4576ae ("dm: fix request-based dm's use of dm_wait_for_completion") Cc: stable@vger.kernel.org Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
zone_watermark_fast was introduced by commit 48ee5f3 ("mm, page_alloc: shortcut watermark checks for order-0 pages"). The commit simply checks if free pages is bigger than watermark without additional calculation such like reducing watermark. It considered free cma pages but it did not consider highatomic reserved. This may incur exhaustion of free pages except high order atomic free pages. Assume that reserved_highatomic pageblock is bigger than watermark min, and there are only few free pages except high order atomic free. Because zone_watermark_fast passes the allocation without considering high order atomic free, normal reclaimable allocation like GFP_HIGHUSER will consume all the free pages. Then finally order-0 atomic allocation may fail on allocation. This means watermark min is not protected against non-atomic allocation. The order-0 atomic allocation with ALLOC_HARDER unwantedly can be failed. Additionally the __GFP_MEMALLOC allocation with ALLOC_NO_WATERMARKS also can be failed. To avoid the problem, zone_watermark_fast should consider highatomic reserve. If the actual size of high atomic free is counted accurately like cma free, we may use it. On this patch just use nr_reserved_highatomic. Additionally introduce __zone_watermark_unusable_free to factor out common parts between zone_watermark_fast and __zone_watermark_ok. This is an example of ALLOC_HARDER allocation failure using v4.19 based kernel. Binder:9343_3: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null) Call trace: [<ffffff8008f40f8c>] dump_stack+0xb8/0xf0 [<ffffff8008223320>] warn_alloc+0xd8/0x12c [<ffffff80082245e4>] __alloc_pages_nodemask+0x120c/0x1250 [<ffffff800827f6e8>] new_slab+0x128/0x604 [<ffffff800827b0cc>] ___slab_alloc+0x508/0x670 [<ffffff800827ba00>] __kmalloc+0x2f8/0x310 [<ffffff80084ac3e0>] context_struct_to_string+0x104/0x1cc [<ffffff80084ad8fc>] security_sid_to_context_core+0x74/0x144 [<ffffff80084ad880>] security_sid_to_context+0x10/0x18 [<ffffff800849bd80>] selinux_secid_to_secctx+0x20/0x28 [<ffffff800849109c>] security_secid_to_secctx+0x3c/0x70 [<ffffff8008bfe118>] binder_transaction+0xe68/0x454c Mem-Info: active_anon:102061 inactive_anon:81551 isolated_anon:0 active_file:59102 inactive_file:68924 isolated_file:64 unevictable:611 dirty:63 writeback:0 unstable:0 slab_reclaimable:13324 slab_unreclaimable:44354 mapped:83015 shmem:4858 pagetables:26316 bounce:0 free:2727 free_pcp:1035 free_cma:178 Node 0 active_anon:408244kB inactive_anon:326204kB active_file:236408kB inactive_file:275696kB unevictable:2444kB isolated(anon):0kB isolated(file):256kB mapped:332060kB dirty:252kB writeback:0kB shmem:19432kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Normal free:10908kB min:6192kB low:44388kB high:47060kB active_anon:409160kB inactive_anon:325924kB active_file:235820kB inactive_file:276628kB unevictable:2444kB writepending:252kB present:3076096kB managed:2673676kB mlocked:2444kB kernel_stack:62512kB pagetables:105264kB bounce:0kB free_pcp:4140kB local_pcp:40kB free_cma:712kB lowmem_reserve[]: 0 0 Normal: 505*4kB (H) 357*8kB (H) 201*16kB (H) 65*32kB (H) 1*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 10236kB 138826 total pagecache pages 5460 pages in swap cache Swap cache stats: add 8273090, delete 8267506, find 1004381/4060142 This is an example of ALLOC_NO_WATERMARKS allocation failure using v4.14 based kernel. kswapd0: page allocation failure: order:0, mode:0x140000a(GFP_NOIO|__GFP_HIGHMEM|__GFP_MOVABLE), nodemask=(null) kswapd0 cpuset=/ mems_allowed=0 CPU: 4 PID: 1221 Comm: kswapd0 Not tainted 4.14.113-18770262-userdebug #1 Call trace: [<0000000000000000>] dump_backtrace+0x0/0x248 [<0000000000000000>] show_stack+0x18/0x20 [<0000000000000000>] __dump_stack+0x20/0x28 [<0000000000000000>] dump_stack+0x68/0x90 [<0000000000000000>] warn_alloc+0x104/0x198 [<0000000000000000>] __alloc_pages_nodemask+0xdc0/0xdf0 [<0000000000000000>] zs_malloc+0x148/0x3d0 [<0000000000000000>] zram_bvec_rw+0x410/0x798 [<0000000000000000>] zram_rw_page+0x88/0xdc [<0000000000000000>] bdev_write_page+0x70/0xbc [<0000000000000000>] __swap_writepage+0x58/0x37c [<0000000000000000>] swap_writepage+0x40/0x4c [<0000000000000000>] shrink_page_list+0xc30/0xf48 [<0000000000000000>] shrink_inactive_list+0x2b0/0x61c [<0000000000000000>] shrink_node_memcg+0x23c/0x618 [<0000000000000000>] shrink_node+0x1c8/0x304 [<0000000000000000>] kswapd+0x680/0x7c4 [<0000000000000000>] kthread+0x110/0x120 [<0000000000000000>] ret_from_fork+0x10/0x18 Mem-Info: active_anon:111826 inactive_anon:65557 isolated_anon:0\x0a active_file:44260 inactive_file:83422 isolated_file:0\x0a unevictable:4158 dirty:117 writeback:0 unstable:0\x0a slab_reclaimable:13943 slab_unreclaimable:43315\x0a mapped:102511 shmem:3299 pagetables:19566 bounce:0\x0a free:3510 free_pcp:553 free_cma:0 Node 0 active_anon:447304kB inactive_anon:262228kB active_file:177040kB inactive_file:333688kB unevictable:16632kB isolated(anon):0kB isolated(file):0kB mapped:410044kB d irty:468kB writeback:0kB shmem:13196kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Normal free:14040kB min:7440kB low:94500kB high:98136kB reserved_highatomic:32768KB active_anon:447336kB inactive_anon:261668kB active_file:177572kB inactive_file:333768k B unevictable:16632kB writepending:480kB present:4081664kB managed:3637088kB mlocked:16632kB kernel_stack:47072kB pagetables:78264kB bounce:0kB free_pcp:2280kB local_pcp:720kB free_cma:0kB [ 4738.329607] lowmem_reserve[]: 0 0 Normal: 860*4kB (H) 453*8kB (H) 180*16kB (H) 26*32kB (H) 34*64kB (H) 6*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 14232kB This is trace log which shows GFP_HIGHUSER consumes free pages right before ALLOC_NO_WATERMARKS. <...>-22275 [006] .... 889.213383: mm_page_alloc: page=00000000d2be5665 pfn=970744 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213385: mm_page_alloc: page=000000004b2335c2 pfn=970745 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213387: mm_page_alloc: page=00000000017272e1 pfn=970278 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213389: mm_page_alloc: page=00000000c4be79fb pfn=970279 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213391: mm_page_alloc: page=00000000f8a51d4f pfn=970260 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213393: mm_page_alloc: page=000000006ba8f5ac pfn=970261 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213395: mm_page_alloc: page=00000000819f1cd3 pfn=970196 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213396: mm_page_alloc: page=00000000f6b72a64 pfn=970197 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO kswapd0-1207 [005] ...1 889.213398: mm_page_alloc: page= (null) pfn=0 order=0 migratetype=1 nr_free=3650 gfp_flags=GFP_NOWAIT|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_MOVABLE [jaewon31.kim@samsung.com: remove redundant code for high-order] Link: http://lkml.kernel.org/r/20200623035242.27232-1-jaewon31.kim@samsung.com Reported-by: Yong-Taek Lee <ytk.lee@samsung.com> Suggested-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Baoquan He <bhe@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Yong-Taek Lee <ytk.lee@samsung.com> Cc: Michal Hocko <mhocko@kernel.org> Link: http://lkml.kernel.org/r/20200619235958.11283-1-jaewon31.kim@samsung.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit f27ce0e14088b23f8d54ae4a44f70307ec420e64) Change-Id: I2638d575f809e885272c3b2a4e5100f2d6b8934d Bug: 175184106
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
This reverts commit 9dcdb6f5ee0ea133f2e0d669743fcb48362ee4c5. The IRQ subsystem already blocks suspend on waiting for IRQ threads to finish running (in dpm_noirq_begin()). This PM wakeup does nothing but add latency to the IRQ handler for non-RT kernels, and it isn't RT-friendly either: [ 42.466403] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974 [ 42.466407] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/3 [ 42.466408] Preemption disabled at: [ 42.466421] [<00000000100c9f7d>] secondary_start_kernel+0xa8/0x130 [ 42.466427] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S W 4.14.212-rt102-Sultan #1 [ 42.466429] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Coral (DT) [ 42.466432] Call trace: [ 42.466436] dump_backtrace+0x0/0x1ac [ 42.466439] show_stack+0x14/0x1c [ 42.466444] dump_stack+0x84/0xac [ 42.466448] ___might_sleep+0x140/0x150 [ 42.466452] rt_spin_lock+0x3c/0x50 [ 42.466458] __pm_stay_awake+0x20/0x50 [ 42.466462] qcom_smp2p_isr+0x10/0x1c [ 42.466467] __handle_irq_event_percpu+0x60/0xd4 [ 42.466469] handle_irq_event_percpu+0x58/0xb0 [ 42.466471] handle_irq_event+0x68/0xe0 [ 42.466474] handle_fasteoi_irq+0x140/0x1fc [ 42.466476] generic_handle_irq+0x18/0x2c [ 42.466478] __handle_domain_irq+0xf8/0xfc [ 42.466481] gic_handle_irq+0xc8/0x164 [ 42.466483] el1_irq+0xb0/0x130 [ 42.466487] finish_task_switch+0xcc/0x1e4 [ 42.466491] __schedule+0x3f0/0x4e0 [ 42.466493] schedule_idle+0x28/0x44 [ 42.466497] do_idle+0x78/0x230 [ 42.466500] cpu_startup_entry+0x20/0x28 [ 42.466502] secondary_start_kernel+0x124/0x130 Remove it since it's useless. Change-Id: Ib4a03d89bbaaf114980ee93105ce3d5b0b5127eb Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
…rules() (#2646) When kernel is compiled with CONFIG_DEBUG_ATOMIC_SLEEP enabled, it prints the following splat in dmesg during post boot: [ 6.739169] init: Opening SELinux policy [ 6.751520] init: Loading SELinux policy [ 6.894684] SELinux: policy capability network_peer_controls=1 [ 6.894688] SELinux: policy capability open_perms=1 [ 6.894690] SELinux: policy capability extended_socket_class=1 [ 6.894691] SELinux: policy capability always_check_network=0 [ 6.894693] SELinux: policy capability cgroup_seclabel=0 [ 6.894695] SELinux: policy capability nnp_nosuid_transition=1 [ 7.214323] selinux: SELinux: Loaded file context from: [ 7.214332] selinux: /system/etc/selinux/plat_file_contexts [ 7.214339] selinux: /system_ext/etc/selinux/system_ext_file_contexts [ 7.214345] selinux: /product/etc/selinux/product_file_contexts [ 7.214350] selinux: /vendor/etc/selinux/vendor_file_contexts [ 7.214356] selinux: /odm/etc/selinux/odm_file_contexts [ 7.216398] KernelSU: /system/bin/init argc: 2 [ 7.216401] KernelSU: /system/bin/init first arg: second_stage [ 7.216403] KernelSU: /system/bin/init second_stage executed [ 7.216506] BUG: sleeping function called from invalid context at security/selinux/ss/hashtab.c:47 [ 7.216512] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1, name: init [ 7.216516] preempt_count: 0, expected: 0 [ 7.216518] RCU nest depth: 1, expected: 0 [ 7.216524] CPU: 6 PID: 1 Comm: init Not tainted 5.4.289-Scarlet-v2.0-beta3 #1 [ 7.216526] Hardware name: redwood based Qualcomm Technologies, Inc. SM7325 (DT) [ 7.216528] Call trace: [ 7.216536] dump_backtrace+0x0/0x210 [ 7.216539] show_stack+0x14/0x20 [ 7.216544] dump_stack+0x9c/0xec [ 7.216548] __might_resched+0x1f0/0x210 [ 7.216552] hashtab_insert+0x38/0x230 [ 7.216557] add_type+0xd4/0x2e0 [ 7.216559] ksu_type+0x24/0x60 [ 7.216562] apply_kernelsu_rules+0xa8/0x650 [ 7.216565] ksu_handle_execveat_ksud+0x2a8/0x460 [ 7.216568] ksu_handle_execveat+0x2c/0x60 [ 7.216571] __arm64_sys_execve+0xe8/0xf0 [ 7.216574] el0_svc_common+0xf4/0x1a0 [ 7.216577] do_el0_svc+0x2c/0x40 [ 7.216579] el0_sync_handler+0x18c/0x200 [ 7.216582] el0_sync+0x140/0x180 This is because apply_kernelsu_rules() uses rcu_read_lock() to protect SELinux policy modifications. However, cond_resched() from hashtab_insert() at security/selinux/ss/hashtab.c is internally called and it sleeps which is illegal under an RCU read-side critical section. While replacing it with a spinlock would suppress the warning, this is fundamentally incorrect because sleeping is illegal while holding a spinlock and spinlock would turn off preemption which isn't an ideal solution since it intentionally turns off rescheduling, and can lead to deadlocks. Instead, replace the RCU lock with a mutex lock. Mutex lock allows sleeping when necessary, which is appropriate here because apply_kernelsu_rules() runs in process context, not in atomic or interrupt context. As apply_kernelsu_rules() is invoked only once during post boot (SYSTEM_RUNNING), the mutex lock does not introduce any major runtime performance regression and provides correct synchronization. Fixes: tiann/KernelSU#2637 Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
Due to a historical oversight, we emit a redundant static branch for each atomic/atomic64 operation when CONFIG_ARM64_LSE_ATOMICS is selected. We can safely remove this, making the kernel Image reasonably smaller. When CONFIG_ARM64_LSE_ATOMICS is selected, every LSE atomic operation has two preceding static branches with the same target, e.g. b f7c <kernel_init_freeable+0xa4> b f7c <kernel_init_freeable+0xa4> mov w0, #0x1 // #1 ldadd w0, w0, [x19] This is because the __lse_ll_sc_body() wrapper uses system_uses_lse_atomics(), which checks both `arm64_const_caps_ready` and `cpu_hwcap_keys[ARM64_HAS_LSE_ATOMICS]`, each of which emits a static branch. This has been the case since commit: addfc38 ("arm64: atomics: avoid out-of-line ll/sc atomics") However, there was never a need to check `arm64_const_caps_ready`, which was itself introduced in commit: 63a1e1c ("arm64/cpufeature: don't use mutex in bringup path") ... so that cpus_have_const_cap() could fall back to checking the `cpu_hwcaps` bitmap prior to the static keys for individual caps becoming enabled. As system_uses_lse_atomics() doesn't check `cpu_hwcaps`, and doesn't need to as we can safely use the LL/SC atomics prior to enabling the `ARM64_HAS_LSE_ATOMICS` static key, it doesn't need to check `arm64_const_caps_ready`. This patch removes the `arm64_const_caps_ready` check from system_uses_lse_atomics(). As the arch_atomic_* routines are meant to be safely usable in noinstr code, I've also marked system_uses_lse_atomics() as __always_inline. This results in one fewer static branch per atomic operation, with the prior example becoming: b f78 <kernel_init_freeable+0xa0> mov w0, #0x1 // #1 ldadd w0, w0, [x19] Each static branch consists of the branch itself and an associated __jump_table entry. Removing these has a reasonable impact on the Image size, with a GCC 11.1.0 defconfig v5.17-rc2 Image being reduced by 128KiB: | [mark@lakrids:~/src/linux]% ls -al Image* | -rw-r--r-- 1 mark mark 34619904 Feb 3 18:24 Image.baseline | -rw-r--r-- 1 mark mark 34488832 Feb 3 18:33 Image.onebranch Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220204104439.270567-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
Aneesh reported that: tlb_flush_mmu() tlb_flush_mmu_tlbonly() tlb_flush() <-- #1 tlb_flush_mmu_free() tlb_table_flush() tlb_table_invalidate() tlb_flush_mmu_tlbonly() tlb_flush() <-- #2 does two TLBIs when tlb->fullmm, because __tlb_reset_range() will not clear tlb->end in that case. Observe that any caller to __tlb_adjust_range() also sets at least one of the tlb->freed_tables || tlb->cleared_p* bits, and those are unconditionally cleared by __tlb_reset_range(). Change the condition for actually issuing TLBI to having one of those bits set, as opposed to having tlb->end != 0. Link: http://lkml.kernel.org/r/20200116064531.483522-4-aneesh.kumar@linux.ibm.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
storage-qa/generic/010 reported a RAMDUMP on reboot test. [ 532.682030] c7 1 debug-reboot: Create reboot monitor timer now [ 532.695337] c7 1 iommu: Removing device paintbox-ipu from group 54 [ 532.702153] c7 1 iommu: Removing device ipu-iommu from group 54 [ 532.711121] c5 1 sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 532.720128] c5 1 ufshcd-qcom 1d84000.ufshc: ufshcd_query_attr_retry: query attribute, idn 13, failed with error -19 after 3 retires [ 532.732754] c5 1 ufshcd-qcom 1d84000.ufshc: ufshcd_disable_auto_bkops: failed to enable exception event -19 -> suspect this causes the below aborts. [ 532.759830] c3 0 Synchronous External Abort: synchronous external abort (0x96000010) at 0xffffff800c80403c [ 532.770231] c3 0 Internal error: : 96000010 [#1] PREEMPT SMP [ 532.776517] c3 0 Modules linked in: ftm5(O) heatmap videobuf2_vmalloc videobuf2_memops lkdtm adsp_loader_dlkm stub_dlkm usf_dlkm native_dlkm machine_dlkm platform_dlkm wcd_cpe_dlkm wsa881x_dlkm wcd934x_dlkm wcd9360_dlkm mbhc_dlkm wcd9xxx_dlkm swr_ctrl_dlkm cs35l36_dlkm q6_dlkm swr_dlkm apr_dlkm q6_notifier_dlkm q6_pdr_dlkm wglink_dlkm wcd_spi_dlkm wcd_core_dlkm pinctrl_wcd_dlkm msm_11ad_proxy wlan(O) [ 532.813577] c3 0 Process swapper/3 (pid: 0, stack limit = 0x00000000aafbbfba) [ 532.821384] c3 0 CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S O 4.14.180-36668-gf872280691f4_audio-gab11b12 #1 Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Change-Id: Ia1a3212bc3038067027c981d82f95fe894a3eedf Signed-off-by: Alexander Winkowski <dereference23@outlook.com> Signed-off-by: TogoFire <togofire@mailfence.com>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
When picolcd is switched into bootloader mode (for FW flashing) make sure not to try to dereference NULL-pointers of feature-devices during unplug/unbind. This fixes following BUG: BUG: unable to handle kernel NULL pointer dereference at 00000298 IP: [<f811f56b>] picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd] *pde = 00000000 Oops: 0000 [#1] Modules linked in: hid_picolcd syscopyarea sysfillrect sysimgblt fb_sys_fops CPU: 0 PID: 15 Comm: khubd Not tainted 3.11.0-rc7-00002-g50d62d4 #2 EIP: 0060:[<f811f56b>] EFLAGS: 00010292 CPU: 0 EIP is at picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd] Call Trace: [<f811d1ab>] picolcd_remove+0xcb/0x120 [hid_picolcd] [<c1469b09>] hid_device_remove+0x59/0xc0 [<c13464ca>] __device_release_driver+0x5a/0xb0 [<c134653f>] device_release_driver+0x1f/0x30 [<c134603d>] bus_remove_device+0x9d/0xd0 [<c13439a5>] device_del+0xd5/0x150 [<c14696a4>] hid_destroy_device+0x24/0x60 [<c1474cbb>] usbhid_disconnect+0x1b/0x40 ... Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Change-Id: Ibdfce9edd1a3d57f1b45c2a776152adada0a68cb Cc: stable@kernel.org Signed-off-by: Jiri Kosina <jkosina@suse.cz> (cherry picked from commit 1cde501) Signed-off-by: TogoFire <togofire@mailfence.com>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
Due to a historical oversight, we emit a redundant static branch for each atomic/atomic64 operation when CONFIG_ARM64_LSE_ATOMICS is selected. We can safely remove this, making the kernel Image reasonably smaller. When CONFIG_ARM64_LSE_ATOMICS is selected, every LSE atomic operation has two preceding static branches with the same target, e.g. b f7c <kernel_init_freeable+0xa4> b f7c <kernel_init_freeable+0xa4> mov w0, #0x1 // #1 ldadd w0, w0, [x19] This is because the __lse_ll_sc_body() wrapper uses system_uses_lse_atomics(), which checks both `arm64_const_caps_ready` and `cpu_hwcap_keys[ARM64_HAS_LSE_ATOMICS]`, each of which emits a static branch. This has been the case since commit: addfc38 ("arm64: atomics: avoid out-of-line ll/sc atomics") However, there was never a need to check `arm64_const_caps_ready`, which was itself introduced in commit: 63a1e1c ("arm64/cpufeature: don't use mutex in bringup path") ... so that cpus_have_const_cap() could fall back to checking the `cpu_hwcaps` bitmap prior to the static keys for individual caps becoming enabled. As system_uses_lse_atomics() doesn't check `cpu_hwcaps`, and doesn't need to as we can safely use the LL/SC atomics prior to enabling the `ARM64_HAS_LSE_ATOMICS` static key, it doesn't need to check `arm64_const_caps_ready`. This patch removes the `arm64_const_caps_ready` check from system_uses_lse_atomics(). As the arch_atomic_* routines are meant to be safely usable in noinstr code, I've also marked system_uses_lse_atomics() as __always_inline. This results in one fewer static branch per atomic operation, with the prior example becoming: b f78 <kernel_init_freeable+0xa0> mov w0, #0x1 // #1 ldadd w0, w0, [x19] Each static branch consists of the branch itself and an associated __jump_table entry. Removing these has a reasonable impact on the Image size, with a GCC 11.1.0 defconfig v5.17-rc2 Image being reduced by 128KiB: | [mark@lakrids:~/src/linux]% ls -al Image* | -rw-r--r-- 1 mark mark 34619904 Feb 3 18:24 Image.baseline | -rw-r--r-- 1 mark mark 34488832 Feb 3 18:33 Image.onebranch Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220204104439.270567-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
Aneesh reported that: tlb_flush_mmu() tlb_flush_mmu_tlbonly() tlb_flush() <-- #1 tlb_flush_mmu_free() tlb_table_flush() tlb_table_invalidate() tlb_flush_mmu_tlbonly() tlb_flush() <-- #2 does two TLBIs when tlb->fullmm, because __tlb_reset_range() will not clear tlb->end in that case. Observe that any caller to __tlb_adjust_range() also sets at least one of the tlb->freed_tables || tlb->cleared_p* bits, and those are unconditionally cleared by __tlb_reset_range(). Change the condition for actually issuing TLBI to having one of those bits set, as opposed to having tlb->end != 0. Link: http://lkml.kernel.org/r/20200116064531.483522-4-aneesh.kumar@linux.ibm.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
storage-qa/generic/010 reported a RAMDUMP on reboot test. [ 532.682030] c7 1 debug-reboot: Create reboot monitor timer now [ 532.695337] c7 1 iommu: Removing device paintbox-ipu from group 54 [ 532.702153] c7 1 iommu: Removing device ipu-iommu from group 54 [ 532.711121] c5 1 sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 532.720128] c5 1 ufshcd-qcom 1d84000.ufshc: ufshcd_query_attr_retry: query attribute, idn 13, failed with error -19 after 3 retires [ 532.732754] c5 1 ufshcd-qcom 1d84000.ufshc: ufshcd_disable_auto_bkops: failed to enable exception event -19 -> suspect this causes the below aborts. [ 532.759830] c3 0 Synchronous External Abort: synchronous external abort (0x96000010) at 0xffffff800c80403c [ 532.770231] c3 0 Internal error: : 96000010 [#1] PREEMPT SMP [ 532.776517] c3 0 Modules linked in: ftm5(O) heatmap videobuf2_vmalloc videobuf2_memops lkdtm adsp_loader_dlkm stub_dlkm usf_dlkm native_dlkm machine_dlkm platform_dlkm wcd_cpe_dlkm wsa881x_dlkm wcd934x_dlkm wcd9360_dlkm mbhc_dlkm wcd9xxx_dlkm swr_ctrl_dlkm cs35l36_dlkm q6_dlkm swr_dlkm apr_dlkm q6_notifier_dlkm q6_pdr_dlkm wglink_dlkm wcd_spi_dlkm wcd_core_dlkm pinctrl_wcd_dlkm msm_11ad_proxy wlan(O) [ 532.813577] c3 0 Process swapper/3 (pid: 0, stack limit = 0x00000000aafbbfba) [ 532.821384] c3 0 CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S O 4.14.180-36668-gf872280691f4_audio-gab11b12 #1 Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Change-Id: Ia1a3212bc3038067027c981d82f95fe894a3eedf Signed-off-by: Alexander Winkowski <dereference23@outlook.com> Signed-off-by: TogoFire <togofire@mailfence.com>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
When picolcd is switched into bootloader mode (for FW flashing) make sure not to try to dereference NULL-pointers of feature-devices during unplug/unbind. This fixes following BUG: BUG: unable to handle kernel NULL pointer dereference at 00000298 IP: [<f811f56b>] picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd] *pde = 00000000 Oops: 0000 [#1] Modules linked in: hid_picolcd syscopyarea sysfillrect sysimgblt fb_sys_fops CPU: 0 PID: 15 Comm: khubd Not tainted 3.11.0-rc7-00002-g50d62d4 #2 EIP: 0060:[<f811f56b>] EFLAGS: 00010292 CPU: 0 EIP is at picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd] Call Trace: [<f811d1ab>] picolcd_remove+0xcb/0x120 [hid_picolcd] [<c1469b09>] hid_device_remove+0x59/0xc0 [<c13464ca>] __device_release_driver+0x5a/0xb0 [<c134653f>] device_release_driver+0x1f/0x30 [<c134603d>] bus_remove_device+0x9d/0xd0 [<c13439a5>] device_del+0xd5/0x150 [<c14696a4>] hid_destroy_device+0x24/0x60 [<c1474cbb>] usbhid_disconnect+0x1b/0x40 ... Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Change-Id: Ibdfce9edd1a3d57f1b45c2a776152adada0a68cb Cc: stable@kernel.org Signed-off-by: Jiri Kosina <jkosina@suse.cz> (cherry picked from commit 1cde501) Signed-off-by: TogoFire <togofire@mailfence.com>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
zone_watermark_fast was introduced by commit 48ee5f3 ("mm, page_alloc: shortcut watermark checks for order-0 pages"). The commit simply checks if free pages is bigger than watermark without additional calculation such like reducing watermark. It considered free cma pages but it did not consider highatomic reserved. This may incur exhaustion of free pages except high order atomic free pages. Assume that reserved_highatomic pageblock is bigger than watermark min, and there are only few free pages except high order atomic free. Because zone_watermark_fast passes the allocation without considering high order atomic free, normal reclaimable allocation like GFP_HIGHUSER will consume all the free pages. Then finally order-0 atomic allocation may fail on allocation. This means watermark min is not protected against non-atomic allocation. The order-0 atomic allocation with ALLOC_HARDER unwantedly can be failed. Additionally the __GFP_MEMALLOC allocation with ALLOC_NO_WATERMARKS also can be failed. To avoid the problem, zone_watermark_fast should consider highatomic reserve. If the actual size of high atomic free is counted accurately like cma free, we may use it. On this patch just use nr_reserved_highatomic. Additionally introduce __zone_watermark_unusable_free to factor out common parts between zone_watermark_fast and __zone_watermark_ok. This is an example of ALLOC_HARDER allocation failure using v4.19 based kernel. Binder:9343_3: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null) Call trace: [<ffffff8008f40f8c>] dump_stack+0xb8/0xf0 [<ffffff8008223320>] warn_alloc+0xd8/0x12c [<ffffff80082245e4>] __alloc_pages_nodemask+0x120c/0x1250 [<ffffff800827f6e8>] new_slab+0x128/0x604 [<ffffff800827b0cc>] ___slab_alloc+0x508/0x670 [<ffffff800827ba00>] __kmalloc+0x2f8/0x310 [<ffffff80084ac3e0>] context_struct_to_string+0x104/0x1cc [<ffffff80084ad8fc>] security_sid_to_context_core+0x74/0x144 [<ffffff80084ad880>] security_sid_to_context+0x10/0x18 [<ffffff800849bd80>] selinux_secid_to_secctx+0x20/0x28 [<ffffff800849109c>] security_secid_to_secctx+0x3c/0x70 [<ffffff8008bfe118>] binder_transaction+0xe68/0x454c Mem-Info: active_anon:102061 inactive_anon:81551 isolated_anon:0 active_file:59102 inactive_file:68924 isolated_file:64 unevictable:611 dirty:63 writeback:0 unstable:0 slab_reclaimable:13324 slab_unreclaimable:44354 mapped:83015 shmem:4858 pagetables:26316 bounce:0 free:2727 free_pcp:1035 free_cma:178 Node 0 active_anon:408244kB inactive_anon:326204kB active_file:236408kB inactive_file:275696kB unevictable:2444kB isolated(anon):0kB isolated(file):256kB mapped:332060kB dirty:252kB writeback:0kB shmem:19432kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Normal free:10908kB min:6192kB low:44388kB high:47060kB active_anon:409160kB inactive_anon:325924kB active_file:235820kB inactive_file:276628kB unevictable:2444kB writepending:252kB present:3076096kB managed:2673676kB mlocked:2444kB kernel_stack:62512kB pagetables:105264kB bounce:0kB free_pcp:4140kB local_pcp:40kB free_cma:712kB lowmem_reserve[]: 0 0 Normal: 505*4kB (H) 357*8kB (H) 201*16kB (H) 65*32kB (H) 1*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 10236kB 138826 total pagecache pages 5460 pages in swap cache Swap cache stats: add 8273090, delete 8267506, find 1004381/4060142 This is an example of ALLOC_NO_WATERMARKS allocation failure using v4.14 based kernel. kswapd0: page allocation failure: order:0, mode:0x140000a(GFP_NOIO|__GFP_HIGHMEM|__GFP_MOVABLE), nodemask=(null) kswapd0 cpuset=/ mems_allowed=0 CPU: 4 PID: 1221 Comm: kswapd0 Not tainted 4.14.113-18770262-userdebug #1 Call trace: [<0000000000000000>] dump_backtrace+0x0/0x248 [<0000000000000000>] show_stack+0x18/0x20 [<0000000000000000>] __dump_stack+0x20/0x28 [<0000000000000000>] dump_stack+0x68/0x90 [<0000000000000000>] warn_alloc+0x104/0x198 [<0000000000000000>] __alloc_pages_nodemask+0xdc0/0xdf0 [<0000000000000000>] zs_malloc+0x148/0x3d0 [<0000000000000000>] zram_bvec_rw+0x410/0x798 [<0000000000000000>] zram_rw_page+0x88/0xdc [<0000000000000000>] bdev_write_page+0x70/0xbc [<0000000000000000>] __swap_writepage+0x58/0x37c [<0000000000000000>] swap_writepage+0x40/0x4c [<0000000000000000>] shrink_page_list+0xc30/0xf48 [<0000000000000000>] shrink_inactive_list+0x2b0/0x61c [<0000000000000000>] shrink_node_memcg+0x23c/0x618 [<0000000000000000>] shrink_node+0x1c8/0x304 [<0000000000000000>] kswapd+0x680/0x7c4 [<0000000000000000>] kthread+0x110/0x120 [<0000000000000000>] ret_from_fork+0x10/0x18 Mem-Info: active_anon:111826 inactive_anon:65557 isolated_anon:0\x0a active_file:44260 inactive_file:83422 isolated_file:0\x0a unevictable:4158 dirty:117 writeback:0 unstable:0\x0a slab_reclaimable:13943 slab_unreclaimable:43315\x0a mapped:102511 shmem:3299 pagetables:19566 bounce:0\x0a free:3510 free_pcp:553 free_cma:0 Node 0 active_anon:447304kB inactive_anon:262228kB active_file:177040kB inactive_file:333688kB unevictable:16632kB isolated(anon):0kB isolated(file):0kB mapped:410044kB d irty:468kB writeback:0kB shmem:13196kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Normal free:14040kB min:7440kB low:94500kB high:98136kB reserved_highatomic:32768KB active_anon:447336kB inactive_anon:261668kB active_file:177572kB inactive_file:333768k B unevictable:16632kB writepending:480kB present:4081664kB managed:3637088kB mlocked:16632kB kernel_stack:47072kB pagetables:78264kB bounce:0kB free_pcp:2280kB local_pcp:720kB free_cma:0kB [ 4738.329607] lowmem_reserve[]: 0 0 Normal: 860*4kB (H) 453*8kB (H) 180*16kB (H) 26*32kB (H) 34*64kB (H) 6*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 14232kB This is trace log which shows GFP_HIGHUSER consumes free pages right before ALLOC_NO_WATERMARKS. <...>-22275 [006] .... 889.213383: mm_page_alloc: page=00000000d2be5665 pfn=970744 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213385: mm_page_alloc: page=000000004b2335c2 pfn=970745 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213387: mm_page_alloc: page=00000000017272e1 pfn=970278 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213389: mm_page_alloc: page=00000000c4be79fb pfn=970279 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213391: mm_page_alloc: page=00000000f8a51d4f pfn=970260 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213393: mm_page_alloc: page=000000006ba8f5ac pfn=970261 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213395: mm_page_alloc: page=00000000819f1cd3 pfn=970196 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO <...>-22275 [006] .... 889.213396: mm_page_alloc: page=00000000f6b72a64 pfn=970197 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER|__GFP_ZERO kswapd0-1207 [005] ...1 889.213398: mm_page_alloc: page= (null) pfn=0 order=0 migratetype=1 nr_free=3650 gfp_flags=GFP_NOWAIT|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_MOVABLE [jaewon31.kim@samsung.com: remove redundant code for high-order] Link: http://lkml.kernel.org/r/20200623035242.27232-1-jaewon31.kim@samsung.com Reported-by: Yong-Taek Lee <ytk.lee@samsung.com> Suggested-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Baoquan He <bhe@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Yong-Taek Lee <ytk.lee@samsung.com> Cc: Michal Hocko <mhocko@kernel.org> Link: http://lkml.kernel.org/r/20200619235958.11283-1-jaewon31.kim@samsung.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit f27ce0e14088b23f8d54ae4a44f70307ec420e64) Change-Id: I2638d575f809e885272c3b2a4e5100f2d6b8934d Bug: 175184106
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
This reverts commit 9dcdb6f5ee0ea133f2e0d669743fcb48362ee4c5. The IRQ subsystem already blocks suspend on waiting for IRQ threads to finish running (in dpm_noirq_begin()). This PM wakeup does nothing but add latency to the IRQ handler for non-RT kernels, and it isn't RT-friendly either: [ 42.466403] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974 [ 42.466407] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/3 [ 42.466408] Preemption disabled at: [ 42.466421] [<00000000100c9f7d>] secondary_start_kernel+0xa8/0x130 [ 42.466427] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S W 4.14.212-rt102-Sultan #1 [ 42.466429] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Coral (DT) [ 42.466432] Call trace: [ 42.466436] dump_backtrace+0x0/0x1ac [ 42.466439] show_stack+0x14/0x1c [ 42.466444] dump_stack+0x84/0xac [ 42.466448] ___might_sleep+0x140/0x150 [ 42.466452] rt_spin_lock+0x3c/0x50 [ 42.466458] __pm_stay_awake+0x20/0x50 [ 42.466462] qcom_smp2p_isr+0x10/0x1c [ 42.466467] __handle_irq_event_percpu+0x60/0xd4 [ 42.466469] handle_irq_event_percpu+0x58/0xb0 [ 42.466471] handle_irq_event+0x68/0xe0 [ 42.466474] handle_fasteoi_irq+0x140/0x1fc [ 42.466476] generic_handle_irq+0x18/0x2c [ 42.466478] __handle_domain_irq+0xf8/0xfc [ 42.466481] gic_handle_irq+0xc8/0x164 [ 42.466483] el1_irq+0xb0/0x130 [ 42.466487] finish_task_switch+0xcc/0x1e4 [ 42.466491] __schedule+0x3f0/0x4e0 [ 42.466493] schedule_idle+0x28/0x44 [ 42.466497] do_idle+0x78/0x230 [ 42.466500] cpu_startup_entry+0x20/0x28 [ 42.466502] secondary_start_kernel+0x124/0x130 Remove it since it's useless. Change-Id: Ib4a03d89bbaaf114980ee93105ce3d5b0b5127eb Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
Due to a historical oversight, we emit a redundant static branch for each atomic/atomic64 operation when CONFIG_ARM64_LSE_ATOMICS is selected. We can safely remove this, making the kernel Image reasonably smaller. When CONFIG_ARM64_LSE_ATOMICS is selected, every LSE atomic operation has two preceding static branches with the same target, e.g. b f7c <kernel_init_freeable+0xa4> b f7c <kernel_init_freeable+0xa4> mov w0, #0x1 // #1 ldadd w0, w0, [x19] This is because the __lse_ll_sc_body() wrapper uses system_uses_lse_atomics(), which checks both `arm64_const_caps_ready` and `cpu_hwcap_keys[ARM64_HAS_LSE_ATOMICS]`, each of which emits a static branch. This has been the case since commit: addfc38 ("arm64: atomics: avoid out-of-line ll/sc atomics") However, there was never a need to check `arm64_const_caps_ready`, which was itself introduced in commit: 63a1e1c ("arm64/cpufeature: don't use mutex in bringup path") ... so that cpus_have_const_cap() could fall back to checking the `cpu_hwcaps` bitmap prior to the static keys for individual caps becoming enabled. As system_uses_lse_atomics() doesn't check `cpu_hwcaps`, and doesn't need to as we can safely use the LL/SC atomics prior to enabling the `ARM64_HAS_LSE_ATOMICS` static key, it doesn't need to check `arm64_const_caps_ready`. This patch removes the `arm64_const_caps_ready` check from system_uses_lse_atomics(). As the arch_atomic_* routines are meant to be safely usable in noinstr code, I've also marked system_uses_lse_atomics() as __always_inline. This results in one fewer static branch per atomic operation, with the prior example becoming: b f78 <kernel_init_freeable+0xa0> mov w0, #0x1 // #1 ldadd w0, w0, [x19] Each static branch consists of the branch itself and an associated __jump_table entry. Removing these has a reasonable impact on the Image size, with a GCC 11.1.0 defconfig v5.17-rc2 Image being reduced by 128KiB: | [mark@lakrids:~/src/linux]% ls -al Image* | -rw-r--r-- 1 mark mark 34619904 Feb 3 18:24 Image.baseline | -rw-r--r-- 1 mark mark 34488832 Feb 3 18:33 Image.onebranch Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220204104439.270567-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
Aneesh reported that: tlb_flush_mmu() tlb_flush_mmu_tlbonly() tlb_flush() <-- #1 tlb_flush_mmu_free() tlb_table_flush() tlb_table_invalidate() tlb_flush_mmu_tlbonly() tlb_flush() <-- #2 does two TLBIs when tlb->fullmm, because __tlb_reset_range() will not clear tlb->end in that case. Observe that any caller to __tlb_adjust_range() also sets at least one of the tlb->freed_tables || tlb->cleared_p* bits, and those are unconditionally cleared by __tlb_reset_range(). Change the condition for actually issuing TLBI to having one of those bits set, as opposed to having tlb->end != 0. Link: http://lkml.kernel.org/r/20200116064531.483522-4-aneesh.kumar@linux.ibm.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
storage-qa/generic/010 reported a RAMDUMP on reboot test. [ 532.682030] c7 1 debug-reboot: Create reboot monitor timer now [ 532.695337] c7 1 iommu: Removing device paintbox-ipu from group 54 [ 532.702153] c7 1 iommu: Removing device ipu-iommu from group 54 [ 532.711121] c5 1 sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 532.720128] c5 1 ufshcd-qcom 1d84000.ufshc: ufshcd_query_attr_retry: query attribute, idn 13, failed with error -19 after 3 retires [ 532.732754] c5 1 ufshcd-qcom 1d84000.ufshc: ufshcd_disable_auto_bkops: failed to enable exception event -19 -> suspect this causes the below aborts. [ 532.759830] c3 0 Synchronous External Abort: synchronous external abort (0x96000010) at 0xffffff800c80403c [ 532.770231] c3 0 Internal error: : 96000010 [#1] PREEMPT SMP [ 532.776517] c3 0 Modules linked in: ftm5(O) heatmap videobuf2_vmalloc videobuf2_memops lkdtm adsp_loader_dlkm stub_dlkm usf_dlkm native_dlkm machine_dlkm platform_dlkm wcd_cpe_dlkm wsa881x_dlkm wcd934x_dlkm wcd9360_dlkm mbhc_dlkm wcd9xxx_dlkm swr_ctrl_dlkm cs35l36_dlkm q6_dlkm swr_dlkm apr_dlkm q6_notifier_dlkm q6_pdr_dlkm wglink_dlkm wcd_spi_dlkm wcd_core_dlkm pinctrl_wcd_dlkm msm_11ad_proxy wlan(O) [ 532.813577] c3 0 Process swapper/3 (pid: 0, stack limit = 0x00000000aafbbfba) [ 532.821384] c3 0 CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S O 4.14.180-36668-gf872280691f4_audio-gab11b12 #1 Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Change-Id: Ia1a3212bc3038067027c981d82f95fe894a3eedf Signed-off-by: Alexander Winkowski <dereference23@outlook.com> Signed-off-by: TogoFire <togofire@mailfence.com>
amritokun
pushed a commit
that referenced
this pull request
Nov 25, 2025
When picolcd is switched into bootloader mode (for FW flashing) make sure not to try to dereference NULL-pointers of feature-devices during unplug/unbind. This fixes following BUG: BUG: unable to handle kernel NULL pointer dereference at 00000298 IP: [<f811f56b>] picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd] *pde = 00000000 Oops: 0000 [#1] Modules linked in: hid_picolcd syscopyarea sysfillrect sysimgblt fb_sys_fops CPU: 0 PID: 15 Comm: khubd Not tainted 3.11.0-rc7-00002-g50d62d4 #2 EIP: 0060:[<f811f56b>] EFLAGS: 00010292 CPU: 0 EIP is at picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd] Call Trace: [<f811d1ab>] picolcd_remove+0xcb/0x120 [hid_picolcd] [<c1469b09>] hid_device_remove+0x59/0xc0 [<c13464ca>] __device_release_driver+0x5a/0xb0 [<c134653f>] device_release_driver+0x1f/0x30 [<c134603d>] bus_remove_device+0x9d/0xd0 [<c13439a5>] device_del+0xd5/0x150 [<c14696a4>] hid_destroy_device+0x24/0x60 [<c1474cbb>] usbhid_disconnect+0x1b/0x40 ... Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Change-Id: Ibdfce9edd1a3d57f1b45c2a776152adada0a68cb Cc: stable@kernel.org Signed-off-by: Jiri Kosina <jkosina@suse.cz> (cherry picked from commit 1cde501) Signed-off-by: TogoFire <togofire@mailfence.com>
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
… iterator
With latest `bpftool prog` command, we observed the following kernel
panic.
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD dfe894067 P4D dfe894067 PUD deb663067 PMD 0
Oops: 0010 [#1] SMP
CPU: 9 PID: 6023 ...
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0000:ffffc900002b8f18 EFLAGS: 00010286
RAX: ffff8883a405f400 RBX: ffff888e46a6bf00 RCX: 000000008020000c
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8883a405f400
RBP: ffff888e46a6bf50 R08: 0000000000000000 R09: ffffffff81129600
R10: ffff8883a405f300 R11: 0000160000000000 R12: 0000000000002710
R13: 000000e9494b690c R14: 0000000000000202 R15: 0000000000000009
FS: 00007fd9187fe700(0000) GS:ffff888e46a40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000de5d33002 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
rcu_core+0x1a4/0x440
__do_softirq+0xd3/0x2c8
irq_exit+0x9d/0xa0
smp_apic_timer_interrupt+0x68/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0033:0x47ce80
Code: Bad RIP value.
RSP: 002b:00007fd9187fba40 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000002 RBX: 00007fd931789160 RCX: 000000000000010c
RDX: 00007fd9308cdfb4 RSI: 00007fd9308cdfb4 RDI: 00007ffedd1ea0a8
RBP: 00007fd9187fbab0 R08: 000000000000000e R09: 000000000000002a
R10: 0000000000480210 R11: 00007fd9187fc570 R12: 00007fd9316cc400
R13: 0000000000000118 R14: 00007fd9308cdfb4 R15: 00007fd9317a9380
After further analysis, the bug is triggered by
Commit eaaacd23910f ("bpf: Add task and task/file iterator targets")
which introduced task_file bpf iterator, which traverses all open file
descriptors for all tasks in the current namespace.
The latest `bpftool prog` calls a task_file bpf program to traverse
all files in the system in order to associate processes with progs/maps, etc.
When traversing files for a given task, rcu read_lock is taken to
access all files in a file_struct. But it used get_file() to grab
a file, which is not right. It is possible file->f_count is 0 and
get_file() will unconditionally increase it.
Later put_file() may cause all kind of issues with the above
as one of sympotoms.
The failure can be reproduced with the following steps in a few seconds:
$ cat t.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#define N 10000
int fd[N];
int main() {
int i;
for (i = 0; i < N; i++) {
fd[i] = open("./note.txt", 'r');
if (fd[i] < 0) {
fprintf(stderr, "failed\n");
return -1;
}
}
for (i = 0; i < N; i++)
close(fd[i]);
return 0;
}
$ gcc -O2 t.c
$ cat run.sh
#/bin/bash
for i in {1..100}
do
while true; do ./a.out; done &
done
$ ./run.sh
$ while true; do bpftool prog >& /dev/null; done
This patch used get_file_rcu() which only grabs a file if the
file->f_count is not zero. This is to ensure the file pointer
is always valid. The above reproducer did not fail for more
than 30 minutes.
Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
Suggested-by: Josef Bacik <josef@toxicpanda.com>
Change-Id: Ie35ddc8ad94c60cca0154dc5a3e6e927c2df2673
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/bpf/20200817174214.252601-1-yhs@fb.com
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
In preempt kernel, BPF_PROG_TEST_RUN on raw_tp triggers: [ 35.874974] BUG: using smp_processor_id() in preemptible [00000000] code: new_name/87 [ 35.893983] caller is bpf_prog_test_run_raw_tp+0xd4/0x1b0 [ 35.900124] CPU: 1 PID: 87 Comm: new_name Not tainted 5.9.0-rc6-g615bd02bf #1 [ 35.907358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 35.916941] Call Trace: [ 35.919660] dump_stack+0x77/0x9b [ 35.923273] check_preemption_disabled+0xb4/0xc0 [ 35.928376] bpf_prog_test_run_raw_tp+0xd4/0x1b0 [ 35.933872] ? selinux_bpf+0xd/0x70 [ 35.937532] __do_sys_bpf+0x6bb/0x21e0 [ 35.941570] ? find_held_lock+0x2d/0x90 [ 35.945687] ? vfs_write+0x150/0x220 [ 35.949586] do_syscall_64+0x2d/0x40 [ 35.953443] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by calling migrate_disable() before smp_processor_id(). Fixes: 1b4d60ec162f ("bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint") Reported-by: Alexei Starovoitov <ast@kernel.org> Change-Id: Ic6e489279a6ff446cc133abbf9eab851188a9734 Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
The 64-bit JEQ/JNE handling in reg_set_min_max() was clearing reg->id in either
true or false branch. In the case 'if (reg->id)' check was done on the other
branch the counter part register would have reg->id == 0 when called into
find_equal_scalars(). In such case the helper would incorrectly identify other
registers with id == 0 as equivalent and propagate the state incorrectly.
Fix it by preserving ID across reg_set_min_max().
In other words any kind of comparison operator on the scalar register
should preserve its ID to recognize:
r1 = r2
if (r1 == 20) {
#1 here both r1 and r2 == 20
} else if (r2 < 20) {
#2 here both r1 and r2 < 20
}
The patch is addressing #1 case. The #2 was working correctly already.
Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
Change-Id: Id5737c3392daa0d695a768206b3d3e4bda187aaf
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201014175608.1416-1-alexei.starovoitov@gmail.com
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
[ Upstream commit 69ca310f34168eae0ada434796bfc22fb4a0fa26 ]
On some systems, some variant of the following splat is
repeatedly seen. The common factor in all traces seems
to be the entry point to task_file_seq_next(). With the
patch, all warnings go away.
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: \x0926-....: (20992 ticks this GP) idle=d7e/1/0x4000000000000002 softirq=81556231/81556231 fqs=4876
\x09(t=21033 jiffies g=159148529 q=223125)
NMI backtrace for cpu 26
CPU: 26 PID: 2015853 Comm: bpftool Kdump: loaded Not tainted 5.6.13-0_fbk4_3876_gd8d1f9bf80bb #1
Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A12 10/08/2018
Call Trace:
<IRQ>
dump_stack+0x50/0x70
nmi_cpu_backtrace.cold.6+0x13/0x50
? lapic_can_unplug_cpu.cold.30+0x40/0x40
nmi_trigger_cpumask_backtrace+0xba/0xca
rcu_dump_cpu_stacks+0x99/0xc7
rcu_sched_clock_irq.cold.90+0x1b4/0x3aa
? tick_sched_do_timer+0x60/0x60
update_process_times+0x24/0x50
tick_sched_timer+0x37/0x70
__hrtimer_run_queues+0xfe/0x270
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x5e/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:get_pid_task+0x38/0x80
Code: 89 f6 48 8d 44 f7 08 48 8b 00 48 85 c0 74 2b 48 83 c6 55 48 c1 e6 04 48 29 f0 74 19 48 8d 78 20 ba 01 00 00 00 f0 0f c1 50 20 <85> d2 74 27 78 11 83 c2 01 78 0c 48 83 c4 08 c3 31 c0 48 83 c4 08
RSP: 0018:ffffc9000d293dc8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
RAX: ffff888637c05600 RBX: ffffc9000d293e0c RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000550 RDI: ffff888637c05620
RBP: ffffffff8284eb80 R08: ffff88831341d300 R09: ffff88822ffd8248
R10: ffff88822ffd82d0 R11: 00000000003a93c0 R12: 0000000000000001
R13: 00000000ffffffff R14: ffff88831341d300 R15: 0000000000000000
? find_ge_pid+0x1b/0x20
task_seq_get_next+0x52/0xc0
task_file_seq_get_next+0x159/0x220
task_file_seq_next+0x4f/0xa0
bpf_seq_read+0x159/0x390
vfs_read+0x8a/0x140
ksys_read+0x59/0xd0
do_syscall_64+0x42/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f95ae73e76e
Code: Bad RIP value.
RSP: 002b:00007ffc02c1dbf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 000000000170faa0 RCX: 00007f95ae73e76e
RDX: 0000000000001000 RSI: 00007ffc02c1dc30 RDI: 0000000000000007
RBP: 00007ffc02c1ec70 R08: 0000000000000005 R09: 0000000000000006
R10: fffffffffffff20b R11: 0000000000000246 R12: 00000000019112a0
R13: 0000000000000000 R14: 0000000000000007 R15: 00000000004283c0
If unable to obtain the file structure for the current task,
proceed to the next task number after the one returned from
task_seq_get_next(), instead of the next task number from the
original iterator.
Also, save the stopping task number from task_seq_get_next()
on failure in case of restarts.
Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
Change-Id: I38f4c944895cb0497ddef69bbac342bc41250fda
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201218185032.2464558-2-jonathan.lemon@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
…_attach_type [ Upstream commit 5e21bb4e812566aef86fbb77c96a4ec0782286e4 ] These two types of XDP progs (BPF_XDP_DEVMAP, BPF_XDP_CPUMAP) will not be executed directly in the driver, therefore we should also not directly run them from here. To run in these two situations, there must be further preparations done, otherwise these may cause a kernel panic. For more details, see also dev_xdp_attach(). [ 46.982479] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 46.984295] #PF: supervisor read access in kernel mode [ 46.985777] #PF: error_code(0x0000) - not-present page [ 46.987227] PGD 800000010dca4067 P4D 800000010dca4067 PUD 10dca6067 PMD 0 [ 46.989201] Oops: 0000 [#1] SMP PTI [ 46.990304] CPU: 7 PID: 562 Comm: a.out Not tainted 5.13.0+ #44 [ 46.992001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/24 [ 46.995113] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 46.996586] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.001562] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.003115] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.005163] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.007135] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.009171] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.011172] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.013244] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.015705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.017475] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.019558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.021595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.023574] PKRU: 55555554 [ 47.024571] Call Trace: [ 47.025424] __bpf_prog_run32+0x32/0x50 [ 47.026296] ? printk+0x53/0x6a [ 47.027066] ? ktime_get+0x39/0x90 [ 47.027895] bpf_test_run.cold.28+0x23/0x123 [ 47.028866] ? printk+0x53/0x6a [ 47.029630] bpf_prog_test_run_xdp+0x149/0x1d0 [ 47.030649] __sys_bpf+0x1305/0x23d0 [ 47.031482] __x64_sys_bpf+0x17/0x20 [ 47.032316] do_syscall_64+0x3a/0x80 [ 47.033165] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 47.034254] RIP: 0033:0x7f04a51364dd [ 47.035133] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48 [ 47.038768] RSP: 002b:00007fff8f9fc518 EFLAGS: 00000213 ORIG_RAX: 0000000000000141 [ 47.040344] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04a51364dd [ 47.041749] RDX: 0000000000000048 RSI: 0000000020002a80 RDI: 000000000000000a [ 47.043171] RBP: 00007fff8f9fc530 R08: 0000000002049300 R09: 0000000020000100 [ 47.044626] R10: 0000000000000004 R11: 0000000000000213 R12: 0000000000401070 [ 47.046088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 47.047579] Modules linked in: [ 47.048318] CR2: 0000000000000000 [ 47.049120] ---[ end trace 7ad34443d5be719a ]--- [ 47.050273] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 47.051343] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.054943] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.056068] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.057522] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.058961] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.060390] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.061803] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.063249] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.065070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.066307] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.067747] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.069217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.070652] PKRU: 55555554 [ 47.071318] Kernel panic - not syncing: Fatal exception [ 47.072854] Kernel Offset: disabled [ 47.073683] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fixes: 9216477449f3 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap") Fixes: fbee97feed9b ("bpf: Add support to attach bpf program to a devmap entry") Reported-by: Abaci <abaci@linux.alibaba.com> Change-Id: I3585e8abeacc017043f6e8a14676121d554c7c6f Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: David Ahern <dsahern@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210708080409.73525-1-xuanzhuo@linux.alibaba.com Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
[ Upstream commit d6371c76e20d7d3f61b05fd67b596af4d14a8886 ]
We got the following UBSAN report on one of our testing machines:
================================================================================
UBSAN: array-index-out-of-bounds in kernel/bpf/syscall.c:2389:24
index 6 is out of range for type 'char *[6]'
CPU: 43 PID: 930921 Comm: systemd-coredum Tainted: G O 5.10.48-cloudflare-kasan-2021.7.0 #1
Hardware name: <snip>
Call Trace:
dump_stack+0x7d/0xa3
ubsan_epilogue+0x5/0x40
__ubsan_handle_out_of_bounds.cold+0x43/0x48
? seq_printf+0x17d/0x250
bpf_link_show_fdinfo+0x329/0x380
? bpf_map_value_size+0xe0/0xe0
? put_files_struct+0x20/0x2d0
? __kasan_kmalloc.constprop.0+0xc2/0xd0
seq_show+0x3f7/0x540
seq_read_iter+0x3f8/0x1040
seq_read+0x329/0x500
? seq_read_iter+0x1040/0x1040
? __fsnotify_parent+0x80/0x820
? __fsnotify_update_child_dentry_flags+0x380/0x380
vfs_read+0x123/0x460
ksys_read+0xed/0x1c0
? __x64_sys_pwrite64+0x1f0/0x1f0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
<snip>
================================================================================
================================================================================
UBSAN: object-size-mismatch in kernel/bpf/syscall.c:2384:2
From the report, we can infer that some array access in bpf_link_show_fdinfo at index 6
is out of bounds. The obvious candidate is bpf_link_type_strs[BPF_LINK_TYPE_XDP] with
BPF_LINK_TYPE_XDP == 6. It turns out that BPF_LINK_TYPE_XDP is missing from bpf_types.h
and therefore doesn't have an entry in bpf_link_type_strs:
pos: 0
flags: 02000000
mnt_id: 13
link_type: (null)
link_id: 4
prog_tag: bcf7977d3b93787c
prog_id: 4
ifindex: 1
Fixes: aa8d3a716b59 ("bpf, xdp: Add bpf_link-based XDP attachment API")
Change-Id: I3778494269186262db490526483faca0ed56207d
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210719085134.43325-2-lmb@cloudflare.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
…descriptors commit a657182a5c5150cdfacb6640aad1d2712571a409 upstream. Hsin-Wei reported a KASAN splat triggered by their BPF runtime fuzzer which is based on a customized syzkaller: BUG: KASAN: slab-out-of-bounds in bpf_int_jit_compile+0x1257/0x13f0 Read of size 8 at addr ffff888004e90b58 by task syz-executor.0/1489 CPU: 1 PID: 1489 Comm: syz-executor.0 Not tainted 5.19.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x9c/0xc9 print_address_description.constprop.0+0x1f/0x1f0 ? bpf_int_jit_compile+0x1257/0x13f0 kasan_report.cold+0xeb/0x197 ? kvmalloc_node+0x170/0x200 ? bpf_int_jit_compile+0x1257/0x13f0 bpf_int_jit_compile+0x1257/0x13f0 ? arch_prepare_bpf_dispatcher+0xd0/0xd0 ? rcu_read_lock_sched_held+0x43/0x70 bpf_prog_select_runtime+0x3e8/0x640 ? bpf_obj_name_cpy+0x149/0x1b0 bpf_prog_load+0x102f/0x2220 ? __bpf_prog_put.constprop.0+0x220/0x220 ? find_held_lock+0x2c/0x110 ? __might_fault+0xd6/0x180 ? lock_downgrade+0x6e0/0x6e0 ? lock_is_held_type+0xa6/0x120 ? __might_fault+0x147/0x180 __sys_bpf+0x137b/0x6070 ? bpf_perf_link_attach+0x530/0x530 ? new_sync_read+0x600/0x600 ? __fget_files+0x255/0x450 ? lock_downgrade+0x6e0/0x6e0 ? fput+0x30/0x1a0 ? ksys_write+0x1a8/0x260 __x64_sys_bpf+0x7a/0xc0 ? syscall_enter_from_user_mode+0x21/0x70 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f917c4e2c2d The problem here is that a range of tnum_range(0, map->max_entries - 1) has limited ability to represent the concrete tight range with the tnum as the set of resulting states from value + mask can result in a superset of the actual intended range, and as such a tnum_in(range, reg->var_off) check may yield true when it shouldn't, for example tnum_range(0, 2) would result in 00XX -> v = 0000, m = 0011 such that the intended set of {0, 1, 2} is here represented by a less precise superset of {0, 1, 2, 3}. As the register is known const scalar, really just use the concrete reg->var_off.value for the upper index check. Fixes: d2e4c1e6c294 ("bpf: Constant map key tracking for prog array pokes") Reported-by: Hsin-Wei Hung <hsinweih@uci.edu> Change-Id: I03425b3777961f3ed6e48319d4acaea69e4ec014 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Shung-Hsi Yu <shung-hsi.yu@suse.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/984b37f9fdf7ac36831d2137415a4a915744c1b6.1661462653.git.daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
[ Upstream commit 7d6620f107bae6ed687ff07668e8e8f855487aa9 ] Syzkaller reported a triggered kernel BUG as follows: ------------[ cut here ]------------ kernel BUG at kernel/bpf/cgroup.c:925! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 194 Comm: detach Not tainted 5.19.0-14184-g69dac8e431af #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__cgroup_bpf_detach+0x1f2/0x2a0 Code: 00 e8 92 60 30 00 84 c0 75 d8 4c 89 e0 31 f6 85 f6 74 19 42 f6 84 28 48 05 00 00 02 75 0e 48 8b 80 c0 00 00 00 48 85 c0 75 e5 <0f> 0b 48 8b 0c5 RSP: 0018:ffffc9000055bdb0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff888100ec0800 RCX: ffffc900000f1000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff888100ec4578 RBP: 0000000000000000 R08: ffff888100ec0800 R09: 0000000000000040 R10: 0000000000000000 R11: 0000000000000000 R12: ffff888100ec4000 R13: 000000000000000d R14: ffffc90000199000 R15: ffff888100effb00 FS: 00007f68213d2b80(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f74a0e5850 CR3: 0000000102836000 CR4: 00000000000006e0 Call Trace: <TASK> cgroup_bpf_prog_detach+0xcc/0x100 __sys_bpf+0x2273/0x2a00 __x64_sys_bpf+0x17/0x20 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f68214dbcb9 Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff8 RSP: 002b:00007ffeb487db68 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f68214dbcb9 RDX: 0000000000000090 RSI: 00007ffeb487db70 RDI: 0000000000000009 RBP: 0000000000000003 R08: 0000000000000012 R09: 0000000b00000003 R10: 00007ffeb487db70 R11: 0000000000000246 R12: 00007ffeb487dc20 R13: 0000000000000004 R14: 0000000000000001 R15: 000055f74a1011b0 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- Repetition steps: For the following cgroup tree, root | cg1 | cg2 1. attach prog2 to cg2, and then attach prog1 to cg1, both bpf progs attach type is NONE or OVERRIDE. 2. write 1 to /proc/thread-self/fail-nth for failslab. 3. detach prog1 for cg1, and then kernel BUG occur. Failslab injection will cause kmalloc fail and fall back to purge_effective_progs. The problem is that cg2 have attached another prog, so when go through cg2 layer, iteration will add pos to 1, and subsequent operations will be skipped by the following condition, and cg will meet NULL in the end. `if (pos && !(cg->bpf.flags[atype] & BPF_F_ALLOW_MULTI))` The NULL cg means no link or prog match, this is as expected, and it's not a bug. So here just skip the no match situation. Fixes: 4c46091ee985 ("bpf: Fix KASAN use-after-free Read in compute_effective_progs") Change-Id: Ica98c69bf3d58c46b1692625827485f7312960be Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220813134030.1972696-1-pulehui@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
This reverts commit 9dcdb6f5ee0ea133f2e0d669743fcb48362ee4c5. The IRQ subsystem already blocks suspend on waiting for IRQ threads to finish running (in dpm_noirq_begin()). This PM wakeup does nothing but add latency to the IRQ handler for non-RT kernels, and it isn't RT-friendly either: [ 42.466403] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974 [ 42.466407] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/3 [ 42.466408] Preemption disabled at: [ 42.466421] [<00000000100c9f7d>] secondary_start_kernel+0xa8/0x130 [ 42.466427] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S W 4.14.212-rt102-Sultan #1 [ 42.466429] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Coral (DT) [ 42.466432] Call trace: [ 42.466436] dump_backtrace+0x0/0x1ac [ 42.466439] show_stack+0x14/0x1c [ 42.466444] dump_stack+0x84/0xac [ 42.466448] ___might_sleep+0x140/0x150 [ 42.466452] rt_spin_lock+0x3c/0x50 [ 42.466458] __pm_stay_awake+0x20/0x50 [ 42.466462] qcom_smp2p_isr+0x10/0x1c [ 42.466467] __handle_irq_event_percpu+0x60/0xd4 [ 42.466469] handle_irq_event_percpu+0x58/0xb0 [ 42.466471] handle_irq_event+0x68/0xe0 [ 42.466474] handle_fasteoi_irq+0x140/0x1fc [ 42.466476] generic_handle_irq+0x18/0x2c [ 42.466478] __handle_domain_irq+0xf8/0xfc [ 42.466481] gic_handle_irq+0xc8/0x164 [ 42.466483] el1_irq+0xb0/0x130 [ 42.466487] finish_task_switch+0xcc/0x1e4 [ 42.466491] __schedule+0x3f0/0x4e0 [ 42.466493] schedule_idle+0x28/0x44 [ 42.466497] do_idle+0x78/0x230 [ 42.466500] cpu_startup_entry+0x20/0x28 [ 42.466502] secondary_start_kernel+0x124/0x130 Remove it since it's useless. Change-Id: Ib4a03d89bbaaf114980ee93105ce3d5b0b5127eb Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
Due to a historical oversight, we emit a redundant static branch for each atomic/atomic64 operation when CONFIG_ARM64_LSE_ATOMICS is selected. We can safely remove this, making the kernel Image reasonably smaller. When CONFIG_ARM64_LSE_ATOMICS is selected, every LSE atomic operation has two preceding static branches with the same target, e.g. b f7c <kernel_init_freeable+0xa4> b f7c <kernel_init_freeable+0xa4> mov w0, #0x1 // #1 ldadd w0, w0, [x19] This is because the __lse_ll_sc_body() wrapper uses system_uses_lse_atomics(), which checks both `arm64_const_caps_ready` and `cpu_hwcap_keys[ARM64_HAS_LSE_ATOMICS]`, each of which emits a static branch. This has been the case since commit: addfc38 ("arm64: atomics: avoid out-of-line ll/sc atomics") However, there was never a need to check `arm64_const_caps_ready`, which was itself introduced in commit: 63a1e1c ("arm64/cpufeature: don't use mutex in bringup path") ... so that cpus_have_const_cap() could fall back to checking the `cpu_hwcaps` bitmap prior to the static keys for individual caps becoming enabled. As system_uses_lse_atomics() doesn't check `cpu_hwcaps`, and doesn't need to as we can safely use the LL/SC atomics prior to enabling the `ARM64_HAS_LSE_ATOMICS` static key, it doesn't need to check `arm64_const_caps_ready`. This patch removes the `arm64_const_caps_ready` check from system_uses_lse_atomics(). As the arch_atomic_* routines are meant to be safely usable in noinstr code, I've also marked system_uses_lse_atomics() as __always_inline. This results in one fewer static branch per atomic operation, with the prior example becoming: b f78 <kernel_init_freeable+0xa0> mov w0, #0x1 // #1 ldadd w0, w0, [x19] Each static branch consists of the branch itself and an associated __jump_table entry. Removing these has a reasonable impact on the Image size, with a GCC 11.1.0 defconfig v5.17-rc2 Image being reduced by 128KiB: | [mark@lakrids:~/src/linux]% ls -al Image* | -rw-r--r-- 1 mark mark 34619904 Feb 3 18:24 Image.baseline | -rw-r--r-- 1 mark mark 34488832 Feb 3 18:33 Image.onebranch Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220204104439.270567-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 12, 2025
Aneesh reported that: tlb_flush_mmu() tlb_flush_mmu_tlbonly() tlb_flush() <-- #1 tlb_flush_mmu_free() tlb_table_flush() tlb_table_invalidate() tlb_flush_mmu_tlbonly() tlb_flush() <-- #2 does two TLBIs when tlb->fullmm, because __tlb_reset_range() will not clear tlb->end in that case. Observe that any caller to __tlb_adjust_range() also sets at least one of the tlb->freed_tables || tlb->cleared_p* bits, and those are unconditionally cleared by __tlb_reset_range(). Change the condition for actually issuing TLBI to having one of those bits set, as opposed to having tlb->end != 0. Link: http://lkml.kernel.org/r/20200116064531.483522-4-aneesh.kumar@linux.ibm.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
[ Upstream commit a91c8096590bd7801a26454789f2992094fe36da ] The original code causes a circular locking dependency found by lockdep. ====================================================== WARNING: possible circular locking dependency detected 6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 Tainted: G S U ------------------------------------------------------ xe_fault_inject/5091 is trying to acquire lock: ffff888156815688 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}, at: __flush_work+0x25d/0x660 but task is already holding lock: ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&devcd->mutex){+.+.}-{3:3}: mutex_lock_nested+0x4e/0xc0 devcd_data_write+0x27/0x90 sysfs_kf_bin_write+0x80/0xf0 kernfs_fop_write_iter+0x169/0x220 vfs_write+0x293/0x560 ksys_write+0x72/0xf0 __x64_sys_write+0x19/0x30 x64_sys_call+0x2bf/0x2660 do_syscall_64+0x93/0xb60 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #1 (kn->active#236){++++}-{0:0}: kernfs_drain+0x1e2/0x200 __kernfs_remove+0xae/0x400 kernfs_remove_by_name_ns+0x5d/0xc0 remove_files+0x54/0x70 sysfs_remove_group+0x3d/0xa0 sysfs_remove_groups+0x2e/0x60 device_remove_attrs+0xc7/0x100 device_del+0x15d/0x3b0 devcd_del+0x19/0x30 process_one_work+0x22b/0x6f0 worker_thread+0x1e8/0x3d0 kthread+0x11c/0x250 ret_from_fork+0x26c/0x2e0 ret_from_fork_asm+0x1a/0x30 -> #0 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}: __lock_acquire+0x1661/0x2860 lock_acquire+0xc4/0x2f0 __flush_work+0x27a/0x660 flush_delayed_work+0x5d/0xa0 dev_coredump_put+0x63/0xa0 xe_driver_devcoredump_fini+0x12/0x20 [xe] devm_action_release+0x12/0x30 release_nodes+0x3a/0x120 devres_release_all+0x8a/0xd0 device_unbind_cleanup+0x12/0x80 device_release_driver_internal+0x23a/0x280 device_driver_detach+0x14/0x20 unbind_store+0xaf/0xc0 drv_attr_store+0x21/0x50 sysfs_kf_write+0x4a/0x80 kernfs_fop_write_iter+0x169/0x220 vfs_write+0x293/0x560 ksys_write+0x72/0xf0 __x64_sys_write+0x19/0x30 x64_sys_call+0x2bf/0x2660 do_syscall_64+0x93/0xb60 entry_SYSCALL_64_after_hwframe+0x76/0x7e other info that might help us debug this: Chain exists of: (work_completion)(&(&devcd->del_wk)->work) --> kn->active#236 --> &devcd->mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&devcd->mutex); lock(kn->active#236); lock(&devcd->mutex); lock((work_completion)(&(&devcd->del_wk)->work)); *** DEADLOCK *** 5 locks held by xe_fault_inject/5091: #0: ffff8881129f9488 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x72/0xf0 #1: ffff88810c755078 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x123/0x220 #2: ffff8881054811a0 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x55/0x280 #3: ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0 #4: ffffffff8359e020 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x72/0x660 stack backtrace: CPU: 14 UID: 0 PID: 5091 Comm: xe_fault_inject Tainted: G S U 6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 PREEMPT_{RT,(lazy)} Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A DDR4(MS-7D25), BIOS 1.10 12/13/2021 Call Trace: <TASK> dump_stack_lvl+0x91/0xf0 dump_stack+0x10/0x20 print_circular_bug+0x285/0x360 check_noncircular+0x135/0x150 ? register_lock_class+0x48/0x4a0 __lock_acquire+0x1661/0x2860 lock_acquire+0xc4/0x2f0 ? __flush_work+0x25d/0x660 ? mark_held_locks+0x46/0x90 ? __flush_work+0x25d/0x660 __flush_work+0x27a/0x660 ? __flush_work+0x25d/0x660 ? trace_hardirqs_on+0x1e/0xd0 ? __pfx_wq_barrier_func+0x10/0x10 flush_delayed_work+0x5d/0xa0 dev_coredump_put+0x63/0xa0 xe_driver_devcoredump_fini+0x12/0x20 [xe] devm_action_release+0x12/0x30 release_nodes+0x3a/0x120 devres_release_all+0x8a/0xd0 device_unbind_cleanup+0x12/0x80 device_release_driver_internal+0x23a/0x280 ? bus_find_device+0xa8/0xe0 device_driver_detach+0x14/0x20 unbind_store+0xaf/0xc0 drv_attr_store+0x21/0x50 sysfs_kf_write+0x4a/0x80 kernfs_fop_write_iter+0x169/0x220 vfs_write+0x293/0x560 ksys_write+0x72/0xf0 __x64_sys_write+0x19/0x30 x64_sys_call+0x2bf/0x2660 do_syscall_64+0x93/0xb60 ? __f_unlock_pos+0x15/0x20 ? __x64_sys_getdents64+0x9b/0x130 ? __pfx_filldir64+0x10/0x10 ? do_syscall_64+0x1a2/0xb60 ? clear_bhb_loop+0x30/0x80 ? clear_bhb_loop+0x30/0x80 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x76e292edd574 Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d d5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89 RSP: 002b:00007fffe247a828 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000076e292edd574 RDX: 000000000000000c RSI: 00006267f6306063 RDI: 000000000000000b RBP: 000000000000000c R08: 000076e292fc4b20 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 00006267f6306063 R13: 000000000000000b R14: 00006267e6859c00 R15: 000076e29322a000 </TASK> xe 0000:03:00.0: [drm] Xe device coredump has been deleted. Fixes: 01daccf74832 ("devcoredump : Serialize devcd_del work") Cc: Mukesh Ojha <quic_mojha@quicinc.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> Cc: Matthew Brost <matthew.brost@intel.com> Acked-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250723142416.1020423-1-dev@lankhorst.se Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [ replaced disable_delayed_work_sync() with cancel_delayed_work_sync() ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
[ Upstream commit 38f50242bf0f237cdc262308d624d333286ec3c5 ]
With CONFIG_PROVE_RCU_LIST=y and by executing
$ netcat -l --sctp &
$ netcat --sctp localhost &
$ ss --sctp
one can trigger the following Lockdep-RCU splat(s):
WARNING: suspicious RCU usage
6.18.0-rc1-00093-g7f864458e9a6 #5 Not tainted
-----------------------------
net/sctp/diag.c:76 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
2 locks held by ss/215:
#0: ffff9c740828bec0 (nlk_cb_mutex-SOCK_DIAG){+.+.}-{4:4}, at: __netlink_dump_start+0x84/0x2b0
#1: ffff9c7401d72cd0 (sk_lock-AF_INET6){+.+.}-{0:0}, at: sctp_sock_dump+0x38/0x200
stack backtrace:
CPU: 0 UID: 0 PID: 215 Comm: ss Not tainted 6.18.0-rc1-00093-g7f864458e9a6 #5 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x90
lockdep_rcu_suspicious.cold+0x4e/0xa3
inet_sctp_diag_fill.isra.0+0x4b1/0x5d0
sctp_sock_dump+0x131/0x200
sctp_transport_traverse_process+0x170/0x1b0
? __pfx_sctp_sock_filter+0x10/0x10
? __pfx_sctp_sock_dump+0x10/0x10
sctp_diag_dump+0x103/0x140
__inet_diag_dump+0x70/0xb0
netlink_dump+0x148/0x490
__netlink_dump_start+0x1f3/0x2b0
inet_diag_handler_cmd+0xcd/0x100
? __pfx_inet_diag_dump_start+0x10/0x10
? __pfx_inet_diag_dump+0x10/0x10
? __pfx_inet_diag_dump_done+0x10/0x10
sock_diag_rcv_msg+0x18e/0x320
? __pfx_sock_diag_rcv_msg+0x10/0x10
netlink_rcv_skb+0x4d/0x100
netlink_unicast+0x1d7/0x2b0
netlink_sendmsg+0x203/0x450
____sys_sendmsg+0x30c/0x340
___sys_sendmsg+0x94/0xf0
__sys_sendmsg+0x83/0xf0
do_syscall_64+0xbb/0x390
entry_SYSCALL_64_after_hwframe+0x77/0x7f
...
</TASK>
Fixes: 8f840e4 ("sctp: add the sctp_diag.c file")
Signed-off-by: Stefan Wiehler <stefan.wiehler@nokia.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251028161506.3294376-2-stefan.wiehler@nokia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
[ Upstream commit e120f46768d98151ece8756ebd688b0e43dc8b29 ]
Raw IP packets have no MAC header, leaving skb->mac_header uninitialized.
This can trigger kernel panics on ARM64 when xfrm or other subsystems
access the offset due to strict alignment checks.
Initialize the MAC header to prevent such crashes.
This can trigger kernel panics on ARM when running IPsec over the
qmimux0 interface.
Example trace:
Internal error: Oops: 000000009600004f [#1] SMP
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.34-gbe78e49cb433 #1
Hardware name: LS1028A RDB Board (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : xfrm_input+0xde8/0x1318
lr : xfrm_input+0x61c/0x1318
sp : ffff800080003b20
Call trace:
xfrm_input+0xde8/0x1318
xfrm6_rcv+0x38/0x44
xfrm6_esp_rcv+0x48/0xa8
ip6_protocol_deliver_rcu+0x94/0x4b0
ip6_input_finish+0x44/0x70
ip6_input+0x44/0xc0
ipv6_rcv+0x6c/0x114
__netif_receive_skb_one_core+0x5c/0x8c
__netif_receive_skb+0x18/0x60
process_backlog+0x78/0x17c
__napi_poll+0x38/0x180
net_rx_action+0x168/0x2f0
Fixes: c6adf77 ("net: usb: qmi_wwan: add qmap mux protocol support")
Signed-off-by: Qendrim Maxhuni <qendrim.maxhuni@garderos.com>
Link: https://patch.msgid.link/20251029075744.105113-1-qendrim.maxhuni@garderos.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
[ Upstream commit f04aad36a07cc17b7a5d5b9a2d386ce6fae63e93 ] syzkaller discovered the following crash: (kernel BUG) [ 44.607039] ------------[ cut here ]------------ [ 44.607422] kernel BUG at mm/userfaultfd.c:2067! [ 44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI [ 44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none) [ 44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460 <snip other registers, drop unreliable trace> [ 44.617726] Call Trace: [ 44.617926] <TASK> [ 44.619284] userfaultfd_release+0xef/0x1b0 [ 44.620976] __fput+0x3f9/0xb60 [ 44.621240] fput_close_sync+0x110/0x210 [ 44.622222] __x64_sys_close+0x8f/0x120 [ 44.622530] do_syscall_64+0x5b/0x2f0 [ 44.622840] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 44.623244] RIP: 0033:0x7f365bb3f227 Kernel panics because it detects UFFD inconsistency during userfaultfd_release_all(). Specifically, a VMA which has a valid pointer to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags. The inconsistency is caused in ksm_madvise(): when user calls madvise() with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR mode, it accidentally clears all flags stored in the upper 32 bits of vma->vm_flags. Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int and int are 32-bit wide. This setup causes the following mishap during the &= ~VM_MERGEABLE assignment. VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000. After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then promoted to unsigned long before the & operation. This promotion fills upper 32 bits with leading 0s, as we're doing unsigned conversion (and even for a signed conversion, this wouldn't help as the leading bit is 0). & operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears the upper 32-bits of its value. Fix it by changing `VM_MERGEABLE` constant to unsigned long, using the BIT() macro. Note: other VM_* flags are not affected: This only happens to the VM_MERGEABLE flag, as the other VM_* flags are all constants of type int and after ~ operation, they end up with leading 1 and are thus converted to unsigned long with leading 1s. Note 2: After commit 31defc3b01d9 ("userfaultfd: remove (VM_)BUG_ON()s"), this is no longer a kernel BUG, but a WARNING at the same place: [ 45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067 but the root-cause (flag-drop) remains the same. [akpm@linux-foundation.org: rust bindgen wasn't able to handle BIT(), from Miguel] Link: https://lore.kernel.org/oe-kbuild-all/202510030449.VfSaAjvd-lkp@intel.com/ Link: https://lkml.kernel.org/r/20251001090353.57523-2-acsjakub@amazon.de Fixes: 7677f7fd8be7 ("userfaultfd: add minor fault registration mode") Signed-off-by: Jakub Acs <acsjakub@amazon.de> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: SeongJae Park <sj@kernel.org> Tested-by: Alice Ryhl <aliceryhl@google.com> Tested-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Xu Xin <xu.xin16@zte.com.cn> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ acsjakub: drop rust-compatibility change (no rust in 5.4) ] Signed-off-by: Jakub Acs <acsjakub@amazon.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
commit e6965188f84a7883e6a0d3448e86b0cf29b24dfc upstream. If the allocation of tl_hba->sh fails in tcm_loop_driver_probe() and we attempt to dereference it in tcm_loop_tpg_address_show() we will get a segfault, see below for an example. So, check tl_hba->sh before dereferencing it. Unable to allocate struct scsi_host BUG: kernel NULL pointer dereference, address: 0000000000000194 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 8356 Comm: tokio-runtime-w Not tainted 6.6.104.2-4.azl3 #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/28/2024 RIP: 0010:tcm_loop_tpg_address_show+0x2e/0x50 [tcm_loop] ... Call Trace: <TASK> configfs_read_iter+0x12d/0x1d0 [configfs] vfs_read+0x1b5/0x300 ksys_read+0x6f/0xf0 ... Cc: stable@vger.kernel.org Fixes: 2628b35 ("tcm_loop: Show address of tpg in configfs") Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Allen Pais <apais@linux.microsoft.com> Link: https://patch.msgid.link/1762370746-6304-1-git-send-email-hamzamahfooz@linux.microsoft.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
[ Upstream commit dfe28c4167a9259fc0c372d9f9473e1ac95cff67 ] The validation of the set(nsh(...)) action is completely wrong. It runs through the nsh_key_put_from_nlattr() function that is the same function that validates NSH keys for the flow match and the push_nsh() action. However, the set(nsh(...)) has a very different memory layout. Nested attributes in there are doubled in size in case of the masked set(). That makes proper validation impossible. There is also confusion in the code between the 'masked' flag, that says that the nested attributes are doubled in size containing both the value and the mask, and the 'is_mask' that says that the value we're parsing is the mask. This is causing kernel crash on trying to write into mask part of the match with SW_FLOW_KEY_PUT() during validation, while validate_nsh() doesn't allocate any memory for it: BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1c2383067 P4D 1c2383067 PUD 20b703067 PMD 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 8 UID: 0 Kdump: loaded Not tainted 6.17.0-rc4+ #107 PREEMPT(voluntary) RIP: 0010:nsh_key_put_from_nlattr+0x19d/0x610 [openvswitch] Call Trace: <TASK> validate_nsh+0x60/0x90 [openvswitch] validate_set.constprop.0+0x270/0x3c0 [openvswitch] __ovs_nla_copy_actions+0x477/0x860 [openvswitch] ovs_nla_copy_actions+0x8d/0x100 [openvswitch] ovs_packet_cmd_execute+0x1cc/0x310 [openvswitch] genl_family_rcv_msg_doit+0xdb/0x130 genl_family_rcv_msg+0x14b/0x220 genl_rcv_msg+0x47/0xa0 netlink_rcv_skb+0x53/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x280/0x3b0 netlink_sendmsg+0x1f7/0x430 ____sys_sendmsg+0x36b/0x3a0 ___sys_sendmsg+0x87/0xd0 __sys_sendmsg+0x6d/0xd0 do_syscall_64+0x7b/0x2c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e The third issue with this process is that while trying to convert the non-masked set into masked one, validate_set() copies and doubles the size of the OVS_KEY_ATTR_NSH as if it didn't have any nested attributes. It should be copying each nested attribute and doubling them in size independently. And the process must be properly reversed during the conversion back from masked to a non-masked variant during the flow dump. In the end, the only two outcomes of trying to use this action are either validation failure or a kernel crash. And if somehow someone manages to install a flow with such an action, it will most definitely not do what it is supposed to, since all the keys and the masks are mixed up. Fixing all the issues is a complex task as it requires re-writing most of the validation code. Given that and the fact that this functionality never worked since introduction, let's just remove it altogether. It's better to re-introduce it later with a proper implementation instead of trying to fix it in stable releases. Fixes: b2d0f5d ("openvswitch: enable NSH support") Reported-by: Junvy Yang <zhuque@tencent.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/20251112112246.95064-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
… NULL on error [ Upstream commit 90a88306eb874fe4bbdd860e6c9787f5bbc588b5 ] Make knav_dma_open_channel consistently return NULL on error instead of ERR_PTR. Currently the header include/linux/soc/ti/knav_dma.h returns NULL when the driver is disabled, but the driver implementation does not even return NULL or ERR_PTR on failure, causing inconsistency in the users. This results in a crash in netcp_free_navigator_resources as followed (trimmed): Unhandled fault: alignment exception (0x221) at 0xfffffff2 [fffffff2] *pgd=80000800207003, *pmd=82ffda003, *pte=00000000 Internal error: : 221 [#1] SMP ARM Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc7 #1 NONE Hardware name: Keystone PC is at knav_dma_close_channel+0x30/0x19c LR is at netcp_free_navigator_resources+0x2c/0x28c [... TRIM...] Call trace: knav_dma_close_channel from netcp_free_navigator_resources+0x2c/0x28c netcp_free_navigator_resources from netcp_ndo_open+0x430/0x46c netcp_ndo_open from __dev_open+0x114/0x29c __dev_open from __dev_change_flags+0x190/0x208 __dev_change_flags from netif_change_flags+0x1c/0x58 netif_change_flags from dev_change_flags+0x38/0xa0 dev_change_flags from ip_auto_config+0x2c4/0x11f0 ip_auto_config from do_one_initcall+0x58/0x200 do_one_initcall from kernel_init_freeable+0x1cc/0x238 kernel_init_freeable from kernel_init+0x1c/0x12c kernel_init from ret_from_fork+0x14/0x38 [... TRIM...] Standardize the error handling by making the function return NULL on all error conditions. The API is used in just the netcp_core.c so the impact is limited. Note, this change, in effect reverts commit 5b6cb43 ("net: ethernet: ti: netcp_core: return error while dma channel open issue"), but provides a less error prone implementation. Suggested-by: Simon Horman <horms@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Nishanth Menon <nm@ti.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251103162811.3730055-1-nm@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
… iterator
With latest `bpftool prog` command, we observed the following kernel
panic.
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD dfe894067 P4D dfe894067 PUD deb663067 PMD 0
Oops: 0010 [#1] SMP
CPU: 9 PID: 6023 ...
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0000:ffffc900002b8f18 EFLAGS: 00010286
RAX: ffff8883a405f400 RBX: ffff888e46a6bf00 RCX: 000000008020000c
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8883a405f400
RBP: ffff888e46a6bf50 R08: 0000000000000000 R09: ffffffff81129600
R10: ffff8883a405f300 R11: 0000160000000000 R12: 0000000000002710
R13: 000000e9494b690c R14: 0000000000000202 R15: 0000000000000009
FS: 00007fd9187fe700(0000) GS:ffff888e46a40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000de5d33002 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
rcu_core+0x1a4/0x440
__do_softirq+0xd3/0x2c8
irq_exit+0x9d/0xa0
smp_apic_timer_interrupt+0x68/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0033:0x47ce80
Code: Bad RIP value.
RSP: 002b:00007fd9187fba40 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000002 RBX: 00007fd931789160 RCX: 000000000000010c
RDX: 00007fd9308cdfb4 RSI: 00007fd9308cdfb4 RDI: 00007ffedd1ea0a8
RBP: 00007fd9187fbab0 R08: 000000000000000e R09: 000000000000002a
R10: 0000000000480210 R11: 00007fd9187fc570 R12: 00007fd9316cc400
R13: 0000000000000118 R14: 00007fd9308cdfb4 R15: 00007fd9317a9380
After further analysis, the bug is triggered by
Commit eaaacd23910f ("bpf: Add task and task/file iterator targets")
which introduced task_file bpf iterator, which traverses all open file
descriptors for all tasks in the current namespace.
The latest `bpftool prog` calls a task_file bpf program to traverse
all files in the system in order to associate processes with progs/maps, etc.
When traversing files for a given task, rcu read_lock is taken to
access all files in a file_struct. But it used get_file() to grab
a file, which is not right. It is possible file->f_count is 0 and
get_file() will unconditionally increase it.
Later put_file() may cause all kind of issues with the above
as one of sympotoms.
The failure can be reproduced with the following steps in a few seconds:
$ cat t.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#define N 10000
int fd[N];
int main() {
int i;
for (i = 0; i < N; i++) {
fd[i] = open("./note.txt", 'r');
if (fd[i] < 0) {
fprintf(stderr, "failed\n");
return -1;
}
}
for (i = 0; i < N; i++)
close(fd[i]);
return 0;
}
$ gcc -O2 t.c
$ cat run.sh
#/bin/bash
for i in {1..100}
do
while true; do ./a.out; done &
done
$ ./run.sh
$ while true; do bpftool prog >& /dev/null; done
This patch used get_file_rcu() which only grabs a file if the
file->f_count is not zero. This is to ensure the file pointer
is always valid. The above reproducer did not fail for more
than 30 minutes.
Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
Suggested-by: Josef Bacik <josef@toxicpanda.com>
Change-Id: Ie35ddc8ad94c60cca0154dc5a3e6e927c2df2673
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/bpf/20200817174214.252601-1-yhs@fb.com
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
In preempt kernel, BPF_PROG_TEST_RUN on raw_tp triggers: [ 35.874974] BUG: using smp_processor_id() in preemptible [00000000] code: new_name/87 [ 35.893983] caller is bpf_prog_test_run_raw_tp+0xd4/0x1b0 [ 35.900124] CPU: 1 PID: 87 Comm: new_name Not tainted 5.9.0-rc6-g615bd02bf #1 [ 35.907358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 35.916941] Call Trace: [ 35.919660] dump_stack+0x77/0x9b [ 35.923273] check_preemption_disabled+0xb4/0xc0 [ 35.928376] bpf_prog_test_run_raw_tp+0xd4/0x1b0 [ 35.933872] ? selinux_bpf+0xd/0x70 [ 35.937532] __do_sys_bpf+0x6bb/0x21e0 [ 35.941570] ? find_held_lock+0x2d/0x90 [ 35.945687] ? vfs_write+0x150/0x220 [ 35.949586] do_syscall_64+0x2d/0x40 [ 35.953443] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by calling migrate_disable() before smp_processor_id(). Fixes: 1b4d60ec162f ("bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint") Reported-by: Alexei Starovoitov <ast@kernel.org> Change-Id: Ic6e489279a6ff446cc133abbf9eab851188a9734 Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
The 64-bit JEQ/JNE handling in reg_set_min_max() was clearing reg->id in either
true or false branch. In the case 'if (reg->id)' check was done on the other
branch the counter part register would have reg->id == 0 when called into
find_equal_scalars(). In such case the helper would incorrectly identify other
registers with id == 0 as equivalent and propagate the state incorrectly.
Fix it by preserving ID across reg_set_min_max().
In other words any kind of comparison operator on the scalar register
should preserve its ID to recognize:
r1 = r2
if (r1 == 20) {
#1 here both r1 and r2 == 20
} else if (r2 < 20) {
#2 here both r1 and r2 < 20
}
The patch is addressing #1 case. The #2 was working correctly already.
Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
Change-Id: Id5737c3392daa0d695a768206b3d3e4bda187aaf
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201014175608.1416-1-alexei.starovoitov@gmail.com
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
[ Upstream commit 69ca310f34168eae0ada434796bfc22fb4a0fa26 ]
On some systems, some variant of the following splat is
repeatedly seen. The common factor in all traces seems
to be the entry point to task_file_seq_next(). With the
patch, all warnings go away.
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: \x0926-....: (20992 ticks this GP) idle=d7e/1/0x4000000000000002 softirq=81556231/81556231 fqs=4876
\x09(t=21033 jiffies g=159148529 q=223125)
NMI backtrace for cpu 26
CPU: 26 PID: 2015853 Comm: bpftool Kdump: loaded Not tainted 5.6.13-0_fbk4_3876_gd8d1f9bf80bb #1
Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A12 10/08/2018
Call Trace:
<IRQ>
dump_stack+0x50/0x70
nmi_cpu_backtrace.cold.6+0x13/0x50
? lapic_can_unplug_cpu.cold.30+0x40/0x40
nmi_trigger_cpumask_backtrace+0xba/0xca
rcu_dump_cpu_stacks+0x99/0xc7
rcu_sched_clock_irq.cold.90+0x1b4/0x3aa
? tick_sched_do_timer+0x60/0x60
update_process_times+0x24/0x50
tick_sched_timer+0x37/0x70
__hrtimer_run_queues+0xfe/0x270
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x5e/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:get_pid_task+0x38/0x80
Code: 89 f6 48 8d 44 f7 08 48 8b 00 48 85 c0 74 2b 48 83 c6 55 48 c1 e6 04 48 29 f0 74 19 48 8d 78 20 ba 01 00 00 00 f0 0f c1 50 20 <85> d2 74 27 78 11 83 c2 01 78 0c 48 83 c4 08 c3 31 c0 48 83 c4 08
RSP: 0018:ffffc9000d293dc8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
RAX: ffff888637c05600 RBX: ffffc9000d293e0c RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000550 RDI: ffff888637c05620
RBP: ffffffff8284eb80 R08: ffff88831341d300 R09: ffff88822ffd8248
R10: ffff88822ffd82d0 R11: 00000000003a93c0 R12: 0000000000000001
R13: 00000000ffffffff R14: ffff88831341d300 R15: 0000000000000000
? find_ge_pid+0x1b/0x20
task_seq_get_next+0x52/0xc0
task_file_seq_get_next+0x159/0x220
task_file_seq_next+0x4f/0xa0
bpf_seq_read+0x159/0x390
vfs_read+0x8a/0x140
ksys_read+0x59/0xd0
do_syscall_64+0x42/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f95ae73e76e
Code: Bad RIP value.
RSP: 002b:00007ffc02c1dbf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 000000000170faa0 RCX: 00007f95ae73e76e
RDX: 0000000000001000 RSI: 00007ffc02c1dc30 RDI: 0000000000000007
RBP: 00007ffc02c1ec70 R08: 0000000000000005 R09: 0000000000000006
R10: fffffffffffff20b R11: 0000000000000246 R12: 00000000019112a0
R13: 0000000000000000 R14: 0000000000000007 R15: 00000000004283c0
If unable to obtain the file structure for the current task,
proceed to the next task number after the one returned from
task_seq_get_next(), instead of the next task number from the
original iterator.
Also, save the stopping task number from task_seq_get_next()
on failure in case of restarts.
Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
Change-Id: I38f4c944895cb0497ddef69bbac342bc41250fda
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201218185032.2464558-2-jonathan.lemon@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
…_attach_type [ Upstream commit 5e21bb4e812566aef86fbb77c96a4ec0782286e4 ] These two types of XDP progs (BPF_XDP_DEVMAP, BPF_XDP_CPUMAP) will not be executed directly in the driver, therefore we should also not directly run them from here. To run in these two situations, there must be further preparations done, otherwise these may cause a kernel panic. For more details, see also dev_xdp_attach(). [ 46.982479] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 46.984295] #PF: supervisor read access in kernel mode [ 46.985777] #PF: error_code(0x0000) - not-present page [ 46.987227] PGD 800000010dca4067 P4D 800000010dca4067 PUD 10dca6067 PMD 0 [ 46.989201] Oops: 0000 [#1] SMP PTI [ 46.990304] CPU: 7 PID: 562 Comm: a.out Not tainted 5.13.0+ #44 [ 46.992001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/24 [ 46.995113] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 46.996586] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.001562] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.003115] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.005163] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.007135] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.009171] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.011172] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.013244] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.015705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.017475] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.019558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.021595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.023574] PKRU: 55555554 [ 47.024571] Call Trace: [ 47.025424] __bpf_prog_run32+0x32/0x50 [ 47.026296] ? printk+0x53/0x6a [ 47.027066] ? ktime_get+0x39/0x90 [ 47.027895] bpf_test_run.cold.28+0x23/0x123 [ 47.028866] ? printk+0x53/0x6a [ 47.029630] bpf_prog_test_run_xdp+0x149/0x1d0 [ 47.030649] __sys_bpf+0x1305/0x23d0 [ 47.031482] __x64_sys_bpf+0x17/0x20 [ 47.032316] do_syscall_64+0x3a/0x80 [ 47.033165] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 47.034254] RIP: 0033:0x7f04a51364dd [ 47.035133] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48 [ 47.038768] RSP: 002b:00007fff8f9fc518 EFLAGS: 00000213 ORIG_RAX: 0000000000000141 [ 47.040344] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04a51364dd [ 47.041749] RDX: 0000000000000048 RSI: 0000000020002a80 RDI: 000000000000000a [ 47.043171] RBP: 00007fff8f9fc530 R08: 0000000002049300 R09: 0000000020000100 [ 47.044626] R10: 0000000000000004 R11: 0000000000000213 R12: 0000000000401070 [ 47.046088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 47.047579] Modules linked in: [ 47.048318] CR2: 0000000000000000 [ 47.049120] ---[ end trace 7ad34443d5be719a ]--- [ 47.050273] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 47.051343] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.054943] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.056068] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.057522] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.058961] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.060390] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.061803] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.063249] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.065070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.066307] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.067747] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.069217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.070652] PKRU: 55555554 [ 47.071318] Kernel panic - not syncing: Fatal exception [ 47.072854] Kernel Offset: disabled [ 47.073683] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fixes: 9216477449f3 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap") Fixes: fbee97feed9b ("bpf: Add support to attach bpf program to a devmap entry") Reported-by: Abaci <abaci@linux.alibaba.com> Change-Id: I3585e8abeacc017043f6e8a14676121d554c7c6f Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: David Ahern <dsahern@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210708080409.73525-1-xuanzhuo@linux.alibaba.com Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
[ Upstream commit d6371c76e20d7d3f61b05fd67b596af4d14a8886 ]
We got the following UBSAN report on one of our testing machines:
================================================================================
UBSAN: array-index-out-of-bounds in kernel/bpf/syscall.c:2389:24
index 6 is out of range for type 'char *[6]'
CPU: 43 PID: 930921 Comm: systemd-coredum Tainted: G O 5.10.48-cloudflare-kasan-2021.7.0 #1
Hardware name: <snip>
Call Trace:
dump_stack+0x7d/0xa3
ubsan_epilogue+0x5/0x40
__ubsan_handle_out_of_bounds.cold+0x43/0x48
? seq_printf+0x17d/0x250
bpf_link_show_fdinfo+0x329/0x380
? bpf_map_value_size+0xe0/0xe0
? put_files_struct+0x20/0x2d0
? __kasan_kmalloc.constprop.0+0xc2/0xd0
seq_show+0x3f7/0x540
seq_read_iter+0x3f8/0x1040
seq_read+0x329/0x500
? seq_read_iter+0x1040/0x1040
? __fsnotify_parent+0x80/0x820
? __fsnotify_update_child_dentry_flags+0x380/0x380
vfs_read+0x123/0x460
ksys_read+0xed/0x1c0
? __x64_sys_pwrite64+0x1f0/0x1f0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
<snip>
================================================================================
================================================================================
UBSAN: object-size-mismatch in kernel/bpf/syscall.c:2384:2
From the report, we can infer that some array access in bpf_link_show_fdinfo at index 6
is out of bounds. The obvious candidate is bpf_link_type_strs[BPF_LINK_TYPE_XDP] with
BPF_LINK_TYPE_XDP == 6. It turns out that BPF_LINK_TYPE_XDP is missing from bpf_types.h
and therefore doesn't have an entry in bpf_link_type_strs:
pos: 0
flags: 02000000
mnt_id: 13
link_type: (null)
link_id: 4
prog_tag: bcf7977d3b93787c
prog_id: 4
ifindex: 1
Fixes: aa8d3a716b59 ("bpf, xdp: Add bpf_link-based XDP attachment API")
Change-Id: I3778494269186262db490526483faca0ed56207d
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210719085134.43325-2-lmb@cloudflare.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
…descriptors commit a657182a5c5150cdfacb6640aad1d2712571a409 upstream. Hsin-Wei reported a KASAN splat triggered by their BPF runtime fuzzer which is based on a customized syzkaller: BUG: KASAN: slab-out-of-bounds in bpf_int_jit_compile+0x1257/0x13f0 Read of size 8 at addr ffff888004e90b58 by task syz-executor.0/1489 CPU: 1 PID: 1489 Comm: syz-executor.0 Not tainted 5.19.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x9c/0xc9 print_address_description.constprop.0+0x1f/0x1f0 ? bpf_int_jit_compile+0x1257/0x13f0 kasan_report.cold+0xeb/0x197 ? kvmalloc_node+0x170/0x200 ? bpf_int_jit_compile+0x1257/0x13f0 bpf_int_jit_compile+0x1257/0x13f0 ? arch_prepare_bpf_dispatcher+0xd0/0xd0 ? rcu_read_lock_sched_held+0x43/0x70 bpf_prog_select_runtime+0x3e8/0x640 ? bpf_obj_name_cpy+0x149/0x1b0 bpf_prog_load+0x102f/0x2220 ? __bpf_prog_put.constprop.0+0x220/0x220 ? find_held_lock+0x2c/0x110 ? __might_fault+0xd6/0x180 ? lock_downgrade+0x6e0/0x6e0 ? lock_is_held_type+0xa6/0x120 ? __might_fault+0x147/0x180 __sys_bpf+0x137b/0x6070 ? bpf_perf_link_attach+0x530/0x530 ? new_sync_read+0x600/0x600 ? __fget_files+0x255/0x450 ? lock_downgrade+0x6e0/0x6e0 ? fput+0x30/0x1a0 ? ksys_write+0x1a8/0x260 __x64_sys_bpf+0x7a/0xc0 ? syscall_enter_from_user_mode+0x21/0x70 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f917c4e2c2d The problem here is that a range of tnum_range(0, map->max_entries - 1) has limited ability to represent the concrete tight range with the tnum as the set of resulting states from value + mask can result in a superset of the actual intended range, and as such a tnum_in(range, reg->var_off) check may yield true when it shouldn't, for example tnum_range(0, 2) would result in 00XX -> v = 0000, m = 0011 such that the intended set of {0, 1, 2} is here represented by a less precise superset of {0, 1, 2, 3}. As the register is known const scalar, really just use the concrete reg->var_off.value for the upper index check. Fixes: d2e4c1e6c294 ("bpf: Constant map key tracking for prog array pokes") Reported-by: Hsin-Wei Hung <hsinweih@uci.edu> Change-Id: I03425b3777961f3ed6e48319d4acaea69e4ec014 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Shung-Hsi Yu <shung-hsi.yu@suse.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/984b37f9fdf7ac36831d2137415a4a915744c1b6.1661462653.git.daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amritokun
pushed a commit
that referenced
this pull request
Dec 21, 2025
[ Upstream commit 7d6620f107bae6ed687ff07668e8e8f855487aa9 ] Syzkaller reported a triggered kernel BUG as follows: ------------[ cut here ]------------ kernel BUG at kernel/bpf/cgroup.c:925! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 194 Comm: detach Not tainted 5.19.0-14184-g69dac8e431af #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__cgroup_bpf_detach+0x1f2/0x2a0 Code: 00 e8 92 60 30 00 84 c0 75 d8 4c 89 e0 31 f6 85 f6 74 19 42 f6 84 28 48 05 00 00 02 75 0e 48 8b 80 c0 00 00 00 48 85 c0 75 e5 <0f> 0b 48 8b 0c5 RSP: 0018:ffffc9000055bdb0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff888100ec0800 RCX: ffffc900000f1000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff888100ec4578 RBP: 0000000000000000 R08: ffff888100ec0800 R09: 0000000000000040 R10: 0000000000000000 R11: 0000000000000000 R12: ffff888100ec4000 R13: 000000000000000d R14: ffffc90000199000 R15: ffff888100effb00 FS: 00007f68213d2b80(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f74a0e5850 CR3: 0000000102836000 CR4: 00000000000006e0 Call Trace: <TASK> cgroup_bpf_prog_detach+0xcc/0x100 __sys_bpf+0x2273/0x2a00 __x64_sys_bpf+0x17/0x20 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f68214dbcb9 Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff8 RSP: 002b:00007ffeb487db68 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f68214dbcb9 RDX: 0000000000000090 RSI: 00007ffeb487db70 RDI: 0000000000000009 RBP: 0000000000000003 R08: 0000000000000012 R09: 0000000b00000003 R10: 00007ffeb487db70 R11: 0000000000000246 R12: 00007ffeb487dc20 R13: 0000000000000004 R14: 0000000000000001 R15: 000055f74a1011b0 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- Repetition steps: For the following cgroup tree, root | cg1 | cg2 1. attach prog2 to cg2, and then attach prog1 to cg1, both bpf progs attach type is NONE or OVERRIDE. 2. write 1 to /proc/thread-self/fail-nth for failslab. 3. detach prog1 for cg1, and then kernel BUG occur. Failslab injection will cause kmalloc fail and fall back to purge_effective_progs. The problem is that cg2 have attached another prog, so when go through cg2 layer, iteration will add pos to 1, and subsequent operations will be skipped by the following condition, and cg will meet NULL in the end. `if (pos && !(cg->bpf.flags[atype] & BPF_F_ALLOW_MULTI))` The NULL cg means no link or prog match, this is as expected, and it's not a bug. So here just skip the no match situation. Fixes: 4c46091ee985 ("bpf: Fix KASAN use-after-free Read in compute_effective_progs") Change-Id: Ica98c69bf3d58c46b1692625827485f7312960be Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220813134030.1972696-1-pulehui@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After initial testing all runs well including KSU Modules like SUSFS. More testing on your side is highly appreciated.
Logs observed during build (including options selected):