Skip to content

Conversation

PlaidCat
Copy link
Collaborator

General Process:

Notes

There is no jira for this yet as we're working on early automation

Checking Rebuild Commits for potentially Missing Commits:

resf_kernel-5.14.0-570.12.1.el9_6

commit 89b96ad56b0ec7c6028526a3c7ec74b80c22fe8e (tag: resf_kernel-5.14.0-570.12.1.el9_6, rocky9_6_rebuild_kernel-5.14.0-570.12.1.el9_6)
Author: Jonathan Maple <jmaple@ciq.com>
Date:   Tue May 20 13:12:43 2025 -0400

    Rebuild rocky9_6 with kernel-5.14.0-570.12.1.el9_6

    Rebuild_History BUILDABLE
    Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
    Number of commits in upstream range v5.14~1..kernel-mainline: 296506
    Number of commits in rpm: 1901
    Number of commits matched with upstream: 43 (2.26%)
    Number of commits in upstream but not in rpm: 296463
    Number of commits NOT found in upstream: 1858 (97.74%)

    Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.12.1.el9_6 for kernel-5.14.0-570.12.1.el9_6
    Clean Cherry Picks: 38 (88.37%)
    Empty Cherry Picks: 2 (4.65%)
    _______________________________

    Full Details Located here:
    ciq/ciq_backports/kernel-5.14.0-570.12.1.el9_6/rebuild.details.txt

    Includes:
    * git commit header above
    * Empty Commits with upstream SHA
    * RPM ChangeLog Entries that could not be matched

    Individual Empty Commit failures contained in the same containing directory.
    The git message for empty commits will have the path for the failed commit.
    File names are the first 8 characters of the upstream SHA

The FIRST tag has A LOT of Number of commits NOT found in upstream: 1858 (97.74%) which can be reduced down to a bunch of entries to the kabi list (this may be a result of kernel-ark style updates

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-5.14.0-570.12.1.el9_6/rebuild.details.txt | grep kabi | wc -l
1840
[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-5.14.0-570.12.1.el9_6/rebuild.details.txt | grep -v kabi
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..kernel-mainline: 296506
Number of commits in rpm: 1901
Number of commits matched with upstream: 43 (2.26%)
Number of commits in upstream but not in rpm: 296463
Number of commits NOT found in upstream: 1858 (97.74%)

Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.12.1.el9_6 for kernel-5.14.0-570.12.1.el9_6
Clean Cherry Picks: 38 (88.37%)
Empty Cherry Picks: 2 (4.65%)
_______________________________

__EMPTY COMMITS__________________________
f10593ad9bc36921f623361c9e3dd96bd52d85ee scsi: sg: Fix slab-use-after-free read in sg_release()
c79a39dc8d060b9e64e8b0fa9d245d44befeefbe pps: Fix a use-after-free

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 9, debranding and Rocky branding'
Ensure aarch64 kernel is not compressed'
redhat: rebuild for prep-kerberos brew fix [RHEL-86037]'
redhat: drop Y issues from changelog
configs: enable FW_CACHE on centos-stream/rhel 9 for nouveau
configs: enable IVPU driver on RHEL
accel: add build system changes
accel: backport ivpu driver from v6.12
rhel_files: ensure all qdiscs are in modules-core
Revert "mm: add vma_has_recency()"
Revert "mm: support POSIX_FADV_NOREUSE"
CVE-2025-1272: security: Re-enable lockdown LSM in some setup_arch()
redhat/self-test: Remove --all from git query
redhat: fix selftest git command so it picks the right commit
gitlab-ci: add jobs for rhel9 automotive pipelines
gitlab-ci: clean up trigger job naming and template inheritance
redhat: change DIST to .el9_6
redhat: change to zstream version numbering for 9.6

resf_kernel-5.14.0-570.16.1.el9_6

commit 4a380692f7475b8e35e5d1be62f78bdeb7e94890 (tag: resf_kernel-5.14.0-570.16.1.el9_6, rocky9_6_rebuild_kernel-5.14.0-570.16.1.el9_6)
Author: Jonathan Maple <jmaple@ciq.com>
Date:   Tue May 20 13:13:37 2025 -0400

    Rebuild rocky9_6 with kernel-5.14.0-570.16.1.el9_6

    Rebuild_History BUILDABLE
    Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
    Number of commits in upstream range v5.14~1..kernel-mainline: 296506
    Number of commits in rpm: 63
    Number of commits matched with upstream: 60 (95.24%)
    Number of commits in upstream but not in rpm: 296446
    Number of commits NOT found in upstream: 3 (4.76%)

    Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.16.1.el9_6 for kernel-5.14.0-570.16.1.el9_6
    Clean Cherry Picks: 51 (85.00%)
    Empty Cherry Picks: 9 (15.00%)
    _______________________________

    Full Details Located here:
    ciq/ciq_backports/kernel-5.14.0-570.16.1.el9_6/rebuild.details.txt

    Includes:
    * git commit header above
    * Empty Commits with upstream SHA
    * RPM ChangeLog Entries that could not be matched

    Individual Empty Commit failures contained in the same containing directory.
    The git message for empty commits will have the path for the failed commit.
    File names are the first 8 characters of the upstream SHA

resf_kernel-5.14.0-570.17.1.el9_6

commit 9e0e88ac545cc6d2db3400792e63b1c7cf39c7e7 (HEAD -> rocky9_6_rebuild, tag: resf_kernel-5.14.0-570.17.1.el9_6, origin/rocky9_6_rebuild, rocky9_6_rebuild_kernel-5.14.0-570.17.1.el9_6)
Author: Jonathan Maple <jmaple@ciq.com>
Date:   Tue May 20 13:14:20 2025 -0400

    Rebuild rocky9_6 with kernel-5.14.0-570.17.1.el9_6

    Rebuild_History BUILDABLE
    Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
    Number of commits in upstream range v5.14~1..kernel-mainline: 296506
    Number of commits in rpm: 40
    Number of commits matched with upstream: 37 (92.50%)
    Number of commits in upstream but not in rpm: 296469
    Number of commits NOT found in upstream: 3 (7.50%)

    Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.17.1.el9_6 for kernel-5.14.0-570.17.1.el9_6
    Clean Cherry Picks: 35 (94.59%)
    Empty Cherry Picks: 2 (5.41%)
    _______________________________

    Full Details Located here:
    ciq/ciq_backports/kernel-5.14.0-570.17.1.el9_6/rebuild.details.txt

    Includes:
    * git commit header above
    * Empty Commits with upstream SHA
    * RPM ChangeLog Entries that could not be matched

    Individual Empty Commit failures contained in the same containing directory.
    The git message for empty commits will have the path for the failed commit.
    File names are the first 8 characters of the upstream SHA

KselfTest

This is the first build of Rocky 9.6 and some of the kernel selftests have changes. Specifically pidfd now stalls and hangs and does not exit. This may need investigated but lkdtm still doesn't work, see this about it: https://github.com/ctrliq/kernel-src-tree/tree/main/kselftests/lkdtm

[maple@rocky9-rebuild kernel-src-tree]$ time make -C tools/testing/selftests SKIP_TARGETS="lkdtm pidfd" run_tests | tee ../kselftest.$(uname -r).log

real    19m53.835s
user    12m56.723s

[maple@rocky9-rebuild kernel-src-tree]$ grep '^ok ' ../kselftest.5.14.0-rocky9_6_rebuild-9e0e88ac545c.log | wc -l
317

PlaidCat added 30 commits May 21, 2025 16:45
We track the config inside the kernel so that it can be there are not
multiple places needed to go to find the correct config values.
jira NONE_AUTOMATION
cve CVE-2024-56631
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Suraj Sonawane <surajsonawane0215@gmail.com>
commit f10593a
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.12.1.el9_6/f10593ad.failed

Fix a use-after-free bug in sg_release(), detected by syzbot with KASAN:

BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30
kernel/locking/lockdep.c:5838
__mutex_unlock_slowpath+0xe2/0x750 kernel/locking/mutex.c:912
sg_release+0x1f4/0x2e0 drivers/scsi/sg.c:407

In sg_release(), the function kref_put(&sfp->f_ref, sg_remove_sfp) is
called before releasing the open_rel_lock mutex. The kref_put() call may
decrement the reference count of sfp to zero, triggering its cleanup
through sg_remove_sfp(). This cleanup includes scheduling deferred work
via sg_remove_sfp_usercontext(), which ultimately frees sfp.

After kref_put(), sg_release() continues to unlock open_rel_lock and may
reference sfp or sdp. If sfp has already been freed, this results in a
slab-use-after-free error.

Move the kref_put(&sfp->f_ref, sg_remove_sfp) call after unlocking the
open_rel_lock mutex. This ensures:

 - No references to sfp or sdp occur after the reference count is
   decremented.

 - Cleanup functions such as sg_remove_sfp() and
   sg_remove_sfp_usercontext() can safely execute without impacting the
   mutex handling in sg_release().

The fix has been tested and validated by syzbot. This patch closes the
bug reported at the following syzkaller link and ensures proper
sequencing of resource cleanup and mutex operations, eliminating the
risk of use-after-free errors in sg_release().

	Reported-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7efb5850a17ba6ce098b
	Tested-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com
Fixes: cc833ac ("sg: O_EXCL and other lock handling")
	Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com>
Link: https://lore.kernel.org/r/20241120125944.88095-1-surajsonawane0215@gmail.com
	Reviewed-by: Bart Van Assche <bvanassche@acm.org>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f10593a)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/sg.c
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Kai Mäkisara <Kai.Makisara@kolumbus.fi>
commit 98b3788

Commit 9604eea ("scsi: st: Add third party poweron reset handling") in
v6.6 added new code to handle the Power On/Reset Unit Attention (POR UA)
sense data. This was in addition to the existing method. When this Unit
Attention is received, the driver blocks attempts to read, write and some
other operations because the reset may have rewinded the tape. Because of
the added code, also the initial POR UA resulted in blocking operations,
including those that are used to set the driver options after the device is
recognized. Also, reading and writing are refused, whereas they succeeded
before this commit.

Add code to not set pos_unknown to block operations if the POR UA is
received from the first test_ready() call after the st device has been
created. This restores the behavior before v6.6.

	Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Link: https://lore.kernel.org/r/20241216113755.30415-1-Kai.Makisara@kolumbus.fi
Fixes: 9604eea ("scsi: st: Add third party poweron reset handling")
CC: stable@vger.kernel.org
Closes: https://lore.kernel.org/linux-scsi/2201CF73-4795-4D3B-9A79-6EE5215CF58D@kolumbus.fi/
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 98b3788)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 52c11d3

On initial driver load, alloc_etherdev_mqs is called with whatever max
queue values are provided by the control plane. However, if the driver
is loaded on a system where num_online_cpus() returns less than the max
queues, the netdev will think there are more queues than are actually
available. Only num_online_cpus() will be allocated, but
skb_get_queue_mapping(skb) could possibly return an index beyond the
range of allocated queues. Consequently, the packet is silently dropped
and it appears as if TX is broken.

Set the real number of queues during open so the netdev knows how many
queues will be allocated.

Fixes: 1c325aa ("idpf: configure resources for TX queues")
	Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
	Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
	Tested-by: Samuel Salin <Samuel.salin@intel.com>
	Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
(cherry picked from commit 52c11d3)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
cve CVE-2023-52922
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author YueHaibing <yuehaibing@huawei.com>
commit 55c3b96

BUG: KASAN: slab-use-after-free in bcm_proc_show+0x969/0xa80
Read of size 8 at addr ffff888155846230 by task cat/7862

CPU: 1 PID: 7862 Comm: cat Not tainted 6.5.0-rc1-00153-gc8746099c197 #230
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xd5/0x150
 print_report+0xc1/0x5e0
 kasan_report+0xba/0xf0
 bcm_proc_show+0x969/0xa80
 seq_read_iter+0x4f6/0x1260
 seq_read+0x165/0x210
 proc_reg_read+0x227/0x300
 vfs_read+0x1d5/0x8d0
 ksys_read+0x11e/0x240
 do_syscall_64+0x35/0xb0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Allocated by task 7846:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x9e/0xa0
 bcm_sendmsg+0x264b/0x44e0
 sock_sendmsg+0xda/0x180
 ____sys_sendmsg+0x735/0x920
 ___sys_sendmsg+0x11d/0x1b0
 __sys_sendmsg+0xfa/0x1d0
 do_syscall_64+0x35/0xb0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 7846:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x27/0x40
 ____kasan_slab_free+0x161/0x1c0
 slab_free_freelist_hook+0x119/0x220
 __kmem_cache_free+0xb4/0x2e0
 rcu_core+0x809/0x1bd0

bcm_op is freed before procfs entry be removed in bcm_release(),
this lead to bcm_proc_show() may read the freed bcm_op.

Fixes: ffd980f ("[CAN]: Add broadcast manager (bcm) protocol")
	Signed-off-by: YueHaibing <yuehaibing@huawei.com>
	Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
	Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/all/20230715092543.15548-1-yuehaibing@huawei.com
	Cc: stable@vger.kernel.org
	Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
(cherry picked from commit 55c3b96)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Calvin Owens <calvin@wbinvd.org>
commit c79a39d
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.12.1.el9_6/c79a39dc.failed

On a board running ntpd and gpsd, I'm seeing a consistent use-after-free
in sys_exit() from gpsd when rebooting:

    pps pps1: removed
    ------------[ cut here ]------------
    kobject: '(null)' (00000000db4bec24): is not initialized, yet kobject_put() is being called.
    WARNING: CPU: 2 PID: 440 at lib/kobject.c:734 kobject_put+0x120/0x150
    CPU: 2 UID: 299 PID: 440 Comm: gpsd Not tainted 6.11.0-rc6-00308-gb31c44928842 #1
    Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT)
    pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : kobject_put+0x120/0x150
    lr : kobject_put+0x120/0x150
    sp : ffffffc0803d3ae0
    x29: ffffffc0803d3ae0 x28: ffffff8042dc9738 x27: 0000000000000001
    x26: 0000000000000000 x25: ffffff8042dc9040 x24: ffffff8042dc9440
    x23: ffffff80402a4620 x22: ffffff8042ef4bd0 x21: ffffff80405cb600
    x20: 000000000008001b x19: ffffff8040b3b6e0 x18: 0000000000000000
    x17: 0000000000000000 x16: 0000000000000000 x15: 696e6920746f6e20
    x14: 7369203a29343263 x13: 205d303434542020 x12: 0000000000000000
    x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
    x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
    x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
    x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
    Call trace:
     kobject_put+0x120/0x150
     cdev_put+0x20/0x3c
     __fput+0x2c4/0x2d8
     ____fput+0x1c/0x38
     task_work_run+0x70/0xfc
     do_exit+0x2a0/0x924
     do_group_exit+0x34/0x90
     get_signal+0x7fc/0x8c0
     do_signal+0x128/0x13b4
     do_notify_resume+0xdc/0x160
     el0_svc+0xd4/0xf8
     el0t_64_sync_handler+0x140/0x14c
     el0t_64_sync+0x190/0x194
    ---[ end trace 0000000000000000 ]---

...followed by more symptoms of corruption, with similar stacks:

    refcount_t: underflow; use-after-free.
    kernel BUG at lib/list_debug.c:62!
    Kernel panic - not syncing: Oops - BUG: Fatal exception

This happens because pps_device_destruct() frees the pps_device with the
embedded cdev immediately after calling cdev_del(), but, as the comment
above cdev_del() notes, fops for previously opened cdevs are still
callable even after cdev_del() returns. I think this bug has always
been there: I can't explain why it suddenly started happening every time
I reboot this particular board.

In commit d953e0e ("pps: Fix a use-after free bug when
unregistering a source."), George Spelvin suggested removing the
embedded cdev. That seems like the simplest way to fix this, so I've
implemented his suggestion, using __register_chrdev() with pps_idr
becoming the source of truth for which minor corresponds to which
device.

But now that pps_idr defines userspace visibility instead of cdev_add(),
we need to be sure the pps->dev refcount can't reach zero while
userspace can still find it again. So, the idr_remove() call moves to
pps_unregister_cdev(), and pps_idr now holds a reference to pps->dev.

    pps_core: source serial1 got cdev (251:1)
    <...>
    pps pps1: removed
    pps_core: unregistering pps1
    pps_core: deallocating pps1

Fixes: d953e0e ("pps: Fix a use-after free bug when unregistering a source.")
	Cc: stable@vger.kernel.org
	Signed-off-by: Calvin Owens <calvin@wbinvd.org>
	Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
Link: https://lore.kernel.org/r/a17975fd5ae99385791929e563f72564edbcf28f.1731383727.git.calvin@wbinvd.org
	Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c79a39d)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/ptp/ptp_ocp.c
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Anumula Murali Mohan Reddy <anumula@chelsio.com>
commit 356983f

t4_set_vf_mac_acl() uses pf to set mac addr, but t4vf_get_vf_mac_acl()
uses port number to get mac addr, this leads to error when an attempt
to set MAC address on VF's of PF2 and PF3.
This patch fixes the issue by using port number to set mac address.

Fixes: e0cdac6 ("cxgb4vf: configure ports accessible by the VF")
	Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com>
	Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
	Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241206062014.49414-1-anumula@chelsio.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 356983f)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Sourabh Jain <sourabhjain@linux.ibm.com>
commit 0bdd7ff

Commit 683eab9 ("powerpc/fadump: setup additional parameters for
dump capture kernel") introduced the additional parameter feature in
fadump for HASH MMU with the understanding that GRUB does not use the
memory area between 640MB and 768MB for its operation.

However, the third patch ("powerpc: increase MIN RMA size for CAS
negotiation") in this series is changing the MIN RMA size to 768MB,
allowing GRUB to use memory up to 768MB. This makes the fadump
reservation for the additional parameter feature for HASH MMU
unreliable.

To address this, export the MIN_RMA so that the next patch
("powerpc/fadump: fix additional param memory reservation for HASH MMU")
can identify the correct memory range for the additional parameter
feature in fadump for HASH MMU.

	Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
	Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
	Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250123114254.200527-2-sourabhjain@linux.ibm.com

(cherry picked from commit 0bdd7ff)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Sourabh Jain <sourabhjain@linux.ibm.com>
commit b7bb460

Commit 683eab9 ("powerpc/fadump: setup additional parameters for
dump capture kernel") introduced the additional parameter feature in
fadump for HASH MMU with the understanding that GRUB does not use the
memory area between 640MB and 768MB for its operation.

However, the third patch in this series ("powerpc: increase MIN RMA
size for CAS negotiation") changes the MIN RMA size to 768MB, allowing
GRUB to use memory up to 768MB. This makes the fadump reservation for
the additional parameter feature for HASH MMU unreliable.

To address this, adjust the memory range for the additional parameter in
fadump for HASH MMU. This will ensure that GRUB does not overwrite the
memory reserved for fadump's additional parameter in HASH MMU.

The new policy for the memory range for the additional parameter in HASH
MMU is that the first memory block must be larger than the MIN_RMA size,
as the bootloader can use memory up to the MIN_RMA size. The range
should be between MIN_RMA and the RMA size (ppc64_rma_size), and it must
not overlap with the fadump reserved area.

	Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
	Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
	Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
	Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250123114254.200527-3-sourabhjain@linux.ibm.com

(cherry picked from commit b7bb460)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Avnish Chouhan <avnish@linux.ibm.com>
commit fdc4453

Change RMA size from 512 MB to 768 MB which will result
in more RMA at boot time for PowerPC. When PowerPC LPAR use/uses vTPM,
Secure Boot or FADump, the 512 MB RMA memory is not sufficient for
booting. With this 512 MB RMA, GRUB2 run out of memory and unable to
load the necessary. Sometimes even usage of CDROM which requires more
memory for installation along with the options mentioned above troubles
the boot memory and result in boot failures. Increasing the RMA size
will resolves multiple out of memory issues observed in PowerPC.

Failure details:

1. GRUB2

kern/ieee1275/init.c:550: mm requested region of size 8513000, flags 1
kern/ieee1275/init.c:563: Cannot satisfy allocation and retain minimum runtime
space
kern/ieee1275/init.c:550: mm requested region of size 8513000, flags 0
kern/ieee1275/init.c:563: Cannot satisfy allocation and retain minimum runtime
space
kern/file.c:215: Closing `/ppc/ppc64/initrd.img' ...
kern/disk.c:297: Closing
`ieee1275//vdevice/v-scsi
@30000067/disk8300000000000000'...
kern/disk.c:311: Closing
`ieee1275//vdevice/v-scsi
@30000067/disk8300000000000000' succeeded.
kern/file.c:225: Closing `/ppc/ppc64/initrd.img' failed with 3.
kern/file.c:148: Opening `/ppc/ppc64/initrd.img' succeeded.
error: ../../grub-core/kern/mm.c:552:out of memory.

2. Kernel

[    0.777633] List of all partitions:
[    0.777639] No filesystem could mount root, tried:
[    0.777640]
[    0.777649] Kernel panic - not syncing: VFS: Unable to mount root fs on "" or unknown-block(0,0)
[    0.777658] CPU: 17 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-0.rc4.20.el10.ppc64le #1
[    0.777669] Hardware name: IBM,9009-22A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW950.B0 (VL950_149) hv:phyp pSeries
[    0.777678] Call Trace:
[    0.777682] [c000000003db7b60] [c000000001119714] dump_stack_lvl+0x88/0xc4 (unreliable)
[    0.777700] [c000000003db7b90] [c00000000016c274] panic+0x174/0x460
[    0.777711] [c000000003db7c30] [c00000000200631c] mount_root_generic+0x320/0x354
[    0.777724] [c000000003db7d00] [c0000000020066f8] prepare_namespace+0x27c/0x2f4
[    0.777735] [c000000003db7d90] [c000000002005824] kernel_init_freeable+0x254/0x294
[    0.777747] [c000000003db7df0] [c00000000001131c] kernel_init+0x30/0x1c4
[    0.777757] [c000000003db7e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
[    0.777768] --- interrupt: 0 at 0x0
[    0.784238] pstore: backend (nvram) writing error (-1)
[    0.790447] Rebooting in 10 seconds..

	Signed-off-by: Avnish Chouhan <avnish@linux.ibm.com>
	Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250123114254.200527-4-sourabhjain@linux.ibm.com

(cherry picked from commit fdc4453)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Sourabh Jain <sourabhjain@linux.ibm.com>
commit 61c403b

Update the fadump document to include details about the fadump
additional parameter feature.

The document includes the following:
- Significance of the feature
- How to use it
- Feature restrictions

No functional changes are introduced.

	Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
	Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
	Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250123114254.200527-5-sourabhjain@linux.ibm.com

(cherry picked from commit 61c403b)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Yang Shi <yang@os.amperecomputing.com>
commit 56a7087

Commit ba0fb44 ("dma-mapping: replace zone_dma_bits by
zone_dma_limit") and subsequent patches changed how zone_dma_limit is
calculated to allow a reduced ZONE_DMA even when RAM starts above 4GB.
Commit 122c234 ("arm64: mm: keep low RAM dma zone") further fixed
this to ensure ZONE_DMA remains below U32_MAX if RAM starts below 4GB,
especially on platforms that do not have IORT or DT description of the
device DMA ranges. While zone boundaries calculation was fixed by the
latter commit, zone_dma_limit, used to determine the GFP_DMA flag in the
core code, was not updated. This results in excessive use of GFP_DMA and
unnecessary ZONE_DMA allocations on some platforms.

Update zone_dma_limit to match the actual upper bound of ZONE_DMA.

Fixes: ba0fb44 ("dma-mapping: replace zone_dma_bits by zone_dma_limit")
	Cc: <stable@vger.kernel.org> # 6.12.x
	Reported-by: Yutang Jiang <jiangyutang@os.amperecomputing.com>
	Tested-by: Yutang Jiang <jiangyutang@os.amperecomputing.com>
	Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20241125171650.77424-1-yang@os.amperecomputing.com
[catalin.marinas@arm.com: some tweaking of the commit log]
	Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 56a7087)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Yishai Hadas <yishaih@nvidia.com>
commit abc7b3f

Memory regions (MR) of type DM (device memory) do not have an associated
umem.

In the __mlx5_ib_dereg_mr() -> mlx5_free_priv_descs() flow, the code
incorrectly takes the wrong branch, attempting to call
dma_unmap_single() on a DMA address that is not mapped.

This results in a WARN [1], as shown below.

The issue is resolved by properly accounting for the DM type and
ensuring the correct branch is selected in mlx5_free_priv_descs().

[1]
WARNING: CPU: 12 PID: 1346 at drivers/iommu/dma-iommu.c:1230 iommu_dma_unmap_page+0x79/0x90
Modules linked in: ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry ovelay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core fuse mlx5_core
CPU: 12 UID: 0 PID: 1346 Comm: ibv_rc_pingpong Not tainted 6.12.0-rc7+ #1631
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:iommu_dma_unmap_page+0x79/0x90
Code: 2b 49 3b 29 72 26 49 3b 69 08 73 20 4d 89 f0 44 89 e9 4c 89 e2 48 89 ee 48 89 df 5b 5d 41 5c 41 5d 41 5e 41 5f e9 07 b8 88 ff <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 66 0f 1f 44 00
RSP: 0018:ffffc90001913a10 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810194b0a8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: ffff88810194b0a8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f537abdd740(0000) GS:ffff88885fb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f537aeb8000 CR3: 000000010c248001 CR4: 0000000000372eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __warn+0x84/0x190
? iommu_dma_unmap_page+0x79/0x90
? report_bug+0xf8/0x1c0
? handle_bug+0x55/0x90
? exc_invalid_op+0x13/0x60
? asm_exc_invalid_op+0x16/0x20
? iommu_dma_unmap_page+0x79/0x90
dma_unmap_page_attrs+0xe6/0x290
mlx5_free_priv_descs+0xb0/0xe0 [mlx5_ib]
__mlx5_ib_dereg_mr+0x37e/0x520 [mlx5_ib]
? _raw_spin_unlock_irq+0x24/0x40
? wait_for_completion+0xfe/0x130
? rdma_restrack_put+0x63/0xe0 [ib_core]
ib_dereg_mr_user+0x5f/0x120 [ib_core]
? lock_release+0xc6/0x280
destroy_hw_idr_uobject+0x1d/0x60 [ib_uverbs]
uverbs_destroy_uobject+0x58/0x1d0 [ib_uverbs]
uobj_destroy+0x3f/0x70 [ib_uverbs]
ib_uverbs_cmd_verbs+0x3e4/0xbb0 [ib_uverbs]
? __pfx_uverbs_destroy_def_handler+0x10/0x10 [ib_uverbs]
? lock_acquire+0xc1/0x2f0
? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]
? ib_uverbs_ioctl+0x116/0x170 [ib_uverbs]
? lock_release+0xc6/0x280
ib_uverbs_ioctl+0xe7/0x170 [ib_uverbs]
? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]
__x64_sys_ioctl+0x1b0/0xa70
do_syscall_64+0x6b/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f537adaf17b
Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffff218f0b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffff218f1d8 RCX: 00007f537adaf17b
RDX: 00007ffff218f1c0 RSI: 00000000c0181b01 RDI: 0000000000000003
RBP: 00007ffff218f1a0 R08: 00007f537aa8d010 R09: 0000561ee2e4f270
R10: 00007f537aace3a8 R11: 0000000000000246 R12: 00007ffff218f190
R13: 000000000000001c R14: 0000561ee2e4d7c0 R15: 00007ffff218f450
</TASK>

Fixes: f18ec42 ("RDMA/mlx5: Use a union inside mlx5_ib_mr")
	Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://patch.msgid.link/2039c22cfc3df02378747ba4d623a558b53fc263.1738587076.git.leon@kernel.org
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit abc7b3f)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
cve CVE-2025-21694
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Rik van Riel <riel@surriel.com>
commit cbc5dde

Since commit 5cbcb62 ("fs/proc: fix softlockup in __read_vmcore") the
number of softlockups in __read_vmcore at kdump time have gone down, but
they still happen sometimes.

In a memory constrained environment like the kdump image, a softlockup is
not just a harmless message, but it can interfere with things like RCU
freeing memory, causing the crashdump to get stuck.

The second loop in __read_vmcore has a lot more opportunities for natural
sleep points, like scheduling out while waiting for a data write to
happen, but apparently that is not always enough.

Add a cond_resched() to the second loop in __read_vmcore to (hopefully)
get rid of the softlockups.

Link: https://lkml.kernel.org/r/20250110102821.2a37581b@fangorn
Fixes: 5cbcb62 ("fs/proc: fix softlockup in __read_vmcore")
	Signed-off-by: Rik van Riel <riel@surriel.com>
	Reported-by: Breno Leitao <leitao@debian.org>
	Cc: Baoquan He <bhe@redhat.com>
	Cc: Dave Young <dyoung@redhat.com>
	Cc: Vivek Goyal <vgoyal@redhat.com>
	Cc: <stable@vger.kernel.org>
	Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit cbc5dde)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Mike Christie <michael.christie@oracle.com>
commit 8604f63

scsi_check_passthrough() is always called, but it doesn't check for if a
command completed successfully. As a result, if a command was successful and
the caller used SCMD_FAILURE_RESULT_ANY to indicate what failures it wanted
to retry, we will end up retrying the command. This will cause delays during
device discovery because of the command being sent multiple times. For some
USB devices it can also cause the wrong device size to be used.

This patch adds a check for if the command was successful. If it is we
return immediately instead of trying to match a failure.

Fixes: 994724e ("scsi: core: Allow passthrough to request midlayer retries")
	Reported-by: Kris Karas <bugs-a21@moonlit-rail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219652
	Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20250107010220.7215-1-michael.christie@oracle.com
	Reviewed-by: Bart Van Assche <bvanassche@acm.org>
	Reviewed-by: John Garry <john.g.garry@oracle.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8604f63)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
cve CVE-2024-57807
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Tomas Henzl <thenzl@redhat.com>
commit 50740f4

This fixes a 'possible circular locking dependency detected' warning
      CPU0                    CPU1
      ----                    ----
 lock(&instance->reset_mutex);
                              lock(&shost->scan_mutex);
                              lock(&instance->reset_mutex);
 lock(&shost->scan_mutex);

Fix this by temporarily releasing the reset_mutex.

	Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20240923174833.45345-1-thenzl@redhat.com
	Acked-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 50740f4)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Paulo Alcantara <pc@manguebit.com>
commit 654292a

When the user sets a file or directory as read-only (e.g. ~S_IWUGO),
the client will set the ATTR_READONLY attribute by sending an
SMB2_SET_INFO request to the server in cifs_setattr_{,nounix}(), but
cifsInodeInfo::cifsAttrs will be left unchanged as the client will
only update the new file attributes in the next call to
{smb311_posix,cifs}_get_inode_info() with the new metadata filled in
@DaTa parameter.

Commit a18280e ("smb: cilent: set reparse mount points as
automounts") mistakenly removed the @DaTa NULL check when calling
is_inode_cache_good(), which broke the above case as the new
ATTR_READONLY attribute would end up not being updated on files with a
read lease.

Fix this by updating the inode whenever we have cached metadata in
@DaTa parameter.

	Reported-by: Horst Reiterer <horst.reiterer@fabasoft.com>
Closes: https://lore.kernel.org/r/85a16504e09147a195ac0aac1c801280@fabasoft.com
Fixes: a18280e ("smb: cilent: set reparse mount points as automounts")
	Cc: stable@vger.kernel.org
	Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
	Signed-off-by: Steve French <stfrench@microsoft.com>
(cherry picked from commit 654292a)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author BH Hsieh <bhsieh@nvidia.com>
commit 55f1a5f

Observed VBUS_OVERRIDE & ID_OVERRIDE might be programmed
with unexpected value prior to XUSB PADCTL driver, this
could also occur in virtualization scenario.

For example, UEFI firmware programs ID_OVERRIDE=GROUNDED to set
a type-c port to host mode and keeps the value to kernel.
If the type-c port is connected a usb host, below errors can be
observed right after usb host mode driver gets probed. The errors
would keep until usb role class driver detects the type-c port
as device mode and notifies usb device mode driver to set both
ID_OVERRIDE and VBUS_OVERRIDE to correct value by XUSB PADCTL
driver.

[  173.765814] usb usb3-port2: Cannot enable. Maybe the USB cable is bad?
[  173.765837] usb usb3-port2: config error

Taking virtualization into account, asserting XUSB PADCTL
reset would break XUSB functions used by other guest OS,
hence only reset VBUS & ID OVERRIDE of the port in
utmi_phy_init.

Fixes: bbf7116 ("phy: tegra: xusb: Add Tegra186 support")
	Cc: stable@vger.kernel.org
Change-Id: Ic63058d4d49b4a1f8f9ab313196e20ad131cc591
	Signed-off-by: BH Hsieh <bhsieh@nvidia.com>
	Signed-off-by: Henry Lin <henryl@nvidia.com>
Link: https://lore.kernel.org/r/20250122105943.8057-1-henryl@nvidia.com
	Signed-off-by: Vinod Koul <vkoul@kernel.org>
(cherry picked from commit 55f1a5f)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Jeff Layton <jlayton@kernel.org>
commit b9382e2

nfsd_file_dispose_list_delayed can be called from the filecache
laundrette, which is shut down after the nfsd threads are shut down and
the nfsd_serv pointer is cleared. If nn->nfsd_serv is NULL then there
are no threads to wake.

Ensure that the nn->nfsd_serv pointer is non-NULL before calling
svc_wake_up in nfsd_file_dispose_list_delayed. This is safe since the
svc_serv is not freed until after the filecache laundrette is cancelled.

	Reported-by: Salvatore Bonaccorso <carnil@debian.org>
Closes: https://bugs.debian.org/1093734
Fixes: ffb4025 ("nfsd: Don't leave work of closing files to a work queue")
	Cc: stable@vger.kernel.org
	Signed-off-by: Jeff Layton <jlayton@kernel.org>
	Reviewed-by: NeilBrown <neilb@suse.de>
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
(cherry picked from commit b9382e2)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
cve CVE-2024-56623
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Quinn Tran <qutran@marvell.com>
commit 07c903d

System crash is observed with stack trace warning of use after
free. There are 2 signals to tell dpc_thread to terminate (UNLOADING
flag and kthread_stop).

On setting the UNLOADING flag when dpc_thread happens to run at the time
and sees the flag, this causes dpc_thread to exit and clean up
itself. When kthread_stop is called for final cleanup, this causes use
after free.

Remove UNLOADING signal to terminate dpc_thread.  Use the kthread_stop
as the main signal to exit dpc_thread.

[596663.812935] kernel BUG at mm/slub.c:294!
[596663.812950] invalid opcode: 0000 [#1] SMP PTI
[596663.812957] CPU: 13 PID: 1475935 Comm: rmmod Kdump: loaded Tainted: G          IOE    --------- -  - 4.18.0-240.el8.x86_64 #1
[596663.812960] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/20/2012
[596663.812974] RIP: 0010:__slab_free+0x17d/0x360

...
[596663.813008] Call Trace:
[596663.813022]  ? __dentry_kill+0x121/0x170
[596663.813030]  ? _cond_resched+0x15/0x30
[596663.813034]  ? _cond_resched+0x15/0x30
[596663.813039]  ? wait_for_completion+0x35/0x190
[596663.813048]  ? try_to_wake_up+0x63/0x540
[596663.813055]  free_task+0x5a/0x60
[596663.813061]  kthread_stop+0xf3/0x100
[596663.813103]  qla2x00_remove_one+0x284/0x440 [qla2xxx]

	Cc: stable@vger.kernel.org
	Signed-off-by: Quinn Tran <qutran@marvell.com>
	Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20241115130313.46826-3-njavali@marvell.com
	Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 07c903d)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit 03ff378

When function evict_should_delete() returns SHOULD_DEFER_EVICTION, gh is
never initialized, but that isn't obvious; if it did initialize gh and
then return SHOULD_DEFER_EVICTION, gfs2_evict_inode() would fail to
release it.  To clarify the code, change gfs2_evict_inode() to always
check if gh needs to be released, no matter what evict_should_delete()
returns.

	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 03ff378)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit 5788253

Add a number of glock flags are currently not shown in the text form of
glock tracepoints.

	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 5788253)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit f83f897

Glocks are always actively acquired by processes, but as indicated by
the GL_NOPID holder flag, some of them are then associated with objects
like cached inodes rather than the process that acquired them.  As such,
for those glock holders, it makes little sense to dump which processes
originally acquired them.

Therefore, gfs2 is trying to hide the identity of the processes that
acquired those glocks.  The code for doing that is incorrect though, so
fix it.

	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit f83f897)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit 8bbfde0

Introduce a new GLF_PENDING_REPLY flag to indicate that a reply from DLM
is expected.  Include that flag in glock dumps to show more clearly
what's going on.  (When the GLF_PENDING_REPLY flag is set, the GLF_LOCK
flag will also be set but the GLF_LOCK flag alone isn't sufficient to
tell that we are waiting for a DLM reply.)

	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 8bbfde0)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit 3774f53

Having this flag attached to the iopen glock instead of the inode is
much simpler; it eliminates a protential weird race in gfs2_try_evict().

	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 3774f53)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit 0b93bac

The last user of this flag was removed in commit b77b4a4 ("gfs2:
Rework freeze / thaw logic").

	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 0b93bac)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Su Hui <suhui@nfschina.com>
commit bb25b97

clang static analyzer complains that value stored to 'gh' is never read.
The code of this line is useless after commit 0b93bac
("gfs2: Remove LM_FLAG_PRIORITY flag"). Remove this code to save space.

	Signed-off-by: Su Hui <suhui@nfschina.com>
	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit bb25b97)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit 0360fac

Remove some more dead code in add_to_queue() that commit 0b93bac
("gfs2: Remove LM_FLAG_PRIORITY flag") has rendered obsolete.  This is a
continuation of commit 3302764610057 ("gfs2: remove dead code in
add_to_queue"); no functional change.

	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 0360fac)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit d838605

In run_queue(), check if the queue of pending requests is empty instead
of blindly assuming that it won't be.

	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit d838605)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
PlaidCat added 18 commits May 21, 2025 16:52
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit 43a17fc

Since isolated CPUs can be reserved at boot time via the "isolcpus"
boot command line option, these pre-isolated CPUs may interfere with
testing done by test_cpuset_prs.sh.

With the previous commit that incorporates those boot time isolated CPUs
into "cpuset.cpus.isolated", we can check for those before testing is
started to make sure that there will be no interference.  Otherwise,
this test will be skipped if incorrect test failure can happen.

As "cpuset.cpus.isolated" is now available in a non cgroup_debug kernel,
we don't need to check for its existence anymore.

	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit 43a17fc)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author everestkc <everestkc@everestkc.com.np>
commit 95a616d

Corrected the spelling errors repoted by codespell as follows:
	temparary ==> temporary
        Proprogate ==> Propagate
        constrainted ==> constrained

	Signed-off-by: Everest K.C. <everestkc@everestkc.com.np>
	Acked-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit 95a616d)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…pdate_cpumasks_hier()"

jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit bcd7012

Revert commit 3ae0b77 ("cgroup/cpuset: Allow suppression of sched
domain rebuild in update_cpumasks_hier()") to allow for an alternative
way to suppress unnecessary rebuild_sched_domains_locked() calls in
update_cpumasks_hier() and elsewhere in a following commit.

	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit bcd7012)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…l per operation

jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit a040c35

Since commit ff0ce72 ("cgroup/cpuset: Eliminate unncessary
sched domains rebuilds in hotplug"), there is only one
rebuild_sched_domains_locked() call per hotplug operation. However,
writing to the various cpuset control files may still casue more than
one rebuild_sched_domains_locked() call to happen in some cases.

Juri had found that two rebuild_sched_domains_locked() calls in
update_prstate(), one from update_cpumasks_hier() and another one from
update_partition_sd_lb() could cause cpuset partition to be created
with null total_bw for DL tasks. IOW, DL tasks may not be scheduled
correctly in such a partition.

A sample command sequence that can reproduce null total_bw is as
follows.

  # echo Y >/sys/kernel/debug/sched/verbose
  # echo +cpuset >/sys/fs/cgroup/cgroup.subtree_control
  # mkdir /sys/fs/cgroup/test
  # echo 0-7 > /sys/fs/cgroup/test/cpuset.cpus
  # echo 6-7 > /sys/fs/cgroup/test/cpuset.cpus.exclusive
  # echo root >/sys/fs/cgroup/test/cpuset.cpus.partition

Fix this double rebuild_sched_domains_locked() calls problem
by replacing existing calls with cpuset_force_rebuild() except
the rebuild_sched_domains_cpuslocked() call at the end of
cpuset_handle_hotplug(). Checking of the force_sd_rebuild flag is
now done at the end of cpuset_write_resmask() and update_prstate()
to determine if rebuild_sched_domains_locked() should be called or not.

The cpuset v1 code can still call rebuild_sched_domains_locked()
directly as double rebuild_sched_domains_locked() calls is not possible.

	Reported-by: Juri Lelli <juri.lelli@redhat.com>
Closes: https://lore.kernel.org/lkml/ZyuUcJDPBln1BK1Y@jlelli-thinkpadt14gen4.remote.csb/
	Signed-off-by: Waiman Long <longman@redhat.com>
	Tested-by: Juri Lelli <juri.lelli@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit a040c35)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit 9b496a8

Isolated CPUs are not allowed to be used in a non-isolated partition.
The only exception is the top cpuset which is allowed to contain boot
time isolated CPUs.

Commit ccac8e8 ("cgroup/cpuset: Fix remote root partition creation
problem") introduces a simplified scheme of including only partition
roots in sched domain generation. However, it does not properly account
for this exception case. This can result in leakage of isolated CPUs
into a sched domain.

Fix it by making sure that isolated CPUs are excluded from the top
cpuset before generating sched domains.

Also update the way the boot time isolated CPUs are handled in
test_cpuset_prs.sh to make sure that those isolated CPUs are really
isolated instead of just skipping them in the tests.

Fixes: ccac8e8 ("cgroup/cpuset: Fix remote root partition creation problem")
	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit 9b496a8)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit a22b3d5
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.17.1.el9_6/a22b3d54.failed

There is a possible race between removing a cgroup diectory that is
a partition root and the creation of a new partition.  The partition
to be removed can be dying but still online, it doesn't not currently
participate in checking for exclusive CPUs conflict, but the exclusive
CPUs are still there in subpartitions_cpus and isolated_cpus. These
two cpumasks are global states that affect the operation of cpuset
partitions. The exclusive CPUs in dying cpusets will only be removed
when cpuset_css_offline() function is called after an RCU delay.

As a result, it is possible that a new partition can be created with
exclusive CPUs that overlap with those of a dying one. When that dying
partition is finally offlined, it removes those overlapping exclusive
CPUs from subpartitions_cpus and maybe isolated_cpus resulting in an
incorrect CPU configuration.

This bug was found when a warning was triggered in
remote_partition_disable() during testing because the subpartitions_cpus
mask was empty.

One possible way to fix this is to iterate the dying cpusets as well and
avoid using the exclusive CPUs in those dying cpusets. However, this
can still cause random partition creation failures or other anomalies
due to racing. A better way to fix this race is to reset the partition
state at the moment when a cpuset is being killed.

Introduce a new css_killed() CSS function pointer and call it, if
defined, before setting CSS_DYING flag in kill_css(). Also update the
css_is_dying() helper to use the CSS_DYING flag introduced by commit
33c35aa ("cgroup: Prevent kill_css() from being called more than
once") for proper synchronization.

Add a new cpuset_css_killed() function to reset the partition state of
a valid partition root if it is being killed.

Fixes: ee8dde0 ("cpuset: Add new v2 cpuset.sched.partition flag")
	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit a22b3d5)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	kernel/cgroup/cpuset.c
…fective_cpumask()

jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit 668e041

Before commit f0af1bf ("cgroup/cpuset: Relax constraints to
partition & cpus changes"), a cpuset partition cannot be enabled if not
all the requested CPUs can be granted from the parent cpuset. After
that commit, a cpuset partition can be created even if the requested
exclusive CPUs contain CPUs not allowed its parent.  The delmask
containing exclusive CPUs to be removed from its parent wasn't
adjusted accordingly.

That is not a problem until the introduction of a new isolated_cpus
mask in commit 11e5f40 ("cgroup/cpuset: Keep track of CPUs in
isolated partitions") as the CPUs in the delmask may be added directly
into isolated_cpus.

As a result, isolated_cpus may incorrectly contain CPUs that are not
isolated leading to incorrect data reporting. Fix this by adjusting
the delmask to reflect the actual exclusive CPUs for the creation of
the partition.

Fixes: 11e5f40 ("cgroup/cpuset: Keep track of CPUs in isolated partitions")
	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit 668e041)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit 8bf450f

When remote_partition_disable() is called to disable a remote partition,
it always sets the partition to an invalid partition state. It should
only do so if an error code (prs_err) has been set. Correct that and
add proper error code in places where remote_partition_disable() is
called due to error.

Fixes: 181c8e0 ("cgroup/cpuset: Introduce remote partition")
	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit 8bf450f)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…_hier() handle remote partition

jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit f62a5d3

Currently, changes in exclusive CPUs are being handled in
remote_partition_check() by disabling conflicting remote partitions.
However, that may lead to results unexpected by the users. Fix
this problem by removing remote_partition_check() and making
update_cpumasks_hier() handle changes in descendant remote partitions
properly.

The compute_effective_exclusive_cpumask() function is enhanced to check
the exclusive_cpus and effective_xcpus from siblings and excluded them
in its effective exclusive CPUs computation and return a value to show if
there is any sibling conflicts.  This is somewhat like the cpu_exclusive
flag check in validate_change(). This is the initial step to enable us
to retire the use of cpu_exclusive flag in cgroup v2 in the future.

One of the tests in the TEST_MATRIX of the test_cpuset_prs.sh
script has to be updated due to changes in the way a child remote
partition root is being handled (updated instead of invalidation)
in update_cpumasks_hier().

	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit f62a5d3)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit f0a0bd3

Rename partition_xcpus_newstate() to isolated_cpus_update(),
update_partition_exclusive() to update_partition_exclusive_flag() and
the new_xcpus_state variable to isolcpus_updated to make their meanings
more explicit. Also add some comments to further clarify the code.
No functional change is expected.

	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit f0a0bd3)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
… and state separator

jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit 65046b5

Currently, ',' is used as the cgroup separator of the expected effective
CPUs and partition root states in the test matrix. However, ',' can be
part of the output of the cpuset.cpus*.effective and cpuset.cpus.isolated
files. Change the separator to '|' so that ',' can appear as part of
the expected values.

	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit 65046b5)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit b2b2b4d

Cleaning up the test_cpuset_prs.sh script and restructure some of the
functions so that a new test matrix with a different cgroup directory
structure can be added in the next patch.

	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit b2b2b4d)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…t_prs.sh

jira NONE_AUTOMATION
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Waiman Long <longman@redhat.com>
commit e8a457b

The current cgroup directory layout for running the partition state
transition tests is mainly suitable for testing local partitions as
well as with a mix of local and remote partitions. It is not that
suitable for doing extensive remote partition and nested remote/local
partition testing.

Add a new set of remote partition tests REMOTE_TEST_MATRIX with another
cgroup directory structure more tailored for remote partition testing
to provide better code coverage.

Also add a few new test cases as well as adjusting existig ones for
the original TEST_MATRIX.

	Signed-off-by: Waiman Long <longman@redhat.com>
	Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit e8a457b)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
cve CVE-2025-37749
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Arnaud Lecomte <contact@arnaud-lcm.com>
commit aabc659

Ensure we have enough data in linear buffer from skb before accessing
initial bytes. This prevents potential out-of-bounds accesses
when processing short packets.

When ppp_sync_txmung receives an incoming package with an empty
payload:
(remote) gef➤  p *(struct pppoe_hdr *) (skb->head + skb->network_header)
$18 = {
	type = 0x1,
	ver = 0x1,
	code = 0x0,
	sid = 0x2,
        length = 0x0,
	tag = 0xffff8880371cdb96
}

from the skb struct (trimmed)
      tail = 0x16,
      end = 0x140,
      head = 0xffff88803346f400 "4",
      data = 0xffff88803346f416 ":\377",
      truesize = 0x380,
      len = 0x0,
      data_len = 0x0,
      mac_len = 0xe,
      hdr_len = 0x0,

it is not safe to access data[2].

	Reported-by: syzbot+29fc8991b0ecb186cf40@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=29fc8991b0ecb186cf40
	Tested-by: syzbot+29fc8991b0ecb186cf40@syzkaller.appspotmail.com
Fixes: 1da177e ("Linux-2.6.12-rc2")
	Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com>
Link: https://patch.msgid.link/20250408-bound-checking-ppp_txmung-v2-1-94bb6e1b92d0@arnaud-lcm.com
[pabeni@redhat.com: fixed subj typo]
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>
(cherry picked from commit aabc659)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
cve CVE-2025-21756
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Michal Luczaj <mhal@rbox.co>
commit 135ffc7

vsock defines a BPF callback to be invoked when close() is called. However,
this callback is never actually executed. As a result, a closed vsock
socket is not automatically removed from the sockmap/sockhash.

Introduce a dummy vsock_close() and make vsock_release() call proto::close.

Note: changes in __vsock_release() look messy, but it's only due to indent
level reduction and variables xmas tree reorder.

Fixes: 634f1a7 ("vsock: support sockmap")
	Signed-off-by: Michal Luczaj <mhal@rbox.co>
	Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
	Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Link: https://lore.kernel.org/r/20241118-vsock-bpf-poll-close-v1-3-f1b9669cacdc@rbox.co
	Signed-off-by: Alexei Starovoitov <ast@kernel.org>
	Acked-by: John Fastabend <john.fastabend@gmail.com>
(cherry picked from commit 135ffc7)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
cve CVE-2025-21756
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Michal Luczaj <mhal@rbox.co>
commit fcdd224

Preserve sockets bindings; this includes both resulting from an explicit
bind() and those implicitly bound through autobind during connect().

Prevents socket unbinding during a transport reassignment, which fixes a
use-after-free:

    1. vsock_create() (refcnt=1) calls vsock_insert_unbound() (refcnt=2)
    2. transport->release() calls vsock_remove_bound() without checking if
       sk was bound and moved to bound list (refcnt=1)
    3. vsock_bind() assumes sk is in unbound list and before
       __vsock_insert_bound(vsock_bound_sockets()) calls
       __vsock_remove_bound() which does:
           list_del_init(&vsk->bound_table); // nop
           sock_put(&vsk->sk);               // refcnt=0

BUG: KASAN: slab-use-after-free in __vsock_bind+0x62e/0x730
Read of size 4 at addr ffff88816b46a74c by task a.out/2057
 dump_stack_lvl+0x68/0x90
 print_report+0x174/0x4f6
 kasan_report+0xb9/0x190
 __vsock_bind+0x62e/0x730
 vsock_bind+0x97/0xe0
 __sys_bind+0x154/0x1f0
 __x64_sys_bind+0x6e/0xb0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Allocated by task 2057:
 kasan_save_stack+0x1e/0x40
 kasan_save_track+0x10/0x30
 __kasan_slab_alloc+0x85/0x90
 kmem_cache_alloc_noprof+0x131/0x450
 sk_prot_alloc+0x5b/0x220
 sk_alloc+0x2c/0x870
 __vsock_create.constprop.0+0x2e/0xb60
 vsock_create+0xe4/0x420
 __sock_create+0x241/0x650
 __sys_socket+0xf2/0x1a0
 __x64_sys_socket+0x6e/0xb0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2057:
 kasan_save_stack+0x1e/0x40
 kasan_save_track+0x10/0x30
 kasan_save_free_info+0x37/0x60
 __kasan_slab_free+0x4b/0x70
 kmem_cache_free+0x1a1/0x590
 __sk_destruct+0x388/0x5a0
 __vsock_bind+0x5e1/0x730
 vsock_bind+0x97/0xe0
 __sys_bind+0x154/0x1f0
 __x64_sys_bind+0x6e/0xb0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

refcount_t: addition on 0; use-after-free.
WARNING: CPU: 7 PID: 2057 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150
RIP: 0010:refcount_warn_saturate+0xce/0x150
 __vsock_bind+0x66d/0x730
 vsock_bind+0x97/0xe0
 __sys_bind+0x154/0x1f0
 __x64_sys_bind+0x6e/0xb0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

refcount_t: underflow; use-after-free.
WARNING: CPU: 7 PID: 2057 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150
RIP: 0010:refcount_warn_saturate+0xee/0x150
 vsock_remove_bound+0x187/0x1e0
 __vsock_release+0x383/0x4a0
 vsock_release+0x90/0x120
 __sock_release+0xa3/0x250
 sock_close+0x14/0x20
 __fput+0x359/0xa80
 task_work_run+0x107/0x1d0
 do_exit+0x847/0x2560
 do_group_exit+0xb8/0x250
 __x64_sys_exit_group+0x3a/0x50
 x64_sys_call+0xfec/0x14f0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fixes: c0cfa2d ("vsock: add multi-transports support")
	Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
	Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-1-1cf57065b770@rbox.co
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit fcdd224)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira NONE_AUTOMATION
cve CVE-2025-21756
Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6
commit-author Michal Luczaj <mhal@rbox.co>
commit 78dafe1

During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

Orphan the socket only after transport release.

Partially reverts the 'Fixes:' commit.

KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
 lock_acquire+0x19e/0x500
 _raw_spin_lock_irqsave+0x47/0x70
 add_wait_queue+0x46/0x230
 virtio_transport_release+0x4e7/0x7f0
 __vsock_release+0xfd/0x490
 vsock_release+0x90/0x120
 __sock_release+0xa3/0x250
 sock_close+0x14/0x20
 __fput+0x35e/0xa90
 __x64_sys_close+0x78/0xd0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

	Reported-by: syzbot+9d55b199192a4be7d02c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c
Fixes: fcdd224 ("vsock: Keep the binding until socket destruction")
	Tested-by: Luigi Leonardi <leonardi@redhat.com>
	Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
	Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250210-vsock-linger-nullderef-v3-1-ef6244d02b54@rbox.co
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 78dafe1)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..kernel-mainline: 296506
Number of commits in rpm: 40
Number of commits matched with upstream: 37 (92.50%)
Number of commits in upstream but not in rpm: 296469
Number of commits NOT found in upstream: 3 (7.50%)

Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.17.1.el9_6 for kernel-5.14.0-570.17.1.el9_6
Clean Cherry Picks: 35 (94.59%)
Empty Cherry Picks: 2 (5.41%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-5.14.0-570.17.1.el9_6/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
@PlaidCat PlaidCat self-assigned this May 21, 2025
@PlaidCat PlaidCat marked this pull request as ready for review May 22, 2025 13:44
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

@PlaidCat PlaidCat merged commit 838cd1e into rocky9_6 May 22, 2025
4 checks passed
@PlaidCat PlaidCat deleted the rocky9_6_rebuild branch May 22, 2025 19:21
github-actions bot pushed a commit that referenced this pull request Oct 1, 2025
[BUG DURING BS > PS TEST]
When running the following script on a btrfs whose block size is larger
than page size, e.g. 8K block size and 4K page size, it will trigger a
kernel BUG:

  # mkfs.btrfs -s 8k $dev
  # mount $dev $mnt
  # mkdir $mnt/dir
  # ln -s dir $mnt/link
  # ls $mnt/link

The call trace looks like this:

  BTRFS warning (device dm-2): support for block size 8192 with page size 4096 is experimental, some features may be missing
  BTRFS info (device dm-2): checking UUID tree
  BTRFS info (device dm-2): enabling ssd optimizations
  BTRFS info (device dm-2): enabling free space tree
  ------------[ cut here ]------------
  kernel BUG at /home/adam/linux/include/linux/highmem.h:275!
  Oops: invalid opcode: 0000 [#1] SMP
  CPU: 8 UID: 0 PID: 667 Comm: ls Tainted: G           OE       6.17.0-rc4-custom+ #283 PREEMPT(full)
  Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
  RIP: 0010:zero_user_segments.constprop.0+0xdc/0xe0 [btrfs]
  Call Trace:
   <TASK>
   btrfs_get_extent.cold+0x85/0x101 [btrfs 7453c70c03e631c8d8bfdd4264fa62d3e238da6f]
   btrfs_do_readpage+0x244/0x750 [btrfs 7453c70c03e631c8d8bfdd4264fa62d3e238da6f]
   btrfs_read_folio+0x9c/0x100 [btrfs 7453c70c03e631c8d8bfdd4264fa62d3e238da6f]
   filemap_read_folio+0x37/0xe0
   do_read_cache_folio+0x94/0x3e0
   __page_get_link.isra.0+0x20/0x90
   page_get_link+0x16/0x40
   step_into+0x69b/0x830
   path_lookupat+0xa7/0x170
   filename_lookup+0xf7/0x200
   ? set_ptes.isra.0+0x36/0x70
   vfs_statx+0x7a/0x160
   do_statx+0x63/0xa0
   __x64_sys_statx+0x90/0xe0
   do_syscall_64+0x82/0xae0
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>

Please note bs > ps support is still under development and the
enablement patch is not even in btrfs development branch.

[CAUSE]
Btrfs reuses its data folio read path to handle symbolic links, as the
symbolic link target is stored as an inline data extent.

But for newly created inodes, btrfs only set the minimal order if the
target inode is a regular file.

Thus for above newly created symbolic link, it doesn't properly respect
the minimal folio order, and triggered the above crash.

[FIX]
Call btrfs_set_inode_mapping_order() unconditionally inside
btrfs_create_new_inode().

For symbolic links this will fix the crash as now the folio will meet
the minimal order.

For regular files this brings no change.

For directory/bdev/char and all the other types of inodes, they won't
go through the data read path, thus no effect either.

Fixes: cc38d17 ("btrfs: enable large data folio support under CONFIG_BTRFS_EXPERIMENTAL")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants