[CBR 7.9] perf: Disallow mis-matched inherited group reads #484

pvts-mat · 2025-08-11T16:00:24Z

[CBR 7.9]
CVE-2023-5717
VULN-7623

Problem

https://www.cve.org/CVERecord?id=CVE-2023-5717

A heap out-of-bounds write vulnerability in the Linux kernel's Linux Kernel Performance Events (perf) component can be exploited to achieve local privilege escalation. If perf_read_group() is called while an event's sibling_list is smaller than its child's sibling_list, it can increment or write to memory locations outside of the allocated buffer.

Applicability: yes

The perf component is included with the CONFIG_PERF_EVENTS option, which is enabled in all ciqcbr7_9 configs

$ grep 'CONFIG_PERF_EVENTS\b' configs/*.config

configs/kernel-3.10.0-x86_64-debug.config:CONFIG_PERF_EVENTS=y
configs/kernel-3.10.0-x86_64.config:CONFIG_PERF_EVENTS=y

The commit "flipping the order of child_list and sibling_list" which introduced the bug - fa8c269 - was backported to ciqcbr7_9 in 170ca9a. The fixing commit 32671e3 is missing and wasn't backported.

Solution

The mainline fix 32671e3 adds a new group_generation field to the perf_event struct. This breaks CBR 7.9 kABI. The field was preserved, but moved to the end of the struct and wrapped in the RH_KABI_EXTEND macro. Unlike in the case of LTS 8.6 (#475) the investigation of whether it's safe to do was not necessary, because the struct already contained multiple RH_KABI_EXTEND(…) fields at the end, which could not have been added otherwise:

kernel-src-tree/include/linux/perf_event.h

Lines 586 to 615 in 10329e4

    
           	RH_KABI_EXTEND(struct list_head		migrate_entry) 
        
           	RH_KABI_EXTEND(struct list_head		active_entry) 
        
           	RH_KABI_EXTEND(void			*pmu_private) 
        
           #if defined(CONFIG_FUNCTION_TRACER) && !defined(CONFIG_X86_64) 
        
           	RH_KABI_EXTEND(struct ftrace_ops	 ftrace_ops) 
        
           #endif 
        
           	/* address range filters */ 
        
           	RH_KABI_EXTEND(struct perf_addr_filters_head	 addr_filters) 
        
           	/* vma address array for file-based filders */ 
        
           	RH_KABI_EXTEND(unsigned long			*addr_filters_offs) 
        
           	RH_KABI_EXTEND(unsigned long			 addr_filters_gen) 
        
           	RH_KABI_EXTEND(struct list_head			 sb_list) 
        
           	RH_KABI_EXTEND(u64				(*clock)(void)) 
        
           	/* The cumulative AND of all event_caps for events in this group. */ 
        
           	RH_KABI_EXTEND(int				group_caps) 
        
           	/* 
        
           	 * Node on the pinned or flexible tree located at the event context; 
        
           	 */ 
        
           	RH_KABI_EXTEND(struct rb_node			group_node) 
        
           	RH_KABI_EXTEND(u64				group_index) 
        
           	RH_KABI_EXTEND(struct list_head			active_list) 
        
           #ifdef CONFIG_BPF_SYSCALL 
        
           	RH_KABI_EXTEND(perf_overflow_handler_t		orig_overflow_handler) 
        
           	RH_KABI_EXTEND(struct bpf_prog			*prog) 
        
           #endif 
        
           	RH_KABI_EXTEND(unsigned long			rcu_batches) 
        
           	RH_KABI_EXTEND(int				rcu_pending) 
        
           #endif /* CONFIG_PERF_EVENTS */ 
        
           };

Additionally, a fix-of-the-fix on the mainlie was commited in a71ef31 which was also included in this backport.

kABI check: passed

$ python /mnt/code/kernel-dist-git-el-7.9/SOURCES/check-kabi -k /mnt/code/kernel-dist-git-el-7.9/SOURCES/Module.kabi_x86_64 -s /mnt/build_files/kernel-src-tree-ciqcbr7_9-CVE-2023-5717/Module.symvers
$ echo $?
0

Boot test: passed

boot-test.log

Kselftests: passed relative

Reference

kselftests–ciqcbr7_9–run1.log
kselftests–ciqcbr7_9–run2.log
kselftests–ciqcbr7_9–run3.log

Patch

kselftests–ciqcbr7_9-CVE-2023-5717–run1.log
kselftests–ciqcbr7_9-CVE-2023-5717–run2.log
kselftests–ciqcbr7_9-CVE-2023-5717–run3.log

Comparison

The results were compared manually with Meld. No differences indicative of a problem introduced by the patch were found.

Specific tests: passed

While not strictly testing the provided patch, a very basic sanity check of the perf_events module was done to see if it remains functional.

Reference

$ uname -r 
3.10.0-ciqcbr7_9
$ sudo perf stat -B dd if=/dev/zero of=/dev/null count=1000000

1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 1.36361 s, 375 MB/s

 Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':

          1,366.22 msec task-clock                #    0.995 CPUs utilized          
                 3      context-switches          #    0.002 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               217      page-faults               #    0.159 K/sec                  
     5,856,326,009      cycles                    #    4.287 GHz                    
     2,520,750,706      instructions              #    0.43  insn per cycle         
       544,415,208      branches                  #  398.483 M/sec                  
        11,042,875      branch-misses             #    2.03% of all branches        

       1.372717065 seconds time elapsed

       0.602402000 seconds user
       0.770071000 seconds sys

Patch

$ uname -r 
3.10.0-ciqcbr7_9-CVE-2023-5717
$ sudo perf stat -B dd if=/dev/zero of=/dev/null count=1000000

1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 1.39469 s, 367 MB/s

 Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':

          1,396.45 msec task-clock                #    0.995 CPUs utilized          
                 5      context-switches          #    0.004 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               218      page-faults               #    0.156 K/sec                  
     5,900,275,843      cycles                    #    4.225 GHz                    
     2,520,173,133      instructions              #    0.43  insn per cycle         
       544,174,329      branches                  #  389.684 M/sec                  
        11,045,660      branch-misses             #    2.03% of all branches        

       1.404119544 seconds time elapsed

       0.669277000 seconds user
       0.734597000 seconds sys

jira VULN-7623 cve CVE-2023-5717 commit-author Peter Zijlstra <peterz@infradead.org> commit 32671e3 upstream-diff The mainline fix 32671e3 adds a new `group_generation' field to the `perf_event' struct. This breaks CBR 7.9 kABI. The new field was preserved, but moved to the end of the struct and wrapped in the `RH_KABI_EXTEND' macro. It can be assumed the kABI in this particular case is preserved based on the fact that there are already plenty of `RH_KABI_EXTEND(...)' fields at the end which could not have been added if the premise was false. Because group consistency is non-atomic between parent (filedesc) and children (inherited) events, it is possible for PERF_FORMAT_GROUP read() to try and sum non-matching counter groups -- with non-sensical results. Add group_generation to distinguish the case where a parent group removes and adds an event and thus has the same number, but a different configuration of events as inherited groups. This became a problem when commit fa8c269 ("perf/core: Invert perf_read_group() loops") flipped the order of child_list and sibling_list. Previously it would iterate the group (sibling_list) first, and for each sibling traverse the child_list. In this order, only the group composition of the parent is relevant. By flipping the order the group composition of the child (inherited) events becomes an issue and the mis-match in group composition becomes evident. That said; even prior to this commit, while reading of a group that is not equally inherited was not broken, it still made no sense. (Ab)use ECHILD as error return to indicate issues with child process group composition. Fixes: fa8c269 ("perf/core: Invert perf_read_group() loops") Reported-by: Budimir Markovic <markovicbudimir@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231018115654.GK33217@noisy.programming.kicks-ass.net (cherry picked from commit 32671e3) Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>

jira VULN-7623 cve-bf CVE-2023-5717 commit-author Peter Zijlstra <peterz@infradead.org> commit a71ef31 Smatch is awesome. Fixes: 32671e3 ("perf: Disallow mis-matched inherited group reads") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit a71ef31) Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>

bmastbergen

🥌

thefossguy-ciq

🚤

Migration may be raced with fallocating hole. remove_inode_single_folio will unmap the folio if the folio is still mapped. However, it's called without folio lock. If the folio is migrated and the mapped pte has been converted to migration entry, folio_mapped() returns false, and won't unmap it. Due to extra refcount held by remove_inode_single_folio, migration fails, restores migration entry to normal pte, and the folio is mapped again. As a result, we triggered BUG in filemap_unaccount_folio. The log is as follows: BUG: Bad page cache in process hugetlb pfn:156c00 page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00 head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0 aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file" flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f4(hugetlb) page dumped because: still mapped when deleted CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x4f/0x70 filemap_unaccount_folio+0xc4/0x1c0 __filemap_remove_folio+0x38/0x1c0 filemap_remove_folio+0x41/0xd0 remove_inode_hugepages+0x142/0x250 hugetlbfs_fallocate+0x471/0x5a0 vfs_fallocate+0x149/0x380 Hold folio lock before checking if the folio is mapped to avold race with migration. Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com Fixes: 4aae8d1 ("mm/hugetlbfs: unmap pages if page fault raced with hole punch") Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

commit 7b73876 upstream. Migration may be raced with fallocating hole. remove_inode_single_folio will unmap the folio if the folio is still mapped. However, it's called without folio lock. If the folio is migrated and the mapped pte has been converted to migration entry, folio_mapped() returns false, and won't unmap it. Due to extra refcount held by remove_inode_single_folio, migration fails, restores migration entry to normal pte, and the folio is mapped again. As a result, we triggered BUG in filemap_unaccount_folio. The log is as follows: BUG: Bad page cache in process hugetlb pfn:156c00 page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00 head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0 aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file" flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f4(hugetlb) page dumped because: still mapped when deleted CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x4f/0x70 filemap_unaccount_folio+0xc4/0x1c0 __filemap_remove_folio+0x38/0x1c0 filemap_remove_folio+0x41/0xd0 remove_inode_hugepages+0x142/0x250 hugetlbfs_fallocate+0x471/0x5a0 vfs_fallocate+0x149/0x380 Hold folio lock before checking if the folio is mapped to avold race with migration. Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com Fixes: 4aae8d1 ("mm/hugetlbfs: unmap pages if page fault raced with hole punch") Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pvts-mat added 2 commits August 11, 2025 17:05

PlaidCat requested review from bmastbergen, kerneltoast, shreeya-patel98 and thefossguy-ciq August 12, 2025 20:13

bmastbergen approved these changes Aug 12, 2025

View reviewed changes

thefossguy-ciq approved these changes Aug 13, 2025

View reviewed changes

PlaidCat merged commit 83b75d7 into ctrliq:ciqcbr7_9 Aug 13, 2025
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CBR 7.9] perf: Disallow mis-matched inherited group reads #484

[CBR 7.9] perf: Disallow mis-matched inherited group reads #484

Uh oh!

pvts-mat commented Aug 11, 2025 •

edited

Loading

Uh oh!

bmastbergen left a comment

Uh oh!

thefossguy-ciq left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

	RH_KABI_EXTEND(struct list_head migrate_entry)
	RH_KABI_EXTEND(struct list_head active_entry)
	RH_KABI_EXTEND(void *pmu_private)
	#if defined(CONFIG_FUNCTION_TRACER) && !defined(CONFIG_X86_64)
	RH_KABI_EXTEND(struct ftrace_ops ftrace_ops)
	#endif
	/* address range filters */
	RH_KABI_EXTEND(struct perf_addr_filters_head addr_filters)
	/* vma address array for file-based filders */
	RH_KABI_EXTEND(unsigned long *addr_filters_offs)
	RH_KABI_EXTEND(unsigned long addr_filters_gen)
	RH_KABI_EXTEND(struct list_head sb_list)
	RH_KABI_EXTEND(u64 (*clock)(void))
	/* The cumulative AND of all event_caps for events in this group. */
	RH_KABI_EXTEND(int group_caps)
	/*
	* Node on the pinned or flexible tree located at the event context;
	*/
	RH_KABI_EXTEND(struct rb_node group_node)
	RH_KABI_EXTEND(u64 group_index)
	RH_KABI_EXTEND(struct list_head active_list)

	#ifdef CONFIG_BPF_SYSCALL
	RH_KABI_EXTEND(perf_overflow_handler_t orig_overflow_handler)
	RH_KABI_EXTEND(struct bpf_prog *prog)
	#endif
	RH_KABI_EXTEND(unsigned long rcu_batches)
	RH_KABI_EXTEND(int rcu_pending)
	#endif /* CONFIG_PERF_EVENTS */
	};

[CBR 7.9] perf: Disallow mis-matched inherited group reads #484

[CBR 7.9] perf: Disallow mis-matched inherited group reads #484

Uh oh!

Conversation

pvts-mat commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Applicability: yes

Solution

kABI check: passed

Boot test: passed

Kselftests: passed relative

Reference

Patch

Comparison

Specific tests: passed

Reference

Patch

Uh oh!

bmastbergen left a comment

Choose a reason for hiding this comment

Uh oh!

thefossguy-ciq left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

pvts-mat commented Aug 11, 2025 •

edited

Loading