GitHub - soakley3/patch-list

Branches Tags
Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.txt		README.txt
Repository files navigation

# patch-list

[2025:]

    - https://issues.redhat.com/browse/RHEL-132449

JIRA: https://issues.redhat.com/browse/RHEL-132449
commit 06dac2f467fe9269a433aa5056dd2ee1d20475e9
Author: Charan Teja Reddy charante@codeaurora.org
Date:   Tue May 4 18:36:51 2021 -0700

mm: compaction: update the COMPACT[STALL|FAIL] events properly

By definition, COMPACT[STALL|FAIL] events needs to be counted when there
is 'At least in one zone compaction wasn't deferred or skipped from the
direct compaction'.  And when compaction is skipped or deferred,
COMPACT_SKIPPED will be returned but it will still go and update these
compaction events which is wrong in the sense that COMPACT[STALL|FAIL]
is counted without even trying the compaction.

Correct this by skipping the counting of these events when
COMPACT_SKIPPED is returned for compaction.  This indirectly also avoid
the unnecessary try into the get_page_from_freelist() when compaction is
not even tried.

There is a corner case where compaction is skipped but still count
COMPACTSTALL event, which is that IRQ came and freed the page and the
same is captured in capture_control.

Link: https://lkml.kernel.org/r/1613151184-21213-1-git-send-email-charante@codeaurora.org
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


Signed-off-by: Lucas Oakley soakley@redhat.com
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 


    - https://issues.redhat.com/browse/RHEL-46006

JIRA: https://issues.redhat.com/browse/RHEL-46006


Upstream Status: RHEL only
Tested: s390x reproducer no longer panics with this patch. No other
        issues seen on x86_64, ppc64le, or aarch64

Downstream patch 653ae766 ("mm/uffd: always wr-protect pte in
pte|pmd_mkuffd_wp()") partially reverts 8e95bedaa1a ("mm: Fix
CVE-2022-2590 by reverting "mm/shmem: unconditionally set pte dirty in
mic_install_pte"") by removing the following from the routine
mfill_atomic_install_pte():

-       if (writable || !page_in_cache)
-               _dst_pte = pte_mkdirty(_dst_pte);

However, 8e95bedaa1a also removed the call to pte_mkdirty() earlier
in mfill_atomic_install_pte():

        _dst_pte = mk_pte(page, dst_vma->vm_page_prot);
-       _dst_pte = pte_mkdirty(_dst_pte);

653ae766 did not restore the call to pte_mkdirty(), leading
to unexpected exceptions in s390x kvm guests, like the following:

[2835953.436969] Low-address protection: 0004 ilc:3 [#1] SMP
[2835953.436978] Modules linked in: ...
[2835953.437045] CPU: 2 PID: 1632 Comm: .... Not tainted 5.14.0-570.37.1.el9_6.s390x #1
[2835953.437048] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
[2835953.437049] User PSW : 0705200180000000 000002aa39538b2a
[2835953.437051]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3
[2835953.437052] User GPRS: 00000000002b45f1 000002aa00000000 000002aa39fd0ff0 ffffffff00120123
[2835953.437054]            000003ff94802200 000003ff94802200 000002aa39fd2280 000003ff94801840
[2835953.437055]            ffffffff00000002 000002aa39fd0fd0 0000000000000000 0000000000000001
[2835953.437056]            000003ff963aef98 000003ffeb279d7f 000003ff947f63d0 000003ff947f62d8
[2835953.437063] User Code: 000002aa39538b1c: 55106004          cl      %r1,4(%r6)
                            000002aa39538b20: a7a40403          brc     10,000002aa39539326
                           #000002aa39538b24: ecb1000100d8      ahik    %r11,%r1,1
                           >000002aa39538b2a: eb1b62440114      csy     %r1,%r11,4676(%r6)
                            000002aa39538b30: a774fff3          brc     7,000002aa39538b16
                            000002aa39538b34: 58109008          l       %r1,8(%r9)
                            000002aa39538b38: eca137b70055      risbg   %r10,%r1,55,183,0
                            000002aa39538b3e: a774fe75          brc     7,000002aa39538828
[2835953.437071] Last Breaking-Event-Address:
[2835953.437071]  [<000002aa39538810>] ....[2aa39380000+2c2000]
[2835953.437081] Kernel panic - not syncing: Fatal exception: panic_on_oops

Signed-off-by: default avatarLucas Oakley <soakley@redhat.com>
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 


https://issues.redhat.com/browse/RHEL-72726
https://lists.crash-utility.osci.io/archives/list/devel@lists.crash-utility.osci.io/thread/HCLRRZAS6WJALKWLCJ4H25NGQ7TJHGJ4/

This simplication fixes the total CPU count being reported
incorrectly in ppc64le and s390x systems when some number of
CPUs have been offlined, as the kt->cpus value is adjusted.
This adds the word "OFFLINE" to the 'sys' output for s390x
and ppc64le, like exists for x86_64 and aarch64 when examining
systems with offlined CPUs.

Without patch:

  KERNEL: /debug/4.18.0-477.10.1.el8_8.s390x/vmlinux
DUMPFILE: /proc/kcore
    CPUS: 1

With patch:

  KERNEL: /debug/4.18.0-477.10.1.el8_8.s390x/vmlinux
DUMPFILE: /proc/kcore
    CPUS: 2 [OFFLINE: 1]

Signed-off-by: Lucas Oakley <soakley(a)redhat.com&gt;
---
 kernel.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/kernel.c b/kernel.c
index 8c2e0ca..3e190f1 100644
--- a/kernel.c
+++ b/kernel.c
@@ -5816,15 +5816,13 @@ display_sys_stats(void)
 				pc->kvmdump_mapfile);
 	}
 	
-	if (machine_type("PPC64"))
-		fprintf(fp, "        CPUS: %d\n", get_cpus_to_display());
-	else {
-		fprintf(fp, "        CPUS: %d", kt->cpus);
-		if (kt->cpus - get_cpus_to_display())
-			fprintf(fp, " [OFFLINE: %d]", 
-				kt->cpus - get_cpus_to_display());
-		fprintf(fp, "\n");
-	}
+        int number_cpus_to_display = get_cpus_to_display();
+        int number_cpus_present = get_cpus_present();
+        fprintf(fp, "        CPUS: %d", number_cpus_present);
+        if (number_cpus_present != number_cpus_to_display)
+                fprintf(fp, " [OFFLINE: %d]",
+                    number_cpus_present - number_cpus_to_display);
+        fprintf(fp, "\n");
 
 	if (ACTIVE())
 		get_xtime(&kt->date);
-- 
2.47.1

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 



[2024:]

https://issues.redhat.com/browse/RHEL-71349
https://lists.crash-utility.osci.io/archives/list/devel@lists.crash-utility.osci.io/thread/FTHEOSEHYHGBW2LU4RFTS7UBKD2ICMWS/

Change check_stack_overflow() to check if the thread_info's cpu
member is smaller than possible existing CPUs, rather than the
kernel table's cpu number (kt->cpus). The kernel table's cpu number
is changed on some architectures to reflect the highest numbered
online cpu + 1. This can cause a false positive in
check_stack_overflow() if the cpu member of a parked task's
thread_info structure, assigned to an offlined cpu, is larger than
the kt->cpus but lower than the number of existing logical cpus.
An example of this is RHEL 7 on s390x or RHEL 8 on ppc64le when
the highest numbered CPU is offlined.

Signed-off-by: Lucas Oakley <soakley(a)redhat.com&gt;
---
 task.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/task.c b/task.c
index 33de7da..93dab0e 100644
--- a/task.c
+++ b/task.c
@@ -11253,12 +11253,12 @@ check_stack_overflow(void)
 				cpu = 0;
 				break;
 			}
-			if (cpu >= kt->cpus) {
+			if (cpu >= get_cpus_present()) {
 				if (!overflow)
 					print_task_header(fp, tc, 0);
 				fprintf(fp, 
 				    "  possible stack overflow: thread_info.cpu: %d >= %d\n",
-					cpu, kt->cpus);
+					cpu, get_cpus_present());
 				overflow++; total++;
 			}
 		}
-- 
2.47.1