Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang 15 built kernel crashes w. "BUG: kernel NULL pointer dereference, address: 00000000", gcc 12 built kernel with same config boots fine (6.1-rc7, x86_32) #1766

Closed
ernsteiswuerfel opened this issue Nov 30, 2022 · 17 comments
Labels
[ARCH] x86 This bug impacts ARCH=i386 [BUG] llvm A bug that should be fixed in upstream LLVM [FIXED][LLVM] 15 This bug was fixed in LLVM 15.x [FIXED][LLVM] 16 This bug was fixed in LLVM 16.0

Comments

@ernsteiswuerfel
Copy link

This is an interesting one!

Gave 6.1-rc7 a test ride on ye goode olde Pentium 4 box and noticed while the kernel boots just fine when built with gcc 12 toolchain it crashes at boot when it is built with clang 15 toolchain, same kernel .config used.

This is reproducable and happens everytime at boot on this machine;

[...]
BUG: kernel NULL pointer dereference, address: 00000000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
*pde = 00000000 
Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
CPU: 1 PID: 1 Comm: init Not tainted 6.1.0-rc7-P4 #3
Hardware name:  /FS51, BIOS 6.00 PG 12/02/2003
EIP: mast_split_data+0x198/0x260
Code: 84 e3 00 00 00 89 fa c7 45 ec 00 00 00 00 31 db 81 e2 00 ff ff ff 0f b6 f9 8b 4d ec 25 00 ff ff ff 8a 6d f3 09 da d3 e7 09 d7 <89> 38 8b 7e 10 8b 46 08 8b 50 0c 8b 7f 0c fe c5 8b 46 04 8b 70 0c
EAX: 00000000 EBX: 00000006 ECX: 00000003 EDX: c123bd06
ESI: c11ffbf0 EDI: c123bd06 EBP: c11ff94c ESP: c11ff934
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010286
CR0: 80050033 CR2: 00000000 CR3: 0276a000 CR4: 000006d0
Call Trace:
 mas_wr_modify+0xc76/0x18c0
 mas_wr_store_entry+0x235/0x2b0
 mas_store_prealloc+0xb8/0x100
 vma_mas_store+0x57/0xd0
 __vma_adjust+0x3f0/0x5b0
 ? rcu_read_lock_sched_held+0xa/0x70
 __split_vma+0xc3/0x120
 do_mas_align_munmap+0x1c8/0x460
 mmap_region+0x260/0x8a0
 ? rcu_read_lock_sched_held+0xa/0x70
 ? arch_get_unmapped_area_topdown+0x12/0x20
 do_mmap+0x33d/0x4b0
 ? prep_transhuge_page+0x20/0x20
 vm_mmap_pgoff+0x7f/0x100
 ksys_mmap_pgoff+0x129/0x170
 __ia32_sys_mmap_pgoff+0x1c/0x30
 do_int80_syscall_32+0x53/0x80
 entry_INT80_32+0xf0/0xf0
EIP: 0xb7f19fad
Code: 00 f7 d8 89 82 38 0a 00 00 b8 ff ff ff ff c3 66 90 66 90 66 90 66 90 66 90 66 90 66 90 53 57 55 8b 1f 8b 6f 08 8b 7f 04 cd 80 <5d> 5f 5b c3 e8 cf 01 00 00 81 c1 16 e0 00 00 8b 44 24 04 3d 85 00
EAX: ffffffda EBX: b787a000 ECX: 00004000 EDX: 00000005
ESI: 00000812 EDI: 00000003 EBP: 00000002 ESP: bfb43eb0
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 007b EFLAGS: 00000202
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
EIP: mast_split_data+0x198/0x260
Code: 84 e3 00 00 00 89 fa c7 45 ec 00 00 00 00 31 db 81 e2 00 ff ff ff 0f b6 f9 8b 4d ec 25 00 ff ff ff 8a 6d f3 09 da d3 e7 09 d7 <89> 38 8b 7e 10 8b 46 08 8b 50 0c 8b 7f 0c fe c5 8b 46 04 8b 70 0c
EAX: 00000000 EBX: 00000006 ECX: 00000003 EDX: c123bd06
ESI: c11ffbf0 EDI: c123bd06 EBP: c11ff94c ESP: c11ff934
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010286
CR0: 80050033 CR2: 00000000 CR3: 0276a000 CR4: 000006d0
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
Kernel Offset: disabled
Rebooting in 40 seconds..

Some data about the machine:

 # inxi -bZ
System:
  Kernel: 6.1.0-rc7-P4 arch: i686 bits: 32 Console: pty pts/0
    Distro: Gentoo Base System release 2.9
Machine:
  Type: Desktop Mobo: Shuttle model: FS51 serial: N/A BIOS: Phoenix
    v: 6.00 PG date: 12/02/2003
CPU:
  Info: single core Intel Pentium 4 [MT] speed (MHz): avg: 3063
Graphics:
  Device-1: AMD RV350 [Radeon 9550/9600/X1050 Series] driver: radeon
    v: kernel
  Display: x11 server: X.Org v: 21.1.1 driver: X: loaded: radeon
    unloaded: fbdev,modesetting gpu: radeon resolution: 1400x900~60Hz
  OpenGL: renderer: llvmpipe (LLVM 14.0.6 128 bits) v: 4.5 Mesa 22.1.7
Network:
  Device-1: Ralink RT2500 Wireless 802.11bg driver: rt2500pci
  Device-2: Realtek RTL-8100/8101L/8139 PCI Fast Ethernet Adapter
    driver: 8139too

If you think it would be a good idea I could mail a bug report to linux-mm too.
dmesg_61-rc7_p4_clang.txt
dmesg_61-rc7_p4_gcc.txt
config_61-rc7_p4-clang.txt
config_61-rc7_p4-gcc.txt

@nathanchance
Copy link
Member

Based on the stack trace, it seems like the maple tree is involved. Liam Howlett, the maple tree maintainer, has been pretty responsive to bug reports from what I can tell:

$ scripts/get_maintainer.pl lib/maple_tree.c
"Liam R. Howlett" <Liam.Howlett@oracle.com> (supporter:MAPLE TREE)
linux-mm@kvack.org (open list:MAPLE TREE)
linux-kernel@vger.kernel.org (open list)

It would be interesting to see if this is reproducible in a virtual machine, which would make debugging it simpler.

while the kernel boots just fine when built with gcc 12

We have occasionally had issues that turned out to be kernel bugs due to UB or other subtleties that only show up with clang.

@nathanchance nathanchance added the [ARCH] x86 This bug impacts ARCH=i386 label Dec 1, 2022
@nickdesaulniers nickdesaulniers added the [BUG] Untriaged Something isn't working label Dec 1, 2022
@howlett
Copy link

howlett commented Dec 8, 2022

I've spent the last few days recreating this bug and finally arrived at the conclusion that it is a clang-15 bug

Necessary background into the call stack from the crash is that we are under the mmap_lock which means only the write operation is occurring in this mm struct. The debug output uses the mm_struct pointer to identify that it is indeed the same task printing both messages to the console.

I made the following changes (among other debug outputs, so the lines won't match but the logic is sound):

@@ -3415,6 +3423,7 @@ static inline bool mas_push_data(struct ma_state *mas, int height,
        if (slot_total >= space)
                return false;
 
+       printk("%s\n", __func__);
        /* Get the data; Fill mast->bn */
        mast->bn->b_end++;
        if (left) {
@@ -3448,6 +3457,7 @@ static inline bool mas_push_data(struct ma_state *mas, int height,
        mast_split_data(mast, mas, split);
        mast_fill_bnode(mast, mas, 2);
        mas_split_final_node(mast, mas, height + 1);
+       printk("%p Return true\n", current->mm);
        return true;
 }
 
@@ -3522,9 +3532,14 @@ static int mas_split(struct ma_state *mas, struct maple_big_node *b_node)
                        break;
 
                /* Try to push right. */
-               if (mas_push_data(mas, height, &mast, false))
+               if (mas_push_data(mas, height, &mast, false)) {
+                       printk("%p Break\n", current->mm);
                        break;
+               }
 
+               printk("%p %d mas node %p type %u b_end %u\n", current->mm,
+                      __LINE__, mas_mn(mas), mte_node_type(mas->node),
+                      b_node->b_end);
                split = mab_calc_split(mas, b_node, &mid_split, prev_l_mas.min);
                mast_split_data(&mast, mas, split);
                /*

And received the following output:

[    1.969961] mas_push_data
[    1.970155] c1239a80 Return true
[    1.970370] c1239a80 3542 mas node c123f400 type 3 b_end 1

Followed by the reported crash.

So somehow the boolean return of 'true' is not treated as true which results in a very similar crash as initially reported.

I also recreated the situation in my userspace test code and it seems to work there, so I'm not sure what else is at play to cause the logic failure.

Recreation required clang-15, v6.1-rc7, and the use of the provided config. Although I did make modifications to have qemu/kvm reproduce the issue so I've attached that here as well. Rebuilding with clang-14 or gcc allows for the machine to boot smoothly.

It is worth noting that I used debian for ease of testing.

config_61-rc7_p4-clang.txt

@howlett
Copy link

howlett commented Dec 8, 2022

I attached the clang-14 config before. Here is the clang-15 config that causes the failure

Take a look at the config option CONFIG_ZERO_CALL_USED_REGS as that seems to be one that matters.

This seems to enable -fzero-call-used-regs=used-gpr

config_61-rc7-clang-15.txt

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Dec 8, 2022

Thanks for looking into this closer @howlett .

So somehow the boolean return of 'true' is not treated as true which results in a very similar crash as initially reported.

Makes sense that this could be CONFIG_ZERO_CALL_USED_REGS, since a return value of true/1 might be accidentally zeroed out (returning false/0).

This smells like llvm/llvm-project#57692.

I noticed in the attached config that:
CONFIG_DEBUG_INFO=y
is set. The fix for llvm/llvm-project#57692 mentioned that -fzero-call-used-regs=used-gpr with -g could cause bad codegen. llvm/llvm-project-release-prs@d4bada9 Looks like it shipped in the clang 15.0.6 release.

CONFIG_CC_VERSION_TEXT="Debian clang version 15.0.6" is set in the config though...so maybe there's more than one bug in -fzero-call-used-regs? (llvm/llvm-project#59242)

@howlett can you confirm whether your build of clang 15.0.6 contains d4bada99c069e2edbee2f4c815598476e7508f0b? Perhaps it's possible that the version of clang 15 was incremented to 15.0.6 before d4bada99c069e2edbee2f4c815598476e7508f0b landed?

cc @bwendling @isanbard

@nickdesaulniers nickdesaulniers added [BUG] llvm A bug that should be fixed in upstream LLVM and removed [BUG] Untriaged Something isn't working labels Dec 8, 2022
@howlett
Copy link

howlett commented Dec 8, 2022

Confirmed, testing was done with 15.0.6

$ clang-15 --version
Debian clang version 15.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Build command for the kernel:
KCONFIG_CONFIG=~/configs/config_61-rc7_p4-clang.txt make CC=clang-15 ARCH=i386 O=.build-i386 -j7 bzImage

@nickdesaulniers
Copy link
Member

I can see the bug in the disassembly:

    d9c9: 83 c4 6c                      addl    $0x6c, %esp
    d9cc: 5e                            popl    %esi
    d9cd: 5f                            popl    %edi
    d9ce: 5b                            popl    %ebx
    d9cf: 5d                            popl    %ebp
    d9d0: 31 c0                         xorl    %eax, %eax
    d9d2: 31 c9                         xorl    %ecx, %ecx
    d9d4: 31 d2                         xorl    %edx, %edx
    d9d6: c3                            retl
...
    ddf5: b0 01                         movb    $0x1, %al
    ddf7: e9 cd fb ff ff                jmp     0xd9c9 <mas_push_data+0x2b9>

So ddf5 would store 1 in the lower byte of %eax, ddf7 then jumps to the function epilog which zeros out the %eax register. (I wonder if there's a bug specific to unconventional calling convention -mregparm=3 that the kernel uses).

Simply having a function return 1 and testing that with -m32 -fzero-call-used-regs=used-gpr -mregparm=3 -g wasn't enough to reproduce the issue though. Let's see if we can creduce this...

@nickdesaulniers
Copy link
Member

Just doing a command line reduction, I needed at least -m32 -fno-pic -march=i686 -O2 -fzero-call-used-regs=used-gpr to reproduce.

@nickdesaulniers
Copy link
Member

Initial reduction:

// clang -m32 -fno-pic -march=i686 -O2 -fzero-call-used-regs=used-gpr -c maple_tree.i -S -o -
struct maple_arange_64 {
  long pivot[1]
};
enum maple_type { maple_arange_64 };
struct {
  struct {
    struct maple_arange_64 ma64
  };
} * mas_data_end___trans_tmp_2;
char mt_slots_0, mt_pivots_0, ma_meta_end_mn_0_0_0_0_0_0;
struct maple_big_node {
  char b_end
};
struct maple_subtree_state {
  struct maple_big_node *bn
} ma_is_leaf();
enum maple_type mas_data_end_type;
char mas_data_end() {
  char offset;
  if (mas_data_end_type)
    return ma_meta_end_mn_0_0_0_0_0_0;
  offset = mt_pivots_0 - 1;
  if (__builtin_expect(mas_data_end___trans_tmp_2->ma64.pivot[offset], 1))
    return offset;
  return mt_pivots_0;
}
void mas_mab_cp(char);
_Bool mas_prev_sibling();
_Bool mas_push_data(struct maple_subtree_state *mast) {
  unsigned char slot_total = mast->bn->b_end, end, space;
  if (mas_prev_sibling())
    end = mas_data_end();
  space = mt_slots_0;
  ma_is_leaf();
  if (slot_total >= space)
    return mast;
  mas_mab_cp(end);
}

let me see if I can reduce this further, but this disassembly very clearly has:

	movb	$1, %al           <- store 1 to %eax
	addl	$24, %esp
	.cfi_def_cfa_offset 8
	popl	%ebx
	.cfi_def_cfa_offset 4
	xorl	%eax, %eax          <- store 0 to %eax
	xorl	%ecx, %ecx
	xorl	%edx, %edx
	retl

@bwendling
Copy link

Yeah, that looks like grossness. I'll investigate.

@bwendling
Copy link

I think I know what's going on. The code uses both %al and %ah. However, when identifying which registers we shouldn't clear, it catches %al, %ax, %eax, and %rax, but not %ah. (>_<) So it goes on to clear %ah, promoting it to 32-bits first.

@bwendling
Copy link

@bwendling
Copy link

Should be fixed now.

@nickdesaulniers
Copy link
Member

@nickdesaulniers nickdesaulniers added the [FIXED][LLVM] 16 This bug was fixed in LLVM 16.0 label Dec 14, 2022
@nickdesaulniers
Copy link
Member

nickdesaulniers commented Dec 14, 2022

This bug was bad enough that I think we should consider marking ZERO_CALL_USED_REGS broken with clang-15; unless we can get upstream llvm to consider a clang 15.0.7 release for this.

llvm/llvm-project#59242 (comment)

@nathanchance
Copy link
Member

We could preemptively do something like this, which accounts for 15.0.7 existing or not:

diff --git a/security/Kconfig.hardening b/security/Kconfig.hardening
index d766b7d0ffd1..ddf9b411a3dd 100644
--- a/security/Kconfig.hardening
+++ b/security/Kconfig.hardening
@@ -257,6 +257,8 @@ config INIT_ON_FREE_DEFAULT_ON

 config CC_HAS_ZERO_CALL_USED_REGS
        def_bool $(cc-option,-fzero-call-used-regs=used-gpr)
+       # https://github.com/ClangBuiltLinux/linux/issues/1766
+       depends on !CC_IS_CLANG || CLANG_VERSION > 150006

 config ZERO_CALL_USED_REGS
        bool "Enable register zeroing on function exit"

@nathanchance
Copy link
Member

I have sent the above patch: https://lore.kernel.org/20221214232602.4118147-1-nathan@kernel.org/

@dileks dileks added the [FIXED][LLVM] 15 This bug was fixed in LLVM 15.x label Jan 21, 2023
@dileks
Copy link
Collaborator

dileks commented Jan 22, 2023

Just for the records and to confirm this is FIXED with LLVM 15.0.7:

$ scripts/diffconfig ../configs/config-6.2.0-rc4-2-amd64-clang15-kcfi /boot/config-6.2.0-rc5-1-amd64-clang15-kcfi
 AS_VERSION 150003 -> 150007
 BUILD_SALT "6.2.0-rc4-2-amd64-clang15-kcfi" -> "6.2.0-rc5-1-amd64-clang15-kcfi"
 CC_VERSION_TEXT "dileks clang version 15.0.3 (https://github.com/samitolvanen/llvm-project.git c6b3afb1a1c0aa39676a043256ecc3639217227f)" -> "dileks clang version 15.0.7 (https://github.com/samitolvanen/llvm-project.git 2c432a3d6ea8004e6be7313411398bbc26289301)"
 CLANG_VERSION 150003 -> 150007
 INTEL_IDXD_PERFMON n -> y
 INTEL_SPEED_SELECT_INTERFACE n -> m
 LLD_VERSION 150003 -> 150007
+CC_HAS_ZERO_CALL_USED_REGS y
+ZERO_CALL_USED_REGS y

$ cat /proc/version 
Linux version 6.2.0-rc5-1-amd64-clang15-kcfi (sedat.dilek@gmail.com@iniza) (dileks clang version 15.0.7 (https://github.com/samitolvanen/llvm-project.git 2c432a3d6ea8004e6be7313411398bbc26289301), LLD 15.0.7) #1~unstable+dileks1 SMP PREEMPT_DYNAMIC 2023-01-22

NOTE: Toolchain: https://github.com/samitolvanen/llvm-project/commits/15.x/kcfi

config-6.2.0-rc5-1-amd64-clang15-kcfi.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[ARCH] x86 This bug impacts ARCH=i386 [BUG] llvm A bug that should be fixed in upstream LLVM [FIXED][LLVM] 15 This bug was fixed in LLVM 15.x [FIXED][LLVM] 16 This bug was fixed in LLVM 16.0
Projects
None yet
Development

No branches or pull requests

6 participants