Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SRSO] "missing return thunk" warning when booting on Intel or Zen1 machine #1911

Closed
nathanchance opened this issue Aug 9, 2023 · 36 comments
Labels
[BUG] linux A bug that should be fixed in the mainline kernel. [FIXED][LINUX] development cycle This bug was only present and fixed in a -next or -rc cycle [TOOL] integrated-as The issue is relevant to LLVM integrated assembler

Comments

@nathanchance
Copy link
Member

After commit fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation"), I see the following stack trace when booting on my two pieces of Intel hardware (either bare metal or virtually):

$ make -skj"$(nproc)" ARCH=x86_64 CC=clang mrproper defconfig bzImage

$ boot-qemu.py -k .
...
[    0.086618] ------------[ cut here ]------------
[    0.086996] missing return thunk: __ret+0x5/0x7e-__ret+0x0/0x7e: e9 f6 ff ff ff
[    0.087005] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:753 apply_returns+0x2da/0x430
[    0.088328] Modules linked in:
[    0.088585] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.5.0-rc5-00056-gcacc6e22932f #1
[    0.089216] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.2-1-1 04/01/2014
[    0.089329] RIP: 0010:apply_returns+0x2da/0x430
[    0.089624] Code: ff ff 0f 0b e9 c8 fd ff ff c6 05 60 bd c2 01 01 48 c7 c7 ae 5a 68 bd 4c 89 ee 4c 89 e2 b9 05 00 00 00 4d 89 e8 e8 b6 4d 05 00 <0f> 0b e9 a0 fd ff ff 45 85 e4 0f 84 2e ff ff ff 48 c7 c7 6e 5a 68
[    0.090328] RSP: 0000:ffffffffbda03e20 EFLAGS: 00010246
[    0.090740] RAX: cb2b7f056bc62700 RBX: ffffffffbe319188 RCX: ffffffffbda53e80
[    0.091328] RDX: ffffffffbda03cd8 RSI: 00000000ffffdfff RDI: ffffffffbda84110
[    0.091891] RBP: ffffffffbda03ef8 R08: 0000000000001fff R09: ffffffffbda54110
[    0.092328] R10: 0000000000005ffd R11: 0000000000000004 R12: ffffffffbcf60040
[    0.093328] R13: ffffffffbcf60045 R14: ffffffffbe319180 R15: ffffffffbda03e38
[    0.093896] FS:  0000000000000000(0000) GS:ffff97db5ee00000(0000) knlGS:0000000000000000
[    0.094328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.094775] CR2: ffff97db55001000 CR3: 000000001442a001 CR4: 0000000000770ef0
[    0.095329] PKRU: 55555554
[    0.095555] Call Trace:
[    0.095755]  <TASK>
[    0.095930]  ? __warn+0xc3/0x1c0
[    0.096328]  ? apply_returns+0x2da/0x430
[    0.096621]  ? report_bug+0x14e/0x1f0
[    0.096860]  ? handle_bug+0x3d/0x80
[    0.097087]  ? exc_invalid_op+0x1a/0x50
[    0.097328]  ? asm_exc_invalid_op+0x1a/0x20
[    0.097645]  ? __ret+0x5/0x7e
[    0.097847]  ? zen_untrain_ret+0x1/0x1
[    0.098329]  ? apply_returns+0x2da/0x430
[    0.098586]  ? __ret+0x5/0x7e
[    0.098781]  ? __ret+0x14/0x7e
[    0.098981]  ? __ret+0xa/0x7e
[    0.099175]  alternative_instructions+0x47/0x110
[    0.099329]  arch_cpu_finalize_init+0x2c/0x50
[    0.099613]  start_kernel+0x2e4/0x390
[    0.099853]  x86_64_start_reservations+0x24/0x30
[    0.100328]  x86_64_start_kernel+0xab/0xb0
[    0.100595]  secondary_startup_64_no_verify+0x17a/0x17b
[    0.100957]  </TASK>
[    0.101101] ---[ end trace 0000000000000000 ]---
...

I can reproduce this with just CC=clang, so it appears unrelated to the existing issues with ld.lld. This appears to be related to the integrated assembler, as this is not reproducible with CC=clang LLVM_IAS=0...

@nathanchance nathanchance added [BUG] Untriaged Something isn't working [TOOL] integrated-as The issue is relevant to LLVM integrated assembler labels Aug 9, 2023
@nathanchance
Copy link
Member Author

If I build with IAS and GNU as in two separate build folders then copy arch/x86/lib/retpoline.o from the GAS build folder and rebuild the IAS kernel, there is no warning.

The diff of llvm-objdump -dr between the two files
diff --git a/tmp/.psub.spGrx5PsgE b/tmp/.psub.dgTNXp9xEE
index 8ad1e3b38958..a23be375283b 100644
--- a/tmp/.psub.spGrx5PsgE
+++ b/tmp/.psub.dgTNXp9xEE
@@ -1,5 +1,5 @@
 
-/home/nathan/Dev/tmp/build/linux/gas/arch/x86/lib/retpoline.o:	file format elf64-x86-64
+/home/nathan/Dev/tmp/build/linux/ias/arch/x86/lib/retpoline.o:	file format elf64-x86-64
 
 Disassembly of section .text.__x86.indirect_thunk:
 
@@ -9,8 +9,8 @@ Disassembly of section .text.__x86.indirect_thunk:
        6: 48 89 04 24                  	movq	%rax, (%rsp)
        a: e9 00 00 00 00               	jmp	0xf <__x86_indirect_thunk_rax+0xf>
 		000000000000000b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-       f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      1a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+       f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      19: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000020 <__x86_indirect_thunk_rcx>:
       20: e8 01 00 00 00               	callq	0x26 <__x86_indirect_thunk_rcx+0x6>
@@ -18,8 +18,8 @@ Disassembly of section .text.__x86.indirect_thunk:
       26: 48 89 0c 24                  	movq	%rcx, (%rsp)
       2a: e9 00 00 00 00               	jmp	0x2f <__x86_indirect_thunk_rcx+0xf>
 		000000000000002b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-      2f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      3a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+      2f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      39: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000040 <__x86_indirect_thunk_rdx>:
       40: e8 01 00 00 00               	callq	0x46 <__x86_indirect_thunk_rdx+0x6>
@@ -27,8 +27,8 @@ Disassembly of section .text.__x86.indirect_thunk:
       46: 48 89 14 24                  	movq	%rdx, (%rsp)
       4a: e9 00 00 00 00               	jmp	0x4f <__x86_indirect_thunk_rdx+0xf>
 		000000000000004b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-      4f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      5a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+      4f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      59: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000060 <__x86_indirect_thunk_rbx>:
       60: e8 01 00 00 00               	callq	0x66 <__x86_indirect_thunk_rbx+0x6>
@@ -36,8 +36,8 @@ Disassembly of section .text.__x86.indirect_thunk:
       66: 48 89 1c 24                  	movq	%rbx, (%rsp)
       6a: e9 00 00 00 00               	jmp	0x6f <__x86_indirect_thunk_rbx+0xf>
 		000000000000006b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-      6f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      7a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+      6f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      79: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000080 <__x86_indirect_thunk_rsp>:
       80: e8 01 00 00 00               	callq	0x86 <__x86_indirect_thunk_rsp+0x6>
@@ -45,8 +45,8 @@ Disassembly of section .text.__x86.indirect_thunk:
       86: 48 89 24 24                  	movq	%rsp, (%rsp)
       8a: e9 00 00 00 00               	jmp	0x8f <__x86_indirect_thunk_rsp+0xf>
 		000000000000008b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-      8f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      9a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+      8f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      99: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 00000000000000a0 <__x86_indirect_thunk_rbp>:
       a0: e8 01 00 00 00               	callq	0xa6 <__x86_indirect_thunk_rbp+0x6>
@@ -54,8 +54,8 @@ Disassembly of section .text.__x86.indirect_thunk:
       a6: 48 89 2c 24                  	movq	%rbp, (%rsp)
       aa: e9 00 00 00 00               	jmp	0xaf <__x86_indirect_thunk_rbp+0xf>
 		00000000000000ab:  R_X86_64_PLT32	__x86_return_thunk-0x4
-      af: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      ba: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+      af: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      b9: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 00000000000000c0 <__x86_indirect_thunk_rsi>:
       c0: e8 01 00 00 00               	callq	0xc6 <__x86_indirect_thunk_rsi+0x6>
@@ -63,8 +63,8 @@ Disassembly of section .text.__x86.indirect_thunk:
       c6: 48 89 34 24                  	movq	%rsi, (%rsp)
       ca: e9 00 00 00 00               	jmp	0xcf <__x86_indirect_thunk_rsi+0xf>
 		00000000000000cb:  R_X86_64_PLT32	__x86_return_thunk-0x4
-      cf: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      da: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+      cf: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      d9: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 00000000000000e0 <__x86_indirect_thunk_rdi>:
       e0: e8 01 00 00 00               	callq	0xe6 <__x86_indirect_thunk_rdi+0x6>
@@ -72,8 +72,8 @@ Disassembly of section .text.__x86.indirect_thunk:
       e6: 48 89 3c 24                  	movq	%rdi, (%rsp)
       ea: e9 00 00 00 00               	jmp	0xef <__x86_indirect_thunk_rdi+0xf>
 		00000000000000eb:  R_X86_64_PLT32	__x86_return_thunk-0x4
-      ef: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      fa: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+      ef: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      f9: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000100 <__x86_indirect_thunk_r8>:
      100: e8 01 00 00 00               	callq	0x106 <__x86_indirect_thunk_r8+0x6>
@@ -81,8 +81,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      106: 4c 89 04 24                  	movq	%r8, (%rsp)
      10a: e9 00 00 00 00               	jmp	0x10f <__x86_indirect_thunk_r8+0xf>
 		000000000000010b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-     10f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     11a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+     10f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     119: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000120 <__x86_indirect_thunk_r9>:
      120: e8 01 00 00 00               	callq	0x126 <__x86_indirect_thunk_r9+0x6>
@@ -90,8 +90,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      126: 4c 89 0c 24                  	movq	%r9, (%rsp)
      12a: e9 00 00 00 00               	jmp	0x12f <__x86_indirect_thunk_r9+0xf>
 		000000000000012b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-     12f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     13a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+     12f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     139: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000140 <__x86_indirect_thunk_r10>:
      140: e8 01 00 00 00               	callq	0x146 <__x86_indirect_thunk_r10+0x6>
@@ -99,8 +99,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      146: 4c 89 14 24                  	movq	%r10, (%rsp)
      14a: e9 00 00 00 00               	jmp	0x14f <__x86_indirect_thunk_r10+0xf>
 		000000000000014b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-     14f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     15a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+     14f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     159: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000160 <__x86_indirect_thunk_r11>:
      160: e8 01 00 00 00               	callq	0x166 <__x86_indirect_thunk_r11+0x6>
@@ -108,8 +108,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      166: 4c 89 1c 24                  	movq	%r11, (%rsp)
      16a: e9 00 00 00 00               	jmp	0x16f <__x86_indirect_thunk_r11+0xf>
 		000000000000016b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-     16f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     17a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+     16f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     179: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000180 <__x86_indirect_thunk_r12>:
      180: e8 01 00 00 00               	callq	0x186 <__x86_indirect_thunk_r12+0x6>
@@ -117,8 +117,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      186: 4c 89 24 24                  	movq	%r12, (%rsp)
      18a: e9 00 00 00 00               	jmp	0x18f <__x86_indirect_thunk_r12+0xf>
 		000000000000018b:  R_X86_64_PLT32	__x86_return_thunk-0x4
-     18f: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     19a: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+     18f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     199: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 00000000000001a0 <__x86_indirect_thunk_r13>:
      1a0: e8 01 00 00 00               	callq	0x1a6 <__x86_indirect_thunk_r13+0x6>
@@ -126,8 +126,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      1a6: 4c 89 2c 24                  	movq	%r13, (%rsp)
      1aa: e9 00 00 00 00               	jmp	0x1af <__x86_indirect_thunk_r13+0xf>
 		00000000000001ab:  R_X86_64_PLT32	__x86_return_thunk-0x4
-     1af: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     1ba: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+     1af: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     1b9: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 00000000000001c0 <__x86_indirect_thunk_r14>:
      1c0: e8 01 00 00 00               	callq	0x1c6 <__x86_indirect_thunk_r14+0x6>
@@ -135,8 +135,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      1c6: 4c 89 34 24                  	movq	%r14, (%rsp)
      1ca: e9 00 00 00 00               	jmp	0x1cf <__x86_indirect_thunk_r14+0xf>
 		00000000000001cb:  R_X86_64_PLT32	__x86_return_thunk-0x4
-     1cf: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     1da: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+     1cf: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     1d9: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 00000000000001e0 <__x86_indirect_thunk_r15>:
      1e0: e8 01 00 00 00               	callq	0x1e6 <__x86_indirect_thunk_r15+0x6>
@@ -144,8 +144,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      1e6: 4c 89 3c 24                  	movq	%r15, (%rsp)
      1ea: e9 00 00 00 00               	jmp	0x1ef <__x86_indirect_thunk_r15+0xf>
 		00000000000001eb:  R_X86_64_PLT32	__x86_return_thunk-0x4
-     1ef: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     1fa: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
+     1ef: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     1f9: 0f 1f 80 00 00 00 00         	nopl	(%rax)
 
 0000000000000200 <__x86_indirect_call_thunk_rax>:
      200: 90                           	nop
@@ -441,8 +441,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      406: 48 89 04 24                  	movq	%rax, (%rsp)
      40a: c3                           	retq
      40b: cc                           	int3
-     40c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     417: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     40c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     416: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 0000000000000420 <__x86_indirect_jump_thunk_rcx>:
      420: e8 01 00 00 00               	callq	0x426 <__x86_indirect_jump_thunk_rcx+0x6>
@@ -450,8 +450,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      426: 48 89 0c 24                  	movq	%rcx, (%rsp)
      42a: c3                           	retq
      42b: cc                           	int3
-     42c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     437: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     42c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     436: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 0000000000000440 <__x86_indirect_jump_thunk_rdx>:
      440: e8 01 00 00 00               	callq	0x446 <__x86_indirect_jump_thunk_rdx+0x6>
@@ -459,8 +459,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      446: 48 89 14 24                  	movq	%rdx, (%rsp)
      44a: c3                           	retq
      44b: cc                           	int3
-     44c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     457: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     44c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     456: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 0000000000000460 <__x86_indirect_jump_thunk_rbx>:
      460: e8 01 00 00 00               	callq	0x466 <__x86_indirect_jump_thunk_rbx+0x6>
@@ -468,8 +468,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      466: 48 89 1c 24                  	movq	%rbx, (%rsp)
      46a: c3                           	retq
      46b: cc                           	int3
-     46c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     477: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     46c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     476: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 0000000000000480 <__x86_indirect_jump_thunk_rsp>:
      480: e8 01 00 00 00               	callq	0x486 <__x86_indirect_jump_thunk_rsp+0x6>
@@ -477,8 +477,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      486: 48 89 24 24                  	movq	%rsp, (%rsp)
      48a: c3                           	retq
      48b: cc                           	int3
-     48c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     497: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     48c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     496: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 00000000000004a0 <__x86_indirect_jump_thunk_rbp>:
      4a0: e8 01 00 00 00               	callq	0x4a6 <__x86_indirect_jump_thunk_rbp+0x6>
@@ -486,8 +486,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      4a6: 48 89 2c 24                  	movq	%rbp, (%rsp)
      4aa: c3                           	retq
      4ab: cc                           	int3
-     4ac: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     4b7: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     4ac: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     4b6: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 00000000000004c0 <__x86_indirect_jump_thunk_rsi>:
      4c0: e8 01 00 00 00               	callq	0x4c6 <__x86_indirect_jump_thunk_rsi+0x6>
@@ -495,8 +495,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      4c6: 48 89 34 24                  	movq	%rsi, (%rsp)
      4ca: c3                           	retq
      4cb: cc                           	int3
-     4cc: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     4d7: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     4cc: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     4d6: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 00000000000004e0 <__x86_indirect_jump_thunk_rdi>:
      4e0: e8 01 00 00 00               	callq	0x4e6 <__x86_indirect_jump_thunk_rdi+0x6>
@@ -504,8 +504,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      4e6: 48 89 3c 24                  	movq	%rdi, (%rsp)
      4ea: c3                           	retq
      4eb: cc                           	int3
-     4ec: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     4f7: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     4ec: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     4f6: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 0000000000000500 <__x86_indirect_jump_thunk_r8>:
      500: e8 01 00 00 00               	callq	0x506 <__x86_indirect_jump_thunk_r8+0x6>
@@ -513,8 +513,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      506: 4c 89 04 24                  	movq	%r8, (%rsp)
      50a: c3                           	retq
      50b: cc                           	int3
-     50c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     517: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     50c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     516: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 0000000000000520 <__x86_indirect_jump_thunk_r9>:
      520: e8 01 00 00 00               	callq	0x526 <__x86_indirect_jump_thunk_r9+0x6>
@@ -522,8 +522,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      526: 4c 89 0c 24                  	movq	%r9, (%rsp)
      52a: c3                           	retq
      52b: cc                           	int3
-     52c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     537: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     52c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     536: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 0000000000000540 <__x86_indirect_jump_thunk_r10>:
      540: e8 01 00 00 00               	callq	0x546 <__x86_indirect_jump_thunk_r10+0x6>
@@ -531,8 +531,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      546: 4c 89 14 24                  	movq	%r10, (%rsp)
      54a: c3                           	retq
      54b: cc                           	int3
-     54c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     557: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     54c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     556: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 0000000000000560 <__x86_indirect_jump_thunk_r11>:
      560: e8 01 00 00 00               	callq	0x566 <__x86_indirect_jump_thunk_r11+0x6>
@@ -540,8 +540,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      566: 4c 89 1c 24                  	movq	%r11, (%rsp)
      56a: c3                           	retq
      56b: cc                           	int3
-     56c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     577: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     56c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     576: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 0000000000000580 <__x86_indirect_jump_thunk_r12>:
      580: e8 01 00 00 00               	callq	0x586 <__x86_indirect_jump_thunk_r12+0x6>
@@ -549,8 +549,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      586: 4c 89 24 24                  	movq	%r12, (%rsp)
      58a: c3                           	retq
      58b: cc                           	int3
-     58c: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     597: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     58c: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     596: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 00000000000005a0 <__x86_indirect_jump_thunk_r13>:
      5a0: e8 01 00 00 00               	callq	0x5a6 <__x86_indirect_jump_thunk_r13+0x6>
@@ -558,8 +558,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      5a6: 4c 89 2c 24                  	movq	%r13, (%rsp)
      5aa: c3                           	retq
      5ab: cc                           	int3
-     5ac: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     5b7: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     5ac: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     5b6: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 00000000000005c0 <__x86_indirect_jump_thunk_r14>:
      5c0: e8 01 00 00 00               	callq	0x5c6 <__x86_indirect_jump_thunk_r14+0x6>
@@ -567,8 +567,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      5c6: 4c 89 34 24                  	movq	%r14, (%rsp)
      5ca: c3                           	retq
      5cb: cc                           	int3
-     5cc: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     5d7: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     5cc: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     5d6: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 00000000000005e0 <__x86_indirect_jump_thunk_r15>:
      5e0: e8 01 00 00 00               	callq	0x5e6 <__x86_indirect_jump_thunk_r15+0x6>
@@ -576,8 +576,8 @@ Disassembly of section .text.__x86.indirect_thunk:
      5e6: 4c 89 3c 24                  	movq	%r15, (%rsp)
      5ea: c3                           	retq
      5eb: cc                           	int3
-     5ec: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-     5f7: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
+     5ec: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+     5f6: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
 
 Disassembly of section .altinstr_replacement:
 
@@ -772,14 +772,15 @@ Disassembly of section .text.__x86.return_thunk:
       40: c3                           	retq
       41: cc                           	int3
       42: 0f ae e8                     	lfence
-      45: eb f9                        	jmp	0x40 <__ret>
-      47: cc                           	int3
-      48: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      53: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      5e: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      69: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      74: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      7f: 90                           	nop
+      45: e9 00 00 00 00               	jmp	0x4a <__ret+0xa>
+		0000000000000046:  R_X86_64_PLT32	__ret-0x4
+      4a: cc                           	int3
+      4b: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      55: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      5f: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      69: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      73: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      7d: 0f 1f 00                     	nopl	(%rax)
       80: cc                           	int3
       81: cc                           	int3
       82: cc                           	int3
@@ -856,8 +857,8 @@ Disassembly of section .text.__x86.return_thunk:
       cb: e8 00 00 00 00               	callq	0xd0 <srso_safe_ret+0x10>
 		00000000000000cc:  R_X86_64_PLT32	srso_safe_ret-0x4
       d0: cc                           	int3
-      d1: 66 66 2e 0f 1f 84 00 00 00 00 00     	nopw	%cs:(%rax,%rax)
-      dc: 0f 1f 40 00                  	nopl	(%rax)
+      d1: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
+      db: 0f 1f 44 00 00               	nopl	(%rax,%rax)
       e0: 90                           	nop
       e1: 90                           	nop
       e2: 90                           	nop
@@ -877,7 +878,8 @@ Disassembly of section .text.__x86.return_thunk:
 
 00000000000000f0 <__x86_return_thunk>:
       f0: f3 0f 1e fa                  	endbr64
-      f4: e9 47 ff ff ff               	jmp	0x40 <__ret>
+      f4: e9 00 00 00 00               	jmp	0xf9 <__x86_return_thunk+0x9>
+		00000000000000f5:  R_X86_64_PLT32	__ret-0x4
       f9: cc                           	int3
       fa: 66 0f 1f 44 00 00            	nopw	(%rax,%rax)
      100: 90                           	nop

Most of that is whitespace differences but there appear to be a couple more R_X86_64_PLT32 relocations with the integrated assembler?

@nathanchance
Copy link
Member Author

Actually, it is not whitespace difference, it seems like the machine code is slightly different, my fancy diff viewer helped visualize it.

@torvic9
Copy link

torvic9 commented Aug 10, 2023

I can confirm this with clang-16 + Linux 6.4.9 + Nick's and Petr's patches. Kaby Lake laptop.

@autogris
Copy link

I can confirm this with clang-16 + Linux 6.4.9 + Nick's and Petr's patches. Kaby Lake laptop.

Same with clang-16 + kernel 6.1.44 + both patches, intel Kaby Lake. Disappears by disabling LLVM_IAS.

@nickdesaulniers nickdesaulniers changed the title "missing return thunk" warning when booting on Intel machine [SRSO] "missing return thunk" warning when booting on Intel machine Aug 10, 2023
@nickdesaulniers
Copy link
Member

nickdesaulniers commented Aug 10, 2023

This might be Intel specific.

EDIT: d'oh it was in the title. Though @misotolar 's data point below shows otherwise.

FWIW, I cannot reproduce on

cpu family	: 23
model		: 49
model name	: AMD Ryzen Threadripper PRO 3995WX 64-Cores
stepping	: 0
microcode	: 0x830107a

(Zen 2)

@nathanchance
Copy link
Member Author

This might be Intel specific.

It appears to be, as I don't see it on either of my AMD boxes.

@misotolar
Copy link

AMD Ryzen 5 3500U, Full LTO:

ago 10 18:27:14 trinity kernel: ------------[ cut here ]------------
ago 10 18:27:14 trinity kernel: missing return thunk: __ret+0x5/0x7e-__ret+0x0/0x7e: e9 f6 ff ff ff
ago 10 18:27:14 trinity kernel: WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:630 apply_returns+0x266/0x490
ago 10 18:27:14 trinity kernel: Modules linked in:
ago 10 18:27:14 trinity kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.9-1-trinity #1 7c99839211746a4997d0b8eb55293daef2e1f334
ago 10 18:27:14 trinity kernel: Hardware name: LENOVO 81W1/LNVNB161216, BIOS E8CN39WW 04/10/2023
ago 10 18:27:14 trinity kernel: RIP: 0010:apply_returns+0x266/0x490
ago 10 18:27:14 trinity kernel: Code: 89 f8 e8 2d f4 13 00 eb 8a 48 c7 c7 96 b7 53 8d 4c 89 ee 4c 89 f2 b9 05 00 00 00 4d 89 e8 c6 05 c5 89 ac 03 01 e8 7a b1 09 00 <0f> 0b e9 49 ff ff ff f3 0f 1e fa b8 01 00 00>
ago 10 18:27:14 trinity kernel: RSP: 0000:ffffffff8e603e00 EFLAGS: 00010246
ago 10 18:27:14 trinity kernel: RAX: 0bd0d6091880cd00 RBX: ffffffff8f323210 RCX: ffffffff8e719de0
ago 10 18:27:14 trinity kernel: RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000004
ago 10 18:27:14 trinity kernel: RBP: ffffffff8e603ee0 R08: 0000000000000000 R09: ffffffff8e66b450
ago 10 18:27:14 trinity kernel: R10: 00000000ffffffff R11: 0000000000000fff R12: 0000000000000005
ago 10 18:27:14 trinity kernel: R13: ffffffff8c8e2d85 R14: ffffffff8c8e2d80 R15: ffffffff8c8e2d80
ago 10 18:27:14 trinity kernel: FS:  0000000000000000(0000) GS:ffff9932f4a00000(0000) knlGS:0000000000000000
ago 10 18:27:14 trinity kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
ago 10 18:27:14 trinity kernel: CR2: ffff9931e0201000 CR3: 000000019ee10000 CR4: 00000000003506f0
ago 10 18:27:14 trinity kernel: Call Trace:
ago 10 18:27:14 trinity kernel:  <TASK>
ago 10 18:27:14 trinity kernel:  ? __warn+0x151/0x250
ago 10 18:27:14 trinity kernel:  ? apply_returns+0x266/0x490
ago 10 18:27:14 trinity kernel:  ? report_bug+0x162/0x200
ago 10 18:27:14 trinity kernel:  ? handle_bug+0x3d/0x80
ago 10 18:27:14 trinity kernel:  ? exc_invalid_op+0x1a/0x50
ago 10 18:27:14 trinity kernel:  ? asm_exc_invalid_op+0x1a/0x20
ago 10 18:27:14 trinity kernel:  ? zen_untrain_ret+0x1/0x1
ago 10 18:27:14 trinity kernel:  ? zen_untrain_ret+0x1/0x1
ago 10 18:27:14 trinity kernel:  ? __ret+0x5/0x7e
ago 10 18:27:14 trinity kernel:  ? apply_returns+0x266/0x490
ago 10 18:27:14 trinity kernel:  ? __ret+0x5/0x7e
ago 10 18:27:14 trinity kernel:  ? __ret+0x14/0x7e
ago 10 18:27:14 trinity kernel:  ? __ret+0xa/0x7e
ago 10 18:27:14 trinity kernel:  ? atomic_notifier_chain_unregister+0x7b/0xd0
ago 10 18:27:14 trinity kernel:  alternative_instructions+0x3a/0x110
ago 10 18:27:14 trinity kernel:  arch_cpu_finalize_init+0x4c/0xb0
ago 10 18:27:14 trinity kernel:  start_kernel+0x360/0x420
ago 10 18:27:14 trinity kernel:  x86_64_start_reservations+0x24/0x30
ago 10 18:27:14 trinity kernel:  x86_64_start_kernel+0x77/0x80
ago 10 18:27:14 trinity kernel:  secondary_startup_64_no_verify+0x10c/0x11b
ago 10 18:27:14 trinity kernel:  </TASK>
ago 10 18:27:14 trinity kernel: ---[ end trace 0000000000000000 ]---

@nathanchance
Copy link
Member Author

3500U

Interesting data point, as that is Zen+, whereas Nick and I only have Zen 2 hardware.

@nickdesaulniers nickdesaulniers changed the title [SRSO] "missing return thunk" warning when booting on Intel machine [SRSO] "missing return thunk" warning when booting on Intel or Zen1 machine Aug 10, 2023
@nickdesaulniers
Copy link
Member

Most of that is whitespace differences but there appear to be a couple more R_X86_64_PLT32 relocations with the integrated assembler?

I think the padding has different encodings between functions, but is a red herring because it is of the same length.

The only other difference that I spot is llvm/llvm-project#64603.

@nickdesaulniers
Copy link
Member

Ok, I need folks that can reproduce this crash (@nathanchance or @misotolar ) to test this for me.

$ make LLVM=1 LLVM_IAS=0 KAFLAGS=-Wa,-mshared -j$(npoc)

That should invoke GNU as rather than clang's integrated assembler, but make gas behave like clang wrt. the relocations for x86.

If that continues to fail to boot with the above trace, then the issue is something other than the relocations.

if that boots without issue, then the relocations themselves are the problem.

@nathanchance
Copy link
Member Author

I should note that there is no boot failure (at least for me), just the warning in dmesg (it is a WARN_ON, not BUG).

If that continues to fail to boot with the above trace, then the issue is something other than the relocations.

if that boots without issue, then the relocations themselves are the problem.

Is this backwards (i.e., if -Wa,-shared has the issue, it is something with the relocations)?

Regardless:

$ make -skj"$(nproc)" ARCH=x86_64 CC=clang LLVM_IAS=0 mrproper defconfig bzImage

$ boot-qemu.py -k .
# No warning

$ make -skj"$(nproc)" ARCH=x86_64 CC=clang KAFLAGS=-Wa,-mshared LLVM_IAS=0 mrproper defconfig bzImage

$ boot-qemu.py -k .
...
[    0.087342] ------------[ cut here ]------------
[    0.087717] missing return thunk: __ret+0x5/0x7e-__ret+0x0/0x7e: e9 f6 ff ff ff
[    0.087725] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:753 apply_returns+0x2da/0x430
[    0.088699] Modules linked in:
[    0.088949] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.5.0-rc5-00056-gcacc6e22932f #1
[    0.089700] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.2-1-1 04/01/2014
[    0.090699] RIP: 0010:apply_returns+0x2da/0x430
[    0.091024] Code: ff ff 0f 0b e9 c8 fd ff ff c6 05 60 bd c2 01 01 48 c7 c7 ae 5a 48 8a 4c 89 ee 4c 89 e2 b9 05 00 00 00 4d 89 e8 e8 b6 4d 05 00 <0f> 0b e9 a0 fd ff ff 45 85 e4 0f 84 2e ff ff ff 48 c7 c7 6e 5a 48
[    0.091699] RSP: 0000:ffffffff8a803e20 EFLAGS: 00010246
[    0.092699] RAX: 95832abb756ec300 RBX: ffffffff8b119188 RCX: ffffffff8a853e80
[    0.093187] RDX: ffffffff8a803cd8 RSI: 00000000ffffdfff RDI: ffffffff8a884110
[    0.093699] RBP: ffffffff8a803ef8 R08: 0000000000001fff R09: ffffffff8a854110
[    0.094154] R10: 0000000000005ffd R11: 0000000000000004 R12: ffffffff89d60040
[    0.094699] R13: ffffffff89d60045 R14: ffffffff8b119180 R15: ffffffff8a803e38
[    0.095157] FS:  0000000000000000(0000) GS:ffff8c60dee00000(0000) knlGS:0000000000000000
[    0.095699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.096067] CR2: ffff8c60db201000 CR3: 000000001a62a001 CR4: 0000000000770ef0
[    0.096528] PKRU: 55555554
[    0.096699] Call Trace:
[    0.096859]  <TASK>
[    0.096996]  ? __warn+0xc3/0x1c0
[    0.097204]  ? apply_returns+0x2da/0x430
[    0.097458]  ? report_bug+0x14e/0x1f0
[    0.097699]  ? handle_bug+0x3d/0x80
[    0.097925]  ? exc_invalid_op+0x1a/0x50
[    0.098171]  ? asm_exc_invalid_op+0x1a/0x20
[    0.098441]  ? __ret+0x5/0x7e
[    0.098635]  ? zen_untrain_ret+0x1/0x1
[    0.098699]  ? apply_returns+0x2da/0x430
[    0.098950]  ? __ret+0x5/0x7e
[    0.099142]  ? __ret+0x14/0x7e
[    0.099340]  ? __ret+0xa/0x7e
[    0.099699]  alternative_instructions+0x47/0x110
[    0.099999]  arch_cpu_finalize_init+0x2c/0x50
[    0.100279]  start_kernel+0x2e4/0x390
[    0.100520]  x86_64_start_reservations+0x24/0x30
[    0.100700]  x86_64_start_kernel+0xab/0xb0
[    0.100963]  secondary_startup_64_no_verify+0x179/0x17b
[    0.101297]  </TASK>
[    0.101440] ---[ end trace 0000000000000000 ]---
...

@nickdesaulniers
Copy link
Member

Is this backwards (i.e., if -Wa,-shared has the issue, it is something with the relocations)?

Yes, sorry.

Based on @MaskRay 's comment, binutils-2_24 and older may also be broken in the same way then.

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Aug 11, 2023

Brothers, please test:

From 45bd5cc6edf3dd974ca030a1f969fcec1391acac Mon Sep 17 00:00:00 2001
From: Nick Desaulniers <ndesaulniers@google.com>
Date: Fri, 11 Aug 2023 08:42:07 -0700
Subject: [PATCH] x86/srso: fix "missing return thunk" on non -mno-shared
 assemblers

A few users have reported observing the following splat from a
WARN_ONCE:

[    0.086618] ------------[ cut here ]------------
[    0.086996] missing return thunk: __ret+0x5/0x7e-__ret+0x0/0x7e: e9 f6 ff ff ff
[    0.087005] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:753 apply_returns+0x2da/0x4
30

[    0.088328] Modules linked in:
[    0.088585] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.5.0-rc5-00056-gcacc6e22932f #1
[    0.089216] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.2-1-1 0
4/01/2014
[    0.089329] RIP: 0010:apply_returns+0x2da/0x430
[    0.089624] Code: ff ff 0f 0b e9 c8 fd ff ff c6 05 60 bd c2 01 01 48 c7 c7 ae 5a 68 bd 4c 89 ee
 4c 89 e2 b9 05 00 00 00 4d 89 e8 e8 b6 4d 05 00 <0f> 0b e9 a0 fd ff ff 45 85 e4 0f 84 2e ff ff ff
 48 c7 c7 6e 5a 68
[    0.090328] RSP: 0000:ffffffffbda03e20 EFLAGS: 00010246
[    0.090740] RAX: cb2b7f056bc62700 RBX: ffffffffbe319188 RCX: ffffffffbda53e80
[    0.091328] RDX: ffffffffbda03cd8 RSI: 00000000ffffdfff RDI: ffffffffbda84110
[    0.091891] RBP: ffffffffbda03ef8 R08: 0000000000001fff R09: ffffffffbda54110
[    0.092328] R10: 0000000000005ffd R11: 0000000000000004 R12: ffffffffbcf60040
[    0.093328] R13: ffffffffbcf60045 R14: ffffffffbe319180 R15: ffffffffbda03e38
[    0.093896] FS:  0000000000000000(0000) GS:ffff97db5ee00000(0000) knlGS:0000000000000000
[    0.094328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.094775] CR2: ffff97db55001000 CR3: 000000001442a001 CR4: 0000000000770ef0
[    0.095329] PKRU: 55555554
[    0.095555] Call Trace:
[    0.095755]  <TASK>
[    0.095930]  ? __warn+0xc3/0x1c0
[    0.096328]  ? apply_returns+0x2da/0x430
[    0.096621]  ? report_bug+0x14e/0x1f0
[    0.096860]  ? handle_bug+0x3d/0x80
[    0.097087]  ? exc_invalid_op+0x1a/0x50
[    0.097328]  ? asm_exc_invalid_op+0x1a/0x20
[    0.097645]  ? __ret+0x5/0x7e
[    0.097847]  ? zen_untrain_ret+0x1/0x1
[    0.098329]  ? apply_returns+0x2da/0x430
[    0.098586]  ? __ret+0x5/0x7e
[    0.098781]  ? __ret+0x14/0x7e
[    0.098981]  ? __ret+0xa/0x7e
[    0.099175]  alternative_instructions+0x47/0x110
[    0.099329]  arch_cpu_finalize_init+0x2c/0x50
[    0.099613]  start_kernel+0x2e4/0x390
[    0.099853]  x86_64_start_reservations+0x24/0x30
[    0.100328]  x86_64_start_kernel+0xab/0xb0
[    0.100595]  secondary_startup_64_no_verify+0x17a/0x17b
[    0.100957]  </TASK>
[    0.101101] ---[ end trace 0000000000000000 ]---

It seems that the presence of (or lack thereof) relocations in
arch/x86/lib/retpoline.o seem to be triggering this.  I'm not certain,
but I suspect that this code may be checking the return thunk BEFORE
relocations have been applied.

GNU as ("GAS") has a command line flag pair -mshared/-mno-shared that
controls this behavior. In binutils 2.25, the implicit default value for
this flag was changed from -mshared to -mno-shared, but only for x86.[0]
Building with KAFLAGS=-Wa,-mshared can reproduce the above splat.

While Documentation/process/changes.rst currently lists binutils 2.25 as
the minimum supported version, the SRSO patches were backported to
stable's linux-5.4.y where binutils 2.21 is still supported. We could
add -Wa,-mno-shared to KBUILD_AFLAGS, but Clang's integrated assembler
doesn't support this flag, and defaults to -mshared for all
architectures. [1]

Instead, we can simply add a local label that aliases the global label
__ret, and refer to that within arch/x86/lib/retpoline.S to avoid any
relocations being generated for any assembler regardless of its implicit
default behavior with respect to -mshared/-mno-shared.

Cc: stable@vger.kernel.org
Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/1911
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=b084df0b8d1262fb1e969c74bcc5c61e262a6199 [0]
Link: https://github.com/llvm/llvm-project/issues/64603 [1]
---
 arch/x86/lib/retpoline.S | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 5c43684ec982..5acb78da5488 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -184,7 +184,7 @@ SYM_FUNC_END(srso_safe_ret_alias)
  *    from re-poisioning the BTB prediction.
  */
        .align 64
-       .skip 64 - (__ret - zen_untrain_ret), 0xcc
+       .skip 64 - (.L__ret - zen_untrain_ret), 0xcc
 SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
        ANNOTATE_NOENDBR
        /*
@@ -217,6 +217,7 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
         * which will be contained safely by the INT3.
         */
 SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(.L__ret, SYM_L_LOCAL)
        ret
        int3
 SYM_CODE_END(__ret)
@@ -230,7 +231,7 @@ SYM_CODE_END(__ret)
         * Jump back and execute the RET in the middle of the TEST instruction.
         * INT3 is for SLS protection.
         */
-       jmp __ret
+       jmp .L__ret
        int3
 SYM_FUNC_END(zen_untrain_ret)
 __EXPORT_THUNK(zen_untrain_ret)
@@ -265,7 +266,7 @@ SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
 SYM_FUNC_START(__x86_return_thunk)
-       ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
+       ALTERNATIVE_2 "jmp .L__ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
                        "call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
        int3
 SYM_CODE_END(__x86_return_thunk)
-- 
2.41.0.694.ge786442a9b-goog

@torvic9 @misotolar @autogris if you respond with:

Tested-by: [your name no square brackets]<your email address with angle bracket> then I can put that in the commit message. I'd like to at least credit you all with reported by tags, but I don't know your email addresses.

@nickdesaulniers nickdesaulniers added the [PATCH] Exists There is a patch that fixes this issue label Aug 11, 2023
@nickdesaulniers nickdesaulniers self-assigned this Aug 11, 2023
@nathanchance
Copy link
Member Author

That diff appears to work for me in QEMU. I will do a boot test on bare metal shortly.

@nathanchance
Copy link
Member Author

nathanchance commented Aug 11, 2023

Somewhat interesting... I see the warning when booting linux-6.4.y kernels (without this patch) on my Zen 2 machine but not mainline...

[    0.172483] ------------[ cut here ]------------
[    0.173397] missing return thunk: __ret+0x5/0x7e-__ret+0x0/0x7e: e9 f6 ff ff ff
[    0.173409] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:630 apply_returns+0x2c5/0x410
[    0.175397] Modules linked in:
[    0.176398] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.9-cbl-00014-g3825c7764f4d #1
[    0.177397] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[    0.178397] RIP: 0010:apply_returns+0x2c5/0x410
[    0.179398] Code: ff ff 0f 0b e9 b7 fd ff ff c6 05 0d a6 c2 01 01 48 c7 c7 1d 75 cd ab 4c 89 ee 4c 89 e2 b9 05 00 00 00 4d 89 e8 e8 fb 3c 05 00 <0f> 0b e9 8f fd ff ff 4d 85 e4 0f 84 2d ff ff ff 48 c7 c7 52 1e c8
[    0.180397] RSP: 0000:ffffffffac003e20 EFLAGS: 00010246
[    0.181397] RAX: d60454cc894cfb00 RBX: ffffffffac8fa678 RCX: ffffffffac054610
[    0.182397] RDX: ffffffffac003cd8 RSI: 00000000ffffbfff RDI: ffffa2acdf040000
[    0.183397] RBP: ffffffffac003ef8 R08: 0000000000003fff R09: ffffa2acdf1a0000
[    0.184397] R10: 000000000000bffd R11: 0000000000000004 R12: ffffffffab533bc0
[    0.185397] R13: ffffffffab533bc5 R14: ffffffffac8fa670 R15: ffffffffac003e38
[    0.186399] FS:  0000000000000000(0000) GS:ffffa2acdb600000(0000) knlGS:0000000000000000
[    0.187397] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.188397] CR2: ffffa2acdec01000 CR3: 000000001e02a000 CR4: 0000000000350ef0
[    0.189399] Call Trace:
[    0.190398]  <TASK>
[    0.191398]  ? __warn+0xc3/0x1c0
[    0.192397]  ? apply_returns+0x2c5/0x410
[    0.193398]  ? report_bug+0x14e/0x1f0
[    0.194398]  ? handle_bug+0x3d/0x80
[    0.195398]  ? exc_invalid_op+0x1a/0x50
[    0.196398]  ? asm_exc_invalid_op+0x1a/0x20
[    0.197398]  ? __ret+0x5/0x7e
[    0.198256]  ? zen_untrain_ret+0x1/0x1
[    0.198398]  ? apply_returns+0x2c5/0x410
[    0.199398]  ? __ret+0x5/0x7e
[    0.200397]  ? __ret+0x14/0x7e
[    0.201398]  ? __ret+0xa/0x7e
[    0.202265]  alternative_instructions+0x47/0x110
[    0.202398]  arch_cpu_finalize_init+0x2c/0x50
[    0.203398]  start_kernel+0x2e4/0x390
[    0.204398]  x86_64_start_reservations+0x24/0x30
[    0.205398]  x86_64_start_kernel+0xab/0xb0
[    0.206398]  secondary_startup_64_no_verify+0x107/0x10b
[    0.207398]  </TASK>
[    0.208397] ---[ end trace 0000000000000000 ]---

Will test that patch to make sure every location that I see the warning in is fixed.

@nickdesaulniers
Copy link
Member

Somewhat interesting... I see the warning when booting linux-6.4.y kernels (without this patch) on my Zen 2 machine but not mainline...

Then there may be some kind of race between the static call patching of this and relocations.

@nathanchance
Copy link
Member Author

That diff resolves the warning for me in all three places that I could reproduce it.

@torvic9
Copy link

torvic9 commented Aug 11, 2023

Not tried the diff yet, but without it, I get the same error message on a Zen2 machine using clang 16.0.6+ThinLTO and the two accepted patches.

kernel: missing return thunk: __ret+0x5/0x7e-__ret+0x0/0x7e: e9 f6 ff ff ff

and

AMD Ryzen 9 3900X 12-Core Processor
[...]
$ zgrep -i srso /proc/config.gz                                                                                                                                                                                         
CONFIG_CPU_SRSO=y

@misotolar
Copy link

My Zen1 laptop boot 6.4.9 without any warning with the patch.

@misotolar
Copy link

Tested-by: Michal Sotolar <michal@sotolar.com>

@phoepsilonix
Copy link

phoepsilonix commented Aug 12, 2023

https://github.com/ClangBuiltLinux/linux/commit/150c42407f87463c27a2459e06845965291d9973.patch
https://github.com/ClangBuiltLinux/linux/commit/8a9b1f65817e94281ed5922ccee877ab99add71e.patch

By combining it with the above two patches, it can be built with Clang(full lto) without any problems.
There are no more errors.

Formatting only.

From 45bd5cc6edf3dd974ca030a1f969fcec1391acac Mon Sep 17 00:00:00 2001
From: Nick Desaulniers <ndesaulniers@google.com>
Date: Fri, 11 Aug 2023 08:42:07 -0700
Subject: [PATCH] x86/srso: fix "missing return thunk" on non -mno-shared
 assemblers

A few users have reported observing the following splat from a
WARN_ONCE:

[    0.086618] ------------[ cut here ]------------
[    0.086996] missing return thunk: __ret+0x5/0x7e-__ret+0x0/0x7e: e9 f6 ff ff ff
[    0.087005] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:753 apply_returns+0x2da/0x4
30

[    0.088328] Modules linked in:
[    0.088585] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.5.0-rc5-00056-gcacc6e22932f #1
[    0.089216] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.2-1-1 0
4/01/2014
[    0.089329] RIP: 0010:apply_returns+0x2da/0x430
[    0.089624] Code: ff ff 0f 0b e9 c8 fd ff ff c6 05 60 bd c2 01 01 48 c7 c7 ae 5a 68 bd 4c 89 ee
 4c 89 e2 b9 05 00 00 00 4d 89 e8 e8 b6 4d 05 00 <0f> 0b e9 a0 fd ff ff 45 85 e4 0f 84 2e ff ff ff
 48 c7 c7 6e 5a 68
[    0.090328] RSP: 0000:ffffffffbda03e20 EFLAGS: 00010246
[    0.090740] RAX: cb2b7f056bc62700 RBX: ffffffffbe319188 RCX: ffffffffbda53e80
[    0.091328] RDX: ffffffffbda03cd8 RSI: 00000000ffffdfff RDI: ffffffffbda84110
[    0.091891] RBP: ffffffffbda03ef8 R08: 0000000000001fff R09: ffffffffbda54110
[    0.092328] R10: 0000000000005ffd R11: 0000000000000004 R12: ffffffffbcf60040
[    0.093328] R13: ffffffffbcf60045 R14: ffffffffbe319180 R15: ffffffffbda03e38
[    0.093896] FS:  0000000000000000(0000) GS:ffff97db5ee00000(0000) knlGS:0000000000000000
[    0.094328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.094775] CR2: ffff97db55001000 CR3: 000000001442a001 CR4: 0000000000770ef0
[    0.095329] PKRU: 55555554
[    0.095555] Call Trace:
[    0.095755]  <TASK>
[    0.095930]  ? __warn+0xc3/0x1c0
[    0.096328]  ? apply_returns+0x2da/0x430
[    0.096621]  ? report_bug+0x14e/0x1f0
[    0.096860]  ? handle_bug+0x3d/0x80
[    0.097087]  ? exc_invalid_op+0x1a/0x50
[    0.097328]  ? asm_exc_invalid_op+0x1a/0x20
[    0.097645]  ? __ret+0x5/0x7e
[    0.097847]  ? zen_untrain_ret+0x1/0x1
[    0.098329]  ? apply_returns+0x2da/0x430
[    0.098586]  ? __ret+0x5/0x7e
[    0.098781]  ? __ret+0x14/0x7e
[    0.098981]  ? __ret+0xa/0x7e
[    0.099175]  alternative_instructions+0x47/0x110
[    0.099329]  arch_cpu_finalize_init+0x2c/0x50
[    0.099613]  start_kernel+0x2e4/0x390
[    0.099853]  x86_64_start_reservations+0x24/0x30
[    0.100328]  x86_64_start_kernel+0xab/0xb0
[    0.100595]  secondary_startup_64_no_verify+0x17a/0x17b
[    0.100957]  </TASK>
[    0.101101] ---[ end trace 0000000000000000 ]---

It seems that the presence of (or lack thereof) relocations in
arch/x86/lib/retpoline.o seem to be triggering this.  I'm not certain,
but I suspect that this code may be checking the return thunk BEFORE
relocations have been applied.

GNU as ("GAS") has a command line flag pair -mshared/-mno-shared that
controls this behavior. In binutils 2.25, the implicit default value for
this flag was changed from -mshared to -mno-shared, but only for x86.[0]
Building with KAFLAGS=-Wa,-mshared can reproduce the above splat.

While Documentation/process/changes.rst currently lists binutils 2.25 as
the minimum supported version, the SRSO patches were backported to
stable's linux-5.4.y where binutils 2.21 is still supported. We could
add -Wa,-mno-shared to KBUILD_AFLAGS, but Clang's integrated assembler
doesn't support this flag, and defaults to -mshared for all
architectures. [1]

Instead, we can simply add a local label that aliases the global label
__ret, and refer to that within arch/x86/lib/retpoline.S to avoid any
relocations being generated for any assembler regardless of its implicit
default behavior with respect to -mshared/-mno-shared.

Cc: stable@vger.kernel.org
Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/1911
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=b084df0b8d1262fb1e969c74bcc5c61e262a6199 [0]
Link: https://github.com/llvm/llvm-project/issues/64603 [1]
---
 arch/x86/lib/retpoline.S | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 5c43684ec982..5acb78da5488 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -184,7 +184,7 @@ SYM_FUNC_END(srso_safe_ret_alias)
  *    from re-poisioning the BTB prediction.
  */
 	.align 64
-	.skip 64 - (__ret - zen_untrain_ret), 0xcc
+       .skip 64 - (.L__ret - zen_untrain_ret), 0xcc
 SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	/*
@@ -217,6 +217,7 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	 * which will be contained safely by the INT3.
 	 */
 SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(.L__ret, SYM_L_LOCAL)
 	ret
 	int3
 SYM_CODE_END(__ret)
@@ -230,7 +231,7 @@ SYM_CODE_END(__ret)
 	 * Jump back and execute the RET in the middle of the TEST instruction.
 	 * INT3 is for SLS protection.
 	 */
-	jmp __ret
+       jmp .L__ret
 	int3
 SYM_FUNC_END(zen_untrain_ret)
 __EXPORT_THUNK(zen_untrain_ret)
@@ -265,7 +266,7 @@ SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
 SYM_FUNC_START(__x86_return_thunk)
-	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
+       ALTERNATIVE_2 "jmp .L__ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
 			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
 	int3
 SYM_CODE_END(__x86_return_thunk)

@autogris
Copy link

autogris commented Aug 12, 2023

Brothers, please test:

Building kernel 6.1.45 fails with this additional patch, giving the following errror:

clang -cc1as: fatal error: error in backend: Size expression must be absolute.
make[2]: *** [scripts/Makefile.build:382: arch/x86/lib/retpoline.o] Error 1
make[1]: *** [scripts/Makefile.build:500: arch/x86/lib] Error 2

LLVM_IAS was set to 1 again, and the 2 previous working patches (https://lore.kernel.org/lkml/20230809-gds-v1-1-eaac90b0cbcc@google.com/ and 150c424) were also applied.

@arachsys
Copy link

arachsys commented Aug 12, 2023

Brothers, please test:

With this patch on top of the lld fix from 8a9b1f6 applied to either git master or the 6.4.10 stable kernel, I'm seeing the 'Size expression must be absolute' build error too.

# AS      arch/x86/lib/retpoline.o
  clang -Wp,-MMD,arch/x86/lib/.retpoline.o.d -nostdinc -I./arch/x86/include -I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -D__KERNEL__ --target=x86_64-linux-gnu -fintegrated-as -Werror=unknown-warning-option -Werror=ignored-optimization-argument -Werror=option-ignored -Werror=unused-command-line-argument -fmacro-prefix-map=./= -D__ASSEMBLY__ -fno-PIE -m64    -c -o arch/x86/lib/retpoline.o arch/x86/lib/retpoline.S
clang -cc1as: fatal error: error in backend: Size expression must be absolute.
make[3]: *** [scripts/Makefile.build:360: arch/x86/lib/retpoline.o] Error 1

This is with llvm-as, clang and lld from the most recent 16.0.6 release, so it may be llvm version dependent?

I'm not using LTO, and initially didn't test with the LTO fix from 150c424 but applying this in addition has no effect on the error.

Without the retpoline.S patch, the kernel builds fine but emits exactly the "missing return thunk" warning described above when booting a test Skylake host.

@nathanchance
Copy link
Member Author

@arachsys @autogris can you post your .config? I can build defconfig with all major versions of LLVM that the kernel supports (11 through 18) in both 6.4.10 and 6.1.45 with the existing ld.lld fixes and that diff, so it seems like it is something with your configuration that tickles this.

@arachsys
Copy link

arachsys commented Aug 12, 2023

Sorry, it turns out I am an idiot. I fixed it, then re-read the patch you posted more carefully and realised that I had accidentally hand-applied

-SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(.L__ret, SYM_L_LOCAL)

instead of

 SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(.L__ret, SYM_L_LOCAL)

Your patch didn't have the problem in the first place. Apologies for the noise!

When correctly applied (!), your patch does indeed build fine and fixes the warning for both my Intel Skylake box and a Zen2 4800U box that also had the same warning when I tested it. If it's helpful,

Tested-by: Chris Webb <chris@arachsys.com>

@autogris
Copy link

@arachsys @autogris can you post your .config? I can build defconfig with all major versions of LLVM that the kernel supports (11 through 18) in both 6.4.10 and 6.1.45 with the existing ld.lld fixes and that diff, so it seems like it is something with your configuration that tickles this.

Sure, here: config.txt

@autogris
Copy link

Sorry, it turns out I am an idiot. I fixed it, then re-read the patch you posted more carefully and realised that I had accidentally hand-applied

-SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(.L__ret, SYM_L_LOCAL)

instead of

 SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(.L__ret, SYM_L_LOCAL)

Your patch didn't have the problem in the first place. Apologies for the noise!

When correctly applied (!), your patch does indeed build fine and fixes the warning for both my Intel Skylake box and a Zen2 4800U box that also had the same warning when I tested it. If it's helpful,

Tested-by: Chris Webb <chris@arachsys.com>

Oh my, I made the exact same mistake adjusting it manually to 6.1.45. Sorry for the trouble.

@torvic9
Copy link

torvic9 commented Aug 12, 2023

With this patch applied, no more warning on my Zen2 box running 6.4.10.

For Zen2:
Tested-by: Tor Vic <torvic9@mailbox.org>

@nathanchance
Copy link
Member Author

Boris points out that the second patch of Peter's clean up series (which depends on the first) should resolve this and I can confirm that for at least mainline and 6.4. 6.1 and earlier seems a lot harder since that was before call depth tracking was added.

@nickdesaulniers
Copy link
Member

Peter mentions on IRC that 770ae1b should be applied to 6.1.y first. I confirmed it applies cleanly, and @nathanchance confirmed then that the next two patches applied + fixed the issue for that branch of stable

@nathanchance
Copy link
Member Author

Peter's patch is now in -tip: https://git.kernel.org/tip/d43490d0ab824023e11d0b57d0aeec17a6e0ca13

@nathanchance nathanchance added [BUG] linux A bug that should be fixed in the mainline kernel. [PATCH] Accepted A submitted patch has been accepted upstream and removed [BUG] Untriaged Something isn't working [PATCH] Exists There is a patch that fixes this issue labels Aug 17, 2023
@nickdesaulniers nickdesaulniers removed their assignment Aug 17, 2023
@nathanchance
Copy link
Member Author

This is now in mainline: https://git.kernel.org/linus/d43490d0ab824023e11d0b57d0aeec17a6e0ca13

It has been backported to 6.4 so far but it should go back to 6.1 and 5.15, it appears the conflicts/issues with all the SRSO fixes have delayed these backports past 6.4 for now.

@nathanchance nathanchance added [FIXED][LINUX] development cycle This bug was only present and fixed in a -next or -rc cycle and removed [PATCH] Accepted A submitted patch has been accepted upstream labels Aug 23, 2023
@nickdesaulniers
Copy link
Member

Do we want to keep this open to track backports?

@nathanchance
Copy link
Member Author

Sure.

@nathanchance nathanchance reopened this Aug 23, 2023
@nathanchance nathanchance added the Needs Backport Should be backported to either linux-stable tree or latest llvm release branch. label Aug 23, 2023
@nathanchance nathanchance removed the Needs Backport Should be backported to either linux-stable tree or latest llvm release branch. label Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[BUG] linux A bug that should be fixed in the mainline kernel. [FIXED][LINUX] development cycle This bug was only present and fixed in a -next or -rc cycle [TOOL] integrated-as The issue is relevant to LLVM integrated assembler
Projects
None yet
Development

No branches or pull requests

7 participants