New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[objtool warnings] pairs of undefined stack state followed by stack state mismatch #319
Comments
I expect this to be fully asm goto related, given that with https://github.com/nathanchance/linux/commits/no-asm-goto and a build of Clang @ llvm/llvm-project@e50d9cb, I do not see these warnings. |
defconfig still has these warnings (different pairing):
and
and
|
Simpler reproducer: $ make CC=clang defconfig
$ make CC=clang -j46 arch/x86/mm/fault.o
$ ./tools/objtool/objtool check arch/x86/mm/fault.o --no-fp |
creduce coughed this up: when grepping for |
(tangent; it's weird that clang accepts |
It's actually perfectly valid C code. :-) (Albeit non-ANSI C.) |
So what I think is happening here is that the clobber list is somehow messed up (for function Attached is the disassembly from: $ clang -O2 -no-integrated-as -c final.i.c
$ objdump -d final.i.o > final.txt
$ ~/linux/tools/objtool/objtool check final.i.o --no-fp
final.i.o: warning: objtool: t()+0x0: return with modified stack frame
final.i.o: warning: objtool: t()+0x0: stack state mismatch: cfa1=7+64 cfa2=7+8 Specifically, it seems that |
ctopper points out that the |
For further objtool warnings: $ grep objtool: log.txt | cut -d ' ' -f 5- | sort | uniq
can't find switch jump table
return with modified stack frame
special: can't find new instruction
stack state mismatch: cfa1=6+16 cfa2=7+8
stack state mismatch: cfa1=7+64 cfa2=7+8
undefined stack state |
Is |
I guess you could say that since |
Which means disabling |
My objtool warnings:
|
Cool, so Josh (author of objtool) got back to me. Looks like the problem is seen in the relocations:
|
@topperc points out on IRC that the reproducer bug is visible before we even assemble to an object file. Replacing ...
# %bb.7: # %if.then14
#APP
1:.byte 15,31,68,0x00,0x00
.pushsection __jump_table, "aw"
.long 1b - ., .Ltmp0 - .
.quad 0 + .Ltmp0 - .
.popsection
#NO_APP
movl 4(%rsp), %eax # 4-byte Reload
.LBB0_8: # %cleanup
addq $8, %rsp
.cfi_def_cfa_offset 56
popq %rbx
.cfi_def_cfa_offset 48
popq %r12
.cfi_def_cfa_offset 40
popq %r13
.cfi_def_cfa_offset 32
popq %r14
.cfi_def_cfa_offset 24
popq %r15
.cfi_def_cfa_offset 16
popq %rbp
.cfi_def_cfa_offset 8
retq
.LBB0_9: # %if.else
.cfi_def_cfa_offset 64
ud2
.Ltmp0: # Address of block that was removed by CodeGen
.Lfunc_end0:
... So the inline asm goto jumps past the epilog (pops and return instructions). |
@topperc points out that $ clang -mllvm -debug -mllvm -print-after-all -O2 -no-integrated-as 2>&1 final.edit.c
...
# *** IR Dump After Prologue/Epilogue Insertion & Frame Finalization ***:
bb.7.if.then14:
; predecessors: %bb.6
successors: %bb.10(0x40000000), %bb.8(0x40000000); %bb.10(50.00%), %bb.8(50.00%)
liveins: $r15
INLINEASM ...
...
# *** IR Dump After Control Flow Optimizer ***:
bb.7.if.then14:
; predecessors: %bb.6
successors: %bb.8(0x80000000); %bb.8(100.00%)
liveins: $r15
INLINEASM
... |
Diff 183623 of https://reviews.llvm.org/D53765 has cut down the vast majority (4593/4607 == 99.7%) of the objtool warnings. (The one we debugged above came from ➜ kernel-all git:(master) ✗ grep objtool: log.txt | cut -d ' ' -f 5- | sort | uniq
can't find switch jump table
return with modified stack frame
special: can't find new instruction
stack state mismatch: cfa1=6+16 cfa2=7+8
stack state mismatch: cfa1=7+64 cfa2=7+8
undefined stack state
➜ kernel-all git:(master) ✗ grep objtool: log.txt | cut -d ' ' -f 5- | wc -l
4607
➜ kernel-all git:(master) ✗ grep objtool: log2.txt | cut -d ' ' -f 5- | sort | uniq
return with modified stack frame
stack state mismatch: cfa1=6+16 cfa2=7+8
stack state mismatch: cfa1=7+64 cfa2=7+8
undefined stack state
➜ kernel-all git:(master) ✗ grep objtool: log2.txt | cut -d ' ' -f 5- | wc -l
14 Let's see if we can pull out another reproducer and whether these are asm goto related or not. EDIT: the warning from arch/x86/mm/fault.o is not gone. It's just that we get different sets of objtool warnings for defconfig (4 + 1KI) vs allyesconfig (14). |
If you can get even a non-creduced reproducer i might be able to make some progress with it. i'm not convinced I fixed all the possible issues with BranchFolding.cpp |
Shouldn't these exist for the second asm goto as well?
|
@gwelymernans I don't think so...labels in C are function scoped, unless using GNU C extension "local labels." So the first pair of |
Is the hanging a new problem? |
yes (we can track in a new issue, then close this one out once that is resolved) |
Can you get me a reproducer? Even just one of the hanging sources preprocessed with a command line. |
Moved to #330 . w/ D53765 diff 183738: attaching a non-creduced reproducer. Please let me know if you'd like it creduced. I don't see the typical drivers/spi/spi-rockchip.o: warning: objtool: rockchip_spi_max_transfer_size()+0x13: undefined stack state
drivers/spi/spi-rockchip.o: warning: objtool: rockchip_spi_max_transfer_size()+0x0: stack state mismatch: cfa1=6+16 cfa2=7+8 so It's not clear to me what's wrong in this case:
...
00000000000022e0 <rockchip_spi_max_transfer_size>:
22e0: e8 00 00 00 00 callq 22e5 <rockchip_spi_max_transfer_size+0x5>
22e1: R_X86_64_PC32 __fentry__-0x4
22e5: 55 push %rbp
22e6: 48 89 e5 mov %rsp,%rbp
22e9: e8 00 00 00 00 callq 22ee <rockchip_spi_max_transfer_size+0xe>
22ea: R_X86_64_PC32 __sanitizer_cov_trace_pc-0x4
22ee: b8 ff ff 00 00 mov $0xffff,%eax
22f3: 5d pop %rbp
22f4: c3 retq
22f5: 90 nop
22f6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
22fd: 00 00 00
... And I don't see the jump table issues Josh previously pointed out. (There is no |
Comparing |
It doesn't look like there are any asm gotos in this file right? When you say "another" special case? Is that referencing some previous issue that wasn't asm goto related? |
Correct. Don't worry about it; I'll stop posting about it (it will be implied). |
The second asm goto is also in an expression statement, so I'm not sure I understand ... |
The labels the second asm goto jump to are just normal labels and thus have function level scope; scoped to function |
huh? see export3.zip (has many). |
When I compiled to .ll there were no callbr instructions in the final IR. It looks like a lot of them are in functions in header files. Those functions must not be used by the .c file and so they got dead code eliminated? |
oh, indeed, no callbr's there. Ok, sorry for the false negative. Did some comparisons between w/ and w/o asm goto support: https://gist.github.com/nickdesaulniers/d2181780669db43affa05f60a4b059dd TL;DR looks like there's just 1 warning unique to asm goto from objtool left (and 5 pre-existing unrelated objtool warnings (4 that we weren't aware of; filed #331 #332 #333 #334 )).
What's bizarre is that I can only reproduce when building the entire kernel, and not the standalone object file.
shakes laptop; why are you haunted?!? Oh, seems like at least everything under Anyways, looks like the .ll does have callbr's. Also, the warning is coming from the kernel's orc unwinder instrumenting the object file after the build (vs the previous issues where it was checking the object files). From the assembly, I see the smoking gun Here's the reproducer. Replacing @topperc I'm not sure if this reproducer is enough to point clearly to a problem or not. I also suspect the lack of inlining of static local functions, even if they contain meaningless asm goto's like the reproducer. |
New diff 183780 is up in phab. .LBB1_1 being unreachable was important to finding the issue. |
Trying now with localmodconfig (a kernel config based on the currently running kernel, a safe configuration if you want to boot on metal); seeing 4 different objtool warnings and some compiler hangs (4 jobs were still running after an hour). Will spend more time bisecting and comparing before+after asm goto patches are applied/checking for the presence of Edit: filed #335 and #336 , but not reassigning until I've done more due diligence on them. |
the issues I was observing with allyesconfig look fixed now as of Diff 183921. Closing this bug, but let's follow up in #336 . |
Forked from #6
lots and lots of objtool warning in the form:
Note that they come in pairs. For an x86_64 allyesconfig, I see
2215
of these warnings (bad). I assume this is asm goto related.The text was updated successfully, but these errors were encountered: