-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4.4 x86_64 defconfig panic at boot #1085
Comments
In case anyone needs to reproduce. |
I'm having trouble reproducing.
I also tried removing Though I'm doing |
My immediate thought is that we forgot a situation where code is being moved past an INLINEASM_BR. |
Let's compare disassembly of object files before and after that suspected llvm commit.
Checking the ir:
There's only one callbr: define internal fastcc i32 @klist_dec_and_del(%struct.klist_node* %0) unnamed_addr #1 {
%2 = getelementptr inbounds %struct.klist_node, %struct.klist_node* %0, i64 0, i32 2
%3 = getelementptr inbounds %struct.kref, %struct.kref* %2, i64 0, i32 0, i32 0
callbr void asm sideeffect ".pushsection .smp_locks,\22a\22\0A.balign 4\0A.long 671f - .\0A.popsection\0A671:\0A\09lock; subl $1, $0; je ${2:l}", "*m,er,X,~{memory},~{dirflag},~{fpsr},~{flags}"(i32* %3, i32 1, i8* blockaddress(@klist_dec_and_del, %5)) #6
738 to label %4 [label %5], !srcloc !13 so that's kind of our smoking gun, since ".pushsection .smp_locks,\22a\22\0A.balign 4\0A.long 671f - .\0A.popsection\0A671:\0A\09lock; subl $1, $0; je ${2:l}" might be a hint at which |
call chain looks something like: |
I'm extracting the IR for |
Seems there's no difference between IR? $ make CC=clang LD=ld.lld O=../tot_llvm -j71 lib/klist.ll
$ llvm-extract --func=klist_dec_and_del lib/klist.ll -o=klist_dec_and_del.bc
$ llvm-dis klist_dec_and_del.bc
<same for 78c69a00a4cff786e0ef13c895d0db309d6b3f42>
$ diff -u <(cat ../78c69a0_llvm/klist_dec_and_del.ll) <(cat ../tot_llvm/klist_dec_and_del.ll)
<no difference> does that imply a difference in lowering? ie. |
Yes. I think this is a code-gen bug, not an optimization bug. |
I looked at klist.s before and after the change and there aren't any significant differences.
|
@gwelymernans , I think you're holding it wrong. 🔥 It looks like you might be comparing before/after Fangrui's change involving local labels? I'm comparing 78c69a00a4cff786e0ef13c895d0db309d6b3f42 vs 158feabde4cb98021469ed4126682d8ee57456eb (which, admittedly, could be narrower). $ make CC=clang LD=ld.lld O=../78c69a0_llvm -j71 lib/klist.s
<same for ToT LLVM>
$ diff -u <(cat ../78c69a0_llvm/lib/klist.s ) <(cat ../tot_llvm/lib/klist.s ) @@ -615,7 +615,8 @@
671:
lock; subl $1, 24(%rdi); je .Ltmp0
#NO_APP
-.LBB12_1:
+ leaq 24(%rdi), %rbx
+# %bb.1:
#APP
#NO_APP
xorl %eax, %eax
@@ -629,12 +630,10 @@
retq
.Ltmp0: # Block address taken
.LBB12_2:
- movq %rdi, %rbx
- addq $24, %rbx
testb $1, -24(%rbx)
je .LBB12_3
.LBB12_4: So before, the inline asm from Ok, now let me dump the output from |
$ llc -O2 lib/klist.ll -print-after-all -o /dev/null &> klist.txt
<same for both versions of LLVM>
$ diff -u <(cat ../78c69a0_llvm/klist.txt ) <(cat ../tot_llvm/klist.txt)
...
<first hunk in the diff is (ignoring many srcloc differences)> @@ -74602,6 +74602,7 @@
successors: %bb.1(0x80000000), %bb.2(0x00000000); %bb.1(100.00%), %bb.2(0.00%)
liveins: $rdi
%7:gr64 = COPY $rdi
+ %0:gr64 = nuw ADD64ri8 %7:gr64(tied-def 0), 24, implicit-def dead $eflags
INLINEASM_BR &".pushsection .smp_locks,\22a\22\0A.balign 4\0A.long 671f - .\0A.popsection\0A671:\0A\09lock; subl $1, $0; je ${2:l}" [sideeffect] [mayload] [maystore] [attdialect], $0:[mem:m], %7:gr64, 1, $noreg, 24, $noreg, $1:[imm], 1, $2:[imm], blockaddress(@klist_dec_and_del, %ir-block.5), $3:[clobber], implicit-def early-clobber $df, $4:[clobber], implicit-def early-clobber $fpsw, $5:[clobber], implicit-def early-clobber $eflags, !13
JMP_1 %bb.1 so looking around L74602 in either file for the pass (
So looks like "Machine code sinking" is the first source of divergence. FWIW, in 78c69a0, this code looks like:
|
Let me rerun this with a tighter range of LLVM SHA's (just before and after @jyknight 's change). I worry that I might be conflating other changes to LLVM. Sorry. |
Using 4b0aa5724fea (just @jyknight 's change) instead of ToT LLVM, I get the same results. Ok, now let's see if I can reduce the IR in order to get us a test case that we can commit to LLVM with a fix. |
Very possible! :-) Nice reduction. It looks like INLINEASM_BR needs to be a barrier to hoisting/sinking in more places. |
Should we mark INLINEASM_BR as "isBarrier" in |
I tried diff --git a/llvm/include/llvm/Target/Target.td b/llvm/include/llvm/Target/Target.td
index aab5376db453..7ec6db9fc4f3 100644
--- a/llvm/include/llvm/Target/Target.td
+++ b/llvm/include/llvm/Target/Target.td
@@ -1021,6 +1021,7 @@ def INLINEASM_BR : StandardPseudoInstruction {
let hasSideEffects = 1;
// Despite potentially branching, this instruction is intentionally _not_
// marked as a terminator or a branch.
+ let isBarrier = 1;
}
def CFI_INSTRUCTION : StandardPseudoInstruction {
let OutOperandList = (outs); with a quick boot test (not looking at the output), but it failed in the same way. 😞 |
ToT #!/usr/bin/env sh
OUT=$(llc -O2 $1 -print-after=machine-sink -stop-after=machine-sink -o /dev/null 2>&1)
echo "$OUT" | grep -FB1 'INLINEASM_BR &".pushsection .smp_locks,\22a\22\0A.balign 4\0A.long 671f - .\0A.popsection\0A671:\0A\09lock; subl $1, $0; je ${2:l}" [sideeffect] [mayload] [maystore] [attdialect], $0:[mem:m], %7:gr64, 1, $noreg, 24, $noreg, $1:[imm], 1, $2:[imm], blockaddress(@klist_dec_and_del, %ir-block.5), $3:[clobber], implicit-def early-clobber $df, $4:[clobber], implicit-def early-clobber $fpsw, $5:[clobber], implicit-def early-clobber $eflags, !13' | \
not grep -F '%0:gr64 = nuw ADD64ri8 %7:gr64(tied-def 0), 24, implicit-def dead $eflags'
# interesting.sh so "grep for the lone $ bugpoint -compile-custom -compile-command=./interesting.sh lib/klist.ll
$ llvm-dis bugpoint-reduced-simplified.bc
$ wc -l lib/klist.ll bugpoint-reduced-simplified.ll
954 lib/klist.ll
354 bugpoint-reduced-simplified.ll I've run that repeatedly through |
Making copies of #!/usr/bin/env bash
GOOD=$(llc-good -O2 $1 -print-after=machine-sink -stop-after=machine-sink -o /dev/null 2>&1)
BAD=$(llc-bad -O2 $1 -print-after=machine-sink -stop-after=machine-sink -o /dev/null 2>&1)
diff -u <( echo "$GOOD" ) <( echo "$BAD" ) | \
grep -B1 "INLINEASM_BR" | \
not grep "ADD64ri8" $ bugpoint -compile-custom -compile-command=./interesting.sh lib/klist.ll
$ llvm-dis bugpoint-reduced-simplified.bc -o bugpoint-reduced-simplified.ll
$ llvm-reduce --test=interesting.sh bugpoint-reduced-simplified.ll -o reduced.ll then I manually cleaned up the types and output a little. $ wc -l reduced.ll
25 reduced.ll I've verified my test case passes for 78c69a00a4cf (before latest asm goto change) and fails after. I'll see if I can just fix machine sink, tomorrow. |
If I remove the diff added in |
@jyknight made a further fix that I think is preferable: https://reviews.llvm.org/D83708. |
This function has a bug which will incorrectly reschedule instructions after an INLINEASM_BR (which can branch). (The bug may also allow scheduling past a throwing-CALL, I'm not certain.) I could fix that bug, but, as the removed FIXME notes, it's better to attempt rescheduling before converting to 3-addr form, as that may remove the need to convert in the first place. In fact, the code to do such reordering was added to this pass only a few months later, in 2011, via the addition of the function rescheduleMIBelowKill. That code does not contain the same bug. The removal of the sink3AddrInstruction function is not a no-op: in some cases it would move an instruction post-conversion, when rescheduleMIBelowKill would not move the instruction pre-converison. However, this does not appear to be important: the machine instruction scheduler can reorder the after-conversion instructions, in any case. This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when built with clang after 4b0aa57. Link: ClangBuiltLinux/linux#1085 Differential Revision: https://reviews.llvm.org/D83708
Good to pick this llvm/llvm-project@60433c6 up for |
This function has a bug which will incorrectly reschedule instructions after an INLINEASM_BR (which can branch). (The bug may also allow scheduling past a throwing-CALL, I'm not certain.) I could fix that bug, but, as the removed FIXME notes, it's better to attempt rescheduling before converting to 3-addr form, as that may remove the need to convert in the first place. In fact, the code to do such reordering was added to this pass only a few months later, in 2011, via the addition of the function rescheduleMIBelowKill. That code does not contain the same bug. The removal of the sink3AddrInstruction function is not a no-op: in some cases it would move an instruction post-conversion, when rescheduleMIBelowKill would not move the instruction pre-converison. However, this does not appear to be important: the machine instruction scheduler can reorder the after-conversion instructions, in any case. This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when built with clang after 4b0aa57. Link: ClangBuiltLinux/linux#1085 Differential Revision: https://reviews.llvm.org/D83708 (cherry picked from commit 60433c6)
Just noting that this also appears to fix a regression reporting internally for ppc64 in mm/memcontrol.c. |
Great news, it's in |
This function has a bug which will incorrectly reschedule instructions after an INLINEASM_BR (which can branch). (The bug may also allow scheduling past a throwing-CALL, I'm not certain.) I could fix that bug, but, as the removed FIXME notes, it's better to attempt rescheduling before converting to 3-addr form, as that may remove the need to convert in the first place. In fact, the code to do such reordering was added to this pass only a few months later, in 2011, via the addition of the function rescheduleMIBelowKill. That code does not contain the same bug. The removal of the sink3AddrInstruction function is not a no-op: in some cases it would move an instruction post-conversion, when rescheduleMIBelowKill would not move the instruction pre-converison. However, this does not appear to be important: the machine instruction scheduler can reorder the after-conversion instructions, in any case. This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when built with clang after 4b0aa57. Link: ClangBuiltLinux/linux#1085 Differential Revision: https://reviews.llvm.org/D83708
This function has a bug which will incorrectly reschedule instructions after an INLINEASM_BR (which can branch). (The bug may also allow scheduling past a throwing-CALL, I'm not certain.) I could fix that bug, but, as the removed FIXME notes, it's better to attempt rescheduling before converting to 3-addr form, as that may remove the need to convert in the first place. In fact, the code to do such reordering was added to this pass only a few months later, in 2011, via the addition of the function rescheduleMIBelowKill. That code does not contain the same bug. The removal of the sink3AddrInstruction function is not a no-op: in some cases it would move an instruction post-conversion, when rescheduleMIBelowKill would not move the instruction pre-converison. However, this does not appear to be important: the machine instruction scheduler can reorder the after-conversion instructions, in any case. This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when built with clang after d80e429. Link: ClangBuiltLinux/linux#1085 Differential Revision: https://reviews.llvm.org/D83708 (cherry picked from commit 3eec260)
This function has a bug which will incorrectly reschedule instructions after an INLINEASM_BR (which can branch). (The bug may also allow scheduling past a throwing-CALL, I'm not certain.) I could fix that bug, but, as the removed FIXME notes, it's better to attempt rescheduling before converting to 3-addr form, as that may remove the need to convert in the first place. In fact, the code to do such reordering was added to this pass only a few months later, in 2011, via the addition of the function rescheduleMIBelowKill. That code does not contain the same bug. The removal of the sink3AddrInstruction function is not a no-op: in some cases it would move an instruction post-conversion, when rescheduleMIBelowKill would not move the instruction pre-converison. However, this does not appear to be important: the machine instruction scheduler can reorder the after-conversion instructions, in any case. This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when built with clang after 4b0aa5724feaa89a9538dcab97e018110b0e4bc3. Link: ClangBuiltLinux/linux#1085 Differential Revision: https://reviews.llvm.org/D83708
According to bisect, this is due to the asm goto rework that happened in llvm/llvm-project@4b0aa57.
Panic can be viewed here: https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/jobs/358738630
I have not done any triage beyond the bisect.
The text was updated successfully, but these errors were encountered: