-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"ran out of registers" while building i386 linux kernel #41914
Comments
This looks similar to https://reviews.llvm.org/rL163819 |
This isn't really a register allocator issue; there aren't enough registers available in the register class in question. |
Minimized: target datalayout = "e-m:e-p:32:32-p270:32:32-p271:32:32-p272:64:64-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i386-unknown-linux-gnu"
%bigstruct = type { void (i32, i32)*, i32, i32, i32, i32, i32, i32 }
@bigconst = constant [1 x %bigstruct] zeroinitializer
@zero = global i32 0
define void @main() {
%1 = load i32, i32* @zero
%2 = getelementptr [1 x %bigstruct], [1 x %bigstruct]* @bigconst, i32 %1, i32 0, i32 0
%3 = load void (i32, i32)*, void (i32, i32)** %2
tail call void %3(i32 inreg 0, i32 inreg 1)
ret void
} |
There's some special handling of tail calls with |
I came across another instance of the output: clang-14 --target=x86_64-linux -m32 -O2 -fno-omit-frame-pointer -c nf_synproxy_core.i
nf_synproxy_core.i:22:7: error: inline assembly requires more registers than available
asm("" : "=&r"(d) : "r"(a), "r"(b), "r"(0), "r"(c), "0"(d));
^ based one the input below (reduced with cvise):int a, b, d, j, l, e, g;
char c;
struct e {
char f;
};
struct g {
struct e h;
};
struct i {
} m(void), f, h;
enum { k } n(struct i *, struct e *);
struct {
short ac;
} * o, *q;
int (*p)();
_Bool r;
short s, t;
unsigned char *u;
void v(void);
static void w(int *x, struct i *z) {
struct g y;
asm("" : "=&r"(d) : "r"(a), "r"(b), "r"(0), "r"(c), "0"(d));
q->ac = 0;
t = (unsigned char *)q - u;
s = y.h.f = k;
n(z, &(&y)->h);
if (o)
p(x, j, l, r);
v();
m();
}
void aa() { w(&e, &f); }
void ab() { w(&g, &h); } The above happens with clang-12 through at least clang-14, but not with clang-11. And there is one more in 64-bit mode that I have not reduced but could if that helps:
|
Perhaps relevant to |
Perhaps a clang-12 regression? |
Bisection converged on https://reviews.llvm.org/D83996. (cc @topperc, @phoebewang ); doesn't look like a regression per se. But it definitely looks like having a scheduling model or not affects this, since we were also seeing issues for |
There are two copies from EAX and EDX into virtual regsiters at the start of the function immediately after isel. Those got re-scheduled after the inline assembly which oversubscribed the register allocator for the inline assembly. |
Adding
to the SandyBridge scheduler model makes it go away. |
Uploaded a test case to pre-commit/demonstrate a diff once we agree on a fix: https://reviews.llvm.org/D122348. From discord:
|
https://reviews.llvm.org/D122350 is a tentative patch, but is touching the schedule model really the right fix? Shouldn't we modify instruction scheduling to account for (or straight up avoid) extending live ranges of physregs across INLINEASM statements? |
diff --git a/llvm/lib/CodeGen/MachineScheduler.cpp b/llvm/lib/CodeGen/MachineScheduler.cpp
index 40cfaebcf55f..d22b5b2a6f96 100644
--- a/llvm/lib/CodeGen/MachineScheduler.cpp
+++ b/llvm/lib/CodeGen/MachineScheduler.cpp
@@ -463,7 +463,13 @@ static bool isSchedBoundary(MachineBasicBlock::iterator MI,
MachineBasicBlock *MBB,
MachineFunction *MF,
const TargetInstrInfo *TII) {
- return MI->isCall() || TII->isSchedulingBoundary(*MI, MBB, *MF);
+ if (MI->isCall() || TII->isSchedulingBoundary(*MI, MBB, *MF))
+ return true;
+ if (MI->isInlineAsm() && !MBB->livein_empty())
+ for (MachineOperand &MO : MI->operands())
+ if (MO.isReg())
+ return true;
+ return false;
}
/// A region of an MBB for scheduling. |
I think this may have big impact to current use of inline asm, given out of registers cases are rare.
|
To match the cost of other scheduling models. This is expected to schedule mov instructions around INLINEASM less frequently for the default machineschedule (pre-RA scheduling). Suggested by Craig Topper. Link: #41914 Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122350
I just landed a patch that will ship in clang-15. It makes it less likely that we schedule That said, the MachineScheduler is still not hardened from this re-occuring. Ideally, the scheduler could check if reordering a mov across an inlineasm would produce unsatisfiable constraints on the register allocator. IIRC, MachineScheduler just uses a priority queue insertion to order instructions that don't have data dependencies between them. So this might not be the last time we see this issue, but my patch should help a bit. Closing for now, but let's open a new bug report with more test cases if we can still reproduce this with clang-15. (My patch allows both test cases by @arndb to compile without issue) |
As is tradition, I spoke too soon. This fixes one of our instances for no |
@nickdesaulniers Is this on 32-bit atom builds? We always use ILP on Atom CPU targets which is going to waste registers that 32-bit can't really afford.
|
No. 64b ATOM. 926afd7 looks like it moved 32b atom from hybrid to ilp. If I change atom to hybrid, I can still reproduce the register exhaustion compilation failure. Same for Maybe I should be looking at the I've posted a smaller reproducer here. |
Extended Description
One file in the linux kernel produces an internal error from the register allocator. I reduced the test case to this:
See https://godbolt.org/z/aQ96HC
The text was updated successfully, but these errors were encountered: