-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISC-V] Fix crash with late stack realignment requirement #83496
base: main
Are you sure you want to change the base?
Conversation
In order to performa a stack re-alignment, the RISC-V back end requires the frame pointer. Of course, the frame pointer rquires a register to be reserved. The decision on which registers to reserve is made prior to any spilling introduced by RA. If the spill has to spill a register that of a class that requires greater alignment than the stack alignment, this will trigger a requirement to re-align the stack but the FP register hasn't been reserved. This patch adds a check for any use of a register that could cause stack re-alignment if it were to be spilled. Currently, the only situation that triggers this issue is when we have -Zdinx with the E extension (for which we align the stack to 4 bytes).
@llvm/pr-subscribers-backend-risc-v Author: Nemanja Ivanovic (nemanjai) ChangesIn order to performa a stack re-alignment, the RISC-V back end requires the frame pointer. Of course, the frame pointer rquires a register to be reserved. The decision on which registers to reserve is made prior to any spilling introduced by RA. If the spill has to spill a register that of a class that requires greater alignment than the stack alignment, this will trigger a requirement to re-align the stack but the FP register hasn't been reserved. Currently, the only situation that triggers this issue is when we have -Zdinx with the E extension (for which we align the stack to 4 bytes). Full diff: https://github.com/llvm/llvm-project/pull/83496.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
index 8bac41372b5a83..26cf6a24d15cca 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
@@ -306,9 +306,12 @@ static Register getMaxPushPopReg(const MachineFunction &MF,
// variable sized allocas, or if the frame address is taken.
bool RISCVFrameLowering::hasFP(const MachineFunction &MF) const {
const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
-
+ bool HasExtE =
+ MF.getSubtarget<RISCVSubtarget>().getTargetABI() == RISCVABI::ABI_ILP32E;
const MachineFrameInfo &MFI = MF.getFrameInfo();
+
return MF.getTarget().Options.DisableFramePointerElim(MF) ||
+ (HasExtE && maxPossibleSpillAlign(MF) > Align(4)) ||
RegInfo->hasStackRealignment(MF) || MFI.hasVarSizedObjects() ||
MFI.isFrameAddressTaken();
}
@@ -1175,6 +1178,35 @@ static unsigned estimateFunctionSizeInBytes(const MachineFunction &MF,
return FnSize;
}
+Align RISCVFrameLowering::maxPossibleSpillAlign(
+ const MachineFunction &MF) const {
+ if (MaxSpillAlign.contains(&MF))
+ return MaxSpillAlign.at(&MF);
+
+ const MachineRegisterInfo &MRI = MF.getRegInfo();
+ const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+ Align CurrMaxAlign = Align(1);
+ // The maximum spill alignment has not yet been computed. Compute it.
+ for (const MachineBasicBlock &MBB : MF) {
+ for (const MachineInstr &MI : MBB) {
+ for (const MachineOperand &MO : MI.operands()) {
+ if (!MO.isReg())
+ continue;
+ Register Reg = MO.getReg();
+ const TargetRegisterClass *RC;
+ if (Reg.isPhysical())
+ RC = TRI->getMinimalPhysRegClass(Reg);
+ else
+ RC = MRI.getRegClassOrNull(MO.getReg());
+ if (RC && TRI->getSpillAlign(*RC) > CurrMaxAlign)
+ CurrMaxAlign = TRI->getSpillAlign(*RC);
+ }
+ }
+ }
+ MaxSpillAlign[&MF] = CurrMaxAlign;
+ return CurrMaxAlign;
+}
+
void RISCVFrameLowering::processFunctionBeforeFrameFinalized(
MachineFunction &MF, RegScavenger *RS) const {
const RISCVRegisterInfo *RegInfo =
diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.h b/llvm/lib/Target/RISCV/RISCVFrameLowering.h
index 210f8c1064724a..cb5dbf82dc1b11 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.h
@@ -13,6 +13,7 @@
#ifndef LLVM_LIB_TARGET_RISCV_RISCVFRAMELOWERING_H
#define LLVM_LIB_TARGET_RISCV_RISCVFRAMELOWERING_H
+#include "llvm/ADT/DenseMap.h"
#include "llvm/CodeGen/TargetFrameLowering.h"
#include "llvm/Support/TypeSize.h"
@@ -37,6 +38,7 @@ class RISCVFrameLowering : public TargetFrameLowering {
void processFunctionBeforeFrameFinalized(MachineFunction &MF,
RegScavenger *RS) const override;
+ Align maxPossibleSpillAlign(const MachineFunction &MF) const;
bool hasFP(const MachineFunction &MF) const override;
bool hasBP(const MachineFunction &MF) const;
@@ -82,6 +84,7 @@ class RISCVFrameLowering : public TargetFrameLowering {
protected:
const RISCVSubtarget &STI;
+ mutable llvm::DenseMap<const MachineFunction *, Align> MaxSpillAlign;
private:
void determineFrameLayout(MachineFunction &MF) const;
diff --git a/llvm/test/CodeGen/RISCV/zdinx-realignment-with-e.ll b/llvm/test/CodeGen/RISCV/zdinx-realignment-with-e.ll
new file mode 100644
index 00000000000000..1b0e6d1967ff92
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/zdinx-realignment-with-e.ll
@@ -0,0 +1,64 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
+; RUN: llc --mtriple=riscv32 --mattr=+e,+zdinx --target-abi ilp32e \
+; RUN: --verify-machineinstrs < %s | FileCheck %s
+define void @func(ptr %__first, ptr %__start, ptr %add.ptr23) #0 {
+; CHECK-LABEL: func:
+; CHECK: # %bb.0: # %entry
+; CHECK-NEXT: addi sp, sp, -24
+; CHECK-NEXT: .cfi_def_cfa_offset 24
+; CHECK-NEXT: sw ra, 20(sp) # 4-byte Folded Spill
+; CHECK-NEXT: sw s0, 16(sp) # 4-byte Folded Spill
+; CHECK-NEXT: .cfi_offset ra, -4
+; CHECK-NEXT: .cfi_offset s0, -8
+; CHECK-NEXT: addi s0, sp, 24
+; CHECK-NEXT: .cfi_def_cfa s0, 0
+; CHECK-NEXT: andi sp, sp, -8
+; CHECK-NEXT: lui a1, %hi(.LCPI0_0)
+; CHECK-NEXT: lw a4, %lo(.LCPI0_0)(a1)
+; CHECK-NEXT: lw a5, %lo(.LCPI0_0+4)(a1)
+; CHECK-NEXT: sw a4, 8(sp)
+; CHECK-NEXT: sw a5, 12(sp)
+; CHECK-NEXT: lui a1, %hi(.LCPI0_1)
+; CHECK-NEXT: lw a4, %lo(.LCPI0_1)(a1)
+; CHECK-NEXT: lw a5, %lo(.LCPI0_1+4)(a1)
+; CHECK-NEXT: sw a4, 0(sp)
+; CHECK-NEXT: sw a5, 4(sp)
+; CHECK-NEXT: .LBB0_1: # %do.body
+; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: lw a4, 0(zero)
+; CHECK-NEXT: lw a5, 4(zero)
+; CHECK-NEXT: fcvt.d.w t1, zero
+; CHECK-NEXT: sw t1, 0(a0)
+; CHECK-NEXT: sw t2, 4(a0)
+; CHECK-NEXT: lw t1, 8(sp)
+; CHECK-NEXT: lw t2, 12(sp)
+; CHECK-NEXT: flt.d a1, t1, a4
+; CHECK-NEXT: neg a1, a1
+; CHECK-NEXT: and a1, a1, a2
+; CHECK-NEXT: lw a4, 0(a1)
+; CHECK-NEXT: lw a5, 4(a1)
+; CHECK-NEXT: lw t1, 0(sp)
+; CHECK-NEXT: lw t2, 4(sp)
+; CHECK-NEXT: flt.d a1, a4, t1
+; CHECK-NEXT: beqz a1, .LBB0_1
+; CHECK-NEXT: # %bb.2: # %do.end
+; CHECK-NEXT: addi sp, s0, -24
+; CHECK-NEXT: lw ra, 20(sp) # 4-byte Folded Reload
+; CHECK-NEXT: lw s0, 16(sp) # 4-byte Folded Reload
+; CHECK-NEXT: addi sp, sp, 24
+; CHECK-NEXT: ret
+entry:
+ br label %do.body
+
+do.body: ; preds = %do.body, %entry
+ store double 0.000000e+00, ptr %__first, align 8
+ %0 = load double, ptr null, align 8
+ %cmp.i51 = fcmp olt double 0x7FF8000000000000, %0
+ %__child_i.2 = select i1 %cmp.i51, ptr %add.ptr23, ptr null
+ %1 = load double, ptr %__child_i.2, align 8
+ %cmp.i52 = fcmp olt double %1, 1.000000e+00
+ br i1 %cmp.i52, label %do.end, label %do.body
+
+do.end: ; preds = %do.body
+ ret void
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possible solution would be to change the alignment requirement for GPRPair
to 32. Perhaps this would need to only be done when the E extension is enabled, so maybe it would require a separate register class.
In any case, thought I'd propose these suggestions and we can converge on some solution to this problem.
continue; | ||
Register Reg = MO.getReg(); | ||
const TargetRegisterClass *RC; | ||
if (Reg.isPhysical()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking, this is not needed to resolve the observed problem as we won't insert a spill of a physical register. However, I implemented it this way to catch any use that might require a realignment - I guess a paranoid position.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided to remove it after all. It fails on implicit defs that don't have a register class anyway.
It should have an alignment of 32, but then we have potential issue spilling a GPRPair. We need to be able to calculate the address of the second half of the spill slot without needing a temporary register. By having the entire spill slot aligned to 8, we guarantee that we can fold +4 into the load/store address in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this problem! I know there will be a lot of problems when ILP32E meets with latest ratified extensions…
@@ -306,9 +306,12 @@ static Register getMaxPushPopReg(const MachineFunction &MF, | |||
// variable sized allocas, or if the frame address is taken. | |||
bool RISCVFrameLowering::hasFP(const MachineFunction &MF) const { | |||
const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo(); | |||
|
|||
bool HasExtE = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HasExtE
-> HasEABI
?
return MaxSpillAlign.at(&MF); | ||
|
||
const MachineRegisterInfo &MRI = MF.getRegInfo(); | ||
const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo(); | |
const TargetRegisterInfo *TRI = STI.getRegisterInfo(); |
I suppose we could solve it very pessimistically for the Global/CPI case by adding a lui/addi pair into the register that gets freed with the first load/store (and addi for the immediate case), but that seems worse than the potential stack re-alignment for the E ABI. Especially since we would presumably have to restore the address if the register isn't dead. |
The Global/CPI case should be independent of the stack align. Those should only come from user loads/stores. I wonder if we should be splitting 64-bit loads/stores during isel instead of generating pseudos. |
...
I'm not sure I fully follow how this will help resolve this problem. Do you mean that we would set the alignment of the paired registers to 4, expand the loads/stores of |
Globals/CPI are unrelated to stack realignment issue let's forget about them. When I mentioned I wonder if we could change the alignment of the pair to 4, and scavenge a register in the rare case that Or change the spill alignment to 4 and teach |
In order to performa a stack re-alignment, the RISC-V back end requires the frame pointer. Of course, the frame pointer rquires a register to be reserved. The decision on which registers to reserve is made prior to any spilling introduced by RA. If the spill has to spill a register that of a class that requires greater alignment than the stack alignment, this will trigger a requirement to re-align the stack but the FP register hasn't been reserved.
This patch adds a check for any use of a register that could cause stack re-alignment if it were to be spilled.
Currently, the only situation that triggers this issue is when we have -Zdinx with the E extension (for which we align the stack to 4 bytes).