[RISCV] Move RISCVDeadRegisterDefinitions to post vector regalloc #90636

lukel97 · 2024-04-30T17:55:23Z

Currently RISCVDeadRegisterDefinitions runs after vsetvli insertion, but in #70549 vsetvli insertion runs after vector regalloc and as a result we no longer convert some vsetvli a0, a0s to vsetvli x0, a0. This patch moves it to after vector regalloc, but before scalar regalloc so we still get the benefits of reducing register pressure.

There are no differences on llvm-test-suite or SPEC CPU 2017 with -O3 and -march=rv64gv, just one case in the in-tree tests where a vsetvli output VL is now marked as dead after phi elimination.

The original patch mentions that it may reduce register pressure and improve register allocation, but I believe the register allocator already accounts for dead defs hence why this almost an NFC. There are no differences on llvm-test-suite or SPEC CPU 2017 with -O3 and -march=rv64gv, just one case in the in-tree tests where a vsetvli output VL is now marked as dead after phi elimination. The motivation for this is to slightly simplify how we handle vleff with post (vector) regalloc in llvm#70549, by keeping GPR defs virtual where possible.

llvmbot · 2024-04-30T17:55:55Z

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

The original patch mentions that it may reduce register pressure and improve register allocation, but I believe the register allocator already accounts for dead defs hence why this almost an NFC.

There are no differences on llvm-test-suite or SPEC CPU 2017 with -O3 and -march=rv64gv, just one case in the in-tree tests where a vsetvli output VL is now marked as dead after phi elimination.

The motivation for this is to slightly simplify how we handle vleff with post (vector) regalloc in #70549, by keeping GPR defs virtual where possible.

Full diff: https://github.com/llvm/llvm-project/pull/90636.diff

4 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp (+1-3)
(modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+3-3)
(modified) llvm/test/CodeGen/RISCV/O3-pipeline.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll (+2-2)

diff --git a/llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp b/llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
index df607236f7d56f..fd6708c44e7720 100644
--- a/llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
+++ b/llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp
@@ -77,10 +77,8 @@ bool RISCVDeadRegisterDefinitions::runOnMachineFunction(MachineFunction &MF) {
           LLVM_DEBUG(dbgs() << "    Ignoring, def is tied operand.\n");
           continue;
         }
-        // We should not have any relevant physreg defs that are replacable by
-        // zero before register allocation. So we just check for dead vreg defs.
         Register Reg = MO.getReg();
-        if (!Reg.isVirtual() || (!MO.isDead() && !MRI->use_nodbg_empty(Reg)))
+        if (!MO.isDead() && !MRI->use_nodbg_empty(Reg))
           continue;
         LLVM_DEBUG(dbgs() << "    Dead def operand #" << I << " in:\n      ";
                    MI.print(dbgs()));
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index 0876f46728a10c..d4978abf205f53 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -536,9 +536,6 @@ void RISCVPassConfig::addPreRegAlloc() {
   if (TM->getOptLevel() != CodeGenOptLevel::None)
     addPass(createRISCVMergeBaseOffsetOptPass());
   addPass(createRISCVInsertVSETVLIPass());
-  if (TM->getOptLevel() != CodeGenOptLevel::None &&
-      EnableRISCVDeadRegisterElimination)
-    addPass(createRISCVDeadRegisterDefinitionsPass());
   addPass(createRISCVInsertReadWriteCSRPass());
   addPass(createRISCVInsertWriteVXRMPass());
 }
@@ -553,6 +550,9 @@ void RISCVPassConfig::addPostRegAlloc() {
   if (TM->getOptLevel() != CodeGenOptLevel::None &&
       EnableRedundantCopyElimination)
     addPass(createRISCVRedundantCopyEliminationPass());
+  if (TM->getOptLevel() != CodeGenOptLevel::None &&
+      EnableRISCVDeadRegisterElimination)
+    addPass(createRISCVDeadRegisterDefinitionsPass());
 }
 
 yaml::MachineFunctionInfo *
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 4121d111091117..174996b9c10b00 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -116,7 +116,6 @@
 ; CHECK-NEXT:       RISC-V Pre-RA pseudo instruction expansion pass
 ; CHECK-NEXT:       RISC-V Merge Base Offset
 ; CHECK-NEXT:       RISC-V Insert VSETVLI pass
-; CHECK-NEXT:       RISC-V Dead register definitions
 ; CHECK-NEXT:       RISC-V Insert Read/Write CSR Pass
 ; CHECK-NEXT:       RISC-V Insert Write VXRM Pass
 ; CHECK-NEXT:       Detect Dead Lanes
@@ -153,6 +152,7 @@
 ; CHECK-NEXT:       Machine Copy Propagation Pass
 ; CHECK-NEXT:       Machine Loop Invariant Code Motion
 ; CHECK-NEXT:       RISC-V Redundant Copy Elimination
+; CHECK-NEXT:       RISC-V Dead register definitions
 ; CHECK-NEXT:       Remove Redundant DEBUG_VALUE analysis
 ; CHECK-NEXT:       Fixup Statepoint Caller Saved
 ; CHECK-NEXT:       PostRA Machine Sink
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
index 4ff2fc7a5fff5d..088d121564bc96 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
@@ -88,13 +88,13 @@ define <vscale x 1 x double> @test3(i64 %avl, i8 zeroext %cond, <vscale x 1 x do
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    beqz a1, .LBB2_2
 ; CHECK-NEXT:  # %bb.1: # %if.then
-; CHECK-NEXT:    vsetvli a0, a0, e64, m1, ta, ma
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
 ; CHECK-NEXT:    vfadd.vv v9, v8, v9
 ; CHECK-NEXT:    vfmul.vv v8, v9, v8
 ; CHECK-NEXT:    # implicit-def: $x10
 ; CHECK-NEXT:    ret
 ; CHECK-NEXT:  .LBB2_2: # %if.else
-; CHECK-NEXT:    vsetvli a0, a0, e64, m1, ta, ma
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
 ; CHECK-NEXT:    vfsub.vv v9, v8, v9
 ; CHECK-NEXT:    vfmul.vv v8, v9, v8
 ; CHECK-NEXT:    # implicit-def: $x10

dtcxzyw

LGTM.

topperc · 2024-04-30T19:02:36Z

Other than vsetvli, the original patch only affected atomics and non-volatile loads. I'm not sure if those are well tested by SPEC or llvm-test-suite. Especially in high register pressure situation.

believe the register allocator already accounts for dead defs hence why this almost an NFC.

If all the non-reserved registers are in use, the register allocation won't pick X0. It would spill to pick a non-reserved register.

lukel97 · 2024-05-01T03:35:33Z

If all the non-reserved registers are in use, the register allocation won't pick X0. It would spill to pick a non-reserved register.

That makes more sense, I forgot that X0 was reserved.

@BeMg I also found we have some regressions in #70549 since this no longer runs after vsetvli insertion, it seems to affect cases where two vsetvlis were coalesced e.g.:

 define <vscale x 1 x double> @test1(i64 %avl, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {
 ; CHECK-LABEL: test1:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; CHECK-NEXT:    vsetvli a0, a0, e64, m1, ta, ma
 ; CHECK-NEXT:    vfadd.vv v8, v8, v9
 ; CHECK-NEXT:    ret
entry:
  %0 = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 2, i64 7)
  %1 = tail call <vscale x 1 x double> @llvm.riscv.vfadd.nxv1f64.nxv1f64(
    <vscale x 1 x double> undef,
    <vscale x 1 x double> %a,
    <vscale x 1 x double> %b,
    i64 7, i64 %0)
  ret <vscale x 1 x double> %1
}

We could potentially run this twice, before regalloc and after regalloc. Or I think we might be able to also make coalescing smarter and have it preserve the x0 def.

lukel97 · 2024-05-02T08:28:51Z

I've updated this to run after vector regalloc but before scalar regalloc, so we should have the best of both worlds now.

BeMg

LGTM with one nit

llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp

lukel97 requested review from BeMg, dtcxzyw, preames, topperc and wangpc-pp April 30, 2024 17:55

llvmbot added the backend:RISC-V label Apr 30, 2024

lukel97 mentioned this pull request Apr 30, 2024

[RISCV] Support postRA vsetvl insertion pass #70549

Merged

dtcxzyw approved these changes Apr 30, 2024

View reviewed changes

Remove MRI since it's no longer used

f5b0027

Run after vector regalloc, before scalar regalloc

bbf1031

lukel97 changed the title ~~[RISCV] Move RISCVDeadRegisterDefinitions to post regalloc~~ [RISCV] Move RISCVDeadRegisterDefinitions to post vector regalloc May 2, 2024

lukel97 requested a review from dtcxzyw May 2, 2024 08:29

BeMg approved these changes May 6, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVDeadRegisterDefinitions.cpp Show resolved Hide resolved

Assert LiveInterval exists

a6fdbb5

lukel97 merged commit 52187b9 into llvm:main May 6, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Move RISCVDeadRegisterDefinitions to post vector regalloc #90636

[RISCV] Move RISCVDeadRegisterDefinitions to post vector regalloc #90636

lukel97 commented Apr 30, 2024 •

edited

Loading

llvmbot commented Apr 30, 2024

dtcxzyw left a comment

topperc commented Apr 30, 2024

lukel97 commented May 1, 2024

lukel97 commented May 2, 2024

BeMg left a comment

[RISCV] Move RISCVDeadRegisterDefinitions to post vector regalloc #90636

[RISCV] Move RISCVDeadRegisterDefinitions to post vector regalloc #90636

Conversation

lukel97 commented Apr 30, 2024 • edited Loading

llvmbot commented Apr 30, 2024

dtcxzyw left a comment

Choose a reason for hiding this comment

topperc commented Apr 30, 2024

lukel97 commented May 1, 2024

lukel97 commented May 2, 2024

BeMg left a comment

Choose a reason for hiding this comment

lukel97 commented Apr 30, 2024 •

edited

Loading