Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Add an experimental pseudoinstruction to represent a rematerializable constant materialization sequence. #69983

Merged
merged 4 commits into from
Oct 26, 2023

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Oct 23, 2023

Rematerialization during register allocation is currently limited to a single instruction with no inputs.

This patch introduces a pseudoinstruction that represents the materialization of a constant. I've started with a sequence of 2 instructions for now, which covers at least the common LUI+ADDI(W) case. This instruction will be expanded into real instructions immediately after register allocation using a new pass. This gives the post-RA scheduler a chance to separate the 2 instructions to improve ILP.

I believe this matches the approach used by AArch64.

Unfortunately, this loses some CSE opportunies when an LUI value is used by multiple constants with different LSBs.

This feature is off by default and a new backend command line option is added to enable it for testing.

This avoids the spill and reloads reported in #69586.

…alizable constant materialization sequence.

Rematerialization during register allocation is currently limited
to a single instruction with no inputs.

This patch introduces a pseudoinstruction that represents the
materialization of a constant. I've started with a sequence of 2
instructions for now, which covers at least the common LUI+ADDI(W)
case. This instruction will be expanded into real instructions
immediately after register allocation using a new pass. This gives
the post-RA scheduler a chance to separate the 2 instructions to
improve ILP.

I believe this matches the approach used by AArch64.

Unfortunately, this loses some CSE opportunies when an LUI value is
used by multiple constants with different LSBs.

This feature is off by default and a new backend command line option is
added to enable it for testing.

This avoids the spill and reloads reported in llvm#69586.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 23, 2023

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

Rematerialization during register allocation is currently limited to a single instruction with no inputs.

This patch introduces a pseudoinstruction that represents the materialization of a constant. I've started with a sequence of 2 instructions for now, which covers at least the common LUI+ADDI(W) case. This instruction will be expanded into real instructions immediately after register allocation using a new pass. This gives the post-RA scheduler a chance to separate the 2 instructions to improve ILP.

I believe this matches the approach used by AArch64.

Unfortunately, this loses some CSE opportunies when an LUI value is used by multiple constants with different LSBs.

This feature is off by default and a new backend command line option is added to enable it for testing.

This avoids the spill and reloads reported in #69586.


Patch is 154.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/69983.diff

10 Files Affected:

  • (modified) llvm/lib/Target/RISCV/CMakeLists.txt (+1)
  • (modified) llvm/lib/Target/RISCV/RISCV.h (+2)
  • (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+11)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.td (+6)
  • (added) llvm/lib/Target/RISCV/RISCVPostRAExpandPseudoInsts.cpp (+157)
  • (modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+3)
  • (modified) llvm/test/CodeGen/RISCV/O0-pipeline.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/O3-pipeline.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/imm.ll (+927)
  • (added) llvm/test/CodeGen/RISCV/pr69586.ll (+1980)
diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt b/llvm/lib/Target/RISCV/CMakeLists.txt
index fd5a5244486ab18..4d5fa79389ea68b 100644
--- a/llvm/lib/Target/RISCV/CMakeLists.txt
+++ b/llvm/lib/Target/RISCV/CMakeLists.txt
@@ -44,6 +44,7 @@ add_llvm_target(RISCVCodeGen
   RISCVMacroFusion.cpp
   RISCVMergeBaseOffset.cpp
   RISCVOptWInstrs.cpp
+  RISCVPostRAExpandPseudoInsts.cpp
   RISCVRedundantCopyElimination.cpp
   RISCVMoveMerger.cpp
   RISCVPushPopOptimizer.cpp
diff --git a/llvm/lib/Target/RISCV/RISCV.h b/llvm/lib/Target/RISCV/RISCV.h
index 0efc915ea52c550..3d8e33dc716ea44 100644
--- a/llvm/lib/Target/RISCV/RISCV.h
+++ b/llvm/lib/Target/RISCV/RISCV.h
@@ -63,6 +63,8 @@ void initializeRISCVExpandAtomicPseudoPass(PassRegistry &);
 FunctionPass *createRISCVInsertVSETVLIPass();
 void initializeRISCVInsertVSETVLIPass(PassRegistry &);
 
+FunctionPass *createRISCVPostRAExpandPseudoPass();
+void initializeRISCVPostRAExpandPseudoPass(PassRegistry &);
 FunctionPass *createRISCVInsertReadWriteCSRPass();
 void initializeRISCVInsertReadWriteCSRPass(PassRegistry &);
 
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 81a1304cf1f405e..42da92da43499ae 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -29,6 +29,12 @@ using namespace llvm;
 #define DEBUG_TYPE "riscv-isel"
 #define PASS_NAME "RISC-V DAG->DAG Pattern Instruction Selection"
 
+static cl::opt<bool> UsePseudoMovImm(
+    "riscv-use-rematerializable-movimm", cl::Hidden,
+    cl::desc("Use a rematerializable pseudoinstruction for 2 instruction "
+             "constant materialization"),
+    cl::init(false));
+
 namespace llvm::RISCV {
 #define GET_RISCVVSSEGTable_IMPL
 #define GET_RISCVVLSEGTable_IMPL
@@ -195,6 +201,11 @@ static SDValue selectImm(SelectionDAG *CurDAG, const SDLoc &DL, const MVT VT,
   RISCVMatInt::InstSeq Seq =
       RISCVMatInt::generateInstSeq(Imm, Subtarget.getFeatureBits());
 
+  // Use a rematerializable pseudo instruction for short sequences if enabled.
+  if (Seq.size() == 2 && UsePseudoMovImm)
+    return SDValue(CurDAG->getMachineNode(RISCV::PseudoMovImm, DL, VT,
+                                          CurDAG->getTargetConstant(Imm, DL, VT)), 0);
+
   // See if we can create this constant as (ADD (SLLI X, C), X) where X is at
   // worst an LUI+ADDIW. This will require an extra register, but avoids a
   // constant pool.
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
index 94de559b1e6e037..770b215b4cae01d 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -1670,6 +1670,12 @@ def PseudoJump : Pseudo<(outs GPR:$rd), (ins pseudo_jump_symbol:$target), [],
                         "jump", "$target, $rd">,
                  Sched<[WriteIALU, WriteJalr, ReadJalr]>;
 
+// Pseudo for LI of simm32.
+let hasSideEffects = 0, mayLoad = 0, mayStore = 0, Size = 8, isCodeGenOnly = 1,
+    isPseudo = 1, isReMaterializable = 1, IsSignExtendingOpW = 1 in
+def PseudoMovImm : Pseudo<(outs GPR:$dst), (ins i32imm:$imm), []>,
+                   Sched<[WriteIALU]>;
+
 let hasSideEffects = 0, mayLoad = 0, mayStore = 0, Size = 8, isCodeGenOnly = 0,
     isAsmParserOnly = 1 in
 def PseudoLLA : Pseudo<(outs GPR:$dst), (ins bare_symbol:$src), [],
diff --git a/llvm/lib/Target/RISCV/RISCVPostRAExpandPseudoInsts.cpp b/llvm/lib/Target/RISCV/RISCVPostRAExpandPseudoInsts.cpp
new file mode 100644
index 000000000000000..e7749d070b0f08b
--- /dev/null
+++ b/llvm/lib/Target/RISCV/RISCVPostRAExpandPseudoInsts.cpp
@@ -0,0 +1,157 @@
+//===-- RISCVPostRAExpandPseudoInsts.cpp - Expand pseudo instrs ----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains a pass that expands the pseudo instruction pseudolisimm32
+// into target instructions. This pass should be run during the post-regalloc
+// passes, before assembly emission. It is used when the TunePseudoLISimm32
+// subfeature is on.
+//
+//===----------------------------------------------------------------------===//
+
+#include "RISCV.h"
+#include "RISCVInstrInfo.h"
+#include "RISCVTargetMachine.h"
+#include "MCTargetDesc/RISCVMatInt.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+
+using namespace llvm;
+
+#define RISCV_POST_RA_EXPAND_PSEUDO_NAME                                       \
+  "RISC-V post-regalloc pseudo instruction expansion pass"
+
+namespace {
+
+class RISCVPostRAExpandPseudo : public MachineFunctionPass {
+public:
+  const RISCVInstrInfo *TII;
+  static char ID;
+
+  RISCVPostRAExpandPseudo() : MachineFunctionPass(ID) {
+    initializeRISCVPostRAExpandPseudoPass(*PassRegistry::getPassRegistry());
+  }
+
+  bool runOnMachineFunction(MachineFunction &MF) override;
+
+  StringRef getPassName() const override {
+    return RISCV_POST_RA_EXPAND_PSEUDO_NAME;
+  }
+
+private:
+  bool expandMBB(MachineBasicBlock &MBB);
+  bool expandMI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+                MachineBasicBlock::iterator &NextMBBI);
+  bool expandMovImm(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
+};
+
+char RISCVPostRAExpandPseudo::ID = 0;
+
+bool RISCVPostRAExpandPseudo::runOnMachineFunction(MachineFunction &MF) {
+  TII = static_cast<const RISCVInstrInfo *>(MF.getSubtarget().getInstrInfo());
+  bool Modified = false;
+  for (auto &MBB : MF)
+    Modified |= expandMBB(MBB);
+  return Modified;
+}
+
+bool RISCVPostRAExpandPseudo::expandMBB(MachineBasicBlock &MBB) {
+  bool Modified = false;
+
+  MachineBasicBlock::iterator MBBI = MBB.begin(), E = MBB.end();
+  while (MBBI != E) {
+    MachineBasicBlock::iterator NMBBI = std::next(MBBI);
+    Modified |= expandMI(MBB, MBBI, NMBBI);
+    MBBI = NMBBI;
+  }
+
+  return Modified;
+}
+
+bool RISCVPostRAExpandPseudo::expandMI(MachineBasicBlock &MBB,
+                                       MachineBasicBlock::iterator MBBI,
+                                       MachineBasicBlock::iterator &NextMBBI) {
+  switch (MBBI->getOpcode()) {
+  case RISCV::PseudoMovImm:
+    return expandMovImm(MBB, MBBI);
+  default:
+    return false;
+  }
+}
+
+bool RISCVPostRAExpandPseudo::expandMovImm(MachineBasicBlock &MBB,
+                                           MachineBasicBlock::iterator MBBI) {
+  DebugLoc DL = MBBI->getDebugLoc();
+
+  int64_t Val = MBBI->getOperand(1).getImm();
+
+  RISCVMatInt::InstSeq Seq = RISCVMatInt::generateInstSeq(
+      Val, MBB.getParent()->getSubtarget().getFeatureBits());
+  assert(!Seq.empty());
+
+  Register SrcReg = RISCV::X0;
+  Register DstReg = MBBI->getOperand(0).getReg();
+  bool DstIsDead = MBBI->getOperand(0).isDead();
+  bool Renamable = MBBI->getOperand(0).isRenamable();
+  bool SrcRenamable = false;
+  unsigned Num = 0;
+
+  for (RISCVMatInt::Inst &Inst : Seq) {
+    bool LastItem = ++Num == Seq.size();
+    switch (Inst.getOpndKind()) {
+    case RISCVMatInt::Imm:
+      BuildMI(MBB, MBBI, DL, TII->get(Inst.getOpcode()))
+          .addReg(DstReg, RegState::Define |
+                              getDeadRegState(DstIsDead && LastItem) |
+                              getRenamableRegState(Renamable))
+          .addImm(Inst.getImm());
+      break;
+    case RISCVMatInt::RegX0:
+      BuildMI(MBB, MBBI, DL, TII->get(Inst.getOpcode()))
+          .addReg(DstReg, RegState::Define |
+                              getDeadRegState(DstIsDead && LastItem) |
+                              getRenamableRegState(Renamable))
+          .addReg(SrcReg, RegState::Kill | getRenamableRegState(SrcRenamable))
+          .addReg(RISCV::X0);
+      break;
+    case RISCVMatInt::RegReg:
+      BuildMI(MBB, MBBI, DL, TII->get(Inst.getOpcode()))
+          .addReg(DstReg, RegState::Define |
+                              getDeadRegState(DstIsDead && LastItem) |
+                              getRenamableRegState(Renamable))
+          .addReg(SrcReg, RegState::Kill | getRenamableRegState(SrcRenamable))
+          .addReg(SrcReg, RegState::Kill | getRenamableRegState(SrcRenamable));
+      break;
+    case RISCVMatInt::RegImm:
+      BuildMI(MBB, MBBI, DL, TII->get(Inst.getOpcode()))
+          .addReg(DstReg, RegState::Define |
+                              getDeadRegState(DstIsDead && LastItem) |
+                              getRenamableRegState(Renamable))
+          .addReg(SrcReg, RegState::Kill | getRenamableRegState(SrcRenamable))
+          .addImm(Inst.getImm());
+      break;
+    }
+    // Only the first instruction has X0 as its source.
+    SrcReg = DstReg;
+    SrcRenamable = Renamable;
+  }
+  MBBI->eraseFromParent();
+  return true;
+}
+
+} // end of anonymous namespace
+
+INITIALIZE_PASS(RISCVPostRAExpandPseudo, "riscv-expand-pseudolisimm32",
+                RISCV_POST_RA_EXPAND_PSEUDO_NAME, false, false)
+namespace llvm {
+
+FunctionPass *createRISCVPostRAExpandPseudoPass() {
+  return new RISCVPostRAExpandPseudo();
+}
+
+} // end of namespace llvm
+
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index 651d24bae57263d..953ac097b915044 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -96,6 +96,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeRISCVTarget() {
   initializeRISCVMakeCompressibleOptPass(*PR);
   initializeRISCVGatherScatterLoweringPass(*PR);
   initializeRISCVCodeGenPreparePass(*PR);
+  initializeRISCVPostRAExpandPseudoPass(*PR);
   initializeRISCVMergeBaseOffsetOptPass(*PR);
   initializeRISCVOptWInstrsPass(*PR);
   initializeRISCVPreRAExpandPseudoPass(*PR);
@@ -372,6 +373,8 @@ bool RISCVPassConfig::addGlobalInstructionSelect() {
 }
 
 void RISCVPassConfig::addPreSched2() {
+  addPass(createRISCVPostRAExpandPseudoPass());
+
   // Emit KCFI checks for indirect calls.
   addPass(createKCFIPass());
 }
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index 01c7613201854a6..1d9af9df2f718f0 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -52,6 +52,7 @@
 ; CHECK-NEXT:       Machine Optimization Remark Emitter
 ; CHECK-NEXT:       Prologue/Epilogue Insertion & Frame Finalization
 ; CHECK-NEXT:       Post-RA pseudo instruction expansion pass
+; CHECK-NEXT:       RISC-V post-regalloc pseudo instruction expansion pass
 ; CHECK-NEXT:       Insert KCFI indirect call checks
 ; CHECK-NEXT:       Analyze Machine Code For Garbage Collection
 ; CHECK-NEXT:       Insert fentry calls
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 30b6e1e541394d0..cf0826096bd41f8 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -156,6 +156,7 @@
 ; CHECK-NEXT:       Tail Duplication
 ; CHECK-NEXT:       Machine Copy Propagation Pass
 ; CHECK-NEXT:       Post-RA pseudo instruction expansion pass
+; CHECK-NEXT:       RISC-V post-regalloc pseudo instruction expansion pass
 ; CHECK-NEXT:       Insert KCFI indirect call checks
 ; CHECK-NEXT:       MachineDominator Tree Construction
 ; CHECK-NEXT:       Machine Natural Loop Construction
diff --git a/llvm/test/CodeGen/RISCV/imm.ll b/llvm/test/CodeGen/RISCV/imm.ll
index e191933b42338aa..cafcf72c022ff4a 100644
--- a/llvm/test/CodeGen/RISCV/imm.ll
+++ b/llvm/test/CodeGen/RISCV/imm.ll
@@ -14,6 +14,11 @@
 ; RUN: llc -mtriple=riscv64 -riscv-disable-using-constant-pool-for-large-ints -mattr=+xtheadbb \
 ; RUN:   -verify-machineinstrs < %s | FileCheck %s -check-prefix=RV64IXTHEADBB
 
+; RUN: llc -mtriple=riscv32 -riscv-disable-using-constant-pool-for-large-ints -verify-machineinstrs < %s \
+; RUN:   -riscv-use-rematerializable-movimm | FileCheck %s -check-prefix=RV32-REMAT
+; RUN: llc -mtriple=riscv64 -riscv-disable-using-constant-pool-for-large-ints -verify-machineinstrs < %s \
+; RUN:   -riscv-use-rematerializable-movimm | FileCheck %s -check-prefix=RV64-REMAT
+
 ; Materializing constants
 
 ; TODO: It would be preferable if anyext constant returns were sign rather
@@ -50,6 +55,16 @@ define signext i32 @zero() nounwind {
 ; RV64IXTHEADBB:       # %bb.0:
 ; RV64IXTHEADBB-NEXT:    li a0, 0
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: zero:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    li a0, 0
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: zero:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    li a0, 0
+; RV64-REMAT-NEXT:    ret
   ret i32 0
 }
 
@@ -83,6 +98,16 @@ define signext i32 @pos_small() nounwind {
 ; RV64IXTHEADBB:       # %bb.0:
 ; RV64IXTHEADBB-NEXT:    li a0, 2047
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: pos_small:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    li a0, 2047
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: pos_small:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    li a0, 2047
+; RV64-REMAT-NEXT:    ret
   ret i32 2047
 }
 
@@ -116,6 +141,16 @@ define signext i32 @neg_small() nounwind {
 ; RV64IXTHEADBB:       # %bb.0:
 ; RV64IXTHEADBB-NEXT:    li a0, -2048
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: neg_small:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    li a0, -2048
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: neg_small:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    li a0, -2048
+; RV64-REMAT-NEXT:    ret
   ret i32 -2048
 }
 
@@ -155,6 +190,18 @@ define signext i32 @pos_i32() nounwind {
 ; RV64IXTHEADBB-NEXT:    lui a0, 423811
 ; RV64IXTHEADBB-NEXT:    addiw a0, a0, -1297
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: pos_i32:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a0, 423811
+; RV32-REMAT-NEXT:    addi a0, a0, -1297
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: pos_i32:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    lui a0, 423811
+; RV64-REMAT-NEXT:    addiw a0, a0, -1297
+; RV64-REMAT-NEXT:    ret
   ret i32 1735928559
 }
 
@@ -194,6 +241,18 @@ define signext i32 @neg_i32() nounwind {
 ; RV64IXTHEADBB-NEXT:    lui a0, 912092
 ; RV64IXTHEADBB-NEXT:    addiw a0, a0, -273
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: neg_i32:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a0, 912092
+; RV32-REMAT-NEXT:    addi a0, a0, -273
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: neg_i32:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    lui a0, 912092
+; RV64-REMAT-NEXT:    addiw a0, a0, -273
+; RV64-REMAT-NEXT:    ret
   ret i32 -559038737
 }
 
@@ -227,6 +286,16 @@ define signext i32 @pos_i32_hi20_only() nounwind {
 ; RV64IXTHEADBB:       # %bb.0:
 ; RV64IXTHEADBB-NEXT:    lui a0, 16
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: pos_i32_hi20_only:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a0, 16
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: pos_i32_hi20_only:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    lui a0, 16
+; RV64-REMAT-NEXT:    ret
   ret i32 65536 ; 0x10000
 }
 
@@ -260,6 +329,16 @@ define signext i32 @neg_i32_hi20_only() nounwind {
 ; RV64IXTHEADBB:       # %bb.0:
 ; RV64IXTHEADBB-NEXT:    lui a0, 1048560
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: neg_i32_hi20_only:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a0, 1048560
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: neg_i32_hi20_only:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    lui a0, 1048560
+; RV64-REMAT-NEXT:    ret
   ret i32 -65536 ; -0x10000
 }
 
@@ -301,6 +380,18 @@ define signext i32 @imm_left_shifted_addi() nounwind {
 ; RV64IXTHEADBB-NEXT:    lui a0, 32
 ; RV64IXTHEADBB-NEXT:    addiw a0, a0, -64
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: imm_left_shifted_addi:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a0, 32
+; RV32-REMAT-NEXT:    addi a0, a0, -64
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: imm_left_shifted_addi:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    lui a0, 32
+; RV64-REMAT-NEXT:    addiw a0, a0, -64
+; RV64-REMAT-NEXT:    ret
   ret i32 131008 ; 0x1FFC0
 }
 
@@ -342,6 +433,18 @@ define signext i32 @imm_right_shifted_addi() nounwind {
 ; RV64IXTHEADBB-NEXT:    lui a0, 524288
 ; RV64IXTHEADBB-NEXT:    addiw a0, a0, -1
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: imm_right_shifted_addi:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a0, 524288
+; RV32-REMAT-NEXT:    addi a0, a0, -1
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: imm_right_shifted_addi:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    lui a0, 524288
+; RV64-REMAT-NEXT:    addiw a0, a0, -1
+; RV64-REMAT-NEXT:    ret
   ret i32 2147483647 ; 0x7FFFFFFF
 }
 
@@ -383,6 +486,18 @@ define signext i32 @imm_right_shifted_lui() nounwind {
 ; RV64IXTHEADBB-NEXT:    lui a0, 56
 ; RV64IXTHEADBB-NEXT:    addiw a0, a0, 580
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: imm_right_shifted_lui:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a0, 56
+; RV32-REMAT-NEXT:    addi a0, a0, 580
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: imm_right_shifted_lui:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    lui a0, 56
+; RV64-REMAT-NEXT:    addiw a0, a0, 580
+; RV64-REMAT-NEXT:    ret
   ret i32 229956 ; 0x38244
 }
 
@@ -421,6 +536,18 @@ define i64 @imm64_1() nounwind {
 ; RV64IXTHEADBB-NEXT:    li a0, 1
 ; RV64IXTHEADBB-NEXT:    slli a0, a0, 31
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: imm64_1:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a0, 524288
+; RV32-REMAT-NEXT:    li a1, 0
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: imm64_1:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    li a0, 1
+; RV64-REMAT-NEXT:    slli a0, a0, 31
+; RV64-REMAT-NEXT:    ret
   ret i64 2147483648 ; 0x8000_0000
 }
 
@@ -460,6 +587,18 @@ define i64 @imm64_2() nounwind {
 ; RV64IXTHEADBB-NEXT:    li a0, -1
 ; RV64IXTHEADBB-NEXT:    srli a0, a0, 32
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: imm64_2:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    li a0, -1
+; RV32-REMAT-NEXT:    li a1, 0
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: imm64_2:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    li a0, -1
+; RV64-REMAT-NEXT:    srli a0, a0, 32
+; RV64-REMAT-NEXT:    ret
   ret i64 4294967295 ; 0xFFFF_FFFF
 }
 
@@ -498,6 +637,18 @@ define i64 @imm64_3() nounwind {
 ; RV64IXTHEADBB-NEXT:    li a0, 1
 ; RV64IXTHEADBB-NEXT:    slli a0, a0, 32
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: imm64_3:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    li a1, 1
+; RV32-REMAT-NEXT:    li a0, 0
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: imm64_3:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    li a0, 1
+; RV64-REMAT-NEXT:    slli a0, a0, 32
+; RV64-REMAT-NEXT:    ret
   ret i64 4294967296 ; 0x1_0000_0000
 }
 
@@ -536,6 +687,18 @@ define i64 @imm64_4() nounwind {
 ; RV64IXTHEADBB-NEXT:    li a0, -1
 ; RV64IXTHEADBB-NEXT:    slli a0, a0, 63
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: imm64_4:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a1, 524288
+; RV32-REMAT-NEXT:    li a0, 0
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: imm64_4:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    li a0, -1
+; RV64-REMAT-NEXT:    slli a0, a0, 63
+; RV64-REMAT-NEXT:    ret
   ret i64 9223372036854775808 ; 0x8000_0000_0000_0000
 }
 
@@ -574,6 +737,18 @@ define i64 @imm64_5() nounwind {
 ; RV64IXTHEADBB-NEXT:    li a0, -1
 ; RV64IXTHEADBB-NEXT:    slli a0, a0, 63
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: imm64_5:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a1, 524288
+; RV32-REMAT-NEXT:    li a0, 0
+; RV32-REMAT-NEXT:    ret
+;
+; RV64-REMAT-LABEL: imm64_5:
+; RV64-REMAT:       # %bb.0:
+; RV64-REMAT-NEXT:    li a0, -1
+; RV64-REMAT-NEXT:    slli a0, a0, 63
+; RV64-REMAT-NEXT:    ret
   ret i64 -9223372036854775808 ; 0x8000_0000_0000_0000
 }
 
@@ -619,6 +794,20 @@ define i64 @imm64_6() nounwind {
 ; RV64IXTHEADBB-NEXT:    addi a0, a0, -1329
 ; RV64IXTHEADBB-NEXT:    slli a0, a0, 35
 ; RV64IXTHEADBB-NEXT:    ret
+;
+; RV32-REMAT-LABEL: imm64_6:
+; RV32-REMAT:       # %bb.0:
+; RV32-REMAT-NEXT:    lui a1, 74565
+; RV32-REMAT-NEXT:    addi a1, a1, 1656
+; RV32-REMAT-NEXT:    li a0, 0
+; RV32-REMAT-NEXT:  ...
[truncated]

@github-actions
Copy link

github-actions bot commented Oct 23, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

; REMAT-NEXT: add a0, a1, a0
; REMAT-NEXT: sf.vc.v.i 2, 0, v8, 0
; REMAT-NEXT: vse32.v v10, (a0)
; REMAT-NEXT: lui a0, 9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be a way of reducing the amount of redundant instructions? We generate a bunch of identical lui instructions and each time we need to follow that with a addiw.

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM w/optional suggestions

p.s. Forgot to say, my approval is for the experimental feature. Whether this is the direction we want to move in overall is going to depend of more work (i.e. the CSE problem), and data collected once we've addressed obvious problems to see if this should be enabled by default.

@@ -1670,6 +1670,12 @@ def PseudoJump : Pseudo<(outs GPR:$rd), (ins pseudo_jump_symbol:$target), [],
"jump", "$target, $rd">,
Sched<[WriteIALU, WriteJalr, ReadJalr]>;

// Pseudo for LI of simm32.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add something to the comment to indicate this is currently purely for experimentation purposes?

Val, MBB.getParent()->getSubtarget().getFeatureBits());
assert(!Seq.empty());

Register SrcReg = RISCV::X0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very close to RISCVInstrInfo::movImm, any chance we can common code?

@topperc topperc merged commit 109aa58 into llvm:main Oct 26, 2023
2 of 3 checks passed
@topperc topperc deleted the pr/remat-pseudo branch October 26, 2023 00:20
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Oct 26, 2023
…alizable constant materialization sequence. (llvm#69983)

Rematerialization during register allocation is currently limited to a
single instruction with no inputs.

This patch introduces a pseudoinstruction that represents the
materialization of a constant. I've started with a sequence of 2
instructions for now, which covers at least the common LUI+ADDI(W) case.
This instruction will be expanded into real instructions immediately
after register allocation using a new pass. This gives the post-RA
scheduler a chance to separate the 2 instructions to improve ILP.

I believe this matches the approach used by AArch64.

Unfortunately, this loses some CSE opportunies when an LUI value is used
by multiple constants with different LSBs.

This feature is off by default and a new backend command line option is
added to enable it for testing.

This avoids the spill and reloads reported in llvm#69586.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants