[RISCV] Fold `addi` into load / store even if they are in different BBs. #67024

mgudim · 2023-09-21T14:44:41Z

Currently, since ISel only looks at one basic block at a time we miss some opportunities to combine load / store with addi. Such opportunities may occur when GEP and the use of GEP are in different basic blocks.

In this PR we combine addi with memory access in RISCVISelLowering:finalizeLowering.

According to my measurements, this patch is about 1% in dynamic instruciton count on mcf.

llvmbot · 2023-09-21T14:45:55Z

@llvm/pr-subscribers-backend-risc-v

Changes

Currently, since ISel only looks at one basic block at a time we miss some opportunities to combine load / store with addi. Such opportunities may occur when GEP and the use of GEP are in different basic blocks.

In this PR we combine addi with memory access in RISCVISelLowering:finalizeLowering.

Full diff: https://github.com/llvm/llvm-project/pull/67024.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+79)
(modified) llvm/lib/Target/RISCV/RISCVISelLowering.h (+2)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index d176fcaf54c2db0..492714ae2a78f8e 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -49,6 +49,7 @@ using namespace llvm;
 #define DEBUG_TYPE "riscv-lower"
 
 STATISTIC(NumTailCalls, "Number of tail calls");
+STATISTIC(NumADDIsMerged, "Number of ADDIs merged.");
 
 static cl::opt<unsigned> ExtensionMaxWebSize(
     DEBUG_TYPE "-ext-max-web-size", cl::Hidden,
@@ -2278,6 +2279,84 @@ bool RISCVTargetLowering::isLegalElementTypeForRVV(EVT ScalarTy) const {
   }
 }
 
+static bool tryToFoldInstIntoUse(MachineInstr &UseMI, MachineInstr &MI) {
+
+  if (MI.getOpcode() != RISCV::ADDI)
+    return false;
+  if (!(MI.getOperand(0).isReg() && MI.getOperand(1).isReg()))
+    return false;
+
+  switch (UseMI.getOpcode()) {
+  default:
+    return false;
+  case RISCV::LB:
+  case RISCV::LH:
+  case RISCV::LW:
+  case RISCV::LD:
+  case RISCV::LBU:
+  case RISCV::LHU:
+  case RISCV::SB:
+  case RISCV::SH:
+  case RISCV::SW:
+  case RISCV::SD:
+    break;
+  }
+  MachineOperand &OriginalBaseMO = UseMI.getOperand(1);
+  if (!OriginalBaseMO.isReg())
+    return false;
+  if (OriginalBaseMO.getReg() != MI.getOperand(0).getReg())
+    return false;
+
+  MachineOperand &OriginalOffsetMO = UseMI.getOperand(2);
+  MachineOperand &ADDIOffsetMO = MI.getOperand(2);
+  if (!(OriginalOffsetMO.isImm() && ADDIOffsetMO.isImm()))
+    return false;
+
+  int64_t OriginalOffset = OriginalOffsetMO.getImm();
+  int64_t ADDIOffset = ADDIOffsetMO.getImm();
+  int64_t TotalOffset = OriginalOffset + ADDIOffset;
+  if (!isInt<12>(TotalOffset))
+    return false;
+
+  OriginalOffsetMO.setImm(TotalOffset);
+  OriginalBaseMO.setReg(MI.getOperand(1).getReg());
+  NumADDIsMerged++;
+  return true;
+}
+
+void RISCVTargetLowering::finalizeLowering(MachineFunction &MF) const {
+  TargetLoweringBase::finalizeLowering(MF);
+  MachineRegisterInfo &MRI = MF.getRegInfo();
+
+  SmallVector<MachineInstr *, 8> ToErase;
+  for (MachineFunction::iterator I = MF.begin(), E = MF.end(); I != E; ++I) {
+    MachineBasicBlock *MBB = &*I;
+    for (MachineBasicBlock::iterator MBBI = MBB->begin(), MBBE = MBB->end();
+         MBBI != MBBE;) {
+      MachineInstr &MI = *MBBI++;
+      if (MI.getOpcode() != RISCV::ADDI)
+        continue;
+      if (!MI.getOperand(0).isReg())
+        continue;
+      SmallVector<MachineInstr *, 4> Users;
+      for (MachineInstr &UseMI :
+           MRI.use_instructions(MI.getOperand(0).getReg())) {
+        Users.push_back(&UseMI);
+      }
+      bool AllUsesWereFolded = true;
+      for (MachineInstr *UseMI : Users) {
+        AllUsesWereFolded &= tryToFoldInstIntoUse(*UseMI, MI);
+      }
+      if (AllUsesWereFolded)
+        ToErase.push_back(&MI);
+    }
+  }
+  for (MachineInstr *MI : ToErase) {
+    MI->eraseFromParent();
+  }
+
+  return;
+}
 
 unsigned RISCVTargetLowering::combineRepeatedFPDivisors() const {
   return NumRepeatedDivisors;
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index 815b9be47f56026..1f0416231a2f4cb 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -951,6 +951,8 @@ class RISCVTargetLowering : public TargetLowering {
     return false;
   };
 
+  void finalizeLowering(MachineFunction &MF) const override;
+
   /// For available scheduling models FDIV + two independent FMULs are much
   /// faster than two FDIVs.
   unsigned combineRepeatedFPDivisors() const override;

Currently, since ISel only looks at one basic block at a time we miss some opportunities to combine load / store with `addi`. Such opportunities may occur when GEP and the use of GEP are in different basic blocks. In this PR we combine `addi` with memory access in `RISCVISelLowering:finalizeLowering`.

topperc · 2023-09-21T16:37:29Z

Tests?

Isn't CodeGenPrepare::optimizeMemoryInst supposed to move GEPs to their uses?

Does this increase register pressure?

wangpc-pp · 2023-09-22T02:49:16Z

Why can't this be done in RISCVMergeBaseOffset?

michaelmaitland · 2023-09-22T03:18:56Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

+
+  MachineOperand &OriginalOffsetMO = UseMI.getOperand(2);
+  MachineOperand &ADDIOffsetMO = MI.getOperand(2);
+  if (!(OriginalOffsetMO.isImm() && ADDIOffsetMO.isImm()))


Should this be an assert since we know this instruction is an ADDI?

Could it be a symbol? Like something that only gets resolved by linker?

Yes it can be a symbol. We do need to check isImm().

michaelmaitland · 2023-09-22T03:20:11Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

+    for (MachineBasicBlock::iterator MBBI = MBB->begin(), MBBE = MBB->end();
+         MBBI != MBBE;) {
+      MachineInstr &MI = *MBBI++;
+      if (MI.getOpcode() != RISCV::ADDI)


I think you already check Opcode == ADDI and Operand0.isReg() in tryToFoldInstIntoUse

Right, thanks!

mgudim · 2023-09-22T14:42:55Z

Tests?
I wanted to commit a test first:
#67022

This way the diff for this PR can show what's changed (https://llvm.org/docs/TestingGuide.html#precommit-workflow-for-tests)

Isn't CodeGenPrepare::optimizeMemoryInst supposed to move GEPs to their uses?
Yes, I've looked at that at first too. The problem is that the gep lowers into more than one instruction, but only one of them (addi) can be sinked into the loop to be absorbed into offset of ld.

Does this increase register pressure?

I hope not. Without this transformation, the addi inside the loop would use a register until the ld. Let me collect the number of spills before / after this pathch and I'll post that soon.

mgudim · 2023-09-22T15:09:56Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

+  case RISCV::SB:
+  case RISCV::SH:
+  case RISCV::SW:
+  case RISCV::SD:


I just realized I should also include FP loads and stores.

you're also missing LWU

mgudim · 2023-09-22T15:24:58Z

Why can't this be done in RISCVMergeBaseOffset?

I haven't seen this pass when I wrote this. I agree, good idea.

I can't understand why is RISCVMergeBaseOffset so late in the pipeline? If we move it right after ISel, we can do the optmizations earlier, which should be generally better, no? Also, some many things are easier to implement while we are still in SSA. What is the latest point at which opportunities for MergeBaseOffset are created?

I see two questions here:

(1) Should we combine this with MergeBaseOffset (I vote "yes").
(2) Where in pipeline should this be? (Move MergeBaseOffset to finalizeLowering?)
@wangpc-pp @topperc @michaelmaitland @asb What do you think?

Update: I see that there is ExpandPseudoInst pass which needs MergeBaseOffsets to be run after it. This raises another question: can we move ExpandPseudoInst earlier?

topperc · 2023-09-22T18:34:05Z

Also, some many things are easier to implement while we are still in SSA.

We're still in SSA form where MergeBaseOffset is now. So I don't understand this statement.

topperc · 2023-09-22T18:38:03Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

+      bool AllUsesWereFolded = true;
+      for (MachineInstr *UseMI : Users)
+        AllUsesWereFolded &= tryToFoldInstIntoUse(*UseMI, MI);
+      if (AllUsesWereFolded)


If all uses aren't folded, then this would increase register pressure wouldn't it?

topperc · 2023-09-25T18:48:02Z

Could we use https://reviews.llvm.org/D152828 for this?

mgudim · 2023-09-25T19:27:53Z

Could we use https://reviews.llvm.org/D152828 for this?

Good idea, I'll take a look but not this week.

mgudim requested review from asb, michaelmaitland, topperc and asi-sc September 21, 2023 14:44

llvmbot added the backend:RISC-V label Sep 21, 2023

mgudim force-pushed the riscv-finalize-lowering branch from 80e35d2 to daa4a90 Compare September 21, 2023 14:50

michaelmaitland reviewed Sep 22, 2023

View reviewed changes

mgudim commented Sep 22, 2023

View reviewed changes

topperc reviewed Sep 22, 2023

View reviewed changes

topperc mentioned this pull request Sep 27, 2023

[RISCV] Add sink-and-fold support for RISC-V. #67602

Merged

mgudim closed this Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Fold `addi` into load / store even if they are in different BBs. #67024

[RISCV] Fold `addi` into load / store even if they are in different BBs. #67024

mgudim commented Sep 21, 2023 •

edited

Loading

llvmbot commented Sep 21, 2023

topperc commented Sep 21, 2023

wangpc-pp commented Sep 22, 2023

michaelmaitland Sep 22, 2023

mgudim Sep 22, 2023 •

edited

Loading

topperc Sep 22, 2023

michaelmaitland Sep 22, 2023

mgudim Sep 22, 2023

mgudim commented Sep 22, 2023

mgudim Sep 22, 2023

topperc Sep 22, 2023

mgudim commented Sep 22, 2023 •

edited

Loading

topperc commented Sep 22, 2023

topperc Sep 22, 2023

topperc commented Sep 25, 2023

mgudim commented Sep 25, 2023

[RISCV] Fold addi into load / store even if they are in different BBs. #67024

[RISCV] Fold addi into load / store even if they are in different BBs. #67024

Conversation

mgudim commented Sep 21, 2023 • edited Loading

llvmbot commented Sep 21, 2023

topperc commented Sep 21, 2023

wangpc-pp commented Sep 22, 2023

michaelmaitland Sep 22, 2023

Choose a reason for hiding this comment

mgudim Sep 22, 2023 • edited Loading

Choose a reason for hiding this comment

topperc Sep 22, 2023

Choose a reason for hiding this comment

michaelmaitland Sep 22, 2023

Choose a reason for hiding this comment

mgudim Sep 22, 2023

Choose a reason for hiding this comment

mgudim commented Sep 22, 2023

mgudim Sep 22, 2023

Choose a reason for hiding this comment

topperc Sep 22, 2023

Choose a reason for hiding this comment

mgudim commented Sep 22, 2023 • edited Loading

topperc commented Sep 22, 2023

topperc Sep 22, 2023

Choose a reason for hiding this comment

topperc commented Sep 25, 2023

mgudim commented Sep 25, 2023

[RISCV] Fold `addi` into load / store even if they are in different BBs. #67024

[RISCV] Fold `addi` into load / store even if they are in different BBs. #67024

mgudim commented Sep 21, 2023 •

edited

Loading

mgudim Sep 22, 2023 •

edited

Loading

mgudim commented Sep 22, 2023 •

edited

Loading