Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Remove CalleeSavedInfo for Zcmp/save-restore-libcalls registers #79535

Merged
merged 1 commit into from
Feb 7, 2024

Conversation

francisvm
Copy link
Collaborator

Registers that are pushed/popped by Zcmp or libcalls have pre-defined frame indices that are never allocated in MachineFrameInfo. They're being used throughout PEI, but the rest of codegen doesn't work that way and expects each frame index to be a valid index in MFI.

This patch keeps it local to PEI and removes them from the CalleeSavedInfo list at the end of the pass.

Before this pass, any MIR testing post-PEI is broken and asserts (see issue #79491).

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 26, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Visoiu Mistrih Francis (francisvm)

Changes

Registers that are pushed/popped by Zcmp or libcalls have pre-defined frame indices that are never allocated in MachineFrameInfo. They're being used throughout PEI, but the rest of codegen doesn't work that way and expects each frame index to be a valid index in MFI.

This patch keeps it local to PEI and removes them from the CalleeSavedInfo list at the end of the pass.

Before this pass, any MIR testing post-PEI is broken and asserts (see issue #79491).


Full diff: https://github.com/llvm/llvm-project/pull/79535.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+24)
  • (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.h (+4)
  • (added) llvm/test/CodeGen/RISCV/zcmp-cm-push-pop.mir (+216)
diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
index d793c0b7377baf0..a017ec0f43d30e7 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
@@ -1252,6 +1252,30 @@ void RISCVFrameLowering::processFunctionBeforeFrameFinalized(
   RVFI->setCalleeSavedStackSize(Size);
 }
 
+void RISCVFrameLowering::processFunctionBeforeFrameIndicesReplaced(
+    MachineFunction &MF, RegScavenger *RS) const {
+  // Remove CalleeSavedInfo for registers saved by Zcmp or save/restore
+  // libcalls.
+  MachineFrameInfo &MFI = MF.getFrameInfo();
+  const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+  const auto *RVFI = MF.getInfo<RISCVMachineFunctionInfo>();
+  if (!RVFI->isPushable(MF) && !RVFI->useSaveRestoreLibCalls(MF))
+    return;
+  const std::vector<CalleeSavedInfo> &CSIs = MFI.getCalleeSavedInfo();
+  std::vector<CalleeSavedInfo> NewCSIs;
+  for (const auto &CSI : CSIs) {
+    // Skip CSRs that have fake a frame index.
+    [[maybe_unused]] int ReservedFI = 0;
+    if (TRI->hasReservedSpillSlot(MF, CSI.getReg(), ReservedFI)) {
+      assert(CSI.getFrameIdx() == ReservedFI &&
+             "Reserved CSR spill slot frame index mismatch in CSI");
+      continue;
+    }
+    NewCSIs.push_back(CSI);
+  }
+  MFI.setCalleeSavedInfo(std::move(NewCSIs));
+}
+
 // Not preserve stack space within prologue for outgoing variables when the
 // function contains variable size objects or there are vector objects accessed
 // by the frame pointer.
diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.h b/llvm/lib/Target/RISCV/RISCVFrameLowering.h
index 5c1c7317d24bc6a..a784479f111b115 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.h
@@ -37,6 +37,10 @@ class RISCVFrameLowering : public TargetFrameLowering {
   void processFunctionBeforeFrameFinalized(MachineFunction &MF,
                                            RegScavenger *RS) const override;
 
+  void
+  processFunctionBeforeFrameIndicesReplaced(MachineFunction &MF,
+                                            RegScavenger *RS) const override;
+
   bool hasFP(const MachineFunction &MF) const override;
 
   bool hasBP(const MachineFunction &MF) const;
diff --git a/llvm/test/CodeGen/RISCV/zcmp-cm-push-pop.mir b/llvm/test/CodeGen/RISCV/zcmp-cm-push-pop.mir
new file mode 100644
index 000000000000000..cf14b9b841c2344
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/zcmp-cm-push-pop.mir
@@ -0,0 +1,216 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=riscv32 -mattr=+zcmp -x mir -run-pass=prologepilog -verify-machineinstrs -o - %s \
+# RUN: | FileCheck -check-prefixes=CHECK-ZCMP32 %s
+# RUN: llc -mtriple=riscv64 -mattr=+zcmp -x mir -run-pass=prologepilog -verify-machineinstrs -o - %s \
+# RUN: | FileCheck -check-prefixes=CHECK-ZCMP64 %s
+# RUN: llc -mtriple=riscv32 -x mir -run-pass=prologepilog -verify-machineinstrs -o - %s \
+# RUN: | FileCheck -check-prefixes=CHECK-NO-ZCMP32 %s
+# RUN: llc -mtriple=riscv64 -x mir -run-pass=prologepilog -verify-machineinstrs -o - %s \
+# RUN: | FileCheck -check-prefixes=CHECK-NO-ZCMP64 %s
+---
+name: push_rvlist15
+tracksRegLiveness: true
+body:                   |
+  bb.0:
+    ; CHECK-ZCMP32-LABEL: name: push_rvlist15
+    ; CHECK-ZCMP32: liveins: $x1, $x8, $x9, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27
+    ; CHECK-ZCMP32-NEXT: {{  $}}
+    ; CHECK-ZCMP32-NEXT: frame-setup CM_PUSH 15, 0, implicit $x1, implicit $x8, implicit $x9, implicit $x18, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25, implicit $x26, implicit $x27
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 64
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x1, -52
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x8, -48
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x9, -44
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x18, -40
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x19, -36
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x20, -32
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x21, -28
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x22, -24
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x23, -20
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x24, -16
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x25, -12
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x26, -8
+    ; CHECK-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x27, -4
+    ; CHECK-ZCMP32-NEXT: $x1 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x8 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x9 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x18 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x19 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x20 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x21 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x22 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x23 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x24 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x25 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x26 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: $x27 = IMPLICIT_DEF
+    ; CHECK-ZCMP32-NEXT: frame-destroy CM_POP 15, 0, implicit-def $x1, implicit-def $x8, implicit-def $x9, implicit-def $x18, implicit-def $x19, implicit-def $x20, implicit-def $x21, implicit-def $x22, implicit-def $x23, implicit-def $x24, implicit-def $x25, implicit-def $x26, implicit-def $x27
+    ; CHECK-ZCMP32-NEXT: PseudoRET
+    ;
+    ; CHECK-ZCMP64-LABEL: name: push_rvlist15
+    ; CHECK-ZCMP64: liveins: $x1, $x8, $x9, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27
+    ; CHECK-ZCMP64-NEXT: {{  $}}
+    ; CHECK-ZCMP64-NEXT: frame-setup CM_PUSH 15, 0, implicit $x1, implicit $x8, implicit $x9, implicit $x18, implicit $x19, implicit $x20, implicit $x21, implicit $x22, implicit $x23, implicit $x24, implicit $x25, implicit $x26, implicit $x27
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 112
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x1, -104
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x8, -96
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x9, -88
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x18, -80
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x19, -72
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x20, -64
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x21, -56
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x22, -48
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x23, -40
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x24, -32
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x25, -24
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x26, -16
+    ; CHECK-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x27, -8
+    ; CHECK-ZCMP64-NEXT: $x1 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x8 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x9 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x18 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x19 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x20 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x21 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x22 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x23 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x24 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x25 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x26 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: $x27 = IMPLICIT_DEF
+    ; CHECK-ZCMP64-NEXT: frame-destroy CM_POP 15, 0, implicit-def $x1, implicit-def $x8, implicit-def $x9, implicit-def $x18, implicit-def $x19, implicit-def $x20, implicit-def $x21, implicit-def $x22, implicit-def $x23, implicit-def $x24, implicit-def $x25, implicit-def $x26, implicit-def $x27
+    ; CHECK-ZCMP64-NEXT: PseudoRET
+    ;
+    ; CHECK-NO-ZCMP32-LABEL: name: push_rvlist15
+    ; CHECK-NO-ZCMP32: liveins: $x1, $x8, $x9, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27
+    ; CHECK-NO-ZCMP32-NEXT: {{  $}}
+    ; CHECK-NO-ZCMP32-NEXT: $x2 = frame-setup ADDI $x2, -64
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 64
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x1, $x2, 60 :: (store (s32) into %stack.0)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x8, $x2, 56 :: (store (s32) into %stack.1)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x9, $x2, 52 :: (store (s32) into %stack.2)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x18, $x2, 48 :: (store (s32) into %stack.3)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x19, $x2, 44 :: (store (s32) into %stack.4)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x20, $x2, 40 :: (store (s32) into %stack.5)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x21, $x2, 36 :: (store (s32) into %stack.6)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x22, $x2, 32 :: (store (s32) into %stack.7)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x23, $x2, 28 :: (store (s32) into %stack.8)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x24, $x2, 24 :: (store (s32) into %stack.9)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x25, $x2, 20 :: (store (s32) into %stack.10)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x26, $x2, 16 :: (store (s32) into %stack.11)
+    ; CHECK-NO-ZCMP32-NEXT: SW killed $x27, $x2, 12 :: (store (s32) into %stack.12)
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x1, -4
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x8, -8
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x9, -12
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x18, -16
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x19, -20
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x20, -24
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x21, -28
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x22, -32
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x23, -36
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x24, -40
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x25, -44
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x26, -48
+    ; CHECK-NO-ZCMP32-NEXT: frame-setup CFI_INSTRUCTION offset $x27, -52
+    ; CHECK-NO-ZCMP32-NEXT: $x1 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x8 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x9 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x18 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x19 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x20 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x21 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x22 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x23 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x24 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x25 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x26 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x27 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP32-NEXT: $x1 = LW $x2, 60 :: (load (s32) from %stack.0)
+    ; CHECK-NO-ZCMP32-NEXT: $x8 = LW $x2, 56 :: (load (s32) from %stack.1)
+    ; CHECK-NO-ZCMP32-NEXT: $x9 = LW $x2, 52 :: (load (s32) from %stack.2)
+    ; CHECK-NO-ZCMP32-NEXT: $x18 = LW $x2, 48 :: (load (s32) from %stack.3)
+    ; CHECK-NO-ZCMP32-NEXT: $x19 = LW $x2, 44 :: (load (s32) from %stack.4)
+    ; CHECK-NO-ZCMP32-NEXT: $x20 = LW $x2, 40 :: (load (s32) from %stack.5)
+    ; CHECK-NO-ZCMP32-NEXT: $x21 = LW $x2, 36 :: (load (s32) from %stack.6)
+    ; CHECK-NO-ZCMP32-NEXT: $x22 = LW $x2, 32 :: (load (s32) from %stack.7)
+    ; CHECK-NO-ZCMP32-NEXT: $x23 = LW $x2, 28 :: (load (s32) from %stack.8)
+    ; CHECK-NO-ZCMP32-NEXT: $x24 = LW $x2, 24 :: (load (s32) from %stack.9)
+    ; CHECK-NO-ZCMP32-NEXT: $x25 = LW $x2, 20 :: (load (s32) from %stack.10)
+    ; CHECK-NO-ZCMP32-NEXT: $x26 = LW $x2, 16 :: (load (s32) from %stack.11)
+    ; CHECK-NO-ZCMP32-NEXT: $x27 = LW $x2, 12 :: (load (s32) from %stack.12)
+    ; CHECK-NO-ZCMP32-NEXT: $x2 = frame-destroy ADDI $x2, 64
+    ; CHECK-NO-ZCMP32-NEXT: PseudoRET
+    ;
+    ; CHECK-NO-ZCMP64-LABEL: name: push_rvlist15
+    ; CHECK-NO-ZCMP64: liveins: $x1, $x8, $x9, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27
+    ; CHECK-NO-ZCMP64-NEXT: {{  $}}
+    ; CHECK-NO-ZCMP64-NEXT: $x2 = frame-setup ADDI $x2, -112
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION def_cfa_offset 112
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x1, $x2, 104 :: (store (s64) into %stack.0)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x8, $x2, 96 :: (store (s64) into %stack.1)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x9, $x2, 88 :: (store (s64) into %stack.2)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x18, $x2, 80 :: (store (s64) into %stack.3)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x19, $x2, 72 :: (store (s64) into %stack.4)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x20, $x2, 64 :: (store (s64) into %stack.5)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x21, $x2, 56 :: (store (s64) into %stack.6)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x22, $x2, 48 :: (store (s64) into %stack.7)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x23, $x2, 40 :: (store (s64) into %stack.8)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x24, $x2, 32 :: (store (s64) into %stack.9)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x25, $x2, 24 :: (store (s64) into %stack.10)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x26, $x2, 16 :: (store (s64) into %stack.11)
+    ; CHECK-NO-ZCMP64-NEXT: SD killed $x27, $x2, 8 :: (store (s64) into %stack.12)
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x1, -8
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x8, -16
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x9, -24
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x18, -32
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x19, -40
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x20, -48
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x21, -56
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x22, -64
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x23, -72
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x24, -80
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x25, -88
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x26, -96
+    ; CHECK-NO-ZCMP64-NEXT: frame-setup CFI_INSTRUCTION offset $x27, -104
+    ; CHECK-NO-ZCMP64-NEXT: $x1 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x8 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x9 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x18 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x19 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x20 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x21 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x22 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x23 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x24 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x25 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x26 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x27 = IMPLICIT_DEF
+    ; CHECK-NO-ZCMP64-NEXT: $x1 = LD $x2, 104 :: (load (s64) from %stack.0)
+    ; CHECK-NO-ZCMP64-NEXT: $x8 = LD $x2, 96 :: (load (s64) from %stack.1)
+    ; CHECK-NO-ZCMP64-NEXT: $x9 = LD $x2, 88 :: (load (s64) from %stack.2)
+    ; CHECK-NO-ZCMP64-NEXT: $x18 = LD $x2, 80 :: (load (s64) from %stack.3)
+    ; CHECK-NO-ZCMP64-NEXT: $x19 = LD $x2, 72 :: (load (s64) from %stack.4)
+    ; CHECK-NO-ZCMP64-NEXT: $x20 = LD $x2, 64 :: (load (s64) from %stack.5)
+    ; CHECK-NO-ZCMP64-NEXT: $x21 = LD $x2, 56 :: (load (s64) from %stack.6)
+    ; CHECK-NO-ZCMP64-NEXT: $x22 = LD $x2, 48 :: (load (s64) from %stack.7)
+    ; CHECK-NO-ZCMP64-NEXT: $x23 = LD $x2, 40 :: (load (s64) from %stack.8)
+    ; CHECK-NO-ZCMP64-NEXT: $x24 = LD $x2, 32 :: (load (s64) from %stack.9)
+    ; CHECK-NO-ZCMP64-NEXT: $x25 = LD $x2, 24 :: (load (s64) from %stack.10)
+    ; CHECK-NO-ZCMP64-NEXT: $x26 = LD $x2, 16 :: (load (s64) from %stack.11)
+    ; CHECK-NO-ZCMP64-NEXT: $x27 = LD $x2, 8 :: (load (s64) from %stack.12)
+    ; CHECK-NO-ZCMP64-NEXT: $x2 = frame-destroy ADDI $x2, 112
+    ; CHECK-NO-ZCMP64-NEXT: PseudoRET
+    $x1 = IMPLICIT_DEF
+    $x8 = IMPLICIT_DEF
+    $x9 = IMPLICIT_DEF
+    $x18 = IMPLICIT_DEF
+    $x19 = IMPLICIT_DEF
+    $x20 = IMPLICIT_DEF
+    $x21 = IMPLICIT_DEF
+    $x22 = IMPLICIT_DEF
+    $x23 = IMPLICIT_DEF
+    $x24 = IMPLICIT_DEF
+    $x25 = IMPLICIT_DEF
+    $x26 = IMPLICIT_DEF
+    $x27 = IMPLICIT_DEF
+    PseudoRET
+...

@wangpc-pp wangpc-pp requested review from asb and topperc January 26, 2024 03:34
@wangpc-pp
Copy link
Contributor

Thanks! I have a question:
Does ARM have the same issue as it has PUSH/POP? How does it handle these CSIs?

@francisvm
Copy link
Collaborator Author

Thanks! I have a question: Does ARM have the same issue as it has PUSH/POP? How does it handle these CSIs?

No, it doesn't have this issue. The issue here is that in MFI, there are no real stack objects created for these CSR spills, which is what a frame index is supposed to point to. In the other backends, they get real stack objects with frame indices that are not pre-defined, which is what we should probably do here too. I did try to fix it properly but there are too many assumptions made around the current mode that makes it a little harder to fix for me.

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks fine, but I am not familiar with this part, so please wait for another approval.

llvm/lib/Target/RISCV/RISCVFrameLowering.cpp Show resolved Hide resolved
Registers that are pushed/popped by Zcmp or libcalls have pre-defined
frame indices that are never allocated in MachineFrameInfo. They're
being used throughout PEI, but the rest of codegen doesn't work that
way and expects each frame index to be a valid index in MFI.

This patch keeps it local to PEI and removes them from the
CalleeSavedInfo list at the end of the pass.

Before this pass, any MIR testing post-PEI is broken and asserts (see
issue llvm#79491).
@francisvm
Copy link
Collaborator Author

Last push should fix the test. Let me know if anyone has any more concerns, I'd like to put up other patches that need MIR testing.

@francisvm
Copy link
Collaborator Author

ping?

Copy link
Contributor

@yetingk yetingk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@francisvm francisvm merged commit 69a661c into llvm:main Feb 7, 2024
4 checks passed
@francisvm francisvm deleted the riscv/zcmp-libcall-csi branch February 7, 2024 02:19
@francisvm francisvm linked an issue Feb 8, 2024 that may be closed by this pull request
@topperc
Copy link
Collaborator

topperc commented Feb 8, 2024

Could this patch affect Inter Procedural Register Allocation? I'm seeing a regression and I'm guess this could be related.

@francisvm
Copy link
Collaborator Author

Could this patch affect Inter Procedural Register Allocation? I'm seeing a regression and I'm guess this could be related.

hmm, PEI runs post-ra, right? I don't know how IPRA works and what is the state of each function when it runs, but this should only be apparent post-PEI.

It does seem like LiveRegUnits uses MFI.getCalleeSavedInfo for pristine registers, so maybe there is something related there?

llvm/lib/CodeGen/LiveRegUnits.cpp
102:    const auto &CSI = MFI.getCalleeSavedInfo();
122:    for (const CalleeSavedInfo &Info : MFI.getCalleeSavedInfo())
133:  for (const CalleeSavedInfo &Info : MFI.getCalleeSavedInfo())

@topperc
Copy link
Collaborator

topperc commented Feb 8, 2024

Could this patch affect Inter Procedural Register Allocation? I'm seeing a regression and I'm guess this could be related.

hmm, PEI runs post-ra, right? I don't know how IPRA works and what is the state of each function when it runs, but this should only be apparent post-PEI.

It does seem like LiveRegUnits uses MFI.getCalleeSavedInfo for pristine registers, so maybe there is something related there?

llvm/lib/CodeGen/LiveRegUnits.cpp
102:    const auto &CSI = MFI.getCalleeSavedInfo();
122:    for (const CalleeSavedInfo &Info : MFI.getCalleeSavedInfo())
133:  for (const CalleeSavedInfo &Info : MFI.getCalleeSavedInfo())

IPRA has a pass that collects which registers a function uses. That is used to update the regmask for call sites in its callers. I think that collection pass runs after PEI.

@topperc
Copy link
Collaborator

topperc commented Feb 8, 2024

I'm going to attempt to register the slots with MFI.

@francisvm
Copy link
Collaborator Author

I'm going to attempt to register the slots with MFI.

that would be great, feel free to add me as a reviewer

topperc added a commit to topperc/llvm-project that referenced this pull request Feb 11, 2024
…ave-restore/Zcmp.

PEI previously used fake frame indices for these callee saved registers.
These fake frame indices are not register with MachineFrameInfo.
This required them to be deleted form CalleeSavedInfo after PEI to
avoid breaking later passes. See llvm#79535

Unfortunately, removing the registers from CalleeSavedInfo pessimizes
Interprocedural Register Allocation. The RegUsageInfoCollector pass
runs after PEI and uses CalleeSavedInfo.

This patch replaces llvm#79535 by properly creating fixed stack objects
through MachineFrameInfo. This changes the stack size and offsets
returned by MachineFrameInfo which requires changes to how
RISCVFrameLowering uses that information.

In addition to the individual object for each register, I've also
create a single large fixed object that covers the entire stack
area covered by cm.push or the libcalls. cm.push must always push
a multiple of 16 bytes and the save restore libcall pushes a multiple
of stack align. I think this leaves holes in the stack where we could
spill other registers, but it matches what we did previously. Maybe
we can optimize this in the future.

The only test changes are due to stack alignment handling after
the callee save registers. Since we now have the fixed objects, on
the stack the offset is non-zero when an aligned object is processed
so the offset gets rounded up, increasing the stack size.
topperc added a commit that referenced this pull request Feb 13, 2024
…ave-restore/Zcmp (#81392)

PEI previously used fake frame indices for these callee saved registers.
These fake frame indices are not register with MachineFrameInfo. This
required them to be deleted form CalleeSavedInfo after PEI to avoid
breaking later passes. See #79535

Unfortunately, removing the registers from CalleeSavedInfo pessimizes
Interprocedural Register Allocation. The RegUsageInfoCollector pass runs
after PEI and uses CalleeSavedInfo.

This patch replaces #79535 by properly creating fixed stack objects
through MachineFrameInfo. This changes the stack size and offsets
returned by MachineFrameInfo which requires changes to how
RISCVFrameLowering uses that information.

In addition to the individual object for each register, I've also create
a single large fixed object that covers the entire stack area covered by
cm.push or the libcalls. cm.push must always push a multiple of 16 bytes
and the save restore libcall pushes a multiple of stack align. I think
this leaves holes in the stack where we could spill other registers, but
it matches what we did previously. Maybe we can optimize this in the
future.

The only test changes are due to stack alignment handling after the
callee save registers. Since we now have the fixed objects, on the stack
the offset is non-zero when an aligned object is processed so the offset
gets rounded up, increasing the stack size.

I suspect we might need some more updates for RVV related code. There is
very little or maybe even no testing of RVV mixed with Zcmp and
save-restore.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RISCV] Post-PEI MIR testing is broken with Zcmp/libcalls
5 participants