Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Add live-through register set printing to GCNRegPressurePrinter pass. #71096

Merged
merged 1 commit into from
Nov 20, 2023

Conversation

vpykhtin
Copy link
Contributor

@vpykhtin vpykhtin commented Nov 2, 2023

Add live-through register set printing, assuming live-through register is in live-in and live-out sets, has no redefinitions but may have uses in the block.

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 2, 2023

@llvm/pr-subscribers-backend-amdgpu

Author: Valery Pykhtin (vpykhtin)

Changes

Add live-through register set printing, assuming live-through register is in live-in and live-out sets, has no redefinitions but may have uses in the block.


Full diff: https://github.com/llvm/llvm-project/pull/71096.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/GCNRegPressure.cpp (+40)
  • (modified) llvm/test/CodeGen/AMDGPU/regpressure_printer.mir (+23)
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
index a04c470b7b9762f..6bb2b8fccfb938b 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
@@ -496,6 +496,34 @@ char &llvm::GCNRegPressurePrinterID = GCNRegPressurePrinter::ID;
 
 INITIALIZE_PASS(GCNRegPressurePrinter, "amdgpu-print-rp", "", true, true)
 
+// Return lanemask of Reg's subregs that are live-through at [Begin, End] and
+// are fully covered by Mask.
+static LaneBitmask
+getRegLiveThroughMask(const MachineRegisterInfo &MRI, const LiveIntervals &LIS,
+                      Register Reg, SlotIndex Begin, SlotIndex End,
+                      LaneBitmask Mask = LaneBitmask::getAll()) {
+
+  auto IsInOneSegment = [Begin, End](const LiveRange &LR) -> bool {
+    auto *Segment = LR.getSegmentContaining(Begin);
+    return Segment && Segment->contains(End);
+  };
+
+  LaneBitmask LiveThroughMask;
+  const LiveInterval &LI = LIS.getInterval(Reg);
+  if (LI.hasSubRanges()) {
+    for (auto &SR : LI.subranges()) {
+      if ((SR.LaneMask & Mask) == SR.LaneMask && IsInOneSegment(SR))
+        LiveThroughMask |= SR.LaneMask;
+    }
+  } else {
+    LaneBitmask RegMask = MRI.getMaxLaneMaskForVReg(Reg);
+    if ((RegMask & Mask) == RegMask && IsInOneSegment(LI))
+      LiveThroughMask = RegMask;
+  }
+
+  return LiveThroughMask;
+}
+
 bool GCNRegPressurePrinter::runOnMachineFunction(MachineFunction &MF) {
   const MachineRegisterInfo &MRI = MF.getRegInfo();
   const TargetRegisterInfo *TRI = MRI.getTargetRegisterInfo();
@@ -597,6 +625,18 @@ bool GCNRegPressurePrinter::runOnMachineFunction(MachineFunction &MF) {
     OS << PFX "  Live-out:" << llvm::print(LiveOut, MRI);
     if (UseDownwardTracker)
       ReportLISMismatchIfAny(LiveOut, getLiveRegs(MBBEndSlot, LIS, MRI));
+
+    GCNRPTracker::LiveRegSet LiveThrough;
+    for (auto [Reg, Mask] : LiveIn) {
+      LaneBitmask MaskIntersection = Mask & LiveOut.lookup(Reg);
+      if (MaskIntersection.any()) {
+        LaneBitmask LTMask = getRegLiveThroughMask(
+            MRI, LIS, Reg, MBBStartSlot, MBBEndSlot, MaskIntersection);
+        if (LTMask.any())
+          LiveThrough[Reg] = LTMask;
+      }
+    }
+    OS << PFX "  Live-thr:" << llvm::print(LiveThrough, MRI);
   }
   OS << "...\n";
   return false;
diff --git a/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir b/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir
index d53050167e98bef..d4e2e75f383d32b 100644
--- a/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir
+++ b/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir
@@ -17,11 +17,13 @@ body:             |
   ; RP-NEXT:   2     1      %1:sgpr_64 = IMPLICIT_DEF
   ; RP-NEXT:   2     1
   ; RP-NEXT:   Live-out: %0:0000000000000003 %1:000000000000000F
+  ; RP-NEXT:   Live-thr:
   ; RP-NEXT: bb.1:
   ; RP-NEXT:   Live-in:  %0:0000000000000003 %1:000000000000000F
   ; RP-NEXT:   SGPR  VGPR
   ; RP-NEXT:   2     1
   ; RP-NEXT:   Live-out: %0:0000000000000003 %1:000000000000000F
+  ; RP-NEXT:   Live-thr: %0:0000000000000003 %1:000000000000000F
   ; RP-NEXT: bb.2:
   ; RP-NEXT:   Live-in:  %0:0000000000000003 %1:000000000000000F
   ; RP-NEXT:   SGPR  VGPR
@@ -29,6 +31,7 @@ body:             |
   ; RP-NEXT:   2     1      S_NOP 0, implicit %0:vgpr_32, implicit %1:sgpr_64
   ; RP-NEXT:   0     0
   ; RP-NEXT:   Live-out:
+  ; RP-NEXT:   Live-thr:
   bb.0:
     %0:vgpr_32 = V_MOV_B32_e32 42, implicit $exec
     %1:sgpr_64 = IMPLICIT_DEF
@@ -49,6 +52,7 @@ body:             |
   ; RPU-NEXT:   3     0      %0:sgpr_128 = IMPLICIT_DEF
   ; RPU-NEXT:   3     0
   ; RPU-NEXT:   Live-out: %0:00000000000000F3
+  ; RPU-NEXT:   Live-thr:
   ; RPU-NEXT: bb.1:
   ; RPU-NEXT:   Live-in:  %0:00000000000000F3
   ; RPU-NEXT:   SGPR  VGPR
@@ -68,6 +72,7 @@ body:             |
   ; RPU-NEXT:   2     0      S_NOP 0, implicit %0.sub3:sgpr_128
   ; RPU-NEXT:   2     0
   ; RPU-NEXT:   Live-out: %0:00000000000000C3
+  ; RPU-NEXT:   Live-thr: %0:00000000000000C0
   ; RPU-NEXT: bb.2:
   ; RPU-NEXT:   Live-in:  %0:00000000000000C3
   ; RPU-NEXT:   SGPR  VGPR
@@ -75,6 +80,7 @@ body:             |
   ; RPU-NEXT:   2     0      S_NOP 0, implicit %0.sub3:sgpr_128, implicit %0.sub0:sgpr_128
   ; RPU-NEXT:   0     0
   ; RPU-NEXT:   Live-out:
+  ; RPU-NEXT:   Live-thr:
   ;
   ; RPD-LABEL: name: live_through_test
   ; RPD: bb.0:
@@ -84,6 +90,7 @@ body:             |
   ; RPD-NEXT:   4     0      %0:sgpr_128 = IMPLICIT_DEF
   ; RPD-NEXT:   3     0
   ; RPD-NEXT:   Live-out: %0:00000000000000F3
+  ; RPD-NEXT:   Live-thr:
   ; RPD-NEXT: bb.1:
   ; RPD-NEXT:   Live-in:  %0:00000000000000F3
   ; RPD-NEXT:   SGPR  VGPR
@@ -103,6 +110,7 @@ body:             |
   ; RPD-NEXT:   2     0      S_NOP 0, implicit %0.sub3:sgpr_128
   ; RPD-NEXT:   2     0
   ; RPD-NEXT:   Live-out: %0:00000000000000C3
+  ; RPD-NEXT:   Live-thr: %0:00000000000000C0
   ; RPD-NEXT: bb.2:
   ; RPD-NEXT:   Live-in:  %0:00000000000000C3
   ; RPD-NEXT:   SGPR  VGPR
@@ -110,6 +118,7 @@ body:             |
   ; RPD-NEXT:   2     0      S_NOP 0, implicit %0.sub3:sgpr_128, implicit %0.sub0:sgpr_128
   ; RPD-NEXT:   0     0
   ; RPD-NEXT:   Live-out:
+  ; RPD-NEXT:   Live-thr:
   bb.0:
     %0:sgpr_128 = IMPLICIT_DEF
   bb.1:
@@ -146,11 +155,13 @@ body:             |
   ; RPU-NEXT:   0     2      undef %1.sub1:vreg_64 = V_MOV_B32_e32 33, implicit $exec
   ; RPU-NEXT:   0     2
   ; RPU-NEXT:   Live-out: %0:0000000000000003 %1:000000000000000C
+  ; RPU-NEXT:   Live-thr:
   ; RPU-NEXT: bb.1:
   ; RPU-NEXT:   Live-in:  %0:0000000000000003 %1:000000000000000C
   ; RPU-NEXT:   SGPR  VGPR
   ; RPU-NEXT:   0     2
   ; RPU-NEXT:   Live-out: %0:0000000000000003 %1:000000000000000C
+  ; RPU-NEXT:   Live-thr: %0:0000000000000003 %1:000000000000000C
   ; RPU-NEXT: bb.2:
   ; RPU-NEXT:   Live-in:  %0:000000000000000F %1:000000000000000F
   ; RPU-NEXT:   mis LIS:  %0:0000000000000003 %1:000000000000000C
@@ -161,6 +172,7 @@ body:             |
   ; RPU-NEXT:   0     4      S_NOP 0, implicit %0:vreg_64, implicit %1:vreg_64
   ; RPU-NEXT:   0     0
   ; RPU-NEXT:   Live-out:
+  ; RPU-NEXT:   Live-thr:
   ;
   ; RPD-LABEL: name: upward_problem_lis_subregs_mismatch
   ; RPD: bb.0:
@@ -172,11 +184,13 @@ body:             |
   ; RPD-NEXT:   0     2      undef %1.sub1:vreg_64 = V_MOV_B32_e32 33, implicit $exec
   ; RPD-NEXT:   0     2
   ; RPD-NEXT:   Live-out: %0:0000000000000003 %1:000000000000000C
+  ; RPD-NEXT:   Live-thr:
   ; RPD-NEXT: bb.1:
   ; RPD-NEXT:   Live-in:  %0:0000000000000003 %1:000000000000000C
   ; RPD-NEXT:   SGPR  VGPR
   ; RPD-NEXT:   0     2
   ; RPD-NEXT:   Live-out: %0:0000000000000003 %1:000000000000000C
+  ; RPD-NEXT:   Live-thr: %0:0000000000000003 %1:000000000000000C
   ; RPD-NEXT: bb.2:
   ; RPD-NEXT:   Live-in:  %0:0000000000000003 %1:000000000000000C
   ; RPD-NEXT:   SGPR  VGPR
@@ -184,6 +198,7 @@ body:             |
   ; RPD-NEXT:   0     2      S_NOP 0, implicit %0:vreg_64, implicit %1:vreg_64
   ; RPD-NEXT:   0     0
   ; RPD-NEXT:   Live-out:
+  ; RPD-NEXT:   Live-thr:
   bb.0:
     undef %0.sub0:vreg_64 = V_MOV_B32_e32 42, implicit $exec
     undef %1.sub1:vreg_64 = V_MOV_B32_e32 33, implicit $exec
@@ -273,6 +288,7 @@ body:             |
   ; RPU-NEXT:   0     5      GLOBAL_STORE_DWORD %15:vreg_64, %18:vgpr_32, 0, 0, implicit $exec
   ; RPU-NEXT:   0     2
   ; RPU-NEXT:   Live-out: %0:0000000000000003 %16:0000000000000003
+  ; RPU-NEXT:   Live-thr:
   ; RPU-NEXT: bb.1:
   ; RPU-NEXT:   Live-in:  %0:0000000000000003 %16:0000000000000003
   ; RPU-NEXT:   SGPR  VGPR
@@ -286,11 +302,13 @@ body:             |
   ; RPU-NEXT:                DBG_VALUE
   ; RPU-NEXT:   0     2
   ; RPU-NEXT:   Live-out: %0:0000000000000003 %16:0000000000000003
+  ; RPU-NEXT:   Live-thr: %0:0000000000000003 %16:0000000000000003
   ; RPU-NEXT: bb.2:
   ; RPU-NEXT:   Live-in:  %0:0000000000000003 %16:0000000000000003
   ; RPU-NEXT:   SGPR  VGPR
   ; RPU-NEXT:   0     2
   ; RPU-NEXT:   Live-out: %0:0000000000000003 %16:0000000000000003
+  ; RPU-NEXT:   Live-thr: %0:0000000000000003 %16:0000000000000003
   ; RPU-NEXT: bb.3:
   ; RPU-NEXT:   Live-in:  %0:0000000000000003 %16:0000000000000003
   ; RPU-NEXT:   SGPR  VGPR
@@ -302,6 +320,7 @@ body:             |
   ; RPU-NEXT:   0     0      S_ENDPGM 0
   ; RPU-NEXT:   0     0
   ; RPU-NEXT:   Live-out:
+  ; RPU-NEXT:   Live-thr:
   ;
   ; RPD-LABEL: name: only_dbg_value_sched_region
   ; RPD: bb.0:
@@ -376,6 +395,7 @@ body:             |
   ; RPD-NEXT:   0     5      GLOBAL_STORE_DWORD %15:vreg_64, %18:vgpr_32, 0, 0, implicit $exec
   ; RPD-NEXT:   0     2
   ; RPD-NEXT:   Live-out: %0:0000000000000003 %16:0000000000000003
+  ; RPD-NEXT:   Live-thr:
   ; RPD-NEXT: bb.1:
   ; RPD-NEXT:   Live-in:  %0:0000000000000003 %16:0000000000000003
   ; RPD-NEXT:   SGPR  VGPR
@@ -389,11 +409,13 @@ body:             |
   ; RPD-NEXT:                DBG_VALUE
   ; RPD-NEXT:   0     2
   ; RPD-NEXT:   Live-out: %0:0000000000000003 %16:0000000000000003
+  ; RPD-NEXT:   Live-thr: %0:0000000000000003 %16:0000000000000003
   ; RPD-NEXT: bb.2:
   ; RPD-NEXT:   Live-in:  %0:0000000000000003 %16:0000000000000003
   ; RPD-NEXT:   SGPR  VGPR
   ; RPD-NEXT:   0     2
   ; RPD-NEXT:   Live-out: %0:0000000000000003 %16:0000000000000003
+  ; RPD-NEXT:   Live-thr: %0:0000000000000003 %16:0000000000000003
   ; RPD-NEXT: bb.3:
   ; RPD-NEXT:   Live-in:  %0:0000000000000003 %16:0000000000000003
   ; RPD-NEXT:   SGPR  VGPR
@@ -405,6 +427,7 @@ body:             |
   ; RPD-NEXT:   0     0      S_ENDPGM 0
   ; RPD-NEXT:   0     0
   ; RPD-NEXT:   Live-out:
+  ; RPD-NEXT:   Live-thr:
   bb.0:
     liveins: $vgpr0
 

Copy link
Collaborator

@rampitec rampitec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a number of registers for each type.

Copy link
Collaborator

@rampitec rampitec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, what about AGPR pressure?

Register Reg, SlotIndex Begin, SlotIndex End,
LaneBitmask Mask = LaneBitmask::getAll()) {

auto IsInOneSegment = [Begin, End](const LiveRange &LR) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as LiveInterval::isLocal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function checks if start and end slot indexes of a basic block lie within a single segment of a LiveRange thus having a single value, but isLocal seems different:

Comment on isLocal seems misleading to me:

    /// True iff this segment is a single segment that lies between the
    /// specified boundaries, exclusively. Vregs live across a backedge are not
    /// considered local. The boundaries are expected to lie within an extended
    /// basic block, so vregs that are not live out should contain no holes.
    bool isLocal(SlotIndex Start, SlotIndex End) const {
      return beginIndex() > Start.getBaseIndex() &&
        endIndex() < End.getBoundaryIndex();
    }

This is a member function of LiveRange class and it may contains several segments.

    /// beginIndex - Return the lowest numbered slot covered.
    SlotIndex beginIndex() const {
      assert(!empty() && "Call to beginIndex() on empty range.");
      return segments.front().start;
    }

    /// endNumber - return the maximum point of the range of the whole,
    /// exclusive.
    SlotIndex endIndex() const {
      assert(!empty() && "Call to endIndex() on empty range.");
      return segments.back().end;
    }

@vpykhtin
Copy link
Contributor Author

vpykhtin commented Nov 3, 2023

BTW, what about AGPR pressure?

Should be added as a third column, but I suggest doing it in a separate PR.

@vpykhtin
Copy link
Contributor Author

vpykhtin commented Nov 3, 2023

I would add a number of registers for each type.

You mean like live-through pressure?

@rampitec
Copy link
Collaborator

rampitec commented Nov 3, 2023

I would add a number of registers for each type.

You mean like live-through pressure?

You already have columns for SGPR and VGPR, you could just add these number below live-through list.

@vpykhtin
Copy link
Contributor Author

vpykhtin commented Nov 7, 2023

You already have columns for SGPR and VGPR, you could just add these number below live-through list.

Added with bb5b639.

@vpykhtin vpykhtin merged commit 57a11b7 into llvm:main Nov 20, 2023
3 checks passed
@vpykhtin vpykhtin deleted the add_livethrough_rp_printing branch November 20, 2023 12:35
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…ter pass. (llvm#71096)

Add live-through register set printing, assuming live-through register
is in live-in and live-out sets, has no redefinitions but may have uses
in the block.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants