Skip to content

[CodeGen] Introduce MI flag for Live Range split instructions #117543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cdevadas
Copy link
Collaborator

For some targets, it is required to identify the COPY instruction
corresponds to the RA inserted live range split. Adding the new
flag MachineInstr::LRSplit to serve the purpose.

For some targets, it is required to identify the COPY instruction
corresponds to the RA inserted live range split. Adding the new
flag `MachineInstr::LRSplit` to serve the purpose.
Copy link
Collaborator Author

cdevadas commented Nov 25, 2024

@llvmbot
Copy link
Member

llvmbot commented Nov 25, 2024

@llvm/pr-subscribers-llvm-regalloc

Author: Christudasan Devadasan (cdevadas)

Changes

For some targets, it is required to identify the COPY instruction
corresponds to the RA inserted live range split. Adding the new
flag MachineInstr::LRSplit to serve the purpose.


Full diff: https://github.com/llvm/llvm-project/pull/117543.diff

2 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/MachineInstr.h (+2-1)
  • (modified) llvm/lib/CodeGen/SplitKit.cpp (+2)
diff --git a/llvm/include/llvm/CodeGen/MachineInstr.h b/llvm/include/llvm/CodeGen/MachineInstr.h
index ead6bbe1d5f641..4545b205d07466 100644
--- a/llvm/include/llvm/CodeGen/MachineInstr.h
+++ b/llvm/include/llvm/CodeGen/MachineInstr.h
@@ -119,7 +119,8 @@ class MachineInstr
     Disjoint = 1 << 19,      // Each bit is zero in at least one of the inputs.
     NoUSWrap = 1 << 20,      // Instruction supports geps
                              // no unsigned signed wrap.
-    SameSign = 1 << 21       // Both operands have the same sign.
+    SameSign = 1 << 21,      // Both operands have the same sign.
+    LRSplit = 1 << 22        // Instruction for live range split.
   };
 
 private:
diff --git a/llvm/lib/CodeGen/SplitKit.cpp b/llvm/lib/CodeGen/SplitKit.cpp
index eb33b93c197d7c..5042f074c26c45 100644
--- a/llvm/lib/CodeGen/SplitKit.cpp
+++ b/llvm/lib/CodeGen/SplitKit.cpp
@@ -533,6 +533,7 @@ SlotIndex SplitEditor::buildSingleSubRegCopy(
               | getInternalReadRegState(!FirstCopy), SubIdx)
       .addReg(FromReg, 0, SubIdx);
 
+  CopyMI->setFlag(MachineInstr::LRSplit);
   SlotIndexes &Indexes = *LIS.getSlotIndexes();
   if (FirstCopy) {
     Def = Indexes.insertMachineInstrInMaps(*CopyMI, Late).getRegSlot();
@@ -552,6 +553,7 @@ SlotIndex SplitEditor::buildCopy(Register FromReg, Register ToReg,
     // The full vreg is copied.
     MachineInstr *CopyMI =
         BuildMI(MBB, InsertBefore, DebugLoc(), Desc, ToReg).addReg(FromReg);
+    CopyMI->setFlag(MachineInstr::LRSplit);
     return Indexes.insertMachineInstrInMaps(*CopyMI, Late).getRegSlot();
   }
 

Copy link
Collaborator

@qcolombet qcolombet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this information?

At the end of the day this is just a regular copy.

@cdevadas
Copy link
Collaborator Author

Why do you need this information?

At the end of the day this is just a regular copy.

Can you see the other PR in the stack? #117544
It is indeed just another copy. That's the real problem when identifying the LR split instructions from the other COPY instructions. AMDGPU has multiple regalloc pipelines (per regclass). We depend on the BBProlog concept (isBasicBlockPrologue) while the spills/copies are inserted during RA. This is primarily needed to push down the VGPR spills/copies in certain blocks at the right point after the exec mask values are manipulated for divergent execution.
I could have directly used COPY to identify the split instruction if this target hook isBasicBlockPrologue is used only during RA. But it is integrated inside the helper functions SkipPHIsAndLabels & SkipPHIsLabelsAndDebug which are used to skip certain Pseudo/Meta instructions from the BB top. These functions are also used during PHI elimination, MI Sink, etc., and cause trouble.

@cdevadas
Copy link
Collaborator Author

Ping

@qcolombet
Copy link
Collaborator

Why do you need this information?
At the end of the day this is just a regular copy.

Can you see the other PR in the stack? #117544 It is indeed just another copy. That's the real problem when identifying the LR split instructions from the other COPY instructions.

My point is why do you have to distinguish them to begin with?
Can't you apply your "push down" transformation on all the COPYs that you can?

@cdevadas
Copy link
Collaborator Author

cdevadas commented Dec 2, 2024

My point is why do you have to distinguish them to begin with? Can't you apply your "push down" transformation on all the COPYs that you can?

For AMDGPU the bb prolog instructions are the RA inserted spills and LR split copies. Any other instruction included as part of BBProlog would result in wrong insertion point leading to buggy CodeGen. PHI elimination pass, for instance, uses the same hook to identify the insertion point while inserting copies at the predecessor blocks. Any COPY at a block begin that was part of regular CodeGen would then be included in the prolog leading to incorrect insertion points.

@cdevadas
Copy link
Collaborator Author

cdevadas commented Dec 9, 2024

Ping

@cdevadas
Copy link
Collaborator Author

Ping @qcolombet.

@cdevadas
Copy link
Collaborator Author

Ping.

@cdevadas
Copy link
Collaborator Author

cdevadas commented Feb 4, 2025

Ping. @qcolombet this patch addresses a critical bug in the AMDGPU codegen. Please take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants