Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-mca] Teach MCA constant registers do not create dependencies #89387

Merged
merged 5 commits into from
May 3, 2024

Conversation

Rin18
Copy link
Contributor

@Rin18 Rin18 commented Apr 19, 2024

Constant registers like the zero registers XZR and WZR are treated as any other register by LLVM-MCA. This can create non existent dependency chains.
Currently there is no method in MCA to query if a register is constant. This patch demonstrates a fix to the issue by adding a bool Constant variable to MCRegisterDesc that is true for constant registers. Since constant registers do not create dependencies, it makes sense to add this check to MCA.

Keeping track of constant registers in MCRegisterDesc adds an extra boolean to the description of each register in a given target. This will increase the size of generated code, which is an issue. Since registers are defined with an isConstant boolean attached, it makes sense for MCRegisterDesc to also be aware of this information. There doesn't seem to be a way to feed this information back to MCRegisterInfo without adding a getter to the class.

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 19, 2024

@llvm/pr-subscribers-tools-llvm-mca

Author: Rin Dobrescu (Rin18)

Changes

Constant registers like the zero registers XZR and WZR are treated as any other register by LLVM-MCA. This can create non existent dependency chains.
Currently there is no method in MCA to query if a register is constant. This patch demonstrates a fix to the issue by adding a bool Constant variable to MCRegisterDesc that is true for constant registers. Since constant registers do not create dependencies, it makes sense to add this check to MCA.

Keeping track of constant registers in MCRegisterDesc adds an extra boolean to the description of each register in a given target. This will increase the size of generated code, which is an issue. Since registers are defined with an isConstant boolean attached, it makes sense for MCRegisterDesc to also be aware of this information. There doesn't seem to be a way to feed this information back to MCRegisterInfo without adding a getter to the class.


Full diff: https://github.com/llvm/llvm-project/pull/89387.diff

4 Files Affected:

  • (modified) llvm/include/llvm/MC/MCRegisterInfo.h (+7)
  • (modified) llvm/lib/MCA/InstrBuilder.cpp (+14-3)
  • (added) llvm/test/tools/llvm-mca/AArch64/Neoverse/V1-zero-dependency.s (+76)
  • (modified) llvm/utils/TableGen/RegisterInfoEmitter.cpp (+2-1)
diff --git a/llvm/include/llvm/MC/MCRegisterInfo.h b/llvm/include/llvm/MC/MCRegisterInfo.h
index c648ef20fa84c6..4dbd4435a8c719 100644
--- a/llvm/include/llvm/MC/MCRegisterInfo.h
+++ b/llvm/include/llvm/MC/MCRegisterInfo.h
@@ -126,6 +126,9 @@ struct MCRegisterDesc {
   /// Index into list with lane mask sequences. The sequence contains a lanemask
   /// for every register unit.
   uint16_t RegUnitLaneMasks;
+
+  // Returns true for constant registers
+  bool Constant;
 };
 
 /// MCRegisterInfo base class - We assume that the target defines a static
@@ -382,6 +385,10 @@ class MCRegisterInfo {
     return RegStrings + get(RegNo).Name;
   }
 
+  bool isConstant(MCRegister RegNo) const {
+    return get(RegNo).Constant && MCRegister::isPhysicalRegister(RegNo.id());
+  }
+
   /// Return the number of registers this target has (useful for
   /// sizing arrays holding per register information)
   unsigned getNumRegs() const {
diff --git a/llvm/lib/MCA/InstrBuilder.cpp b/llvm/lib/MCA/InstrBuilder.cpp
index 1a82e45763a267..ad558dd7cf858a 100644
--- a/llvm/lib/MCA/InstrBuilder.cpp
+++ b/llvm/lib/MCA/InstrBuilder.cpp
@@ -320,9 +320,9 @@ void InstrBuilder::populateWrites(InstrDesc &ID, const MCInst &MCI,
 
   unsigned NumVariadicOps = MCI.getNumOperands() - MCDesc.getNumOperands();
   ID.Writes.resize(TotalDefs + NumVariadicOps);
-  // Iterate over the operands list, and skip non-register operands.
-  // The first NumExplicitDefs register operands are expected to be register
-  // definitions.
+  // Iterate over the operands list, and skip non-register or constant register
+  // operands. The first NumExplicitDefs register operands are expected to be
+  // register definitions.
   unsigned CurrentDef = 0;
   unsigned OptionalDefIdx = MCDesc.getNumOperands() - 1;
   unsigned i = 0;
@@ -335,6 +335,10 @@ void InstrBuilder::populateWrites(InstrDesc &ID, const MCInst &MCI,
       OptionalDefIdx = CurrentDef++;
       continue;
     }
+    if (MRI.isConstant(Op.getReg())) {
+      CurrentDef++;
+      continue;
+    }
 
     WriteDescriptor &Write = ID.Writes[CurrentDef];
     Write.OpIndex = i;
@@ -413,6 +417,8 @@ void InstrBuilder::populateWrites(InstrDesc &ID, const MCInst &MCI,
     const MCOperand &Op = MCI.getOperand(OpIndex);
     if (!Op.isReg())
       continue;
+    if (MRI.isConstant(Op.getReg()))
+      continue;
 
     WriteDescriptor &Write = ID.Writes[CurrentDef];
     Write.OpIndex = OpIndex;
@@ -449,6 +455,9 @@ void InstrBuilder::populateReads(InstrDesc &ID, const MCInst &MCI,
     if (!Op.isReg())
       continue;
 
+    if (MRI.isConstant(Op.getReg()))
+      continue;
+
     ReadDescriptor &Read = ID.Reads[CurrentUse];
     Read.OpIndex = OpIndex;
     Read.UseIndex = I;
@@ -465,6 +474,8 @@ void InstrBuilder::populateReads(InstrDesc &ID, const MCInst &MCI,
     Read.OpIndex = ~I;
     Read.UseIndex = NumExplicitUses + I;
     Read.RegisterID = MCDesc.implicit_uses()[I];
+    if (MRI.isConstant(Read.RegisterID))
+      continue;
     Read.SchedClassID = SchedClassID;
     LLVM_DEBUG(dbgs() << "\t\t[Use][I] OpIdx=" << ~Read.OpIndex
                       << ", UseIndex=" << Read.UseIndex << ", RegisterID="
diff --git a/llvm/test/tools/llvm-mca/AArch64/Neoverse/V1-zero-dependency.s b/llvm/test/tools/llvm-mca/AArch64/Neoverse/V1-zero-dependency.s
new file mode 100644
index 00000000000000..071329fd00cdd2
--- /dev/null
+++ b/llvm/test/tools/llvm-mca/AArch64/Neoverse/V1-zero-dependency.s
@@ -0,0 +1,76 @@
+# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
+# RUN: llvm-mca -mtriple=aarch64 -mcpu=neoverse-v1 --timeline --timeline-max-iterations=4 < %s | FileCheck %s
+
+mov x0, x1
+cmp x0, #4
+
+# CHECK:      Iterations:        100
+# CHECK-NEXT: Instructions:      200
+# CHECK-NEXT: Total Cycles:      54
+# CHECK-NEXT: Total uOps:        200
+
+# CHECK:      Dispatch Width:    15
+# CHECK-NEXT: uOps Per Cycle:    3.70
+# CHECK-NEXT: IPC:               3.70
+# CHECK-NEXT: Block RThroughput: 0.3
+
+# CHECK:      Instruction Info:
+# CHECK-NEXT: [1]: #uOps
+# CHECK-NEXT: [2]: Latency
+# CHECK-NEXT: [3]: RThroughput
+# CHECK-NEXT: [4]: MayLoad
+# CHECK-NEXT: [5]: MayStore
+# CHECK-NEXT: [6]: HasSideEffects (U)
+
+# CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
+# CHECK-NEXT:  1      1     0.25                        mov	x0, x1
+# CHECK-NEXT:  1      1     0.33                        cmp	x0, #4
+
+# CHECK:      Resources:
+# CHECK-NEXT: [0.0] - V1UnitB
+# CHECK-NEXT: [0.1] - V1UnitB
+# CHECK-NEXT: [1.0] - V1UnitD
+# CHECK-NEXT: [1.1] - V1UnitD
+# CHECK-NEXT: [2]   - V1UnitL2
+# CHECK-NEXT: [3.0] - V1UnitL01
+# CHECK-NEXT: [3.1] - V1UnitL01
+# CHECK-NEXT: [4]   - V1UnitM0
+# CHECK-NEXT: [5]   - V1UnitM1
+# CHECK-NEXT: [6.0] - V1UnitS
+# CHECK-NEXT: [6.1] - V1UnitS
+# CHECK-NEXT: [7]   - V1UnitV0
+# CHECK-NEXT: [8]   - V1UnitV1
+# CHECK-NEXT: [9]   - V1UnitV2
+# CHECK-NEXT: [10]  - V1UnitV3
+
+# CHECK:      Resource pressure per iteration:
+# CHECK-NEXT: [0.0]  [0.1]  [1.0]  [1.1]  [2]    [3.0]  [3.1]  [4]    [5]    [6.0]  [6.1]  [7]    [8]    [9]    [10]
+# CHECK-NEXT:  -      -      -      -      -      -      -     0.50   0.50   0.50   0.50    -      -      -      -
+
+# CHECK:      Resource pressure by instruction:
+# CHECK-NEXT: [0.0]  [0.1]  [1.0]  [1.1]  [2]    [3.0]  [3.1]  [4]    [5]    [6.0]  [6.1]  [7]    [8]    [9]    [10]   Instructions:
+# CHECK-NEXT:  -      -      -      -      -      -      -     0.48   0.50   0.01   0.01    -      -      -      -     mov	x0, x1
+# CHECK-NEXT:  -      -      -      -      -      -      -     0.02    -     0.49   0.49    -      -      -      -     cmp	x0, #4
+
+# CHECK:      Timeline view:
+# CHECK-NEXT: Index     012345
+
+# CHECK:      [0,0]     DeER .   mov	x0, x1
+# CHECK-NEXT: [0,1]     D=eER.   cmp	x0, #4
+# CHECK-NEXT: [1,0]     DeE-R.   mov	x0, x1
+# CHECK-NEXT: [1,1]     D=eER.   cmp	x0, #4
+# CHECK-NEXT: [2,0]     DeE-R.   mov	x0, x1
+# CHECK-NEXT: [2,1]     D=eER.   cmp	x0, #4
+# CHECK-NEXT: [3,0]     DeE-R.   mov	x0, x1
+# CHECK-NEXT: [3,1]     D==eER   cmp	x0, #4
+
+# CHECK:      Average Wait times (based on the timeline view):
+# CHECK-NEXT: [0]: Executions
+# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
+# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
+# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
+
+# CHECK:            [0]    [1]    [2]    [3]
+# CHECK-NEXT: 0.     4     1.0    1.0    0.8       mov	x0, x1
+# CHECK-NEXT: 1.     4     2.3    0.3    0.0       cmp	x0, #4
+# CHECK-NEXT:        4     1.6    0.6    0.4       <total>
diff --git a/llvm/utils/TableGen/RegisterInfoEmitter.cpp b/llvm/utils/TableGen/RegisterInfoEmitter.cpp
index ee8830edeedbbe..f6f93154c454d3 100644
--- a/llvm/utils/TableGen/RegisterInfoEmitter.cpp
+++ b/llvm/utils/TableGen/RegisterInfoEmitter.cpp
@@ -977,7 +977,8 @@ void RegisterInfoEmitter::runMCDesc(raw_ostream &OS, CodeGenTarget &Target,
        << DiffSeqs.get(SubRegLists[i]) << ", " << DiffSeqs.get(SuperRegLists[i])
        << ", " << SubRegIdxSeqs.get(SubRegIdxLists[i]) << ", "
        << (Offset << RegUnitBits | FirstRU) << ", "
-       << LaneMaskSeqs.get(RegUnitLaneMasks[i]) << " },\n";
+       << LaneMaskSeqs.get(RegUnitLaneMasks[i]) << ", " << Reg.Constant
+       << " },\n";
     ++i;
   }
   OS << "};\n\n"; // End of register descriptors...

Copy link
Contributor

@michaelmaitland michaelmaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good addition to llvm-mca. Thanks for the patch!

llvm/utils/TableGen/RegisterInfoEmitter.cpp Show resolved Hide resolved
llvm/include/llvm/MC/MCRegisterInfo.h Outdated Show resolved Hide resolved
llvm/include/llvm/MC/MCRegisterInfo.h Outdated Show resolved Hide resolved
@@ -977,7 +977,8 @@ void RegisterInfoEmitter::runMCDesc(raw_ostream &OS, CodeGenTarget &Target,
<< DiffSeqs.get(SubRegLists[i]) << ", " << DiffSeqs.get(SuperRegLists[i])
<< ", " << SubRegIdxSeqs.get(SubRegIdxLists[i]) << ", "
<< (Offset << RegUnitBits | FirstRU) << ", "
<< LaneMaskSeqs.get(RegUnitLaneMasks[i]) << " },\n";
<< LaneMaskSeqs.get(RegUnitLaneMasks[i]) << ", " << Reg.Constant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some tests checking against the content of TableGen-ed register info under test/TableGen, you might want to double check if you need to update any of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I took a look at the TableGen tests relating to RegisterInfoEmitter.cpp. The tests I found were only related to register cost values, so this change shouldn't affect them.

@topperc
Copy link
Collaborator

topperc commented Apr 19, 2024

This will increase the size of generated code, which is an issue.

MCRegisterDesc is 22 bytes before this change but requires 4 byte alignment so I think the bool sneaks into the 2 byte padding.

@Rin18
Copy link
Contributor Author

Rin18 commented Apr 22, 2024

This will increase the size of generated code, which is an issue.

MCRegisterDesc is 22 bytes before this change but requires 4 byte alignment so I think the bool sneaks into the 2 byte padding.

I was not aware of that, this is reassuring.

@Rin18 Rin18 changed the title [WIP][RFC] Teach MCA constant registers do not create dependencies [llvm-mca] Teach MCA constant registers do not create dependencies Apr 23, 2024
@Rin18
Copy link
Contributor Author

Rin18 commented Apr 29, 2024

I'm really thankful for the comments I've received so far, I think I've addressed them all. This patch should be in a good state now. Further reviews would be appreciated to move this closer to being merged.

Copy link
Contributor

@michaelmaitland michaelmaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comment.

llvm/lib/MCA/InstrBuilder.cpp Outdated Show resolved Hide resolved
@Rin18 Rin18 merged commit 385f59f into llvm:main May 3, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants