Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TableGen][GlobalISel] Specialize more MatchTable Opcodes #89736

Merged
merged 2 commits into from
Apr 24, 2024

Conversation

Pierre-vh
Copy link
Contributor

The vast majority of the following (very common) opcodes were always called with identical arguments:

  • GIM_CheckType for the root
  • GIM_CheckRegBankForClass for the root
  • GIR_Copy between the old and new root
  • GIR_ConstrainSelectedInstOperands on the new root
  • GIR_BuildMI to create the new root

I added overloaded version of each opcode specialized for the root instructions. It always saves between 1 and 2 bytes per instance depending on the number of arguments specialized into the opcode. Some of these opcodes had between 5 and 15k occurences in the AArch64 GlobalISel Match Table.

Additionally, the following opcodes are almost always used in the same sequence:

  • GIR_EraseFromParent 0 + GIR_Done
    • GIR_EraseRootFromParent_Done has been created to do both. Saves 2 bytes per occurence.
  • GIR_IsSafeToFold was always called for each InsnID except 0.
    • Changed the opcode to take the number of instructions to check after MI[0]

The savings from these are pretty neat. For AArch64GenGlobalISel.inc:

  • AArch64InstructionSelector.cpp.o goes down from 772kb to 704kb (-10% code size)
  • Self-reported MatchTable size goes from 420380 bytes to 352426 bytes (~ -17%)

A smaller match table means a faster match table because we spend less time iterating and decoding.
I don't have a solid measurement methodology for GlobalISel performance so I don't have precise numbers but I saw a few % of improvements in a simple testcase.

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 23, 2024

@llvm/pr-subscribers-llvm-globalisel

Author: Pierre van Houtryve (Pierre-vh)

Changes

The vast majority of the following (very common) opcodes were always called with identical arguments:

  • GIM_CheckType for the root
  • GIM_CheckRegBankForClass for the root
  • GIR_Copy between the old and new root
  • GIR_ConstrainSelectedInstOperands on the new root
  • GIR_BuildMI to create the new root

I added overloaded version of each opcode specialized for the root instructions. It always saves between 1 and 2 bytes per instance depending on the number of arguments specialized into the opcode. Some of these opcodes had between 5 and 15k occurences in the AArch64 GlobalISel Match Table.

Additionally, the following opcodes are almost always used in the same sequence:

  • GIR_EraseFromParent 0 + GIR_Done
    • GIR_EraseRootFromParent_Done has been created to do both. Saves 2 bytes per occurence.
  • GIR_IsSafeToFold was always called for each InsnID except 0.
    • Changed the opcode to take the number of instructions to check after MI[0]

The savings from these are pretty neat. For AArch64GenGlobalISel.inc:

  • AArch64InstructionSelector.cpp.o goes down from 772kb to 704kb (-10% code size)
  • Self-reported MatchTable size goes from 420380 bytes to 352426 bytes (~ -17%)

A smaller match table means a faster match table because we spend less time iterating and decoding.
I don't have a solid measurement methodology for GlobalISel performance so I don't have precise numbers but I saw a few % of improvements in a simple testcase.


Patch is 269.39 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/89736.diff

43 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h (+17-3)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h (+45-23)
  • (modified) llvm/test/TableGen/ContextlessPredicates.td (+16-16)
  • (modified) llvm/test/TableGen/DefaultOpsGlobalISel.td (+85-94)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-eraseroot.td (+3-4)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td (+6-8)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td (+11-14)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td (+6-8)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-miflags.td (+6-7)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td (+10-11)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-patfrag-root.td (+10-13)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-permutations.td (+51-95)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-temp-defs.td (+10-10)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-typeof.td (+5-6)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table.td (+13-16)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-atomic_store.td (+1-1)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-immAllZeroOne.td (+6-6)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-immarg-literal-pattern.td (+2-2)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-input-discard.td (+10-9)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-multiple-output-discard.td (+13-12)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-multiple-output.td (+41-38)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-nested-subregs.td (+7-6)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-notype-output-pattern.td (+4-4)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-output-discard.td (+11-10)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-zero-reg.td (+9-8)
  • (modified) llvm/test/TableGen/GlobalISelEmitter.td (+257-279)
  • (modified) llvm/test/TableGen/GlobalISelEmitterCustomPredicate.td (+43-43)
  • (modified) llvm/test/TableGen/GlobalISelEmitterFlags.td (+3-3)
  • (modified) llvm/test/TableGen/GlobalISelEmitterHwModes.td (+16-16)
  • (modified) llvm/test/TableGen/GlobalISelEmitterMatchTableOptimizer.td (+13-14)
  • (modified) llvm/test/TableGen/GlobalISelEmitterMatchTableOptimizerSameOperand-invalid.td (+77-82)
  • (modified) llvm/test/TableGen/GlobalISelEmitterMatchTableOptimizerSameOperand.td (+1-1)
  • (modified) llvm/test/TableGen/GlobalISelEmitterOverloadedPtr.td (+4-4)
  • (modified) llvm/test/TableGen/GlobalISelEmitterRegSequence.td (+9-8)
  • (modified) llvm/test/TableGen/GlobalISelEmitterSubreg.td (+40-33)
  • (modified) llvm/test/TableGen/GlobalISelEmitterVariadic.td (+12-12)
  • (modified) llvm/test/TableGen/HasNoUse.td (+8-7)
  • (modified) llvm/test/TableGen/address-space-patfrags.td (+1-1)
  • (modified) llvm/test/TableGen/gisel-physreg-input.td (+24-22)
  • (modified) llvm/test/TableGen/immarg-predicated.td (+2-2)
  • (modified) llvm/test/TableGen/immarg.td (+2-2)
  • (modified) llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp (+138-84)
  • (modified) llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.h (+25-5)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
index 29a46f04fd5de5..8eddc6a6a531b4 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
@@ -217,6 +217,8 @@ enum {
   /// - OpIdx(ULEB128) - Operand index
   /// - Ty(1) - Expected type
   GIM_CheckType,
+  /// GIM_CheckType but InsnID is omitted and defaults to zero.
+  GIM_RootCheckType,
 
   /// Check the type of a pointer to any address space.
   /// - InsnID(ULEB128) - Instruction ID
@@ -229,6 +231,8 @@ enum {
   /// - OpIdx(ULEB128) - Operand index
   /// - RC(2) - Expected register bank (specified as a register class)
   GIM_CheckRegBankForClass,
+  /// GIM_CheckRegBankForClass but InsnID is omitted and defaults to zero.
+  GIM_RootCheckRegBankForClass,
 
   /// Check the operand matches a complex predicate
   /// - InsnID(ULEB128) - Instruction ID
@@ -278,9 +282,9 @@ enum {
   /// - OpIdx(ULEB128) - Operand index
   GIM_CheckIsImm,
 
-  /// Check if the specified operand is safe to fold into the current
-  /// instruction.
-  /// - InsnID(ULEB128) - Instruction ID
+  /// Checks if the matched instructions numbered [1, 1+N) can
+  /// be folded into the root (inst 0).
+  /// - Num(1)
   GIM_CheckIsSafeToFold,
 
   /// Check the specified operands are identical.
@@ -338,6 +342,8 @@ enum {
   /// - InsnID(ULEB128) - Instruction ID to define
   /// - Opcode(2) - The new opcode to use
   GIR_BuildMI,
+  /// GIR_BuildMI but InsnID is omitted and defaults to zero.
+  GIR_BuildRootMI,
 
   /// Builds a constant and stores its result in a TempReg.
   /// - TempRegID(ULEB128) - Temp Register to define.
@@ -349,6 +355,8 @@ enum {
   /// - OldInsnID(ULEB128) - Instruction ID to copy from
   /// - OpIdx(ULEB128) - The operand to copy
   GIR_Copy,
+  /// GIR_Copy but with both New/OldInsnIDs omitted and defaulting to zero.
+  GIR_RootToRootCopy,
 
   /// Copy an operand to the specified instruction or add a zero register if the
   /// operand is a zero immediate.
@@ -506,6 +514,9 @@ enum {
   /// description.
   /// - InsnID(ULEB128) - Instruction ID to modify
   GIR_ConstrainSelectedInstOperands,
+  /// GIR_ConstrainSelectedInstOperands but InsnID is omitted and defaults to
+  /// zero.
+  GIR_RootConstrainSelectedInstOperands,
 
   /// Merge all memory operands into instruction.
   /// - InsnID(ULEB128) - Instruction ID to modify
@@ -518,6 +529,9 @@ enum {
   /// - InsnID(ULEB128) - Instruction ID to erase
   GIR_EraseFromParent,
 
+  /// Combines both a GIR_EraseFromParent 0 + GIR_Done
+  GIR_EraseRootFromParent_Done,
+
   /// Create a new temporary register that's not constrained.
   /// - TempRegID(ULEB128) - The temporary register ID to initialize.
   /// - Ty(1) - Expected type
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
index c73ac2c9f55b7b..dec2d97bb1fa7e 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
@@ -131,6 +131,16 @@ bool GIMatchTableExecutor::executeMatchTable(
     return V;
   };
 
+  const auto eraseImpl = [&](MachineInstr *MI) {
+    // If we're erasing the insertion point, ensure we don't leave a dangling
+    // pointer in the builder.
+    if (Builder.getInsertPt() == MI)
+      Builder.setInsertPt(*MI->getParent(), ++MI->getIterator());
+    if (Observer)
+      Observer->erasingInstr(*MI);
+    MI->eraseFromParent();
+  };
+
   while (true) {
     assert(CurrentIdx != ~0u && "Invalid MatchTable index");
     uint8_t MatcherOpcode = MatchTable[CurrentIdx++];
@@ -661,8 +671,9 @@ bool GIMatchTableExecutor::executeMatchTable(
 
       break;
     }
+    case GIM_RootCheckType:
     case GIM_CheckType: {
-      uint64_t InsnID = readULEB();
+      uint64_t InsnID = (MatcherOpcode == GIM_RootCheckType) ? 0 : readULEB();
       uint64_t OpIdx = readULEB();
       int TypeID = readS8();
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
@@ -741,8 +752,11 @@ bool GIMatchTableExecutor::executeMatchTable(
       State.RecordedTypes[TypeIdx] = MRI.getType(Op.getReg());
       break;
     }
+
+    case GIM_RootCheckRegBankForClass:
     case GIM_CheckRegBankForClass: {
-      uint64_t InsnID = readULEB();
+      uint64_t InsnID =
+          (MatcherOpcode == GIM_RootCheckRegBankForClass) ? 0 : readULEB();
       uint64_t OpIdx = readULEB();
       uint16_t RCEnum = readU16();
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
@@ -898,14 +912,16 @@ bool GIMatchTableExecutor::executeMatchTable(
       break;
     }
     case GIM_CheckIsSafeToFold: {
-      uint64_t InsnID = readULEB();
+      uint64_t NumInsn = MatchTable[CurrentIdx++];
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
-                      dbgs() << CurrentIdx << ": GIM_CheckIsSafeToFold(MIs["
-                             << InsnID << "])\n");
-      assert(State.MIs[InsnID] != nullptr && "Used insn before defined");
-      if (!isObviouslySafeToFold(*State.MIs[InsnID], *State.MIs[0])) {
-        if (handleReject() == RejectAndGiveUp)
-          return false;
+                      dbgs() << CurrentIdx << ": GIM_CheckIsSafeToFold(N = "
+                             << NumInsn << ")\n");
+      MachineInstr &Root = *State.MIs[0];
+      for (unsigned K = 1, E = NumInsn + 1; K < E; ++K) {
+        if (!isObviouslySafeToFold(*State.MIs[K], Root)) {
+          if (handleReject() == RejectAndGiveUp)
+            return false;
+        }
       }
       break;
     }
@@ -1011,8 +1027,9 @@ bool GIMatchTableExecutor::executeMatchTable(
       break;
     }
 
+    case GIR_BuildRootMI:
     case GIR_BuildMI: {
-      uint64_t NewInsnID = readULEB();
+      uint64_t NewInsnID = (MatcherOpcode == GIR_BuildRootMI) ? 0 : readULEB();
       uint16_t Opcode = readU16();
       if (NewInsnID >= OutMIs.size())
         OutMIs.resize(NewInsnID + 1);
@@ -1034,9 +1051,12 @@ bool GIMatchTableExecutor::executeMatchTable(
       break;
     }
 
+    case GIR_RootToRootCopy:
     case GIR_Copy: {
-      uint64_t NewInsnID = readULEB();
-      uint64_t OldInsnID = readULEB();
+      uint64_t NewInsnID =
+          (MatcherOpcode == GIR_RootToRootCopy) ? 0 : readULEB();
+      uint64_t OldInsnID =
+          (MatcherOpcode == GIR_RootToRootCopy) ? 0 : readULEB();
       uint64_t OpIdx = readULEB();
       assert(OutMIs[NewInsnID] && "Attempted to add to undefined instruction");
       OutMIs[NewInsnID].add(State.MIs[OldInsnID]->getOperand(OpIdx));
@@ -1361,8 +1381,11 @@ bool GIMatchTableExecutor::executeMatchTable(
       break;
     }
 
+    case GIR_RootConstrainSelectedInstOperands:
     case GIR_ConstrainSelectedInstOperands: {
-      uint64_t InsnID = readULEB();
+      uint64_t InsnID = (MatcherOpcode == GIR_RootConstrainSelectedInstOperands)
+                            ? 0
+                            : readULEB();
       assert(OutMIs[InsnID] && "Attempted to add to undefined instruction");
       constrainSelectedInstRegOperands(*OutMIs[InsnID].getInstr(), TII, TRI,
                                        RBI);
@@ -1372,7 +1395,6 @@ bool GIMatchTableExecutor::executeMatchTable(
                              << InsnID << "])\n");
       break;
     }
-
     case GIR_MergeMemOperands: {
       uint64_t InsnID = readULEB();
       uint64_t NumInsn = MatchTable[CurrentIdx++];
@@ -1391,7 +1413,6 @@ bool GIMatchTableExecutor::executeMatchTable(
       DEBUG_WITH_TYPE(TgtExecutor::getName(), dbgs() << ")\n");
       break;
     }
-
     case GIR_EraseFromParent: {
       uint64_t InsnID = readULEB();
       MachineInstr *MI = State.MIs[InsnID];
@@ -1399,16 +1420,17 @@ bool GIMatchTableExecutor::executeMatchTable(
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
                       dbgs() << CurrentIdx << ": GIR_EraseFromParent(MIs["
                              << InsnID << "])\n");
-      // If we're erasing the insertion point, ensure we don't leave a dangling
-      // pointer in the builder.
-      if (Builder.getInsertPt() == MI)
-        Builder.setInsertPt(*MI->getParent(), ++MI->getIterator());
-      if (Observer)
-        Observer->erasingInstr(*MI);
-      MI->eraseFromParent();
+      eraseImpl(MI);
       break;
     }
-
+    case GIR_EraseRootFromParent_Done: {
+      DEBUG_WITH_TYPE(TgtExecutor::getName(),
+                      dbgs()
+                          << CurrentIdx << ": GIR_EraseRootFromParent_Done\n");
+      eraseImpl(State.MIs[0]);
+      propagateFlags();
+      return true;
+    }
     case GIR_MakeTempReg: {
       uint64_t TempRegID = readULEB();
       int TypeID = readS8();
diff --git a/llvm/test/TableGen/ContextlessPredicates.td b/llvm/test/TableGen/ContextlessPredicates.td
index 5e4e69069c3e32..eead9655111e68 100644
--- a/llvm/test/TableGen/ContextlessPredicates.td
+++ b/llvm/test/TableGen/ContextlessPredicates.td
@@ -22,26 +22,26 @@ def : Pat<(test_atomic_op_frag GPR32:$ptr, GPR32:$val) ,
 
 // CHECK_NOPT-LABEL: const uint8_t *MyTargetInstructionSelector::getMatchTable() const {
 // CHECK_NOPT-NEXT:    constexpr static uint8_t MatchTable0[] = {
-// CHECK_NOPT-NEXT:      GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(58), // Rule ID 0 //
+// CHECK_NOPT-NEXT:      GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(52), // Rule ID 0 //
 // CHECK_NOPT-NEXT:        GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
 // CHECK_NOPT-NEXT:        GIM_CheckOpcode, /*MI*/0, GIMT_Encode2(TargetOpcode::G_ATOMICRMW_XCHG),
 // CHECK_NOPT-NEXT:        GIM_CheckMemorySizeEqualTo, /*MI*/0, /*MMO*/0, /*Size*/GIMT_Encode4(4),
 // CHECK_NOPT-NEXT:        // MIs[0] DstI[dst]
-// CHECK_NOPT-NEXT:        GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
-// CHECK_NOPT-NEXT:        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
+// CHECK_NOPT-NEXT:        GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32,
+// CHECK_NOPT-NEXT:        GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
 // CHECK_NOPT-NEXT:        // MIs[0] ptr
 // CHECK_NOPT-NEXT:        GIM_CheckPointerToAny, /*MI*/0, /*Op*/1, /*SizeInBits*/32,
-// CHECK_NOPT-NEXT:        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
+// CHECK_NOPT-NEXT:        GIM_RootCheckRegBankForClass, /*Op*/1, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
 // CHECK_NOPT-NEXT:        // MIs[0] val
-// CHECK_NOPT-NEXT:        GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
-// CHECK_NOPT-NEXT:        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
+// CHECK_NOPT-NEXT:        GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_s32,
+// CHECK_NOPT-NEXT:        GIM_RootCheckRegBankForClass, /*Op*/2, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
 // CHECK_NOPT-NEXT:        GIM_CheckCxxInsnPredicate, /*MI*/0, /*FnId*/GIMT_Encode2(GICXXPred_MI_Predicate_test_atomic_op_frag),
 // CHECK_NOPT-NEXT:        // (atomic_swap:{ *:[i32] } GPR32:{ *:[i32] }:$ptr, GPR32:{ *:[i32] }:$val)<<P:Predicate_test_atomic_op_frag>>  =>  (INSN:{ *:[i32] } GPR32:{ *:[i32] }:$ptr, GPR32:{ *:[i32] }:$val)
 // CHECK_NOPT-NEXT:        GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/GIMT_Encode2(MyTarget::INSN),
-// CHECK_NOPT-NEXT:        GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
+// CHECK_NOPT-NEXT:        GIR_RootConstrainSelectedInstOperands,
 // CHECK_NOPT-NEXT:        // GIR_Coverage, 0,
 // CHECK_NOPT-NEXT:        GIR_Done,
-// CHECK_NOPT-NEXT:      // Label 0: @58
+// CHECK_NOPT-NEXT:      // Label 0: @52
 // CHECK_NOPT-NEXT:      GIM_Reject,
 // CHECK_NOPT-NEXT:      };
 // CHECK_NOPT-NEXT:    return MatchTable0;
@@ -49,23 +49,23 @@ def : Pat<(test_atomic_op_frag GPR32:$ptr, GPR32:$val) ,
 
 // CHECK_OPT-LABEL: const uint8_t *MyTargetInstructionSelector::getMatchTable() const {
 // CHECK_OPT-NEXT:   constexpr static uint8_t MatchTable0[] = {
-// CHECK_OPT-NEXT:     GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(55), // Rule ID 0 //
+// CHECK_OPT-NEXT:     GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(49), // Rule ID 0 //
 // CHECK_OPT-NEXT:       GIM_CheckOpcode, /*MI*/0, GIMT_Encode2(TargetOpcode::G_ATOMICRMW_XCHG),
-// CHECK_OPT-NEXT:       GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
-// CHECK_OPT-NEXT:       GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
+// CHECK_OPT-NEXT:       GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32,
+// CHECK_OPT-NEXT:       GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_s32,
 // CHECK_OPT-NEXT:       GIM_CheckMemorySizeEqualTo, /*MI*/0, /*MMO*/0, /*Size*/GIMT_Encode4(4),
-// CHECK_OPT-NEXT:       GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
+// CHECK_OPT-NEXT:       GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
 // CHECK_OPT-NEXT:       // MIs[0] ptr
 // CHECK_OPT-NEXT:       GIM_CheckPointerToAny, /*MI*/0, /*Op*/1, /*SizeInBits*/32,
-// CHECK_OPT-NEXT:       GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
-// CHECK_OPT-NEXT:       GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
+// CHECK_OPT-NEXT:       GIM_RootCheckRegBankForClass, /*Op*/1, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
+// CHECK_OPT-NEXT:       GIM_RootCheckRegBankForClass, /*Op*/2, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID),
 // CHECK_OPT-NEXT:       GIM_CheckCxxInsnPredicate, /*MI*/0, /*FnId*/GIMT_Encode2(GICXXPred_MI_Predicate_test_atomic_op_frag),
 // CHECK_OPT-NEXT:       // (atomic_swap:{ *:[i32] } GPR32:{ *:[i32] }:$ptr, GPR32:{ *:[i32] }:$val)<<P:Predicate_test_atomic_op_frag>>  =>  (INSN:{ *:[i32] } GPR32:{ *:[i32] }:$ptr, GPR32:{ *:[i32] }:$val)
 // CHECK_OPT-NEXT:       GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/GIMT_Encode2(MyTarget::INSN),
-// CHECK_OPT-NEXT:       GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
+// CHECK_OPT-NEXT:       GIR_RootConstrainSelectedInstOperands,
 // CHECK_OPT-NEXT:       // GIR_Coverage, 0,
 // CHECK_OPT-NEXT:       GIR_Done,
-// CHECK_OPT-NEXT:     // Label 0: @55
+// CHECK_OPT-NEXT:     // Label 0: @49
 // CHECK_OPT-NEXT:     GIM_Reject,
 // CHECK_OPT-NEXT:     };
 // CHECK_OPT-NEXT:   return MatchTable0;
diff --git a/llvm/test/TableGen/DefaultOpsGlobalISel.td b/llvm/test/TableGen/DefaultOpsGlobalISel.td
index 0c5aa0b912f549..8f4176a2aa730b 100644
--- a/llvm/test/TableGen/DefaultOpsGlobalISel.td
+++ b/llvm/test/TableGen/DefaultOpsGlobalISel.td
@@ -33,101 +33,97 @@ def clamp : OperandWithDefaultOps <i1, (ops (i1 0))>;
 
 // CHECK:      const uint8_t *MyTargetInstructionSelector::getMatchTable() const {
 // CHECK-NEXT:   constexpr static uint8_t MatchTable0[] = {
-// CHECK-NEXT:     GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(79), // Rule ID 3 //
+// CHECK-NEXT:     GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(69), // Rule ID 3 //
 // CHECK-NEXT:       GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
 // CHECK-NEXT:       GIM_CheckOpcode, /*MI*/0, GIMT_Encode2(TargetOpcode::G_FMAXNUM),
 // CHECK-NEXT:       // MIs[0] DstI[dst]
-// CHECK-NEXT:       GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
-// CHECK-NEXT:       GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::FPR32RegClassID),
+// CHECK-NEXT:       GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32,
+// CHECK-NEXT:       GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::FPR32RegClassID),
 // CHECK-NEXT:       // MIs[0] SelectSrcMods:src0:mods0
-// CHECK-NEXT:       GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
+// CHECK-NEXT:       GIM_RootCheckType, /*Op*/1, /*Type*/GILLT_s32,
 // CHECK-NEXT:       GIM_CheckComplexPattern, /*MI*/0, /*Op*/1, /*Renderer*/GIMT_Encode2(0), GIMT_Encode2(GICP_gi_SelectSrcMods),
 // CHECK-NEXT:       // MIs[0] SelectSrcMods:src1:mods1
-// CHECK-NEXT:       GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
+// CHECK-NEXT:       GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_s32,
 // CHECK-NEXT:       GIM_CheckComplexPattern, /*MI*/0, /*Op*/2, /*Renderer*/GIMT_Encode2(1), GIMT_Encode2(GICP_gi_SelectSrcMods),
 // CHECK-NEXT:       // (fmaxnum:{ *:[f32] } (SelectSrcMods:{ *:[f32] } f32:{ *:[f32] }:$src0, src_mods:{ *:[i32] }:$mods0), (SelectSrcMods:{ *:[f32] } f32:{ *:[f32] }:$src1, src_mods:{ *:[i32] }:$mods1))  =>  (FMAX:{ *:[f32] } src_mods:{ *:[i32] }:$mods0, f32:{ *:[f32] }:$src0, src_mods:{ *:[i32] }:$mods1, f32:{ *:[f32] }:$src1)
-// CHECK-NEXT:       GIR_BuildMI, /*InsnID*/0, /*Opcode*/GIMT_Encode2(MyTarget::FMAX),
-// CHECK-NEXT:       GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // DstI[dst]
+// CHECK-NEXT:       GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(MyTarget::FMAX),
+// CHECK-NEXT:       GIR_RootToRootCopy, /*OpIdx*/0, // DstI[dst]
 // CHECK-NEXT:       GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/1, // mods0
 // CHECK-NEXT:       GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/0, // src0
 // CHECK-NEXT:       GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(1), /*SubOperand*/1, // mods1
 // CHECK-NEXT:       GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(1), /*SubOperand*/0, // src1
 // CHECK-NEXT:       GIR_AddImm8, /*InsnID*/0, /*Imm*/0,
-// CHECK-NEXT:       GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
-// CHECK-NEXT:       GIR_EraseFromParent, /*InsnID*/0,
+// CHECK-NEXT:       GIR_RootConstrainSelectedInstOperands,
 // CHECK-NEXT:       // GIR_Coverage, 3,
-// CHECK-NEXT:       GIR_Done,
-// CHECK-NEXT:     // Label 0: @79
-// CHECK-NEXT:     GIM_Try, /*On fail goto*//*Label 1*/ GIMT_Encode4(139), // Rule ID 2 //
+// CHECK-NEXT:       GIR_EraseRootFromParent_Done,
+// CHECK-NEXT:     // Label 0: @69
+// CHECK-NEXT:     GIM_Try, /*On fail goto*//*Label 1*/ GIMT_Encode4(120), // Rule ID 2 //
 // CHECK-NEXT:       GIM_CheckNumOperands, /*MI*/0, /*Expected*/2,
 // CHECK-NEXT:       GIM_CheckOpcode, /*MI*/0, GIMT_Encode2(TargetOpcode::G_FFLOOR),
 // CHECK-NEXT:       // MIs[0] DstI[dst]
-// CHECK-NEXT:       GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
-// CHECK-NEXT:       GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::FPR32RegClassID),
+// CHECK-NEXT:       GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32,
+// CHECK-NEXT:       GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::FPR32RegClassID),
 // CHECK-NEXT:       // MIs[0] SelectClampOMod:src0:omod:clamp
-// CHECK-NEXT:       GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32,
+// CHECK-NEXT:       GIM_RootCheckType, /*Op*/1, /*Type*/GILLT_s32,
 // CHECK-NEXT:       GIM_CheckComplexPattern, /*MI*/0, /*Op*/1, /*Renderer*/GIMT_Encode2(0), GIMT_Encode2(GICP_gi_SelectClampOMod),
 // CHECK-NEXT:       // (ffloor:{ *:[f32] } (SelectClampOMod:{ *:[f32] } f32:{ *:[f32] }:$src0, omod:{ *:[i32] }:$omod, i1:{ *:[i1] }:$clamp))  =>  (FLOMP:{ *:[f32] } f32:{ *:[f32] }:$src0, i1:{ *:[i1] }:$clamp, omod:{ *:[i32] }:$omod)
-// CHECK-NEXT:       GIR_BuildMI, /*InsnID*/0, /*Opcode*/GIMT_Encode2(MyTarget::FLOMP),
-// CHECK-NEXT:       GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // DstI[dst]
+// CHECK-NEXT:       GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(MyTarget::FLOMP),
+// CHECK-NEXT:       GIR_RootToRootCopy, /*OpIdx*/0, // DstI[dst]
 // CHECK-NEXT:       GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/0, // src0
 // CHECK-NEXT:       GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/2, // clamp
 // CHECK-NEXT:       GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/1, // omod
-// CHECK-NEXT:       GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
-// CHECK-NEXT:       GIR_EraseFromParent, /*InsnID*/0,
+// CHECK-NEXT:       GIR_RootConstrainSelectedInstOperands,
 // CHECK-NEXT:       // GIR_Coverage, 2,
-// CHECK-NEXT:       GIR_Done,
-// CHECK-NEXT:     // Label 1: @139
-// CHECK-NEXT:     GIM_Try, /*On fail goto*//*Label 2*/ GIMT_Encode4(207), // Rule ID 8 //
+// CHECK-NEXT:       GIR_EraseRootFromParent_Done,
+// CHECK-NEXT:     // Label 1: @120
+// CHECK-NEXT:     GIM_Try, /*On fail goto*//*Label 2*/ GIMT_Encode4(179), // Rule ID 8 //
 // CHECK-NEXT:       GIM_CheckNumOperands, /*MI*/0,...
[truncated]

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 8a631d789859d09ba3a11a1206c30e595f4b6428 c063e482a69a955c6580dfee1d69a3bad5d1b766 -- llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.h
View the diff from clang-format here.
diff --git a/llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp b/llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp
index 663c04c521..8af219f34e 100644
--- a/llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp
+++ b/llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp
@@ -730,7 +730,7 @@ void RuleMatcher::optimize() {
   auto EraseRootIt = Actions.end();
   auto It = Actions.begin();
   while (It != Actions.end()) {
-    if(const auto* EI = dyn_cast<EraseInstAction>(It->get())) {
+    if (const auto *EI = dyn_cast<EraseInstAction>(It->get())) {
       unsigned InstID = EI->getInsnID();
       if (!AlreadySeenEraseInsts.insert(InstID).second) {
         It = Actions.erase(It);

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the table still emitting everything as uint64_t? Is it time to fix that?

@Pierre-vh
Copy link
Contributor Author

Is the table still emitting everything as uint64_t? Is it time to fix that?

No, it's a byte table now with most things using a flexible encoding (ULEB) or fixed-size encoding between 2 and 8 bytes if we can't do ULEB (e.g. value not known)

@aemerson
Copy link
Contributor

Very nice! I can try to measure the compile time benefit of this when I have a bit of time, so we can understand how much these types of savings are worth for future.

@Pierre-vh Pierre-vh merged commit 9375962 into llvm:main Apr 24, 2024
3 of 4 checks passed
@Pierre-vh Pierre-vh deleted the specialize-more-opcodes-matchtable branch April 24, 2024 07:19
@aemerson
Copy link
Contributor

Here's some compile time measurements for CTMark building with -Os 10 times, it seems ClamAV failed to build, but overall seems there's a neutral/slight improvement:

Program                                       compile_time
                                              lhs          rhs    diff
consumer-typeset/consumer-typeset               3.58         3.58  0.0%
kimwitu++/kc                                    6.67         6.67  0.0%
lencod/lencod                                   4.65         4.65 -0.0%
7zip/7zip-benchmark                            14.78        14.77 -0.0%
SPASS/SPASS                                     4.56         4.56 -0.0%
mafft/pairlocalalign                            2.66         2.66 -0.0%
sqlite3/sqlite3                                 2.49         2.49 -0.0%
tramp3d-v4/tramp3d-v4                           5.69         5.68 -0.1%
Bullet/bullet                                  10.99        10.99 -0.1%
                           Geomean difference                     -0.0%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants