[TableGen][SubtargetEmitter] Add the ability for processor models to …

…describe dependency breaking instructions. This patch adds the ability for processor models to describe dependency breaking instructions. Different processors may specify a different set of dependency-breaking instructions. That means, we cannot assume that all processors of the same target would use the same rules to classify dependency breaking instructions. The main goal of this patch is to provide the means to describe dependency breaking instructions directly via tablegen, and have the following TargetSubtargetInfo hooks redefined in overrides by tabegen'd XXXGenSubtargetInfo classes (here, XXX is a Target name). ``` virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const { return false; } virtual bool isDependencyBreaking(const MachineInstr *MI, APInt &Mask) const { return isZeroIdiom(MI); } ``` An instruction MI is a dependency-breaking instruction if a call to method isDependencyBreaking(MI) on the STI (TargetSubtargetInfo object) evaluates to true. Similarly, an instruction MI is a special case of zero-idiom dependency breaking instruction if a call to STI.isZeroIdiom(MI) returns true. The extra APInt is used for those targets that may want to select which machine operands have their dependency broken (see comments in code). Note that by default, subtargets don't know about the existence of dependency-breaking. In the absence of external information, those method calls would always return false. A new tablegen class named STIPredicate has been added by this patch to let processor models classify instructions that have properties in common. The idea is that, a MCInstrPredicate definition can be used to "generate" an instruction equivalence class, with the idea that instructions of a same class all have a property in common. STIPredicate definitions are essentially a collection of instruction equivalence classes. Also, different processor models can specify a different variant of the same STIPredicate with different rules (i.e. predicates) to classify instructions. Tablegen backends (in this particular case, the SubtargetEmitter) will be able to process STIPredicate definitions, and automatically generate functions in XXXGenSubtargetInfo. This patch introduces two special kind of STIPredicate classes named IsZeroIdiomFunction and IsDepBreakingFunction in tablegen. It also adds a definition for those in the BtVer2 scheduling model only. This patch supersedes the one committed at r338372 (phabricator review: D49310). The main advantages are: - We can describe subtarget predicates via tablegen using STIPredicates. - We can describe zero-idioms / dep-breaking instructions directly via tablegen in the scheduling models. In future, the STIPredicates framework can be used for solving other problems. Examples of future developments are: - Teach how to identify optimizable register-register moves - Teach how to identify slow LEA instructions (each subtarget defining its own concept of "slow" LEA). - Teach how to identify instructions that have undocumented false dependencies on the output registers on some processors only. It is also (in my opinion) an elegant way to expose knowledge to both external tools like llvm-mca, and codegen passes. For example, machine schedulers in LLVM could reuse that information when internally constructing the data dependency graph for a code region. This new design feature is also an "opt-in" feature. Processor models don't have to use the new STIPredicates. It has all been designed to be as unintrusive as possible. Differential Revision: https://reviews.llvm.org/D52174 llvm-svn: 342555
llvm · Sep 19, 2018 · 8b6c314 · 8b6c314
1 parent 4fd2e2a
commit 8b6c314
Show file tree

Hide file tree

Showing 13 changed files with 1,164 additions and 92 deletions.
diff --git a/llvm/include/llvm/CodeGen/TargetSubtargetInfo.h b/llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
@@ -14,6 +14,7 @@
 #ifndef LLVM_CODEGEN_TARGETSUBTARGETINFO_H
 #define LLVM_CODEGEN_TARGETSUBTARGETINFO_H
 
+#include "llvm/ADT/APInt.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
@@ -144,6 +145,31 @@ class TargetSubtargetInfo : public MCSubtargetInfo {
     return 0;
   }
 
+  /// Returns true if \param MI is a dependency breaking zero-idiom instruction
+  /// for the subtarget.
+  ///
+  /// This function also sets bits in \param Mask related to input operands that
+  /// are not in a data dependency relationship.  There is one bit for each
+  /// machine operand; implicit operands follow explicit operands in the bit
+  /// representation used for \param Mask.  An empty \param Mask (i.e. a mask
+  /// with all bits cleared) means: data dependencies are "broken" for all the
+  /// explicit input machine operands of \param MI.
+  virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const {
+    return false;
+  }
+
+  /// Returns true if \param MI is a dependency breaking instruction for the
+  /// subtarget.
+  ///
+  /// Similar in behavior to `isZeroIdiom`. However, it knows how to identify
+  /// all dependency breaking instructions (i.e. not just zero-idioms).
+  /// 
+  /// As for `isZeroIdiom`, this method returns a mask of "broken" dependencies.
+  /// (See method `isZeroIdiom` for a detailed description of \param Mask).
+  virtual bool isDependencyBreaking(const MachineInstr *MI, APInt &Mask) const {
+    return isZeroIdiom(MI, Mask);
+  }
+
   /// True if the subtarget should run MachineScheduler after aggressive
   /// coalescing.
   ///

diff --git a/llvm/include/llvm/MC/MCInstrAnalysis.h b/llvm/include/llvm/MC/MCInstrAnalysis.h
@@ -88,18 +88,53 @@ class MCInstrAnalysis {
                                     const MCInst &Inst,
                                     APInt &Writes) const;
 
-  /// Returns true if \param Inst is a dependency breaking instruction for the
+  /// Returns true if \param MI is a dependency breaking zero-idiom for the
   /// given subtarget.
   ///
+  /// \param Mask is used to identify input operands that have their dependency
+  /// broken. Each bit of the mask is associated with a specific input operand.
+  /// Bits associated with explicit input operands are laid out first in the
+  /// mask; implicit operands come after explicit operands.
+  /// 
+  /// Dependencies are broken only for operands that have their corresponding bit
+  /// set. Operands that have their bit cleared, or that don't have a
+  /// corresponding bit in the mask don't have their dependency broken.
+  /// Note that \param Mask may not be big enough to describe all operands.
+  /// The assumption for operands that don't have a correspondent bit in the
+  /// mask is that those are still data dependent.
+  /// 
+  /// The only exception to the rule is for when \param Mask has all zeroes.
+  /// A zero mask means: dependencies are broken for all explicit register
+  /// operands.
+  virtual bool isZeroIdiom(const MCInst &MI, APInt &Mask,
+                           unsigned CPUID) const {
+    return false;
+  }
+
+  /// Returns true if \param MI is a dependency breaking instruction for the
+  /// subtarget associated with \param CPUID.
+  ///
   /// The value computed by a dependency breaking instruction is not dependent
   /// on the inputs. An example of dependency breaking instruction on X86 is
   /// `XOR %eax, %eax`.
-  /// TODO: In future, we could implement an alternative approach where this
-  /// method returns `true` if the input instruction is not dependent on
-  /// some/all of its input operands. An APInt mask could then be used to
-  /// identify independent operands.
-  virtual bool isDependencyBreaking(const MCSubtargetInfo &STI,
-                                    const MCInst &Inst) const;
+  ///
+  /// If \param MI is a dependency breaking instruction for subtarget \param
+  /// CPUID, then \param Mask can be inspected to identify independent operands.
+  ///
+  /// Essentially, each bit of the mask corresponds to an input operand.
+  /// Explicit operands are laid out first in the mask; implicit operands follow
+  /// explicit operands. Bits are set for operands that are independent.
+  ///
+  /// Note that the number of bits in Mask may not be equivalent to the sum of
+  /// explicit and implicit operands in \param MI. Operands that don't have a
+  /// corresponding bit in Mask are assumed "not independente".
+  ///
+  /// The only exception is for when \param Mask is all zeroes. That means:
+  /// explicit input operands of \param MI are independent.
+  virtual bool isDependencyBreaking(const MCInst &MI, APInt &Mask,
+                                    unsigned CPUID) const {
+    return isZeroIdiom(MI, Mask, CPUID);
+  }
 
   /// Given a branch instruction try to get the address the branch
   /// targets. Return true on success, and the address in Target.

diff --git a/llvm/include/llvm/Target/TargetInstrPredicate.td b/llvm/include/llvm/Target/TargetInstrPredicate.td
@@ -68,6 +68,7 @@
 
 // Forward declarations.
 class Instruction;
+class SchedMachineModel;
 
 // A generic machine instruction predicate.
 class MCInstPredicate;
@@ -230,3 +231,100 @@ class CheckFunctionPredicate<string MCInstFn, string MachineInstrFn> : MCInstPre
   string MCInstFnName = MCInstFn;
   string MachineInstrFnName = MachineInstrFn;
 }
+
+// Used to classify machine instructions based on a machine instruction
+// predicate.
+//
+// Let IC be an InstructionEquivalenceClass definition, and MI a machine
+// instruction.  We say that MI belongs to the equivalence class described by IC
+// if and only if the following two conditions are met:
+//  a) MI's opcode is in the `opcodes` set, and
+//  b) `Predicate` evaluates to true when applied to MI.
+//
+// Instances of this class can be used by processor scheduling models to
+// describe instructions that have a property in common.  For example,
+// InstructionEquivalenceClass definitions can be used to identify the set of
+// dependency breaking instructions for a processor model.
+//
+// An (optional) list of operand indices can be used to further describe
+// properties that apply to instruction operands. For example, it can be used to
+// identify register uses of a dependency breaking instructions that are not in
+// a RAW dependency.
+class InstructionEquivalenceClass<list<Instruction> opcodes,
+                                  MCInstPredicate pred,
+                                  list<int> operands = []> {
+  list<Instruction> Opcodes = opcodes;
+  MCInstPredicate Predicate = pred;
+  list<int> OperandIndices = operands;
+}
+
+// Used by processor models to describe dependency breaking instructions.
+//
+// This is mainly an alias for InstructionEquivalenceClass.  Input operand
+// `BrokenDeps` identifies the set of "broken dependencies". There is one bit
+// per each implicit and explicit input operand.  An empty set of broken
+// dependencies means: "explicit input register operands are independent."
+class DepBreakingClass<list<Instruction> opcodes, MCInstPredicate pred,
+                       list<int> BrokenDeps = []>
+    : InstructionEquivalenceClass<opcodes, pred, BrokenDeps>;
+
+// A function descriptor used to describe the signature of a predicate methods
+// which will be expanded by the STIPredicateExpander into a tablegen'd
+// XXXGenSubtargetInfo class member definition (here, XXX is a target name).
+//
+// It describes the signature of a TargetSubtarget hook, as well as a few extra
+// properties. Examples of extra properties are:
+//  - The default return value for the auto-generate function hook.
+//  - A list of subtarget hooks (Delegates) that are called from this function.
+//
+class STIPredicateDecl<string name, MCInstPredicate default = FalsePred,
+                       bit overrides = 1, bit expandForMC = 1,
+                       bit updatesOpcodeMask = 0,
+                       list<STIPredicateDecl> delegates = []> {
+  string Name = name;
+
+  MCInstPredicate DefaultReturnValue = default;
+
+  // True if this method is declared as virtual in class TargetSubtargetInfo.
+  bit OverridesBaseClassMember = overrides;
+
+  // True if we need an equivalent predicate function in the MC layer.
+  bit ExpandForMC = expandForMC;
+
+  // True if the autogenerated method has a extra in/out APInt param used as a
+  // mask of operands.
+  bit UpdatesOpcodeMask = updatesOpcodeMask;
+
+  // A list of STIPredicates used by this definition to delegate part of the
+  // computation. For example, STIPredicateFunction `isDependencyBreaking()`
+  // delegates to `isZeroIdiom()` part of its computation.
+  list<STIPredicateDecl> Delegates = delegates;
+}
+
+// A predicate function definition member of class `XXXGenSubtargetInfo`.
+//
+// If `Declaration.ExpandForMC` is true, then SubtargetEmitter
+// will also expand another definition of this method that accepts a MCInst.
+class STIPredicate<STIPredicateDecl declaration,
+                   list<InstructionEquivalenceClass> classes> {
+  STIPredicateDecl Declaration = declaration;
+  list<InstructionEquivalenceClass> Classes = classes;
+  SchedMachineModel SchedModel = ?;
+}
+
+// Convenience classes and definitions used by processor scheduling models to
+// describe dependency breaking instructions.
+let UpdatesOpcodeMask = 1 in {
+
+def IsZeroIdiomDecl : STIPredicateDecl<"isZeroIdiom">;
+
+let Delegates = [IsZeroIdiomDecl] in
+def IsDepBreakingDecl : STIPredicateDecl<"isDependencyBreaking">;
+
+} // UpdatesOpcodeMask
+
+class IsZeroIdiomFunction<list<DepBreakingClass> classes>
+    : STIPredicate<IsZeroIdiomDecl, classes>;
+
+class IsDepBreakingFunction<list<DepBreakingClass> classes>
+    : STIPredicate<IsDepBreakingDecl, classes>;
diff --git a/llvm/lib/MC/MCInstrAnalysis.cpp b/llvm/lib/MC/MCInstrAnalysis.cpp
@@ -24,11 +24,6 @@ bool MCInstrAnalysis::clearsSuperRegisters(const MCRegisterInfo &MRI,
   return false;
 }
 
-bool MCInstrAnalysis::isDependencyBreaking(const MCSubtargetInfo &STI,
-                                           const MCInst &Inst) const {
-  return false;
-}
-
 bool MCInstrAnalysis::evaluateBranch(const MCInst &Inst, uint64_t Addr,
                                      uint64_t Size, uint64_t &Target) const {
   if (Inst.getNumOperands() == 0 ||

diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
@@ -380,8 +380,9 @@ class X86MCInstrAnalysis : public MCInstrAnalysis {
 public:
   X86MCInstrAnalysis(const MCInstrInfo *MCII) : MCInstrAnalysis(MCII) {}
 
-  bool isDependencyBreaking(const MCSubtargetInfo &STI,
-                            const MCInst &Inst) const override;
+#define GET_STIPREDICATE_DECLS_FOR_MC_ANALYSIS
+#include "X86GenSubtargetInfo.inc"
+
   bool clearsSuperRegisters(const MCRegisterInfo &MRI, const MCInst &Inst,
                             APInt &Mask) const override;
   std::vector<std::pair<uint64_t, uint64_t>>
@@ -390,77 +391,8 @@ class X86MCInstrAnalysis : public MCInstrAnalysis {
                  const Triple &TargetTriple) const override;
 };
 
-bool X86MCInstrAnalysis::isDependencyBreaking(const MCSubtargetInfo &STI,
-                                              const MCInst &Inst) const {
-  if (STI.getCPU() == "btver2") {
-    // Reference: Agner Fog's microarchitecture.pdf - Section 20 "AMD Bobcat and
-    // Jaguar pipeline", subsection 8 "Dependency-breaking instructions".
-    switch (Inst.getOpcode()) {
-    default:
-      return false;
-    case X86::SUB32rr:
-    case X86::SUB64rr:
-    case X86::SBB32rr:
-    case X86::SBB64rr:
-    case X86::XOR32rr:
-    case X86::XOR64rr:
-    case X86::XORPSrr:
-    case X86::XORPDrr:
-    case X86::VXORPSrr:
-    case X86::VXORPDrr:
-    case X86::ANDNPSrr:
-    case X86::VANDNPSrr:
-    case X86::ANDNPDrr:
-    case X86::VANDNPDrr:
-    case X86::PXORrr:
-    case X86::VPXORrr:
-    case X86::PANDNrr:
-    case X86::VPANDNrr:
-    case X86::PSUBBrr:
-    case X86::PSUBWrr:
-    case X86::PSUBDrr:
-    case X86::PSUBQrr:
-    case X86::VPSUBBrr:
-    case X86::VPSUBWrr:
-    case X86::VPSUBDrr:
-    case X86::VPSUBQrr:
-    case X86::PCMPEQBrr:
-    case X86::PCMPEQWrr:
-    case X86::PCMPEQDrr:
-    case X86::PCMPEQQrr:
-    case X86::VPCMPEQBrr:
-    case X86::VPCMPEQWrr:
-    case X86::VPCMPEQDrr:
-    case X86::VPCMPEQQrr:
-    case X86::PCMPGTBrr:
-    case X86::PCMPGTWrr:
-    case X86::PCMPGTDrr:
-    case X86::PCMPGTQrr:
-    case X86::VPCMPGTBrr:
-    case X86::VPCMPGTWrr:
-    case X86::VPCMPGTDrr:
-    case X86::VPCMPGTQrr:
-    case X86::MMX_PXORirr:
-    case X86::MMX_PANDNirr:
-    case X86::MMX_PSUBBirr:
-    case X86::MMX_PSUBDirr:
-    case X86::MMX_PSUBQirr:
-    case X86::MMX_PSUBWirr:
-    case X86::MMX_PCMPGTBirr:
-    case X86::MMX_PCMPGTDirr:
-    case X86::MMX_PCMPGTWirr:
-    case X86::MMX_PCMPEQBirr:
-    case X86::MMX_PCMPEQDirr:
-    case X86::MMX_PCMPEQWirr:
-      return Inst.getOperand(1).getReg() == Inst.getOperand(2).getReg();
-    case X86::CMP32rr:
-    case X86::CMP64rr:
-      return Inst.getOperand(0).getReg() == Inst.getOperand(1).getReg();
-    }
-  }
-
-  return false;
-}
+#define GET_STIPREDICATE_DEFS_FOR_MC_ANALYSIS
+#include "X86GenSubtargetInfo.inc"
 
 bool X86MCInstrAnalysis::clearsSuperRegisters(const MCRegisterInfo &MRI,
                                               const MCInst &Inst,

diff --git a/llvm/lib/Target/X86/X86ScheduleBtVer2.td b/llvm/lib/Target/X86/X86ScheduleBtVer2.td
@@ -687,4 +687,66 @@ def JSlowLEA16r : SchedWriteRes<[JALU01]> {
 
 def : InstRW<[JSlowLEA16r], (instrs LEA16r)>;
 
+///////////////////////////////////////////////////////////////////////////////
+// Dependency breaking instructions.
+///////////////////////////////////////////////////////////////////////////////
+
+def : IsZeroIdiomFunction<[
+  // GPR Zero-idioms.
+  DepBreakingClass<[ SUB32rr, SUB64rr, XOR32rr, XOR64rr ], ZeroIdiomPredicate>,
+
+  // MMX Zero-idioms.
+  DepBreakingClass<[
+    MMX_PXORirr, MMX_PANDNirr, MMX_PSUBBirr,
+    MMX_PSUBDirr, MMX_PSUBQirr, MMX_PSUBWirr,
+    MMX_PCMPGTBirr, MMX_PCMPGTDirr, MMX_PCMPGTWirr
+  ], ZeroIdiomPredicate>,
+
+  // SSE Zero-idioms.
+  DepBreakingClass<[
+    // fp variants.
+    XORPSrr, XORPDrr, ANDNPSrr, ANDNPDrr,
+
+    // int variants.
+    PXORrr, PANDNrr,
+    PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
+    PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
+  ], ZeroIdiomPredicate>,
+
+  // AVX Zero-idioms.
+  DepBreakingClass<[
+    // xmm fp variants.
+    VXORPSrr, VXORPDrr, VANDNPSrr, VANDNPDrr,
+
+    // xmm int variants.
+    VPXORrr, VPANDNrr,
+    VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
+    VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr,
+
+    // ymm variants.
+    VXORPSYrr, VXORPDYrr, VANDNPSYrr, VANDNPDYrr
+  ], ZeroIdiomPredicate>
+]>;
+
+def : IsDepBreakingFunction<[
+  // GPR
+  DepBreakingClass<[ SBB32rr, SBB64rr ], ZeroIdiomPredicate>,
+  DepBreakingClass<[ CMP32rr, CMP64rr ], CheckSameRegOperand<0, 1> >,
+
+  // MMX
+  DepBreakingClass<[
+    MMX_PCMPEQBirr, MMX_PCMPEQDirr, MMX_PCMPEQWirr
+  ], ZeroIdiomPredicate>,
+
+  // SSE
+  DepBreakingClass<[ 
+    PCMPEQBrr, PCMPEQWrr, PCMPEQDrr, PCMPEQQrr
+  ], ZeroIdiomPredicate>,
+
+  // AVX
+  DepBreakingClass<[
+    VPCMPEQBrr, VPCMPEQWrr, VPCMPEQDrr, VPCMPEQQrr
+  ], ZeroIdiomPredicate>
+]>;
+
 } // SchedModel