Skip to content

Commit

Permalink
[X86] Add some missing dependency-breaking zero idiom patterns to sch…
Browse files Browse the repository at this point in the history
…eduler models

Many of the x86 scheduler models are not accounting for their microarch's ability to handle dependency-breaking zero idioms (pxor xmm0,xmm0 etc.), which is causing some notable differences when comparing llvm-mca reports to iaca, uops.info etc.

These are based on the Intel AoMs and Agner's docs which list the instructions handled on each cpu model - there may be more, although tbh the xor/pxor/xorps/xorpd are by far the most commonly encountered.

Once this is in place we also need to review missing support for 'allones' idioms and reg-reg move elimination, but this needs fixing first.

@lebedev.ri The Barcelona test changes are due to the cpu still being tagged as using the SandyBridge model, if/when you get back to D63628 these will need to be addressed.

Based on an original patch by @andreadb (Andrea Di Biagio)

Differential Revision: https://reviews.llvm.org/D117497
  • Loading branch information
RKSimon committed Jan 19, 2022
1 parent 42a6821 commit 6eb8fc9
Show file tree
Hide file tree
Showing 25 changed files with 1,150 additions and 749 deletions.
36 changes: 36 additions & 0 deletions llvm/lib/Target/X86/X86SchedBroadwell.td
Expand Up @@ -1743,4 +1743,40 @@ def BWSETA_SETBErm : SchedWriteVariant<[
def : InstRW<[BWSETA_SETBErr], (instrs SETCCr)>;
def : InstRW<[BWSETA_SETBErm], (instrs SETCCm)>;

///////////////////////////////////////////////////////////////////////////////
// Dependency breaking instructions.
///////////////////////////////////////////////////////////////////////////////

def : IsZeroIdiomFunction<[
// GPR Zero-idioms.
DepBreakingClass<[ SUB32rr, SUB64rr, XOR32rr, XOR64rr ], ZeroIdiomPredicate>,

// SSE Zero-idioms.
DepBreakingClass<[
// fp variants.
XORPSrr, XORPDrr,

// int variants.
PXORrr,
PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
], ZeroIdiomPredicate>,

// AVX Zero-idioms.
DepBreakingClass<[
// xmm fp variants.
VXORPSrr, VXORPDrr,

// xmm int variants.
VPXORrr,
VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr,

// ymm variants.
VXORPSYrr, VXORPDYrr, VPXORYrr,
VPSUBBYrr, VPSUBWYrr, VPSUBDYrr, VPSUBQYrr,
VPCMPGTBYrr, VPCMPGTWYrr, VPCMPGTDYrr, VPCMPGTQYrr
], ZeroIdiomPredicate>,
]>;

} // SchedModel
36 changes: 36 additions & 0 deletions llvm/lib/Target/X86/X86SchedHaswell.td
Expand Up @@ -2032,4 +2032,40 @@ def HWSETA_SETBErm : SchedWriteVariant<[
def : InstRW<[HWSETA_SETBErr], (instrs SETCCr)>;
def : InstRW<[HWSETA_SETBErm], (instrs SETCCm)>;

///////////////////////////////////////////////////////////////////////////////
// Dependency breaking instructions.
///////////////////////////////////////////////////////////////////////////////

def : IsZeroIdiomFunction<[
// GPR Zero-idioms.
DepBreakingClass<[ SUB32rr, SUB64rr, XOR32rr, XOR64rr ], ZeroIdiomPredicate>,

// SSE Zero-idioms.
DepBreakingClass<[
// fp variants.
XORPSrr, XORPDrr,

// int variants.
PXORrr,
PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
], ZeroIdiomPredicate>,

// AVX Zero-idioms.
DepBreakingClass<[
// xmm fp variants.
VXORPSrr, VXORPDrr,

// xmm int variants.
VPXORrr,
VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr,

// ymm variants.
VXORPSYrr, VXORPDYrr, VPXORYrr,
VPSUBBYrr, VPSUBWYrr, VPSUBDYrr, VPSUBQYrr,
VPCMPGTBYrr, VPCMPGTWYrr, VPCMPGTDYrr
], ZeroIdiomPredicate>,
]>;

} // SchedModel
44 changes: 44 additions & 0 deletions llvm/lib/Target/X86/X86SchedIceLake.td
Expand Up @@ -2632,4 +2632,48 @@ def ICXSETA_SETBErm : SchedWriteVariant<[
def : InstRW<[ICXSETA_SETBErr], (instrs SETCCr)>;
def : InstRW<[ICXSETA_SETBErm], (instrs SETCCm)>;

///////////////////////////////////////////////////////////////////////////////
// Dependency breaking instructions.
///////////////////////////////////////////////////////////////////////////////

def : IsZeroIdiomFunction<[
// GPR Zero-idioms.
DepBreakingClass<[ SUB32rr, SUB64rr, XOR32rr, XOR64rr ], ZeroIdiomPredicate>,

// SSE Zero-idioms.
DepBreakingClass<[
// fp variants.
XORPSrr, XORPDrr,

// int variants.
PXORrr,
PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
], ZeroIdiomPredicate>,

// AVX Zero-idioms.
DepBreakingClass<[
// xmm fp variants.
VXORPSrr, VXORPDrr,

// xmm int variants.
VPXORrr,
VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr,

// ymm variants.
VXORPSYrr, VXORPDYrr, VPXORYrr,
VPSUBBYrr, VPSUBWYrr, VPSUBDYrr, VPSUBQYrr,
VPCMPGTBYrr, VPCMPGTWYrr, VPCMPGTDYrr, VPCMPGTQYrr,

// zmm variants.
VXORPSZrr, VXORPDZrr, VPXORDZrr, VPXORQZrr,
VXORPSZ128rr, VXORPDZ128rr, VPXORDZ128rr, VPXORQZ128rr,
VXORPSZ256rr, VXORPDZ256rr, VPXORDZ256rr, VPXORQZ256rr,
VPSUBBZrr, VPSUBWZrr, VPSUBDZrr, VPSUBQZrr,
VPSUBBZ128rr, VPSUBWZ128rr, VPSUBDZ128rr, VPSUBQZ128rr,
VPSUBBZ256rr, VPSUBWZ256rr, VPSUBDZ256rr, VPSUBQZ256rr,
], ZeroIdiomPredicate>,
]>;

} // SchedModel
31 changes: 31 additions & 0 deletions llvm/lib/Target/X86/X86SchedSandyBridge.td
Expand Up @@ -1232,4 +1232,35 @@ def SBSETA_SETBErm : SchedWriteVariant<[
def : InstRW<[SBSETA_SETBErr], (instrs SETCCr)>;
def : InstRW<[SBSETA_SETBErm], (instrs SETCCm)>;

///////////////////////////////////////////////////////////////////////////////
// Dependency breaking instructions.
///////////////////////////////////////////////////////////////////////////////

def : IsZeroIdiomFunction<[
// GPR Zero-idioms.
DepBreakingClass<[ SUB32rr, SUB64rr, XOR32rr, XOR64rr ], ZeroIdiomPredicate>,

// SSE Zero-idioms.
DepBreakingClass<[
// fp variants.
XORPSrr, XORPDrr,

// int variants.
PXORrr,
PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
], ZeroIdiomPredicate>,

// AVX Zero-idioms.
DepBreakingClass<[
// xmm fp variants.
VXORPSrr, VXORPDrr,

// xmm int variants.
VPXORrr,
VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr,
], ZeroIdiomPredicate>,
]>;

} // SchedModel
36 changes: 36 additions & 0 deletions llvm/lib/Target/X86/X86SchedSkylakeClient.td
Expand Up @@ -1903,4 +1903,40 @@ def SKLSETA_SETBErm : SchedWriteVariant<[
def : InstRW<[SKLSETA_SETBErr], (instrs SETCCr)>;
def : InstRW<[SKLSETA_SETBErm], (instrs SETCCm)>;

///////////////////////////////////////////////////////////////////////////////
// Dependency breaking instructions.
///////////////////////////////////////////////////////////////////////////////

def : IsZeroIdiomFunction<[
// GPR Zero-idioms.
DepBreakingClass<[ SUB32rr, SUB64rr, XOR32rr, XOR64rr ], ZeroIdiomPredicate>,

// SSE Zero-idioms.
DepBreakingClass<[
// fp variants.
XORPSrr, XORPDrr,

// int variants.
PXORrr,
PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
], ZeroIdiomPredicate>,

// AVX Zero-idioms.
DepBreakingClass<[
// xmm fp variants.
VXORPSrr, VXORPDrr,

// xmm int variants.
VPXORrr,
VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr,

// ymm variants.
VXORPSYrr, VXORPDYrr, VPXORYrr,
VPSUBBYrr, VPSUBWYrr, VPSUBDYrr, VPSUBQYrr,
VPCMPGTBYrr, VPCMPGTWYrr, VPCMPGTDYrr, VPCMPGTQYrr
], ZeroIdiomPredicate>,
]>;

} // SchedModel
44 changes: 44 additions & 0 deletions llvm/lib/Target/X86/X86SchedSkylakeServer.td
Expand Up @@ -2615,4 +2615,48 @@ def SKXSETA_SETBErm : SchedWriteVariant<[
def : InstRW<[SKXSETA_SETBErr], (instrs SETCCr)>;
def : InstRW<[SKXSETA_SETBErm], (instrs SETCCm)>;

///////////////////////////////////////////////////////////////////////////////
// Dependency breaking instructions.
///////////////////////////////////////////////////////////////////////////////

def : IsZeroIdiomFunction<[
// GPR Zero-idioms.
DepBreakingClass<[ SUB32rr, SUB64rr, XOR32rr, XOR64rr ], ZeroIdiomPredicate>,

// SSE Zero-idioms.
DepBreakingClass<[
// fp variants.
XORPSrr, XORPDrr,

// int variants.
PXORrr,
PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
], ZeroIdiomPredicate>,

// AVX Zero-idioms.
DepBreakingClass<[
// xmm fp variants.
VXORPSrr, VXORPDrr,

// xmm int variants.
VPXORrr,
VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr,

// ymm variants.
VXORPSYrr, VXORPDYrr, VPXORYrr,
VPSUBBYrr, VPSUBWYrr, VPSUBDYrr, VPSUBQYrr,
VPCMPGTBYrr, VPCMPGTWYrr, VPCMPGTDYrr, VPCMPGTQYrr,

// zmm variants.
VXORPSZrr, VXORPDZrr, VPXORDZrr, VPXORQZrr,
VXORPSZ128rr, VXORPDZ128rr, VPXORDZ128rr, VPXORQZ128rr,
VXORPSZ256rr, VXORPDZ256rr, VPXORDZ256rr, VPXORQZ256rr,
VPSUBBZrr, VPSUBWZrr, VPSUBDZrr, VPSUBQZrr,
VPSUBBZ128rr, VPSUBWZ128rr, VPSUBDZ128rr, VPSUBQZ128rr,
VPSUBBZ256rr, VPSUBWZ256rr, VPSUBDZ256rr, VPSUBQZ256rr,
], ZeroIdiomPredicate>,
]>;

} // SchedModel
18 changes: 18 additions & 0 deletions llvm/lib/Target/X86/X86ScheduleSLM.td
Expand Up @@ -482,4 +482,22 @@ def: InstRW<[SLMWriteResGroup1rm], (instrs MMX_PADDQrm, PADDQrm,
MMX_PSUBQrm, PSUBQrm,
PCMPEQQrm)>;

///////////////////////////////////////////////////////////////////////////////
// Dependency breaking instructions.
///////////////////////////////////////////////////////////////////////////////

def : IsZeroIdiomFunction<[
// GPR Zero-idioms.
DepBreakingClass<[ XOR32rr ], ZeroIdiomPredicate>,

// SSE Zero-idioms.
DepBreakingClass<[
// fp variants.
XORPSrr, XORPDrr,

// int variants.
PXORrr,
], ZeroIdiomPredicate>,
]>;

} // SchedModel
79 changes: 79 additions & 0 deletions llvm/lib/Target/X86/X86ScheduleZnver1.td
Expand Up @@ -1543,4 +1543,83 @@ def : InstRW<[WriteMicrocoded], (instrs VZEROUPPER)>;
// VZEROALL.
def : InstRW<[WriteMicrocoded], (instrs VZEROALL)>;

///////////////////////////////////////////////////////////////////////////////
// Dependency breaking instructions.
///////////////////////////////////////////////////////////////////////////////

def : IsZeroIdiomFunction<[
// GPR Zero-idioms.
DepBreakingClass<[
SUB32rr, SUB64rr,
XOR32rr, XOR64rr
], ZeroIdiomPredicate>,

// MMX Zero-idioms.
DepBreakingClass<[
MMX_PXORrr, MMX_PANDNrr, MMX_PSUBBrr,
MMX_PSUBDrr, MMX_PSUBQrr, MMX_PSUBWrr,
MMX_PSUBSBrr, MMX_PSUBSWrr, MMX_PSUBUSBrr, MMX_PSUBUSWrr,
MMX_PCMPGTBrr, MMX_PCMPGTDrr, MMX_PCMPGTWrr
], ZeroIdiomPredicate>,

// SSE Zero-idioms.
DepBreakingClass<[
// fp variants.
XORPSrr, XORPDrr, ANDNPSrr, ANDNPDrr,

// int variants.
PXORrr, PANDNrr,
PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
], ZeroIdiomPredicate>,

// AVX XMM Zero-idioms.
DepBreakingClass<[
// fp variants.
VXORPSrr, VXORPDrr, VANDNPSrr, VANDNPDrr,

// int variants.
VPXORrr, VPANDNrr,
VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr
], ZeroIdiomPredicate>,

// AVX YMM Zero-idioms.
DepBreakingClass<[
// fp variants
VXORPSYrr, VXORPDYrr, VANDNPSYrr, VANDNPDYrr,

// int variants
VPXORYrr, VPANDNYrr,
VPSUBBYrr, VPSUBWYrr, VPSUBDYrr, VPSUBQYrr,
VPCMPGTBYrr, VPCMPGTWYrr, VPCMPGTDYrr, VPCMPGTQYrr
], ZeroIdiomPredicate>
]>;

def : IsDepBreakingFunction<[
// GPR
DepBreakingClass<[ SBB32rr, SBB64rr ], ZeroIdiomPredicate>,
DepBreakingClass<[ CMP32rr, CMP64rr ], CheckSameRegOperand<0, 1> >,

// MMX
DepBreakingClass<[
MMX_PCMPEQBrr, MMX_PCMPEQWrr, MMX_PCMPEQDrr
], ZeroIdiomPredicate>,

// SSE
DepBreakingClass<[
PCMPEQBrr, PCMPEQWrr, PCMPEQDrr, PCMPEQQrr
], ZeroIdiomPredicate>,

// AVX XMM
DepBreakingClass<[
VPCMPEQBrr, VPCMPEQWrr, VPCMPEQDrr, VPCMPEQQrr
], ZeroIdiomPredicate>,

// AVX YMM
DepBreakingClass<[
VPCMPEQBYrr, VPCMPEQWYrr, VPCMPEQDYrr, VPCMPEQQYrr
], ZeroIdiomPredicate>,
]>;

} // SchedModel

0 comments on commit 6eb8fc9

Please sign in to comment.