Skip to content

Commit

Permalink
[SelectionDAG][x86] turn insertelement into undef with variable index…
Browse files Browse the repository at this point in the history
… into splat

I noticed this along with the patterns in D51125, but when the index is variable, 
we don't convert insertelement into a build_vector.

For x86, that means these get expanded at legalization time into the loading/spilling 
code that we see in the tests. I think it's always better to avoid going to memory on 
these, and we get the optimal 'broadcast' if it's available.

I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have 
regression tests that would be affected (although I did not check what would happen in 
those cases). In the most basic cases shown here, AArch64 would probably do much 
better with a splat.

Differential Revision: https://reviews.llvm.org/D51186

llvm-svn: 340705
  • Loading branch information
rotateright committed Aug 26, 2018
1 parent 5fdb817 commit 113cac3
Show file tree
Hide file tree
Showing 5 changed files with 242 additions and 283 deletions.
6 changes: 6 additions & 0 deletions llvm/include/llvm/CodeGen/TargetLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -545,6 +545,12 @@ class TargetLoweringBase {
return false;
}

/// Return true if inserting a scalar into a variable element of an undef
/// vector is more efficiently handled by splatting the scalar instead.
virtual bool shouldSplatInsEltVarIndex(EVT) const {
return false;
}

/// Return true if target supports floating point exceptions.
bool hasFloatingPointExceptions() const {
return HasFloatingPointExceptions;
Expand Down
13 changes: 10 additions & 3 deletions llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15045,12 +15045,19 @@ SDValue DAGCombiner::visitINSERT_VECTOR_ELT(SDNode *N) {
InVec == InVal.getOperand(0) && EltNo == InVal.getOperand(1))
return InVec;

// We must know which element is being inserted for folds below here.
auto *IndexC = dyn_cast<ConstantSDNode>(EltNo);
if (!IndexC)
if (!IndexC) {
// If this is variable insert to undef vector, it might be better to splat:
// inselt undef, InVal, EltNo --> build_vector < InVal, InVal, ... >
if (InVec.isUndef() && TLI.shouldSplatInsEltVarIndex(VT)) {
SmallVector<SDValue, 8> Ops(VT.getVectorNumElements(), InVal);
return DAG.getBuildVector(VT, DL, Ops);
}
return SDValue();
unsigned Elt = IndexC->getZExtValue();
}

// We must know which element is being inserted for folds below here.
unsigned Elt = IndexC->getZExtValue();
if (SDValue Shuf = combineInsertEltToShuffle(N, Elt))
return Shuf;

Expand Down
6 changes: 6 additions & 0 deletions llvm/lib/Target/X86/X86ISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4801,6 +4801,12 @@ bool X86TargetLowering::preferShiftsToClearExtremeBits(SDValue Y) const {
return true;
}

bool X86TargetLowering::shouldSplatInsEltVarIndex(EVT VT) const {
// Any legal vector type can be splatted more efficiently than
// loading/spilling from memory.
return isTypeLegal(VT);
}

MVT X86TargetLowering::hasFastEqualityCompare(unsigned NumBits) const {
MVT VT = MVT::getIntegerVT(NumBits);
if (isTypeLegal(VT))
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Target/X86/X86ISelLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -833,6 +833,8 @@ namespace llvm {
return VTIsOk(XVT) && VTIsOk(KeptBitsVT);
}

bool shouldSplatInsEltVarIndex(EVT VT) const override;

bool convertSetCCLogicToBitwiseLogic(EVT VT) const override {
return VT.isScalarInteger();
}
Expand Down

0 comments on commit 113cac3

Please sign in to comment.