Skip to content

Commit

Permalink
Move several vector intrinsics out of experimental namespace (#88748)
Browse files Browse the repository at this point in the history
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice

from the experimental namespace.

All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
  • Loading branch information
mgabka committed Apr 29, 2024
1 parent 16bd10a commit bfc0317
Show file tree
Hide file tree
Showing 102 changed files with 2,642 additions and 2,544 deletions.
2 changes: 1 addition & 1 deletion clang/lib/CodeGen/CGExprScalar.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2330,7 +2330,7 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
}

// Perform VLAT <-> VLST bitcast through memory.
// TODO: since the llvm.experimental.vector.{insert,extract} intrinsics
// TODO: since the llvm.vector.{insert,extract} intrinsics
// require the element types of the vectors to be the same, we
// need to keep this around for bitcasts between VLAT <-> VLST where
// the element types of the vectors are not the same, until we figure
Expand Down
53 changes: 27 additions & 26 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18805,7 +18805,7 @@ runtime, then the result vector is a :ref:`poison value <poisonvalues>`. The
``idx`` parameter must be a vector index constant type (for most targets this
will be an integer pointer type).

'``llvm.experimental.vector.reverse``' Intrinsic
'``llvm.vector.reverse``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
Expand All @@ -18814,25 +18814,26 @@ This is an overloaded intrinsic.

::

declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a)
declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
declare <2 x i8> @llvm.vector.reverse.v2i8(<2 x i8> %a)
declare <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)

Overview:
"""""""""

The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector.
The '``llvm.vector.reverse.*``' intrinsics reverse a vector.
The intrinsic takes a single vector and returns a vector of matching type but
with the original lane order reversed. These intrinsics work for both fixed
and scalable vectors. While this intrinsic is marked as experimental the
recommended way to express reverse operations for fixed-width vectors is still
to use a shufflevector, as that may allow for more optimization opportunities.
and scalable vectors. While this intrinsic supports all vector types
the recommended way to express this operation for fixed-width vectors is
still to use a shufflevector, as that may allow for more optimization
opportunities.

Arguments:
""""""""""

The argument to this intrinsic must be a vector.

'``llvm.experimental.vector.deinterleave2``' Intrinsic
'``llvm.vector.deinterleave2``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
Expand All @@ -18841,13 +18842,13 @@ This is an overloaded intrinsic.

::

declare {<2 x double>, <2 x double>} @llvm.experimental.vector.deinterleave2.v4f64(<4 x double> %vec1)
declare {<vscale x 4 x i32>, <vscale x 4 x i32>} @llvm.experimental.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> %vec1)
declare {<2 x double>, <2 x double>} @llvm.vector.deinterleave2.v4f64(<4 x double> %vec1)
declare {<vscale x 4 x i32>, <vscale x 4 x i32>} @llvm.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> %vec1)

Overview:
"""""""""

The '``llvm.experimental.vector.deinterleave2``' intrinsic constructs two
The '``llvm.vector.deinterleave2``' intrinsic constructs two
vectors by deinterleaving the even and odd lanes of the input vector.

This intrinsic works for both fixed and scalable vectors. While this intrinsic
Expand All @@ -18859,15 +18860,15 @@ For example:

.. code-block:: text

{<2 x i64>, <2 x i64>} llvm.experimental.vector.deinterleave2.v4i64(<4 x i64> <i64 0, i64 1, i64 2, i64 3>); ==> {<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>}
{<2 x i64>, <2 x i64>} llvm.vector.deinterleave2.v4i64(<4 x i64> <i64 0, i64 1, i64 2, i64 3>); ==> {<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>}

Arguments:
""""""""""

The argument is a vector whose type corresponds to the logical concatenation of
the two result types.

'``llvm.experimental.vector.interleave2``' Intrinsic
'``llvm.vector.interleave2``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
Expand All @@ -18876,13 +18877,13 @@ This is an overloaded intrinsic.

::

declare <4 x double> @llvm.experimental.vector.interleave2.v4f64(<2 x double> %vec1, <2 x double> %vec2)
declare <vscale x 8 x i32> @llvm.experimental.vector.interleave2.nxv8i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2)
declare <4 x double> @llvm.vector.interleave2.v4f64(<2 x double> %vec1, <2 x double> %vec2)
declare <vscale x 8 x i32> @llvm.vector.interleave2.nxv8i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2)

Overview:
"""""""""

The '``llvm.experimental.vector.interleave2``' intrinsic constructs a vector
The '``llvm.vector.interleave2``' intrinsic constructs a vector
by interleaving two input vectors.

This intrinsic works for both fixed and scalable vectors. While this intrinsic
Expand All @@ -18894,7 +18895,7 @@ For example:

.. code-block:: text

<4 x i64> llvm.experimental.vector.interleave2.v4i64(<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>); ==> <4 x i64> <i64 0, i64 1, i64 2, i64 3>
<4 x i64> llvm.vector.interleave2.v4i64(<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>); ==> <4 x i64> <i64 0, i64 1, i64 2, i64 3>

Arguments:
""""""""""
Expand Down Expand Up @@ -18940,7 +18941,7 @@ The '``llvm.experimental.cttz.elts``' intrinsic counts the trailing (least
significant) zero elements in a vector. If ``src == 0`` the result is the
number of elements in the input vector.

'``llvm.experimental.vector.splice``' Intrinsic
'``llvm.vector.splice``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
Expand All @@ -18949,13 +18950,13 @@ This is an overloaded intrinsic.

::

declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
declare <2 x double> @llvm.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
declare <vscale x 4 x i32> @llvm.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)

Overview:
"""""""""

The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by
The '``llvm.vector.splice.*``' intrinsics construct a vector by
concatenating elements from the first input vector with elements of the second
input vector, returning a vector of the same type as the input vectors. The
signed immediate, modulo the number of elements in the vector, is the index
Expand All @@ -18966,16 +18967,16 @@ immediate, it extracts ``-imm`` trailing elements from the first vector, and
the remaining elements from ``%vec2``.

These intrinsics work for both fixed and scalable vectors. While this intrinsic
is marked as experimental, the recommended way to express this operation for
supports all vector types the recommended way to express this operation for
fixed-width vectors is still to use a shufflevector, as that may allow for more
optimization opportunities.

For example:

.. code-block:: text

llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1); ==> <B, C, D, E> index
llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3); ==> <B, C, D, E> trailing elements
llvm.vector.splice(<A,B,C,D>, <E,F,G,H>, 1); ==> <B, C, D, E> index
llvm.vector.splice(<A,B,C,D>, <E,F,G,H>, -3); ==> <B, C, D, E> trailing elements


Arguments:
Expand Down Expand Up @@ -22198,7 +22199,7 @@ Overview:
"""""""""

The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length
predicated version of the '``llvm.experimental.vector.splice.*``' intrinsic.
predicated version of the '``llvm.vector.splice.*``' intrinsic.

Arguments:
""""""""""
Expand Down Expand Up @@ -22257,7 +22258,7 @@ Overview:
"""""""""

The '``llvm.experimental.vp.reverse.*``' intrinsic is the vector length
predicated version of the '``llvm.experimental.vector.reverse.*``' intrinsic.
predicated version of the '``llvm.vector.reverse.*``' intrinsic.

Arguments:
""""""""""
Expand Down
6 changes: 5 additions & 1 deletion llvm/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,11 @@ Update on required toolchains to build LLVM
Changes to the LLVM IR
----------------------

- Added Memory Model Relaxation Annotations (MMRAs).
* Added Memory Model Relaxation Annotations (MMRAs).
* Renamed ``llvm.experimental.vector.reverse`` intrinsic to ``llvm.vector.reverse``.
* Renamed ``llvm.experimental.vector.splice`` intrinsic to ``llvm.vector.splice``.
* Renamed ``llvm.experimental.vector.interleave2`` intrinsic to ``llvm.vector.interleave2``.
* Renamed ``llvm.experimental.vector.deinterleave2`` intrinsic to ``llvm.vector.deinterleave2``.

Changes to LLVM infrastructure
------------------------------
Expand Down
4 changes: 2 additions & 2 deletions llvm/include/llvm/CodeGen/BasicTTIImpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -1662,12 +1662,12 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
TTI::SK_InsertSubvector, cast<VectorType>(Args[0]->getType()),
std::nullopt, CostKind, Index, cast<VectorType>(Args[1]->getType()));
}
case Intrinsic::experimental_vector_reverse: {
case Intrinsic::vector_reverse: {
return thisT()->getShuffleCost(
TTI::SK_Reverse, cast<VectorType>(Args[0]->getType()), std::nullopt,
CostKind, 0, cast<VectorType>(RetTy));
}
case Intrinsic::experimental_vector_splice: {
case Intrinsic::vector_splice: {
unsigned Index = cast<ConstantInt>(Args[2])->getZExtValue();
return thisT()->getShuffleCost(
TTI::SK_Splice, cast<VectorType>(Args[0]->getType()), std::nullopt,
Expand Down
4 changes: 2 additions & 2 deletions llvm/include/llvm/CodeGen/GlobalISel/IRTranslator.h
Original file line number Diff line number Diff line change
Expand Up @@ -247,8 +247,8 @@ class IRTranslator : public MachineFunctionPass {
bool translateTrap(const CallInst &U, MachineIRBuilder &MIRBuilder,
unsigned Opcode);

// Translate @llvm.experimental.vector.interleave2 and
// @llvm.experimental.vector.deinterleave2 intrinsics for fixed-width vector
// Translate @llvm.vector.interleave2 and
// @llvm.vector.deinterleave2 intrinsics for fixed-width vector
// types into vector shuffles.
bool translateVectorInterleave2Intrinsic(const CallInst &CI,
MachineIRBuilder &MIRBuilder);
Expand Down
4 changes: 2 additions & 2 deletions llvm/include/llvm/CodeGen/TargetLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -3146,7 +3146,7 @@ class TargetLoweringBase {

/// Lower a deinterleave intrinsic to a target specific load intrinsic.
/// Return true on success. Currently only supports
/// llvm.experimental.vector.deinterleave2
/// llvm.vector.deinterleave2
///
/// \p DI is the deinterleave intrinsic.
/// \p LI is the accompanying load instruction
Expand All @@ -3157,7 +3157,7 @@ class TargetLoweringBase {

/// Lower an interleave intrinsic to a target specific store intrinsic.
/// Return true on success. Currently only supports
/// llvm.experimental.vector.interleave2
/// llvm.vector.interleave2
///
/// \p II is the interleave intrinsic.
/// \p SI is the accompanying store instruction
Expand Down
32 changes: 16 additions & 16 deletions llvm/include/llvm/IR/Intrinsics.td
Original file line number Diff line number Diff line change
Expand Up @@ -2577,15 +2577,15 @@ def int_preserve_static_offset : DefaultAttrsIntrinsic<[llvm_ptr_ty],

//===------------ Intrinsics to perform common vector shuffles ------------===//

def int_experimental_vector_reverse : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
[LLVMMatchType<0>],
[IntrNoMem]>;
def int_vector_reverse : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
[LLVMMatchType<0>],
[IntrNoMem]>;

def int_experimental_vector_splice : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
[LLVMMatchType<0>,
LLVMMatchType<0>,
llvm_i32_ty],
[IntrNoMem, ImmArg<ArgIndex<2>>]>;
def int_vector_splice : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
[LLVMMatchType<0>,
LLVMMatchType<0>,
llvm_i32_ty],
[IntrNoMem, ImmArg<ArgIndex<2>>]>;

//===---------- Intrinsics to query properties of scalable vectors --------===//
def int_vscale : DefaultAttrsIntrinsic<[llvm_anyint_ty], [], [IntrNoMem]>;
Expand All @@ -2600,15 +2600,15 @@ def int_vector_extract : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
[IntrNoMem, IntrSpeculatable, ImmArg<ArgIndex<1>>]>;


def int_experimental_vector_interleave2 : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
[LLVMHalfElementsVectorType<0>,
LLVMHalfElementsVectorType<0>],
[IntrNoMem]>;
def int_vector_interleave2 : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
[LLVMHalfElementsVectorType<0>,
LLVMHalfElementsVectorType<0>],
[IntrNoMem]>;

def int_experimental_vector_deinterleave2 : DefaultAttrsIntrinsic<[LLVMHalfElementsVectorType<0>,
LLVMHalfElementsVectorType<0>],
[llvm_anyvector_ty],
[IntrNoMem]>;
def int_vector_deinterleave2 : DefaultAttrsIntrinsic<[LLVMHalfElementsVectorType<0>,
LLVMHalfElementsVectorType<0>],
[llvm_anyvector_ty],
[IntrNoMem]>;

//===----------------- Pointer Authentication Intrinsics ------------------===//
//
Expand Down
2 changes: 1 addition & 1 deletion llvm/include/llvm/IR/PatternMatch.h
Original file line number Diff line number Diff line change
Expand Up @@ -2513,7 +2513,7 @@ inline typename m_Intrinsic_Ty<Opnd0, Opnd1>::Ty m_CopySign(const Opnd0 &Op0,

template <typename Opnd0>
inline typename m_Intrinsic_Ty<Opnd0>::Ty m_VecReverse(const Opnd0 &Op0) {
return m_Intrinsic<Intrinsic::experimental_vector_reverse>(Op0);
return m_Intrinsic<Intrinsic::vector_reverse>(Op0);
}

//===----------------------------------------------------------------------===//
Expand Down
6 changes: 3 additions & 3 deletions llvm/lib/Analysis/InstructionSimplify.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6281,11 +6281,11 @@ static Value *simplifyUnaryIntrinsic(Function *F, Value *Op0,
m_Intrinsic<Intrinsic::pow>(m_SpecificFP(10.0), m_Value(X)))))
return X;
break;
case Intrinsic::experimental_vector_reverse:
// experimental.vector.reverse(experimental.vector.reverse(x)) -> x
case Intrinsic::vector_reverse:
// vector.reverse(vector.reverse(x)) -> x
if (match(Op0, m_VecReverse(m_Value(X))))
return X;
// experimental.vector.reverse(splat(X)) -> splat(X)
// vector.reverse(splat(X)) -> splat(X)
if (isSplatValue(Op0))
return Op0;
break;
Expand Down
30 changes: 13 additions & 17 deletions llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1639,8 +1639,7 @@ bool ComplexDeinterleavingGraph::checkNodes() {
ComplexDeinterleavingGraph::NodePtr
ComplexDeinterleavingGraph::identifyRoot(Instruction *RootI) {
if (auto *Intrinsic = dyn_cast<IntrinsicInst>(RootI)) {
if (Intrinsic->getIntrinsicID() !=
Intrinsic::experimental_vector_interleave2)
if (Intrinsic->getIntrinsicID() != Intrinsic::vector_interleave2)
return nullptr;

auto *Real = dyn_cast<Instruction>(Intrinsic->getOperand(0));
Expand Down Expand Up @@ -1675,7 +1674,7 @@ ComplexDeinterleavingGraph::identifyDeinterleave(Instruction *Real,
Value *FinalValue = nullptr;
if (match(Real, m_ExtractValue<0>(m_Instruction(I))) &&
match(Imag, m_ExtractValue<1>(m_Specific(I))) &&
match(I, m_Intrinsic<Intrinsic::experimental_vector_deinterleave2>(
match(I, m_Intrinsic<Intrinsic::vector_deinterleave2>(
m_Value(FinalValue)))) {
NodePtr PlaceholderNode = prepareCompositeNode(
llvm::ComplexDeinterleavingOperation::Deinterleave, Real, Imag);
Expand Down Expand Up @@ -1960,13 +1959,11 @@ Value *ComplexDeinterleavingGraph::replaceNode(IRBuilderBase &Builder,
// Splats that are not constant are interleaved where they are located
Instruction *InsertPoint = (I->comesBefore(R) ? R : I)->getNextNode();
IRBuilder<> IRB(InsertPoint);
ReplacementNode =
IRB.CreateIntrinsic(Intrinsic::experimental_vector_interleave2, NewTy,
{Node->Real, Node->Imag});
ReplacementNode = IRB.CreateIntrinsic(Intrinsic::vector_interleave2,
NewTy, {Node->Real, Node->Imag});
} else {
ReplacementNode =
Builder.CreateIntrinsic(Intrinsic::experimental_vector_interleave2,
NewTy, {Node->Real, Node->Imag});
ReplacementNode = Builder.CreateIntrinsic(
Intrinsic::vector_interleave2, NewTy, {Node->Real, Node->Imag});
}
break;
}
Expand All @@ -1991,9 +1988,8 @@ Value *ComplexDeinterleavingGraph::replaceNode(IRBuilderBase &Builder,
auto *B = replaceNode(Builder, Node->Operands[1]);
auto *NewMaskTy = VectorType::getDoubleElementsVectorType(
cast<VectorType>(MaskReal->getType()));
auto *NewMask =
Builder.CreateIntrinsic(Intrinsic::experimental_vector_interleave2,
NewMaskTy, {MaskReal, MaskImag});
auto *NewMask = Builder.CreateIntrinsic(Intrinsic::vector_interleave2,
NewMaskTy, {MaskReal, MaskImag});
ReplacementNode = Builder.CreateSelect(NewMask, A, B);
break;
}
Expand Down Expand Up @@ -2021,8 +2017,8 @@ void ComplexDeinterleavingGraph::processReductionOperation(
Value *InitImag = OldPHIImag->getIncomingValueForBlock(Incoming);

IRBuilder<> Builder(Incoming->getTerminator());
auto *NewInit = Builder.CreateIntrinsic(
Intrinsic::experimental_vector_interleave2, NewVTy, {InitReal, InitImag});
auto *NewInit = Builder.CreateIntrinsic(Intrinsic::vector_interleave2, NewVTy,
{InitReal, InitImag});

NewPHI->addIncoming(NewInit, Incoming);
NewPHI->addIncoming(OperationReplacement, BackEdge);
Expand All @@ -2034,9 +2030,9 @@ void ComplexDeinterleavingGraph::processReductionOperation(

Builder.SetInsertPoint(
&*FinalReductionReal->getParent()->getFirstInsertionPt());
auto *Deinterleave = Builder.CreateIntrinsic(
Intrinsic::experimental_vector_deinterleave2,
OperationReplacement->getType(), OperationReplacement);
auto *Deinterleave = Builder.CreateIntrinsic(Intrinsic::vector_deinterleave2,
OperationReplacement->getType(),
OperationReplacement);

auto *NewReal = Builder.CreateExtractValue(Deinterleave, (uint64_t)0);
FinalReductionReal->replaceUsesOfWith(Real, NewReal);
Expand Down
Loading

0 comments on commit bfc0317

Please sign in to comment.