Skip to content

Commit

Permalink
[VP][RISCV] Add vp.smax/smin/umax/umin intrinsics
Browse files Browse the repository at this point in the history
Differential Revision: https://reviews.llvm.org/D135418
  • Loading branch information
topperc committed Oct 8, 2022
1 parent aa8ab5b commit 9f67047
Show file tree
Hide file tree
Showing 15 changed files with 10,851 additions and 10 deletions.
200 changes: 200 additions & 0 deletions llvm/docs/LangRef.rst
Expand Up @@ -13641,6 +13641,8 @@ then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and
``poison`` otherwise.


.. _int_smax:

'``llvm.smax.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -13670,6 +13672,8 @@ integer element type. The argument types must match each other, and the return
type must match the argument type.


.. _int_smin:

'``llvm.smin.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -13699,6 +13703,8 @@ integer element type. The argument types must match each other, and the return
type must match the argument type.


.. _int_umax:

'``llvm.umax.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -13728,6 +13734,8 @@ integer element type. The argument types must match each other, and the return
type must match the argument type.


.. _int_umin:

'``llvm.umin.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -18871,6 +18879,198 @@ Examples:
%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison


.. _int_vp_smax:

'``llvm.vp.smax.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
This is an overloaded intrinsic.

::

declare <16 x i32> @llvm.vp.smax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
declare <vscale x 4 x i32> @llvm.vp.smax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
declare <256 x i64> @llvm.vp.smax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

Overview:
"""""""""

Predicated integer signed maximum of two vectors of integers.


Arguments:
""""""""""

The first two operands and the result have the same vector of integer type. The
third operand is the vector mask and has the same number of elements as the
result vector type. The fourth operand is the explicit vector length of the
operation.

Semantics:
""""""""""

The '``llvm.vp.smax``' intrinsic performs integer signed maximum (:ref:`smax <int_smax>`)
of the first and second vector operand on each enabled lane. The result on
disabled lanes is a :ref:`poison value <poisonvalues>`.

Examples:
"""""""""

.. code-block:: llvm

%r = call <4 x i32> @llvm.vp.smax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

%t = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison


.. _int_vp_smin:

'``llvm.vp.smin.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
This is an overloaded intrinsic.

::

declare <16 x i32> @llvm.vp.smin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
declare <vscale x 4 x i32> @llvm.vp.smin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
declare <256 x i64> @llvm.vp.smin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

Overview:
"""""""""

Predicated integer signed minimum of two vectors of integers.


Arguments:
""""""""""

The first two operands and the result have the same vector of integer type. The
third operand is the vector mask and has the same number of elements as the
result vector type. The fourth operand is the explicit vector length of the
operation.

Semantics:
""""""""""

The '``llvm.vp.smin``' intrinsic performs integer signed minimum (:ref:`smin <int_smin>`)
of the first and second vector operand on each enabled lane. The result on
disabled lanes is a :ref:`poison value <poisonvalues>`.

Examples:
"""""""""

.. code-block:: llvm

%r = call <4 x i32> @llvm.vp.smin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

%t = call <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison


.. _int_vp_umax:

'``llvm.vp.umax.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
This is an overloaded intrinsic.

::

declare <16 x i32> @llvm.vp.umax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
declare <vscale x 4 x i32> @llvm.vp.umax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
declare <256 x i64> @llvm.vp.umax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

Overview:
"""""""""

Predicated integer unsigned maximum of two vectors of integers.


Arguments:
""""""""""

The first two operands and the result have the same vector of integer type. The
third operand is the vector mask and has the same number of elements as the
result vector type. The fourth operand is the explicit vector length of the
operation.

Semantics:
""""""""""

The '``llvm.vp.umax``' intrinsic performs integer unsigned maximum (:ref:`umax <int_umax>`)
of the first and second vector operand on each enabled lane. The result on
disabled lanes is a :ref:`poison value <poisonvalues>`.

Examples:
"""""""""

.. code-block:: llvm

%r = call <4 x i32> @llvm.vp.umax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

%t = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison


.. _int_vp_umin:

'``llvm.vp.umin.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
This is an overloaded intrinsic.

::

declare <16 x i32> @llvm.vp.umin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
declare <vscale x 4 x i32> @llvm.vp.umin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
declare <256 x i64> @llvm.vp.umin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

Overview:
"""""""""

Predicated integer unsigned minimum of two vectors of integers.


Arguments:
""""""""""

The first two operands and the result have the same vector of integer type. The
third operand is the vector mask and has the same number of elements as the
result vector type. The fourth operand is the explicit vector length of the
operation.

Semantics:
""""""""""

The '``llvm.vp.umin``' intrinsic performs integer unsigned minimum (:ref:`umin <int_umin>`)
of the first and second vector operand on each enabled lane. The result on
disabled lanes is a :ref:`poison value <poisonvalues>`.

Examples:
"""""""""

.. code-block:: llvm

%r = call <4 x i32> @llvm.vp.umin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

%t = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison


.. _int_vp_copysign:

'``llvm.vp.copysign.*``' Intrinsics
Expand Down
20 changes: 20 additions & 0 deletions llvm/include/llvm/IR/Intrinsics.td
Expand Up @@ -1530,6 +1530,26 @@ let IntrProperties = [IntrNoMem, IntrNoSync, IntrWillReturn] in {
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
def int_vp_smin : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
[ LLVMMatchType<0>,
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
def int_vp_smax : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
[ LLVMMatchType<0>,
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
def int_vp_umin : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
[ LLVMMatchType<0>,
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
def int_vp_umax : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
[ LLVMMatchType<0>,
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;

// Floating-point arithmetic
def int_vp_fadd : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
Expand Down
15 changes: 15 additions & 0 deletions llvm/include/llvm/IR/VPIntrinsics.def
Expand Up @@ -196,6 +196,21 @@ HELPER_REGISTER_BINARY_INT_VP(vp_xor, VP_XOR, Xor)

#undef HELPER_REGISTER_BINARY_INT_VP

// llvm.vp.smin(x,y,mask,vlen)
BEGIN_REGISTER_VP(vp_smin, 2, 3, VP_SMIN, -1)
END_REGISTER_VP(vp_smin, VP_SMIN)

// llvm.vp.smax(x,y,mask,vlen)
BEGIN_REGISTER_VP(vp_smax, 2, 3, VP_SMAX, -1)
END_REGISTER_VP(vp_smax, VP_SMAX)

// llvm.vp.umin(x,y,mask,vlen)
BEGIN_REGISTER_VP(vp_umin, 2, 3, VP_UMIN, -1)
END_REGISTER_VP(vp_umin, VP_UMIN)

// llvm.vp.umax(x,y,mask,vlen)
BEGIN_REGISTER_VP(vp_umax, 2, 3, VP_UMAX, -1)
END_REGISTER_VP(vp_umax, VP_UMAX)
///// } Integer Arithmetic

///// Floating-Point Arithmetic {
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
Expand Up @@ -165,11 +165,15 @@ void DAGTypeLegalizer::PromoteIntegerResult(SDNode *N, unsigned ResNo) {
case ISD::VP_SUB:
case ISD::VP_MUL: Res = PromoteIntRes_SimpleIntBinOp(N); break;

case ISD::VP_SMIN:
case ISD::VP_SMAX:
case ISD::SDIV:
case ISD::SREM:
case ISD::VP_SDIV:
case ISD::VP_SREM: Res = PromoteIntRes_SExtIntBinOp(N); break;

case ISD::VP_UMIN:
case ISD::VP_UMAX:
case ISD::UDIV:
case ISD::UREM:
case ISD::VP_UDIV:
Expand Down
16 changes: 8 additions & 8 deletions llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
Expand Up @@ -1093,10 +1093,10 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, unsigned ResNo) {
case ISD::UREM: case ISD::VP_UREM:
case ISD::SREM: case ISD::VP_SREM:
case ISD::FREM: case ISD::VP_FREM:
case ISD::SMIN:
case ISD::SMAX:
case ISD::UMIN:
case ISD::UMAX:
case ISD::SMIN: case ISD::VP_SMIN:
case ISD::SMAX: case ISD::VP_SMAX:
case ISD::UMIN: case ISD::VP_UMIN:
case ISD::UMAX: case ISD::VP_UMAX:
case ISD::SADDSAT:
case ISD::UADDSAT:
case ISD::SSUBSAT:
Expand Down Expand Up @@ -3934,10 +3934,10 @@ void DAGTypeLegalizer::WidenVectorResult(SDNode *N, unsigned ResNo) {
case ISD::FMAXNUM: case ISD::VP_FMAXNUM:
case ISD::FMINIMUM:
case ISD::FMAXIMUM:
case ISD::SMIN:
case ISD::SMAX:
case ISD::UMIN:
case ISD::UMAX:
case ISD::SMIN: case ISD::VP_SMIN:
case ISD::SMAX: case ISD::VP_SMAX:
case ISD::UMIN: case ISD::VP_UMIN:
case ISD::UMAX: case ISD::VP_UMAX:
case ISD::UADDSAT:
case ISD::SADDSAT:
case ISD::USUBSAT:
Expand Down
11 changes: 10 additions & 1 deletion llvm/lib/Target/RISCV/RISCVISelLowering.cpp
Expand Up @@ -454,7 +454,8 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::VP_REDUCE_SMIN, ISD::VP_REDUCE_UMAX, ISD::VP_REDUCE_UMIN,
ISD::VP_MERGE, ISD::VP_SELECT, ISD::VP_FP_TO_SINT,
ISD::VP_FP_TO_UINT, ISD::VP_SETCC, ISD::VP_SIGN_EXTEND,
ISD::VP_ZERO_EXTEND, ISD::VP_TRUNCATE};
ISD::VP_ZERO_EXTEND, ISD::VP_TRUNCATE, ISD::VP_SMIN,
ISD::VP_SMAX, ISD::VP_UMIN, ISD::VP_UMAX};

static const unsigned FloatingPointVPOps[] = {
ISD::VP_FADD, ISD::VP_FSUB, ISD::VP_FMUL,
Expand Down Expand Up @@ -3979,6 +3980,14 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
if (Op.getOperand(0).getSimpleValueType().getVectorElementType() == MVT::i1)
return lowerVPSetCCMaskOp(Op, DAG);
return lowerVPOp(Op, DAG, RISCVISD::SETCC_VL, /*HasMergeOp*/ true);
case ISD::VP_SMIN:
return lowerVPOp(Op, DAG, RISCVISD::SMIN_VL, /*HasMergeOp*/ true);
case ISD::VP_SMAX:
return lowerVPOp(Op, DAG, RISCVISD::SMAX_VL, /*HasMergeOp*/ true);
case ISD::VP_UMIN:
return lowerVPOp(Op, DAG, RISCVISD::UMIN_VL, /*HasMergeOp*/ true);
case ISD::VP_UMAX:
return lowerVPOp(Op, DAG, RISCVISD::UMAX_VL, /*HasMergeOp*/ true);
case ISD::EXPERIMENTAL_VP_STRIDED_LOAD:
return lowerVPStridedLoad(Op, DAG);
case ISD::EXPERIMENTAL_VP_STRIDED_STORE:
Expand Down

0 comments on commit 9f67047

Please sign in to comment.