Skip to content

Commit

Permalink
[VP][RISCV] Add vp.maxnum and vp.minnum intrinsics and RISC-V support.
Browse files Browse the repository at this point in the history
Add vp.maxnum and vp.minnum which are vector predicted intrinsics of llvm.maxnum
and llvm.minnum.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134639
  • Loading branch information
yetingk committed Sep 27, 2022
1 parent 20a80d6 commit 04e1301
Show file tree
Hide file tree
Showing 10 changed files with 1,687 additions and 6 deletions.
101 changes: 101 additions & 0 deletions llvm/docs/LangRef.rst
Expand Up @@ -14512,6 +14512,8 @@ Semantics:
This function returns the same values as the libm ``fabs`` functions
would, and handles error conditions in the same way.

.. _i_minnum:

'``llvm.minnum.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -14562,6 +14564,7 @@ NaN, the intrinsic lowering is responsible for quieting the inputs to
correctly return the non-NaN input (e.g. by using the equivalent of
``llvm.canonicalize``).

.. _i_maxnum:

'``llvm.maxnum.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -18850,6 +18853,104 @@ Examples:
%also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison


.. _int_vp_minnum:

'``llvm.vp.minnum.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
This is an overloaded intrinsic.

::

declare <16 x float> @llvm.vp.minnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
declare <vscale x 4 x float> @llvm.vp.minnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
declare <256 x double> @llvm.vp.minnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

Overview:
"""""""""

Predicated floating-point IEEE-754 minNum of two vectors of floating-point values.


Arguments:
""""""""""

The first two operands and the result have the same vector of floating-point type. The
third operand is the vector mask and has the same number of elements as the
result vector type. The fourth operand is the explicit vector length of the
operation.

Semantics:
""""""""""

The '``llvm.vp.minnum``' intrinsic performs floating-point minimum (:ref:`minnum <i_minnum>`)
of the first and second vector operand on each enabled lane. The result on
disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
performed in the default floating-point environment.

Examples:
"""""""""

.. code-block:: llvm

%r = call <4 x float> @llvm.vp.minnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

%t = call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %b)
%also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison


.. _int_vp_maxnum:

'``llvm.vp.maxnum.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
This is an overloaded intrinsic.

::

declare <16 x float> @llvm.vp.maxnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
declare <vscale x 4 x float> @llvm.vp.maxnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
declare <256 x double> @llvm.vp.maxnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

Overview:
"""""""""

Predicated floating-point IEEE-754 maxNum of two vectors of floating-point values.


Arguments:
""""""""""

The first two operands and the result have the same vector of floating-point type. The
third operand is the vector mask and has the same number of elements as the
result vector type. The fourth operand is the explicit vector length of the
operation.

Semantics:
""""""""""

The '``llvm.vp.maxnum``' intrinsic performs floating-point maximum (:ref:`maxnum <i_maxnum>`)
of the first and second vector operand on each enabled lane. The result on
disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
performed in the default floating-point environment.

Examples:
"""""""""

.. code-block:: llvm

%r = call <4 x float> @llvm.vp.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

%t = call <4 x float> @llvm.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
%also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison


.. _int_vp_fadd:

'``llvm.vp.fadd.*``' Intrinsics
Expand Down
10 changes: 10 additions & 0 deletions llvm/include/llvm/IR/Intrinsics.td
Expand Up @@ -1575,6 +1575,16 @@ let IntrProperties = [IntrNoMem, IntrNoSync, IntrWillReturn] in {
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
def int_vp_minnum : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
[ LLVMMatchType<0>,
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
def int_vp_maxnum : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
[ LLVMMatchType<0>,
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;

// Casts
def int_vp_trunc : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
Expand Down
7 changes: 7 additions & 0 deletions llvm/include/llvm/IR/VPIntrinsics.def
Expand Up @@ -248,6 +248,13 @@ BEGIN_REGISTER_VP(vp_fma, 3, 4, VP_FMA, -1)
VP_PROPERTY_CONSTRAINEDFP(1, 1, experimental_constrained_fma)
END_REGISTER_VP(vp_fma, VP_FMA)

// llvm.vp.minnum(x, y, mask,vlen)
BEGIN_REGISTER_VP(vp_minnum, 2, 3, VP_FMINNUM, -1)
END_REGISTER_VP(vp_minnum, VP_FMINNUM)

// llvm.vp.maxnum(x, y, mask,vlen)
BEGIN_REGISTER_VP(vp_maxnum, 2, 3, VP_FMAXNUM, -1)
END_REGISTER_VP(vp_maxnum, VP_FMAXNUM)
///// } Floating-Point Arithmetic

///// Type Casts {
Expand Down
8 changes: 4 additions & 4 deletions llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
Expand Up @@ -1071,8 +1071,8 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, unsigned ResNo) {
case ISD::FADD: case ISD::VP_FADD:
case ISD::FSUB: case ISD::VP_FSUB:
case ISD::FMUL: case ISD::VP_FMUL:
case ISD::FMINNUM:
case ISD::FMAXNUM:
case ISD::FMINNUM: case ISD::VP_FMINNUM:
case ISD::FMAXNUM: case ISD::VP_FMAXNUM:
case ISD::FMINIMUM:
case ISD::FMAXIMUM:
case ISD::SDIV: case ISD::VP_SDIV:
Expand Down Expand Up @@ -3924,8 +3924,8 @@ void DAGTypeLegalizer::WidenVectorResult(SDNode *N, unsigned ResNo) {
case ISD::SHL: case ISD::VP_SHL:
case ISD::SRA: case ISD::VP_ASHR:
case ISD::SRL: case ISD::VP_LSHR:
case ISD::FMINNUM:
case ISD::FMAXNUM:
case ISD::FMINNUM: case ISD::VP_FMINNUM:
case ISD::FMAXNUM: case ISD::VP_FMAXNUM:
case ISD::FMINIMUM:
case ISD::FMAXIMUM:
case ISD::SMIN:
Expand Down
6 changes: 5 additions & 1 deletion llvm/lib/Target/RISCV/RISCVISelLowering.cpp
Expand Up @@ -450,7 +450,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::VP_REDUCE_FMIN, ISD::VP_REDUCE_FMAX, ISD::VP_MERGE,
ISD::VP_SELECT, ISD::VP_SINT_TO_FP, ISD::VP_UINT_TO_FP,
ISD::VP_SETCC, ISD::VP_FP_ROUND, ISD::VP_FP_EXTEND,
ISD::VP_SQRT};
ISD::VP_SQRT, ISD::VP_FMINNUM, ISD::VP_FMAXNUM};

static const unsigned IntegerVecReduceOps[] = {
ISD::VECREDUCE_ADD, ISD::VECREDUCE_AND, ISD::VECREDUCE_OR,
Expand Down Expand Up @@ -3856,6 +3856,10 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
return lowerVPOp(Op, DAG, RISCVISD::FSQRT_VL);
case ISD::VP_FMA:
return lowerVPOp(Op, DAG, RISCVISD::VFMADD_VL);
case ISD::VP_FMINNUM:
return lowerVPOp(Op, DAG, RISCVISD::FMINNUM_VL, /*HasMergeOp*/ true);
case ISD::VP_FMAXNUM:
return lowerVPOp(Op, DAG, RISCVISD::FMAXNUM_VL, /*HasMergeOp*/ true);
case ISD::VP_SIGN_EXTEND:
case ISD::VP_ZERO_EXTEND:
if (Op.getOperand(0).getSimpleValueType().getVectorElementType() == MVT::i1)
Expand Down

0 comments on commit 04e1301

Please sign in to comment.