Skip to content

Commit

Permalink
[Intrinsic] Add the llvm.umul.fix.sat intrinsic
Browse files Browse the repository at this point in the history
Summary:
Add an intrinsic that takes 2 unsigned integers with
the scale of them provided as the third argument and
performs fixed point multiplication on them. The
result is saturated and clamped between the largest and
smallest representable values of the first 2 operands.

This is a part of implementing fixed point arithmetic
in clang where some of the more complex operations
will be implemented as intrinsics.

Patch by: leonardchan, bjope

Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel

Reviewed By: leonardchan

Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57836

llvm-svn: 371308
  • Loading branch information
bjope committed Sep 7, 2019
1 parent 314893c commit 5e331e4
Show file tree
Hide file tree
Showing 22 changed files with 919 additions and 50 deletions.
67 changes: 67 additions & 0 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13764,6 +13764,73 @@ Examples
%res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2)


'``llvm.umul.fix.sat.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax
"""""""

This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
on any integer bit width or vectors of integers.

::

declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)

Overview
"""""""""

The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
fixed point saturation multiplication on 2 arguments of the same scale.

Arguments
""""""""""

The arguments (%a and %b) and the result may be of integer types of any bit
width, but they must have the same bit width. ``%a`` and ``%b`` are the two
values that will undergo unsigned fixed point multiplication. The argument
``%scale`` represents the scale of both operands, and must be a constant
integer.

Semantics:
""""""""""

This operation performs fixed point multiplication on the 2 arguments of a
specified scale. The result will also be returned in the same scale specified
in the third argument.

If the result value cannot be precisely represented in the given scale, the
value is rounded up or down to the closest representable value. The rounding
direction is unspecified.

The maximum value this operation can clamp to is the largest unsigned value
representable by the bit width of the first 2 arguments. The minimum value is the
smallest unsigned value representable by this bit width (zero).


Examples
"""""""""

.. code-block:: llvm

%res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
%res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)

; The result in the following could be rounded down to 2 or up to 2.5
%res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1) ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)

; Saturation
%res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0) ; %res = 15 (8 x 2 -> clamped to 15)
%res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2) ; %res = 15 (2 x 2 -> clamped to 3.75)

; Scale can affect the saturation result
%res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7)
%res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2)


Specialised Arithmetic Intrinsics
---------------------------------

Expand Down
2 changes: 1 addition & 1 deletion llvm/include/llvm/CodeGen/ISDOpcodes.h
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ namespace ISD {
/// Same as the corresponding unsaturated fixed point instructions, but the
/// result is clamped between the min and max values representable by the
/// bits of the first 2 operands.
SMULFIXSAT,
SMULFIXSAT, UMULFIXSAT,

/// Simple binary floating point operators.
FADD, FSUB, FMUL, FDIV, FREM,
Expand Down
5 changes: 3 additions & 2 deletions llvm/include/llvm/CodeGen/TargetLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -923,6 +923,7 @@ class TargetLoweringBase {
case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX:
case ISD::UMULFIXSAT:
Supported = isSupportedFixedPointOperation(Op, VT, Scale);
break;
}
Expand Down Expand Up @@ -4097,8 +4098,8 @@ class TargetLowering : public TargetLoweringBase {
/// method accepts integers as its arguments.
SDValue expandAddSubSat(SDNode *Node, SelectionDAG &DAG) const;

/// Method for building the DAG expansion of ISD::SMULFIX. This method accepts
/// integers as its arguments.
/// Method for building the DAG expansion of ISD::[U|S]MULFIX[SAT]. This
/// method accepts integers as its arguments.
SDValue expandFixedPointMul(SDNode *Node, SelectionDAG &DAG) const;

/// Method for building the DAG expansion of ISD::U(ADD|SUB)O. Expansion
Expand Down
3 changes: 3 additions & 0 deletions llvm/include/llvm/IR/Intrinsics.td
Original file line number Diff line number Diff line change
Expand Up @@ -895,6 +895,9 @@ def int_umul_fix : Intrinsic<[llvm_anyint_ty],
def int_smul_fix_sat : Intrinsic<[llvm_anyint_ty],
[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
[IntrNoMem, IntrSpeculatable, IntrWillReturn, Commutative, ImmArg<2>]>;
def int_umul_fix_sat : Intrinsic<[llvm_anyint_ty],
[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
[IntrNoMem, IntrSpeculatable, IntrWillReturn, Commutative, ImmArg<2>]>;

//===------------------------- Memory Use Markers -------------------------===//
//
Expand Down
1 change: 1 addition & 0 deletions llvm/include/llvm/Target/TargetSelectionDAG.td
Original file line number Diff line number Diff line change
Expand Up @@ -396,6 +396,7 @@ def usubsat : SDNode<"ISD::USUBSAT" , SDTIntBinOp>;
def smulfix : SDNode<"ISD::SMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;
def smulfixsat : SDNode<"ISD::SMULFIXSAT", SDTIntScaledBinOp, [SDNPCommutative]>;
def umulfix : SDNode<"ISD::UMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;
def umulfixsat : SDNode<"ISD::UMULFIXSAT", SDTIntScaledBinOp, [SDNPCommutative]>;

def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;
def sext_invec : SDNode<"ISD::SIGN_EXTEND_VECTOR_INREG", SDTExtInvec>;
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Analysis/VectorUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ bool llvm::isTriviallyVectorizable(Intrinsic::ID ID) {
case Intrinsic::smul_fix:
case Intrinsic::smul_fix_sat:
case Intrinsic::umul_fix:
case Intrinsic::umul_fix_sat:
case Intrinsic::sqrt: // Begin floating-point.
case Intrinsic::sin:
case Intrinsic::cos:
Expand Down Expand Up @@ -98,6 +99,7 @@ bool llvm::hasVectorInstrinsicScalarOpd(Intrinsic::ID ID,
case Intrinsic::smul_fix:
case Intrinsic::smul_fix_sat:
case Intrinsic::umul_fix:
case Intrinsic::umul_fix_sat:
return (ScalarOpdIdx == 2);
default:
return false;
Expand Down
6 changes: 4 additions & 2 deletions llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1749,7 +1749,8 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::SUBCARRY: return visitSUBCARRY(N);
case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX: return visitMULFIX(N);
case ISD::UMULFIX:
case ISD::UMULFIXSAT: return visitMULFIX(N);
case ISD::MUL: return visitMUL(N);
case ISD::SDIV: return visitSDIV(N);
case ISD::UDIV: return visitUDIV(N);
Expand Down Expand Up @@ -3519,7 +3520,8 @@ SDValue DAGCombiner::visitSUBCARRY(SDNode *N) {
return SDValue();
}

// Notice that "mulfix" can be any of SMULFIX, SMULFIXSAT and UMULFIX here.
// Notice that "mulfix" can be any of SMULFIX, SMULFIXSAT, UMULFIX and
// UMULFIXSAT here.
SDValue DAGCombiner::visitMULFIX(SDNode *N) {
SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);
Expand Down
4 changes: 3 additions & 1 deletion llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1115,7 +1115,8 @@ void SelectionDAGLegalize::LegalizeOp(SDNode *Node) {
}
case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX: {
case ISD::UMULFIX:
case ISD::UMULFIXSAT: {
unsigned Scale = Node->getConstantOperandVal(2);
Action = TLI.getFixedPointOperationAction(Node->getOpcode(),
Node->getValueType(0), Scale);
Expand Down Expand Up @@ -3353,6 +3354,7 @@ bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX:
case ISD::UMULFIXSAT:
Results.push_back(TLI.expandFixedPointMul(Node, DAG));
break;
case ISD::ADDCARRY:
Expand Down
84 changes: 66 additions & 18 deletions llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,12 @@ void DAGTypeLegalizer::PromoteIntegerResult(SDNode *N, unsigned ResNo) {
case ISD::UADDSAT:
case ISD::SSUBSAT:
case ISD::USUBSAT: Res = PromoteIntRes_ADDSUBSAT(N); break;

case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX: Res = PromoteIntRes_MULFIX(N); break;
case ISD::UMULFIX:
case ISD::UMULFIXSAT: Res = PromoteIntRes_MULFIX(N); break;

case ISD::ABS: Res = PromoteIntRes_ABS(N); break;

case ISD::ATOMIC_LOAD:
Expand Down Expand Up @@ -689,6 +692,8 @@ SDValue DAGTypeLegalizer::PromoteIntRes_MULFIX(SDNode *N) {
SDValue Op1Promoted, Op2Promoted;
bool Signed =
N->getOpcode() == ISD::SMULFIX || N->getOpcode() == ISD::SMULFIXSAT;
bool Saturating =
N->getOpcode() == ISD::SMULFIXSAT || N->getOpcode() == ISD::UMULFIXSAT;
if (Signed) {
Op1Promoted = SExtPromotedInteger(N->getOperand(0));
Op2Promoted = SExtPromotedInteger(N->getOperand(1));
Expand All @@ -701,7 +706,6 @@ SDValue DAGTypeLegalizer::PromoteIntRes_MULFIX(SDNode *N) {
unsigned DiffSize =
PromotedType.getScalarSizeInBits() - OldType.getScalarSizeInBits();

bool Saturating = N->getOpcode() == ISD::SMULFIXSAT;
if (Saturating) {
// Promoting the operand and result values changes the saturation width,
// which is extends the values that we clamp to on saturation. This could be
Expand Down Expand Up @@ -1164,7 +1168,8 @@ bool DAGTypeLegalizer::PromoteIntegerOperand(SDNode *N, unsigned OpNo) {

case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX: Res = PromoteIntOp_MULFIX(N); break;
case ISD::UMULFIX:
case ISD::UMULFIXSAT: Res = PromoteIntOp_MULFIX(N); break;

case ISD::FPOWI: Res = PromoteIntOp_FPOWI(N); break;

Expand Down Expand Up @@ -1739,7 +1744,8 @@ void DAGTypeLegalizer::ExpandIntegerResult(SDNode *N, unsigned ResNo) {

case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX: ExpandIntRes_MULFIX(N, Lo, Hi); break;
case ISD::UMULFIX:
case ISD::UMULFIXSAT: ExpandIntRes_MULFIX(N, Lo, Hi); break;

case ISD::VECREDUCE_ADD:
case ISD::VECREDUCE_MUL:
Expand Down Expand Up @@ -2810,7 +2816,8 @@ void DAGTypeLegalizer::ExpandIntRes_MULFIX(SDNode *N, SDValue &Lo,
SDValue LHS = N->getOperand(0);
SDValue RHS = N->getOperand(1);
uint64_t Scale = N->getConstantOperandVal(2);
bool Saturating = N->getOpcode() == ISD::SMULFIXSAT;
bool Saturating = (N->getOpcode() == ISD::SMULFIXSAT ||
N->getOpcode() == ISD::UMULFIXSAT);
bool Signed = (N->getOpcode() == ISD::SMULFIX ||
N->getOpcode() == ISD::SMULFIXSAT);

Expand All @@ -2821,23 +2828,35 @@ void DAGTypeLegalizer::ExpandIntRes_MULFIX(SDNode *N, SDValue &Lo,
Result = DAG.getNode(ISD::MUL, dl, VT, LHS, RHS);
} else {
EVT BoolVT = getSetCCResultType(VT);
Result = DAG.getNode(ISD::SMULO, dl, DAG.getVTList(VT, BoolVT), LHS, RHS);
unsigned MulOp = Signed ? ISD::SMULO : ISD::UMULO;
Result = DAG.getNode(MulOp, dl, DAG.getVTList(VT, BoolVT), LHS, RHS);
SDValue Product = Result.getValue(0);
SDValue Overflow = Result.getValue(1);
assert(Signed && "Unsigned saturation not supported (yet).");
APInt MinVal = APInt::getSignedMinValue(VTSize);
APInt MaxVal = APInt::getSignedMaxValue(VTSize);
SDValue SatMin = DAG.getConstant(MinVal, dl, VT);
SDValue SatMax = DAG.getConstant(MaxVal, dl, VT);
SDValue Zero = DAG.getConstant(0, dl, VT);
SDValue ProdNeg = DAG.getSetCC(dl, BoolVT, Product, Zero, ISD::SETLT);
Result = DAG.getSelect(dl, VT, ProdNeg, SatMax, SatMin);
Result = DAG.getSelect(dl, VT, Overflow, Result, Product);
if (Signed) {
APInt MinVal = APInt::getSignedMinValue(VTSize);
APInt MaxVal = APInt::getSignedMaxValue(VTSize);
SDValue SatMin = DAG.getConstant(MinVal, dl, VT);
SDValue SatMax = DAG.getConstant(MaxVal, dl, VT);
SDValue Zero = DAG.getConstant(0, dl, VT);
SDValue ProdNeg = DAG.getSetCC(dl, BoolVT, Product, Zero, ISD::SETLT);
Result = DAG.getSelect(dl, VT, ProdNeg, SatMax, SatMin);
Result = DAG.getSelect(dl, VT, Overflow, Result, Product);
} else {
// For unsigned multiplication, we only need to check the max since we
// can't really overflow towards zero.
APInt MaxVal = APInt::getMaxValue(VTSize);
SDValue SatMax = DAG.getConstant(MaxVal, dl, VT);
Result = DAG.getSelect(dl, VT, Overflow, SatMax, Product);
}
}
SplitInteger(Result, Lo, Hi);
return;
}

// For SMULFIX[SAT] we only expect to find Scale<VTSize, but this assert will
// cover for unhandled cases below, while still being valid for UMULFIX[SAT].
assert(Scale <= VTSize && "Scale can't be larger than the value type size.");

EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
SDValue LL, LH, RL, RH;
GetExpandedInteger(LHS, LL, LH);
Expand Down Expand Up @@ -2892,13 +2911,20 @@ void DAGTypeLegalizer::ExpandIntRes_MULFIX(SDNode *N, SDValue &Lo,
if (!Saturating)
return;

// Can not overflow when there is no integer part.
if (Scale == VTSize)
return;

// To handle saturation we must check for overflow in the multiplication.
//
// Unsigned overflow happened if the upper (VTSize - Scale) bits (of Result)
// aren't all zeroes.
//
// Signed overflow happened if the upper (VTSize - Scale + 1) bits (of Result)
// aren't all ones or all zeroes.
//
// We cannot overflow past HH when multiplying 2 ints of size VTSize, so the
// highest bit of HH determines saturation direction in the event of
// highest bit of HH determines saturation direction in the event of signed
// saturation.

SDValue ResultHL = Result[2];
Expand All @@ -2909,8 +2935,30 @@ void DAGTypeLegalizer::ExpandIntRes_MULFIX(SDNode *N, SDValue &Lo,
SDValue NVTNeg1 = DAG.getConstant(-1, dl, NVT);
EVT BoolNVT = getSetCCResultType(NVT);

if (!Signed)
llvm_unreachable("Unsigned saturation not supported (yet).");
if (!Signed) {
if (Scale < NVTSize) {
// Overflow happened if ((HH | (HL >> Scale)) != 0).
SDValue HLAdjusted = DAG.getNode(ISD::SRL, dl, NVT, ResultHL,
DAG.getConstant(Scale, dl, ShiftTy));
SDValue Tmp = DAG.getNode(ISD::OR, dl, NVT, HLAdjusted, ResultHH);
SatMax = DAG.getSetCC(dl, BoolNVT, Tmp, NVTZero, ISD::SETNE);
} else if (Scale == NVTSize) {
// Overflow happened if (HH != 0).
SatMax = DAG.getSetCC(dl, BoolNVT, ResultHH, NVTZero, ISD::SETNE);
} else if (Scale < VTSize) {
// Overflow happened if ((HH >> (Scale - NVTSize)) != 0).
SDValue HLAdjusted = DAG.getNode(ISD::SRL, dl, NVT, ResultHL,
DAG.getConstant(Scale - NVTSize, dl,
ShiftTy));
SatMax = DAG.getSetCC(dl, BoolNVT, HLAdjusted, NVTZero, ISD::SETNE);
} else
llvm_unreachable("Scale must be less or equal to VTSize for UMULFIXSAT"
"(and saturation can't happen with Scale==VTSize).");

Hi = DAG.getSelect(dl, NVT, SatMax, NVTNeg1, Hi);
Lo = DAG.getSelect(dl, NVT, SatMax, NVTNeg1, Lo);
return;
}

if (Scale < NVTSize) {
// The number of overflow bits we can check are VTSize - Scale + 1 (we
Expand Down
13 changes: 7 additions & 6 deletions llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,8 @@ SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
break;
case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX: {
case ISD::UMULFIX:
case ISD::UMULFIXSAT: {
unsigned Scale = Node->getConstantOperandVal(2);
Action = TLI.getFixedPointOperationAction(Node->getOpcode(),
Node->getValueType(0), Scale);
Expand Down Expand Up @@ -834,11 +835,11 @@ SDValue VectorLegalizer::Expand(SDValue Op) {
case ISD::UMULFIX:
return ExpandFixedPointMul(Op);
case ISD::SMULFIXSAT:
// FIXME: We do not expand SMULFIXSAT here yet, not sure why. Maybe it
// results in worse codegen compared to the default unroll? This should
// probably be investigated. And if we still prefer to unroll an explanation
// could be helpful, otherwise it just looks like something that hasn't been
// "implemented" yet.
case ISD::UMULFIXSAT:
// FIXME: We do not expand SMULFIXSAT/UMULFIXSAT here yet, not sure exactly
// why. Maybe it results in worse codegen compared to the unroll for some
// targets? This should probably be investigated. And if we still prefer to
// unroll an explanation could be helpful.
return DAG.UnrollVectorOp(Op.getNode());
case ISD::STRICT_FADD:
case ISD::STRICT_FSUB:
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,7 @@ void DAGTypeLegalizer::ScalarizeVectorResult(SDNode *N, unsigned ResNo) {
case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX:
case ISD::UMULFIXSAT:
R = ScalarizeVecRes_MULFIX(N);
break;
}
Expand Down Expand Up @@ -1002,6 +1003,7 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, unsigned ResNo) {
case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX:
case ISD::UMULFIXSAT:
SplitVecRes_MULFIX(N, Lo, Hi);
break;
}
Expand Down Expand Up @@ -2765,6 +2767,7 @@ void DAGTypeLegalizer::WidenVectorResult(SDNode *N, unsigned ResNo) {
case ISD::SMULFIX:
case ISD::SMULFIXSAT:
case ISD::UMULFIX:
case ISD::UMULFIXSAT:
// These are binary operations, but with an extra operand that shouldn't
// be widened (the scale).
Res = WidenVecRes_BinaryWithExtraScalarOp(N);
Expand Down
Loading

0 comments on commit 5e331e4

Please sign in to comment.