From d9c4c312d81b4a2059d0ca2bb454c3452e52042e Mon Sep 17 00:00:00 2001 From: Zahira Ammarguellat Date: Wed, 20 Mar 2024 12:14:09 -0700 Subject: [PATCH] [CLANG] Full support of complex multiplication and division. (#81514) In clang there are two options `-f[no]cx-limited-range` and `-f[no]cx-fortran-rules` that control the range of complex multiplication and division. However, it is unclear how these options interact with one another. For instance, what should happen when the users compile with `-fcx-fortran-rules -fno-cx-limited-range` or `-fcx-limited-range -fno-cx-fortran-rules`? In this patch we are introducing a new option to solve the issue and give a greater flexibility to the user to control the behavior of the compiler when performing multiplication and division of complex floating-point values. `-fcomplex-arihmetic=[full|improved|promoted|basic]` `full`: Implementation of complex division and multiplication using a call to runtime library functions (generally the case, but the BE might sometimes replace the library call if it knows enough about the potential range of the inputs). Overflow and non-finite values are handled by the library implementation. For the case of multiplication overflow will occur in accordance with normal floating-point rules. This is the default value. `improved`: Implementation of complex division using the Smith algorithm at source precision. Smith's algorithm for complex division. See SMITH, R. L. Algorithm 116: Complex division. Commun. ACM 5, 8 (1962). This value offers improved handling for overflow in intermediate calculations, but overflow may occur. NaN and infinite values are not handled in some cases. `promoted`: Implementation of complex division using algebraic formulas at higher precision. Overflow is handled. Non-finite values are handled in some cases. If the target does not have native support for a higher precision data type, the implementation for the complex operation using the Smith algorithm will be used. Overflow may still occur in some cases. NaN and infinite values are not handled. `basic`: Implementation of complex division and multiplication using algebraic formulas at source precision. No special handling to avoid overflow. NaN and infinite values are not handled. `fcx-limited-range` will alias `-fcomplex-arithmetic=basic` `-fcx-fortran-rules` will alias `-fcomplex-arithmetic=improved` `-fno-cx-limited-range` and `-fno-cx-fortran-rules` will alias `-fcomplex-arithmetic=full` The complex division and multiplication will be implemented as follows depending on the option used. -fcomplex-arithmetic | div | mul -- | -- | -- basic | algebraic form | algebraic form improved | smith's algorithm | algebraic form full | libcall | libcall + nan processing promoted | algebraic form + higher precision | algebraic form --- clang/docs/UsersManual.rst | 45 +- .../clang/Basic/DiagnosticCommonKinds.td | 5 + clang/include/clang/Basic/LangOptions.h | 33 +- clang/include/clang/Driver/Options.td | 37 +- clang/lib/CodeGen/CGExprComplex.cpp | 83 +- clang/lib/Driver/ToolChains/Clang.cpp | 173 +- clang/lib/Parse/ParsePragma.cpp | 11 +- clang/test/CodeGen/X86/cx-complex-range.c | 1425 +++++++ clang/test/CodeGen/complex-math.c | 2 +- clang/test/CodeGen/cx-complex-range.c | 3519 ++++++++++++++++- clang/test/CodeGen/pragma-cx-limited-range.c | 224 +- clang/test/CodeGen/smiths-complex-div.c | 104 +- clang/test/Driver/range.c | 169 +- 13 files changed, 5518 insertions(+), 312 deletions(-) create mode 100644 clang/test/CodeGen/X86/cx-complex-range.c diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index 7a63d720241a7a..129e75fc9a785a 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -1847,19 +1847,50 @@ floating point semantic models: precise (the default), strict, and fast. * ``16`` - Forces ``_Float16`` operations to be emitted without using excess precision arithmetic. +.. option:: -fcomplex-arithmetic=: + + This option specifies the implementation for complex multiplication and division. + + Valid values are: ``basic``, ``improved``, ``full`` and ``promoted``. + + * ``basic`` Implementation of complex division and multiplication using + algebraic formulas at source precision. No special handling to avoid + overflow. NaN and infinite values are not handled. + * ``improved`` Implementation of complex division using the Smith algorithm + at source precision. Smith's algorithm for complex division. + See SMITH, R. L. Algorithm 116: Complex division. Commun. ACM 5, 8 (1962). + This value offers improved handling for overflow in intermediate + calculations, but overflow may occur. NaN and infinite values are not + handled in some cases. + * ``full`` Implementation of complex division and multiplication using a + call to runtime library functions (generally the case, but the BE might + sometimes replace the library call if it knows enough about the potential + range of the inputs). Overflow and non-finite values are handled by the + library implementation. For the case of multiplication overflow will occur in + accordance with normal floating-point rules. This is the default value. + * ``promoted`` Implementation of complex division using algebraic formulas at + higher precision. Overflow is handled. Non-finite values are handled in some + cases. If the target does not have native support for a higher precision + data type, the implementation for the complex operation using the Smith + algorithm will be used. Overflow may still occur in some cases. NaN and + infinite values are not handled. + .. option:: -fcx-limited-range: - This option enables the naive mathematical formulas for complex division and - multiplication with no NaN checking of results. The default is - ``-fno-cx-limited-range``, but this option is enabled by the ``-ffast-math`` + This option is aliased to ``-fcomplex-arithmetic=basic``. It enables the + naive mathematical formulas for complex division and multiplication with no + NaN checking of results. The default is ``-fno-cx-limited-range`` aliased to + ``-fcomplex-arithmetic=full``. This option is enabled by the ``-ffast-math`` option. .. option:: -fcx-fortran-rules: - This option enables the naive mathematical formulas for complex - multiplication and enables application of Smith's algorithm for complex - division. See SMITH, R. L. Algorithm 116: Complex division. Commun. - ACM 5, 8 (1962). The default is ``-fno-cx-fortran-rules``. + This option is aliased to ``-fcomplex-arithmetic=improved``. It enables the + naive mathematical formulas for complex multiplication and enables application + of Smith's algorithm for complex division. See SMITH, R. L. Algorithm 116: + Complex division. Commun. ACM 5, 8 (1962). + The default is ``-fno-cx-fortran-rules`` aliased to + ``-fcomplex-arithmetic=full``. .. _floating-point-environment: diff --git a/clang/include/clang/Basic/DiagnosticCommonKinds.td b/clang/include/clang/Basic/DiagnosticCommonKinds.td index 43e132e5665850..a52bf62e24202c 100644 --- a/clang/include/clang/Basic/DiagnosticCommonKinds.td +++ b/clang/include/clang/Basic/DiagnosticCommonKinds.td @@ -45,6 +45,11 @@ def note_using : Note<"using">; def note_possibility : Note<"one possibility">; def note_also_found : Note<"also found">; +def warn_next_larger_fp_type_same_size_than_fp : Warning< + "higher precision floating-point type size has the same size than " + "floating-point type size">, + InGroup>; + // Parse && Lex let CategoryName = "Lexical or Preprocessor Issue" in { diff --git a/clang/include/clang/Basic/LangOptions.h b/clang/include/clang/Basic/LangOptions.h index 08fc706e3cbf74..24b109e32cdd3e 100644 --- a/clang/include/clang/Basic/LangOptions.h +++ b/clang/include/clang/Basic/LangOptions.h @@ -396,7 +396,38 @@ class LangOptionsBase { IncompleteOnly = 3, }; - enum ComplexRangeKind { CX_Full, CX_Limited, CX_Fortran, CX_None }; + /// Controls the various implementations for complex multiplication and + // division. + enum ComplexRangeKind { + /// Implementation of complex division and multiplication using a call to + /// runtime library functions(generally the case, but the BE might + /// sometimes replace the library call if it knows enough about the + /// potential range of the inputs). Overflow and non-finite values are + /// handled by the library implementation. This is the default value. + CX_Full, + + /// Implementation of complex division offering an improved handling + /// for overflow in intermediate calculations with no special handling for + /// NaN and infinite values. + CX_Improved, + + /// Implementation of complex division using algebraic formulas at + /// higher precision. Overflow is handled. Non-finite values are handled in + /// some cases. If the target hardware does not have native support for a + /// higher precision data type, an implementation for the complex operation + /// will be used to provide improved guards against intermediate overflow, + /// but overflow and underflow may still occur in some cases. NaN and + /// infinite values are not handled. + CX_Promoted, + + /// Implementation of complex division and multiplication using + /// algebraic formulas at source precision. No special handling to avoid + /// overflow. NaN and infinite values are not handled. + CX_Basic, + + /// No range rule is enabled. + CX_None + }; // Define simple language options (with no accessors). #define LANGOPT(Name, Bits, Default, Description) unsigned Name : Bits; diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 9b5125ecfed8b3..4a954258ce40b6 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1046,30 +1046,29 @@ defm offload_uniform_block : BoolFOption<"offload-uniform-block", NegFlag, BothFlags<[], [ClangOption], " that kernels are launched with uniform block sizes (default true for CUDA/HIP and false otherwise)">>; -def fcx_limited_range : Joined<["-"], "fcx-limited-range">, - Group, Visibility<[ClangOption, CC1Option]>, - HelpText<"Basic algebraic expansions of complex arithmetic operations " - "involving are enabled.">; - -def fno_cx_limited_range : Joined<["-"], "fno-cx-limited-range">, - Group, Visibility<[ClangOption, CC1Option]>, - HelpText<"Basic algebraic expansions of complex arithmetic operations " - "involving are disabled.">; - -def fcx_fortran_rules : Joined<["-"], "fcx-fortran-rules">, - Group, Visibility<[ClangOption, CC1Option]>, - HelpText<"Range reduction is enabled for complex arithmetic operations.">; - -def fno_cx_fortran_rules : Joined<["-"], "fno-cx-fortran-rules">, - Group, Visibility<[ClangOption, CC1Option]>, - HelpText<"Range reduction is disabled for complex arithmetic operations.">; +def fcomplex_arithmetic_EQ : Joined<["-"], "fcomplex-arithmetic=">, Group, + Visibility<[ClangOption, CC1Option]>, + Values<"full,improved,promoted,basic">, NormalizedValuesScope<"LangOptions">, + NormalizedValues<["CX_Full", "CX_Improved", "CX_Promoted", "CX_Basic"]>; def complex_range_EQ : Joined<["-"], "complex-range=">, Group, Visibility<[CC1Option]>, - Values<"full,limited,fortran">, NormalizedValuesScope<"LangOptions">, - NormalizedValues<["CX_Full", "CX_Limited", "CX_Fortran"]>, + Values<"full,improved,promoted,basic">, NormalizedValuesScope<"LangOptions">, + NormalizedValues<["CX_Full", "CX_Improved", "CX_Promoted", "CX_Basic"]>, MarshallingInfoEnum, "CX_Full">; +defm cx_limited_range: BoolOptionWithoutMarshalling<"f", "cx-limited-range", + PosFlag, + NegFlag>; + +defm cx_fortran_rules: BoolOptionWithoutMarshalling<"f", "cx-fortran-rules", + PosFlag, + NegFlag>; + // OpenCL-only Options def cl_opt_disable : Flag<["-"], "cl-opt-disable">, Group, Visibility<[ClangOption, CC1Option]>, diff --git a/clang/lib/CodeGen/CGExprComplex.cpp b/clang/lib/CodeGen/CGExprComplex.cpp index 0266ba934da62c..27ddaacc28f522 100644 --- a/clang/lib/CodeGen/CGExprComplex.cpp +++ b/clang/lib/CodeGen/CGExprComplex.cpp @@ -51,11 +51,12 @@ class ComplexExprEmitter CGBuilderTy &Builder; bool IgnoreReal; bool IgnoreImag; -public: - ComplexExprEmitter(CodeGenFunction &cgf, bool ir=false, bool ii=false) - : CGF(cgf), Builder(CGF.Builder), IgnoreReal(ir), IgnoreImag(ii) { - } + bool FPHasBeenPromoted; +public: + ComplexExprEmitter(CodeGenFunction &cgf, bool ir = false, bool ii = false) + : CGF(cgf), Builder(CGF.Builder), IgnoreReal(ir), IgnoreImag(ii), + FPHasBeenPromoted(false) {} //===--------------------------------------------------------------------===// // Utilities @@ -287,9 +288,54 @@ class ComplexExprEmitter ComplexPairTy EmitComplexBinOpLibCall(StringRef LibCallName, const BinOpInfo &Op); - QualType getPromotionType(QualType Ty) { + QualType GetHigherPrecisionFPType(QualType ElementType) { + const auto *CurrentBT = dyn_cast(ElementType); + switch (CurrentBT->getKind()) { + case BuiltinType::Kind::Float16: + return CGF.getContext().FloatTy; + case BuiltinType::Kind::Float: + case BuiltinType::Kind::BFloat16: + return CGF.getContext().DoubleTy; + case BuiltinType::Kind::Double: + return CGF.getContext().LongDoubleTy; + default: + return ElementType; + } + } + + QualType HigherPrecisionTypeForComplexArithmetic(QualType ElementType, + bool IsDivOpCode) { + QualType HigherElementType = GetHigherPrecisionFPType(ElementType); + const llvm::fltSemantics &ElementTypeSemantics = + CGF.getContext().getFloatTypeSemantics(ElementType); + const llvm::fltSemantics &HigherElementTypeSemantics = + CGF.getContext().getFloatTypeSemantics(HigherElementType); + // Check that the promoted type can handle the intermediate values without + // overflowing. This can be interpreted as: + // (SmallerType.LargestFiniteVal * SmallerType.LargestFiniteVal) * 2 <= + // LargerType.LargestFiniteVal. + // In terms of exponent it gives this formula: + // (SmallerType.LargestFiniteVal * SmallerType.LargestFiniteVal + // doubles the exponent of SmallerType.LargestFiniteVal) + if (llvm::APFloat::semanticsMaxExponent(ElementTypeSemantics) * 2 + 1 <= + llvm::APFloat::semanticsMaxExponent(HigherElementTypeSemantics)) { + return CGF.getContext().getComplexType(HigherElementType); + } else { + FPHasBeenPromoted = true; + DiagnosticsEngine &Diags = CGF.CGM.getDiags(); + Diags.Report(diag::warn_next_larger_fp_type_same_size_than_fp); + return CGF.getContext().getComplexType(ElementType); + } + } + + QualType getPromotionType(QualType Ty, bool IsDivOpCode = false) { if (auto *CT = Ty->getAs()) { QualType ElementType = CT->getElementType(); + if (IsDivOpCode && ElementType->isFloatingType() && + CGF.getLangOpts().getComplexRange() == + LangOptions::ComplexRangeKind::CX_Promoted) + return HigherPrecisionTypeForComplexArithmetic(ElementType, + IsDivOpCode); if (ElementType.UseExcessPrecision(CGF.getContext())) return CGF.getContext().getComplexType(CGF.getContext().FloatTy); } @@ -300,11 +346,12 @@ class ComplexExprEmitter #define HANDLEBINOP(OP) \ ComplexPairTy VisitBin##OP(const BinaryOperator *E) { \ - QualType promotionTy = getPromotionType(E->getType()); \ + QualType promotionTy = getPromotionType( \ + E->getType(), \ + (E->getOpcode() == BinaryOperatorKind::BO_Div) ? true : false); \ ComplexPairTy result = EmitBin##OP(EmitBinOps(E, promotionTy)); \ if (!promotionTy.isNull()) \ - result = \ - CGF.EmitUnPromotedValue(result, E->getType()); \ + result = CGF.EmitUnPromotedValue(result, E->getType()); \ return result; \ } @@ -794,8 +841,9 @@ ComplexPairTy ComplexExprEmitter::EmitBinMul(const BinOpInfo &Op) { ResR = Builder.CreateFSub(AC, BD, "mul_r"); ResI = Builder.CreateFAdd(AD, BC, "mul_i"); - if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Limited || - Op.FPFeatures.getComplexRange() == LangOptions::CX_Fortran) + if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Basic || + Op.FPFeatures.getComplexRange() == LangOptions::CX_Improved || + Op.FPFeatures.getComplexRange() == LangOptions::CX_Promoted) return ComplexPairTy(ResR, ResI); // Emit the test for the real part becoming NaN and create a branch to @@ -986,14 +1034,17 @@ ComplexPairTy ComplexExprEmitter::EmitBinDiv(const BinOpInfo &Op) { llvm::Value *OrigLHSi = LHSi; if (!LHSi) LHSi = llvm::Constant::getNullValue(RHSi->getType()); - if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Fortran) + QualType ComplexElementTy = Op.Ty->castAs()->getElementType(); + if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Improved || + (Op.FPFeatures.getComplexRange() == LangOptions::CX_Promoted && + FPHasBeenPromoted)) return EmitRangeReductionDiv(LHSr, LHSi, RHSr, RHSi); - else if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Limited) + else if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Basic || + Op.FPFeatures.getComplexRange() == LangOptions::CX_Promoted) return EmitAlgebraicDiv(LHSr, LHSi, RHSr, RHSi); - else if (!CGF.getLangOpts().FastMath || - // '-ffast-math' is used in the command line but followed by an - // '-fno-cx-limited-range'. - Op.FPFeatures.getComplexRange() == LangOptions::CX_Full) { + // '-ffast-math' is used in the command line but followed by an + // '-fno-cx-limited-range' or '-fcomplex-arithmetic=full'. + else if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Full) { LHSi = OrigLHSi; // If we have a complex operand on the RHS and FastMath is not allowed, we // delegate to a libcall to handle all of the complexities and minimize diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index bcdf2737bc7ae0..bc9cc8ce6cf5a5 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2687,60 +2687,43 @@ static void CollectArgsForIntegratedAssembler(Compilation &C, } } -static StringRef EnumComplexRangeToStr(LangOptions::ComplexRangeKind Range, - StringRef Option) { +static std::string ComplexRangeKindToStr(LangOptions::ComplexRangeKind Range) { switch (Range) { - case LangOptions::ComplexRangeKind::CX_Limited: - return "-fcx-limited-range"; + case LangOptions::ComplexRangeKind::CX_Full: + return "full"; break; - case LangOptions::ComplexRangeKind::CX_Fortran: - return "-fcx-fortran-rules"; + case LangOptions::ComplexRangeKind::CX_Basic: + return "basic"; break; - default: - return Option; + case LangOptions::ComplexRangeKind::CX_Improved: + return "improved"; + break; + case LangOptions::ComplexRangeKind::CX_Promoted: + return "promoted"; break; + default: + return ""; } } -static void EmitComplexRangeDiag(const Driver &D, - LangOptions::ComplexRangeKind Range1, - LangOptions::ComplexRangeKind Range2, - StringRef Option = StringRef()) { - if (Range1 != Range2 && Range1 != LangOptions::ComplexRangeKind::CX_None) { - bool NegateFortranOption = false; - bool NegateLimitedOption = false; - if (!Option.empty()) { - NegateFortranOption = - Range1 == LangOptions::ComplexRangeKind::CX_Fortran && - Option == "-fno-cx-fortran-rules"; - NegateLimitedOption = - Range1 == LangOptions::ComplexRangeKind::CX_Limited && - Option == "-fno-cx-limited-range"; - } - if (Option.empty() || - (!Option.empty() && !NegateFortranOption && !NegateLimitedOption)) - D.Diag(clang::diag::warn_drv_overriding_option) - << EnumComplexRangeToStr(Range1, Option) - << EnumComplexRangeToStr(Range2, Option); +static std::string ComplexArithmeticStr(LangOptions::ComplexRangeKind Range) { + return (Range == LangOptions::ComplexRangeKind::CX_None) + ? "" + : "-fcomplex-arithmetic=" + ComplexRangeKindToStr(Range); +} + +static void EmitComplexRangeDiag(const Driver &D, std::string str1, + std::string str2) { + if ((str1.compare(str2) != 0) && !str2.empty() && !str1.empty()) { + D.Diag(clang::diag::warn_drv_overriding_option) << str1 << str2; } } static std::string RenderComplexRangeOption(LangOptions::ComplexRangeKind Range) { - std::string ComplexRangeStr = "-complex-range="; - switch (Range) { - case LangOptions::ComplexRangeKind::CX_Full: - ComplexRangeStr += "full"; - break; - case LangOptions::ComplexRangeKind::CX_Limited: - ComplexRangeStr += "limited"; - break; - case LangOptions::ComplexRangeKind::CX_Fortran: - ComplexRangeStr += "fortran"; - break; - default: - assert(0 && "Unexpected range option"); - } + std::string ComplexRangeStr = ComplexRangeKindToStr(Range); + if (!ComplexRangeStr.empty()) + return "-complex-range=" + ComplexRangeStr; return ComplexRangeStr; } @@ -2792,6 +2775,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, StringRef BFloat16ExcessPrecision = ""; LangOptions::ComplexRangeKind Range = LangOptions::ComplexRangeKind::CX_None; std::string ComplexRangeStr = ""; + std::string GccRangeComplexOption = ""; // Lambda to set fast-math options. This is also used by -ffp-model=fast auto applyFastMath = [&]() { @@ -2807,9 +2791,19 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, FPExceptionBehavior = ""; // If fast-math is set then set the fp-contract mode to fast. FPContract = "fast"; - // ffast-math enables limited range rules for complex multiplication and + // ffast-math enables basic range rules for complex multiplication and // division. - Range = LangOptions::ComplexRangeKind::CX_Limited; + // Warn if user expects to perform full implementation of complex + // multiplication or division in the presence of nan or ninf flags. + if (Range == LangOptions::ComplexRangeKind::CX_Full || + Range == LangOptions::ComplexRangeKind::CX_Improved || + Range == LangOptions::ComplexRangeKind::CX_Promoted) + EmitComplexRangeDiag( + D, ComplexArithmeticStr(Range), + !GccRangeComplexOption.empty() + ? GccRangeComplexOption + : ComplexArithmeticStr(LangOptions::ComplexRangeKind::CX_Basic)); + Range = LangOptions::ComplexRangeKind::CX_Basic; SeenUnsafeMathModeOption = true; }; @@ -2824,26 +2818,87 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, switch (optID) { default: break; - case options::OPT_fcx_limited_range: { - EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Limited); - Range = LangOptions::ComplexRangeKind::CX_Limited; + case options::OPT_fcx_limited_range: + if (GccRangeComplexOption.empty()) { + if (Range != LangOptions::ComplexRangeKind::CX_Basic) + EmitComplexRangeDiag(D, RenderComplexRangeOption(Range), + "-fcx-limited-range"); + } else { + if (GccRangeComplexOption != "-fno-cx-limited-range") + EmitComplexRangeDiag(D, GccRangeComplexOption, "-fcx-limited-range"); + } + GccRangeComplexOption = "-fcx-limited-range"; + Range = LangOptions::ComplexRangeKind::CX_Basic; break; - } case options::OPT_fno_cx_limited_range: - EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Full, - "-fno-cx-limited-range"); + if (GccRangeComplexOption.empty()) { + EmitComplexRangeDiag(D, RenderComplexRangeOption(Range), + "-fno-cx-limited-range"); + } else { + if (GccRangeComplexOption.compare("-fcx-limited-range") != 0 && + GccRangeComplexOption.compare("-fno-cx-fortran-rules") != 0) + EmitComplexRangeDiag(D, GccRangeComplexOption, + "-fno-cx-limited-range"); + } + GccRangeComplexOption = "-fno-cx-limited-range"; Range = LangOptions::ComplexRangeKind::CX_Full; break; - case options::OPT_fcx_fortran_rules: { - EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Fortran); - Range = LangOptions::ComplexRangeKind::CX_Fortran; + case options::OPT_fcx_fortran_rules: + if (GccRangeComplexOption.empty()) + EmitComplexRangeDiag(D, RenderComplexRangeOption(Range), + "-fcx-fortran-rules"); + else + EmitComplexRangeDiag(D, GccRangeComplexOption, "-fcx-fortran-rules"); + GccRangeComplexOption = "-fcx-fortran-rules"; + Range = LangOptions::ComplexRangeKind::CX_Improved; break; - } case options::OPT_fno_cx_fortran_rules: - EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Full, - "-fno-cx-fortran-rules"); + if (GccRangeComplexOption.empty()) { + EmitComplexRangeDiag(D, RenderComplexRangeOption(Range), + "-fno-cx-fortran-rules"); + } else { + if (GccRangeComplexOption != "-fno-cx-limited-range") + EmitComplexRangeDiag(D, GccRangeComplexOption, + "-fno-cx-fortran-rules"); + } + GccRangeComplexOption = "-fno-cx-fortran-rules"; Range = LangOptions::ComplexRangeKind::CX_Full; break; + case options::OPT_fcomplex_arithmetic_EQ: { + LangOptions::ComplexRangeKind RangeVal; + StringRef Val = A->getValue(); + if (Val.equals("full")) + RangeVal = LangOptions::ComplexRangeKind::CX_Full; + else if (Val.equals("improved")) + RangeVal = LangOptions::ComplexRangeKind::CX_Improved; + else if (Val.equals("promoted")) + RangeVal = LangOptions::ComplexRangeKind::CX_Promoted; + else if (Val.equals("basic")) + RangeVal = LangOptions::ComplexRangeKind::CX_Basic; + else { + D.Diag(diag::err_drv_unsupported_option_argument) + << A->getSpelling() << Val; + break; + } + if (!GccRangeComplexOption.empty()) { + if (GccRangeComplexOption.compare("-fcx-limited-range") != 0) { + if (GccRangeComplexOption.compare("-fcx-fortran-rules") != 0) { + if (RangeVal != LangOptions::ComplexRangeKind::CX_Improved) + EmitComplexRangeDiag(D, GccRangeComplexOption, + ComplexArithmeticStr(RangeVal)); + } else { + EmitComplexRangeDiag(D, GccRangeComplexOption, + ComplexArithmeticStr(RangeVal)); + } + } else { + if (RangeVal != LangOptions::ComplexRangeKind::CX_Basic) + EmitComplexRangeDiag(D, GccRangeComplexOption, + ComplexArithmeticStr(RangeVal)); + } + } + Range = RangeVal; + break; + } case options::OPT_ffp_model_EQ: { // If -ffp-model= is seen, reset to fno-fast-math HonorINFs = true; @@ -3256,8 +3311,12 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, if (Range != LangOptions::ComplexRangeKind::CX_None) ComplexRangeStr = RenderComplexRangeOption(Range); - if (!ComplexRangeStr.empty()) + if (!ComplexRangeStr.empty()) { CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr)); + if (Args.hasArg(options::OPT_fcomplex_arithmetic_EQ)) + CmdArgs.push_back(Args.MakeArgString("-fcomplex-arithmetic=" + + ComplexRangeKindToStr(Range))); + } if (Args.hasArg(options::OPT_fcx_limited_range)) CmdArgs.push_back("-fcx-limited-range"); if (Args.hasArg(options::OPT_fcx_fortran_rules)) diff --git a/clang/lib/Parse/ParsePragma.cpp b/clang/lib/Parse/ParsePragma.cpp index 730ac1a0fee5cc..0f692e2146a490 100644 --- a/clang/lib/Parse/ParsePragma.cpp +++ b/clang/lib/Parse/ParsePragma.cpp @@ -844,6 +844,11 @@ void Parser::HandlePragmaFPContract() { FPC = LangOptions::FPM_Off; break; case tok::OOS_DEFAULT: + // According to ISO C99 standard chapter 7.3.4, the default value + // for the pragma is ``off'. '-fcomplex-arithmetic=basic', + // '-fcx-limited-range', '-fcx-fortran-rules' and + // '-fcomplex-arithmetic=improved' control the default value of these + // pragmas. FPC = getLangOpts().getDefaultFPContractMode(); break; } @@ -909,15 +914,15 @@ void Parser::HandlePragmaCXLimitedRange() { LangOptions::ComplexRangeKind Range; switch (OOS) { case tok::OOS_ON: - Range = LangOptions::CX_Limited; + Range = LangOptions::CX_Basic; break; case tok::OOS_OFF: Range = LangOptions::CX_Full; break; case tok::OOS_DEFAULT: // According to ISO C99 standard chapter 7.3.4, the default value - // for the pragma is ``off'. -fcx-limited-range and -fcx-fortran-rules - // control the default value of these pragmas. + // for the pragma is ``off'. -fcomplex-arithmetic controls the default value + // of these pragmas. Range = getLangOpts().getComplexRange(); break; } diff --git a/clang/test/CodeGen/X86/cx-complex-range.c b/clang/test/CodeGen/X86/cx-complex-range.c new file mode 100644 index 00000000000000..fa46576266a205 --- /dev/null +++ b/clang/test/CodeGen/X86/cx-complex-range.c @@ -0,0 +1,1425 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -o - | FileCheck %s --check-prefix=FULL + +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -complex-range=basic -o - | FileCheck %s --check-prefix=BASIC + +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -fno-cx-limited-range -o - | FileCheck %s --check-prefix=FULL + +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -complex-range=improved -o - | FileCheck %s --check-prefix=IMPRVD + +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -complex-range=promoted -o - | FileCheck %s --check-prefix=PRMTD + +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -complex-range=full -o - | FileCheck %s --check-prefix=FULL + +// RUN: %clang_cc1 -triple x86_64-windows-pc -complex-range=promoted \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=X86WINPRMTD + +// Fast math +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu \ +// RUN: -ffast-math -complex-range=basic -emit-llvm -o - %s \ +// RUN: | FileCheck %s --check-prefix=BASIC_FAST + +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu \ +// RUN: -ffast-math -complex-range=full -emit-llvm -o - %s \ +// RUN: | FileCheck %s --check-prefix=FULL_FAST + +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -fno-cx-fortran-rules -o - | FileCheck %s --check-prefix=FULL + +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu \ +// RUN: -ffast-math -complex-range=improved -emit-llvm -o - %s \ +// RUN: | FileCheck %s --check-prefix=IMPRVD_FAST + +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu \ +// RUN: -ffast-math -complex-range=promoted -emit-llvm -o - %s \ +// RUN: | FileCheck %s --check-prefix=PRMTD_FAST + +// FULL-LABEL: define dso_local <2 x half> @divf16( +// FULL-SAME: <2 x half> noundef [[A_COERCE:%.*]], <2 x half> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// FULL-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// FULL-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// FULL-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// FULL-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// FULL-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// FULL-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// FULL-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// FULL-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// FULL-NEXT: [[CALL:%.*]] = call <2 x float> @__divsc3(float noundef [[EXT]], float noundef [[EXT1]], float noundef [[EXT2]], float noundef [[EXT3]]) #[[ATTR1:[0-9]+]] +// FULL-NEXT: store <2 x float> [[CALL]], ptr [[COERCE]], align 4 +// FULL-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[COERCE_REAL]] to half +// FULL-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[COERCE_IMAG]] to half +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// FULL-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// FULL-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// FULL-NEXT: ret <2 x half> [[TMP0]] +// +// BASIC-LABEL: define dso_local <2 x half> @divf16( +// BASIC-SAME: <2 x half> noundef [[A_COERCE:%.*]], <2 x half> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// BASIC-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// BASIC-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// BASIC-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// BASIC-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// BASIC-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// BASIC-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// BASIC-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// BASIC-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// BASIC-NEXT: [[TMP0:%.*]] = fmul float [[EXT]], [[EXT2]] +// BASIC-NEXT: [[TMP1:%.*]] = fmul float [[EXT1]], [[EXT3]] +// BASIC-NEXT: [[TMP2:%.*]] = fadd float [[TMP0]], [[TMP1]] +// BASIC-NEXT: [[TMP3:%.*]] = fmul float [[EXT2]], [[EXT2]] +// BASIC-NEXT: [[TMP4:%.*]] = fmul float [[EXT3]], [[EXT3]] +// BASIC-NEXT: [[TMP5:%.*]] = fadd float [[TMP3]], [[TMP4]] +// BASIC-NEXT: [[TMP6:%.*]] = fmul float [[EXT1]], [[EXT2]] +// BASIC-NEXT: [[TMP7:%.*]] = fmul float [[EXT]], [[EXT3]] +// BASIC-NEXT: [[TMP8:%.*]] = fsub float [[TMP6]], [[TMP7]] +// BASIC-NEXT: [[TMP9:%.*]] = fdiv float [[TMP2]], [[TMP5]] +// BASIC-NEXT: [[TMP10:%.*]] = fdiv float [[TMP8]], [[TMP5]] +// BASIC-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP9]] to half +// BASIC-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP10]] to half +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// BASIC-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// BASIC-NEXT: [[TMP11:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// BASIC-NEXT: ret <2 x half> [[TMP11]] +// +// IMPRVD-LABEL: define dso_local <2 x half> @divf16( +// IMPRVD-SAME: <2 x half> noundef [[A_COERCE:%.*]], <2 x half> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// IMPRVD-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// IMPRVD-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// IMPRVD-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// IMPRVD-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// IMPRVD-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// IMPRVD-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// IMPRVD-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// IMPRVD-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// IMPRVD-NEXT: [[TMP0:%.*]] = call float @llvm.fabs.f32(float [[EXT2]]) +// IMPRVD-NEXT: [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[EXT3]]) +// IMPRVD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt float [[TMP0]], [[TMP1]] +// IMPRVD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD-NEXT: [[TMP2:%.*]] = fdiv float [[EXT3]], [[EXT2]] +// IMPRVD-NEXT: [[TMP3:%.*]] = fmul float [[TMP2]], [[EXT3]] +// IMPRVD-NEXT: [[TMP4:%.*]] = fadd float [[EXT2]], [[TMP3]] +// IMPRVD-NEXT: [[TMP5:%.*]] = fmul float [[EXT1]], [[TMP2]] +// IMPRVD-NEXT: [[TMP6:%.*]] = fadd float [[EXT]], [[TMP5]] +// IMPRVD-NEXT: [[TMP7:%.*]] = fdiv float [[TMP6]], [[TMP4]] +// IMPRVD-NEXT: [[TMP8:%.*]] = fmul float [[EXT]], [[TMP2]] +// IMPRVD-NEXT: [[TMP9:%.*]] = fsub float [[EXT1]], [[TMP8]] +// IMPRVD-NEXT: [[TMP10:%.*]] = fdiv float [[TMP9]], [[TMP4]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD: abs_rhsr_less_than_abs_rhsi: +// IMPRVD-NEXT: [[TMP11:%.*]] = fdiv float [[EXT2]], [[EXT3]] +// IMPRVD-NEXT: [[TMP12:%.*]] = fmul float [[TMP11]], [[EXT2]] +// IMPRVD-NEXT: [[TMP13:%.*]] = fadd float [[EXT3]], [[TMP12]] +// IMPRVD-NEXT: [[TMP14:%.*]] = fmul float [[EXT]], [[TMP11]] +// IMPRVD-NEXT: [[TMP15:%.*]] = fadd float [[TMP14]], [[EXT1]] +// IMPRVD-NEXT: [[TMP16:%.*]] = fdiv float [[TMP15]], [[TMP13]] +// IMPRVD-NEXT: [[TMP17:%.*]] = fmul float [[EXT1]], [[TMP11]] +// IMPRVD-NEXT: [[TMP18:%.*]] = fsub float [[TMP17]], [[EXT]] +// IMPRVD-NEXT: [[TMP19:%.*]] = fdiv float [[TMP18]], [[TMP13]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD: complex_div: +// IMPRVD-NEXT: [[TMP20:%.*]] = phi float [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[TMP21:%.*]] = phi float [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP20]] to half +// IMPRVD-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP21]] to half +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// IMPRVD-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// IMPRVD-NEXT: [[TMP22:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// IMPRVD-NEXT: ret <2 x half> [[TMP22]] +// +// PRMTD-LABEL: define dso_local <2 x half> @divf16( +// PRMTD-SAME: <2 x half> noundef [[A_COERCE:%.*]], <2 x half> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// PRMTD-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// PRMTD-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// PRMTD-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// PRMTD-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// PRMTD-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// PRMTD-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// PRMTD-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// PRMTD-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// PRMTD-NEXT: [[TMP0:%.*]] = fmul float [[EXT]], [[EXT2]] +// PRMTD-NEXT: [[TMP1:%.*]] = fmul float [[EXT1]], [[EXT3]] +// PRMTD-NEXT: [[TMP2:%.*]] = fadd float [[TMP0]], [[TMP1]] +// PRMTD-NEXT: [[TMP3:%.*]] = fmul float [[EXT2]], [[EXT2]] +// PRMTD-NEXT: [[TMP4:%.*]] = fmul float [[EXT3]], [[EXT3]] +// PRMTD-NEXT: [[TMP5:%.*]] = fadd float [[TMP3]], [[TMP4]] +// PRMTD-NEXT: [[TMP6:%.*]] = fmul float [[EXT1]], [[EXT2]] +// PRMTD-NEXT: [[TMP7:%.*]] = fmul float [[EXT]], [[EXT3]] +// PRMTD-NEXT: [[TMP8:%.*]] = fsub float [[TMP6]], [[TMP7]] +// PRMTD-NEXT: [[TMP9:%.*]] = fdiv float [[TMP2]], [[TMP5]] +// PRMTD-NEXT: [[TMP10:%.*]] = fdiv float [[TMP8]], [[TMP5]] +// PRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP9]] to half +// PRMTD-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP10]] to half +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// PRMTD-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// PRMTD-NEXT: [[TMP11:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// PRMTD-NEXT: ret <2 x half> [[TMP11]] +// +// X86WINPRMTD-LABEL: define dso_local i32 @divf16( +// X86WINPRMTD-SAME: i32 noundef [[A_COERCE:%.*]], i32 noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// X86WINPRMTD-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// X86WINPRMTD-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// X86WINPRMTD-NEXT: store i32 [[A_COERCE]], ptr [[A]], align 2 +// X86WINPRMTD-NEXT: store i32 [[B_COERCE]], ptr [[B]], align 2 +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// X86WINPRMTD-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// X86WINPRMTD-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// X86WINPRMTD-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// X86WINPRMTD-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// X86WINPRMTD-NEXT: [[TMP0:%.*]] = fmul float [[EXT]], [[EXT2]] +// X86WINPRMTD-NEXT: [[TMP1:%.*]] = fmul float [[EXT1]], [[EXT3]] +// X86WINPRMTD-NEXT: [[TMP2:%.*]] = fadd float [[TMP0]], [[TMP1]] +// X86WINPRMTD-NEXT: [[TMP3:%.*]] = fmul float [[EXT2]], [[EXT2]] +// X86WINPRMTD-NEXT: [[TMP4:%.*]] = fmul float [[EXT3]], [[EXT3]] +// X86WINPRMTD-NEXT: [[TMP5:%.*]] = fadd float [[TMP3]], [[TMP4]] +// X86WINPRMTD-NEXT: [[TMP6:%.*]] = fmul float [[EXT1]], [[EXT2]] +// X86WINPRMTD-NEXT: [[TMP7:%.*]] = fmul float [[EXT]], [[EXT3]] +// X86WINPRMTD-NEXT: [[TMP8:%.*]] = fsub float [[TMP6]], [[TMP7]] +// X86WINPRMTD-NEXT: [[TMP9:%.*]] = fdiv float [[TMP2]], [[TMP5]] +// X86WINPRMTD-NEXT: [[TMP10:%.*]] = fdiv float [[TMP8]], [[TMP5]] +// X86WINPRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP9]] to half +// X86WINPRMTD-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP10]] to half +// X86WINPRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// X86WINPRMTD-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// X86WINPRMTD-NEXT: [[TMP11:%.*]] = load i32, ptr [[RETVAL]], align 2 +// X86WINPRMTD-NEXT: ret i32 [[TMP11]] +// +// BASIC_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @divf16( +// BASIC_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x half> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// BASIC_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// BASIC_FAST-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// BASIC_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// BASIC_FAST-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// BASIC_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// BASIC_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// BASIC_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// BASIC_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// BASIC_FAST-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]] +// BASIC_FAST-NEXT: [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]] +// BASIC_FAST-NEXT: [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP0]], [[TMP1]] +// BASIC_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT2]], [[EXT2]] +// BASIC_FAST-NEXT: [[TMP4:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT3]], [[EXT3]] +// BASIC_FAST-NEXT: [[TMP5:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP3]], [[TMP4]] +// BASIC_FAST-NEXT: [[TMP6:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]] +// BASIC_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]] +// BASIC_FAST-NEXT: [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP6]], [[TMP7]] +// BASIC_FAST-NEXT: [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP2]], [[TMP5]] +// BASIC_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP8]], [[TMP5]] +// BASIC_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP9]] to half +// BASIC_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP10]] to half +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// BASIC_FAST-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// BASIC_FAST-NEXT: [[TMP11:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// BASIC_FAST-NEXT: ret <2 x half> [[TMP11]] +// +// FULL_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @divf16( +// FULL_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x half> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// FULL_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// FULL_FAST-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// FULL_FAST-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// FULL_FAST-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// FULL_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// FULL_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// FULL_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// FULL_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// FULL_FAST-NEXT: [[CALL:%.*]] = call reassoc nnan ninf nsz arcp afn nofpclass(nan inf) <2 x float> @__divsc3(float noundef nofpclass(nan inf) [[EXT]], float noundef nofpclass(nan inf) [[EXT1]], float noundef nofpclass(nan inf) [[EXT2]], float noundef nofpclass(nan inf) [[EXT3]]) #[[ATTR1:[0-9]+]] +// FULL_FAST-NEXT: store <2 x float> [[CALL]], ptr [[COERCE]], align 4 +// FULL_FAST-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL_FAST-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL_FAST-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL_FAST-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[COERCE_REAL]] to half +// FULL_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[COERCE_IMAG]] to half +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// FULL_FAST-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// FULL_FAST-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// FULL_FAST-NEXT: ret <2 x half> [[TMP0]] +// +// IMPRVD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @divf16( +// IMPRVD_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x half> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// IMPRVD_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// IMPRVD_FAST-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// IMPRVD_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// IMPRVD_FAST-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// IMPRVD_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// IMPRVD_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// IMPRVD_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// IMPRVD_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[EXT2]]) +// IMPRVD_FAST-NEXT: [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[EXT3]]) +// IMPRVD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt float [[TMP0]], [[TMP1]] +// IMPRVD_FAST-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD_FAST: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP2:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[EXT3]], [[EXT2]] +// IMPRVD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP2]], [[EXT3]] +// IMPRVD_FAST-NEXT: [[TMP4:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[EXT2]], [[TMP3]] +// IMPRVD_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[EXT]], [[TMP5]] +// IMPRVD_FAST-NEXT: [[TMP7:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP6]], [[TMP4]] +// IMPRVD_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP9:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[EXT1]], [[TMP8]] +// IMPRVD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP9]], [[TMP4]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD_FAST: abs_rhsr_less_than_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP11:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[EXT2]], [[EXT3]] +// IMPRVD_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP11]], [[EXT2]] +// IMPRVD_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[EXT3]], [[TMP12]] +// IMPRVD_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP15:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP14]], [[EXT1]] +// IMPRVD_FAST-NEXT: [[TMP16:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP15]], [[TMP13]] +// IMPRVD_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP18:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP17]], [[EXT]] +// IMPRVD_FAST-NEXT: [[TMP19:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP18]], [[TMP13]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD_FAST: complex_div: +// IMPRVD_FAST-NEXT: [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP20]] to half +// IMPRVD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP21]] to half +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// IMPRVD_FAST-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// IMPRVD_FAST-NEXT: [[TMP22:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// IMPRVD_FAST-NEXT: ret <2 x half> [[TMP22]] +// +// PRMTD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @divf16( +// PRMTD_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x half> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// PRMTD_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// PRMTD_FAST-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// PRMTD_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// PRMTD_FAST-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// PRMTD_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// PRMTD_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// PRMTD_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]] +// PRMTD_FAST-NEXT: [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]] +// PRMTD_FAST-NEXT: [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP0]], [[TMP1]] +// PRMTD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT2]], [[EXT2]] +// PRMTD_FAST-NEXT: [[TMP4:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT3]], [[EXT3]] +// PRMTD_FAST-NEXT: [[TMP5:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP3]], [[TMP4]] +// PRMTD_FAST-NEXT: [[TMP6:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]] +// PRMTD_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]] +// PRMTD_FAST-NEXT: [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP6]], [[TMP7]] +// PRMTD_FAST-NEXT: [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP2]], [[TMP5]] +// PRMTD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP8]], [[TMP5]] +// PRMTD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP9]] to half +// PRMTD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[TMP10]] to half +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// PRMTD_FAST-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// PRMTD_FAST-NEXT: [[TMP11:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// PRMTD_FAST-NEXT: ret <2 x half> [[TMP11]] +// +_Complex _Float16 divf16(_Complex _Float16 a, _Complex _Float16 b) { + return a / b; +} + +// FULL-LABEL: define dso_local <2 x half> @mulf16( +// FULL-SAME: <2 x half> noundef [[A_COERCE:%.*]], <2 x half> noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// FULL-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// FULL-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// FULL-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// FULL-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// FULL-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// FULL-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// FULL-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// FULL-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// FULL-NEXT: [[MUL_AC:%.*]] = fmul float [[EXT]], [[EXT2]] +// FULL-NEXT: [[MUL_BD:%.*]] = fmul float [[EXT1]], [[EXT3]] +// FULL-NEXT: [[MUL_AD:%.*]] = fmul float [[EXT]], [[EXT3]] +// FULL-NEXT: [[MUL_BC:%.*]] = fmul float [[EXT1]], [[EXT2]] +// FULL-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// FULL-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// FULL-NEXT: [[ISNAN_CMP:%.*]] = fcmp uno float [[MUL_R]], [[MUL_R]] +// FULL-NEXT: br i1 [[ISNAN_CMP]], label [[COMPLEX_MUL_IMAG_NAN:%.*]], label [[COMPLEX_MUL_CONT:%.*]], !prof [[PROF2:![0-9]+]] +// FULL: complex_mul_imag_nan: +// FULL-NEXT: [[ISNAN_CMP4:%.*]] = fcmp uno float [[MUL_I]], [[MUL_I]] +// FULL-NEXT: br i1 [[ISNAN_CMP4]], label [[COMPLEX_MUL_LIBCALL:%.*]], label [[COMPLEX_MUL_CONT]], !prof [[PROF2]] +// FULL: complex_mul_libcall: +// FULL-NEXT: [[CALL:%.*]] = call <2 x float> @__mulsc3(float noundef [[EXT]], float noundef [[EXT1]], float noundef [[EXT2]], float noundef [[EXT3]]) #[[ATTR1]] +// FULL-NEXT: store <2 x float> [[CALL]], ptr [[COERCE]], align 4 +// FULL-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL-NEXT: br label [[COMPLEX_MUL_CONT]] +// FULL: complex_mul_cont: +// FULL-NEXT: [[REAL_MUL_PHI:%.*]] = phi float [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_REAL]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL-NEXT: [[IMAG_MUL_PHI:%.*]] = phi float [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_IMAG]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[REAL_MUL_PHI]] to half +// FULL-NEXT: [[UNPROMOTION5:%.*]] = fptrunc float [[IMAG_MUL_PHI]] to half +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// FULL-NEXT: store half [[UNPROMOTION5]], ptr [[RETVAL_IMAGP]], align 2 +// FULL-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// FULL-NEXT: ret <2 x half> [[TMP0]] +// +// BASIC-LABEL: define dso_local <2 x half> @mulf16( +// BASIC-SAME: <2 x half> noundef [[A_COERCE:%.*]], <2 x half> noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// BASIC-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// BASIC-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// BASIC-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// BASIC-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// BASIC-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// BASIC-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// BASIC-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// BASIC-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// BASIC-NEXT: [[MUL_AC:%.*]] = fmul float [[EXT]], [[EXT2]] +// BASIC-NEXT: [[MUL_BD:%.*]] = fmul float [[EXT1]], [[EXT3]] +// BASIC-NEXT: [[MUL_AD:%.*]] = fmul float [[EXT]], [[EXT3]] +// BASIC-NEXT: [[MUL_BC:%.*]] = fmul float [[EXT1]], [[EXT2]] +// BASIC-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// BASIC-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// BASIC-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half +// BASIC-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// BASIC-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// BASIC-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// BASIC-NEXT: ret <2 x half> [[TMP0]] +// +// IMPRVD-LABEL: define dso_local <2 x half> @mulf16( +// IMPRVD-SAME: <2 x half> noundef [[A_COERCE:%.*]], <2 x half> noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// IMPRVD-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// IMPRVD-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// IMPRVD-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// IMPRVD-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// IMPRVD-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// IMPRVD-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// IMPRVD-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// IMPRVD-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// IMPRVD-NEXT: [[MUL_AC:%.*]] = fmul float [[EXT]], [[EXT2]] +// IMPRVD-NEXT: [[MUL_BD:%.*]] = fmul float [[EXT1]], [[EXT3]] +// IMPRVD-NEXT: [[MUL_AD:%.*]] = fmul float [[EXT]], [[EXT3]] +// IMPRVD-NEXT: [[MUL_BC:%.*]] = fmul float [[EXT1]], [[EXT2]] +// IMPRVD-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// IMPRVD-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// IMPRVD-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half +// IMPRVD-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// IMPRVD-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// IMPRVD-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// IMPRVD-NEXT: ret <2 x half> [[TMP0]] +// +// PRMTD-LABEL: define dso_local <2 x half> @mulf16( +// PRMTD-SAME: <2 x half> noundef [[A_COERCE:%.*]], <2 x half> noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// PRMTD-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// PRMTD-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// PRMTD-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// PRMTD-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// PRMTD-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// PRMTD-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// PRMTD-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// PRMTD-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// PRMTD-NEXT: [[MUL_AC:%.*]] = fmul float [[EXT]], [[EXT2]] +// PRMTD-NEXT: [[MUL_BD:%.*]] = fmul float [[EXT1]], [[EXT3]] +// PRMTD-NEXT: [[MUL_AD:%.*]] = fmul float [[EXT]], [[EXT3]] +// PRMTD-NEXT: [[MUL_BC:%.*]] = fmul float [[EXT1]], [[EXT2]] +// PRMTD-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// PRMTD-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// PRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half +// PRMTD-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// PRMTD-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// PRMTD-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// PRMTD-NEXT: ret <2 x half> [[TMP0]] +// +// X86WINPRMTD-LABEL: define dso_local i32 @mulf16( +// X86WINPRMTD-SAME: i32 noundef [[A_COERCE:%.*]], i32 noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// X86WINPRMTD-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// X86WINPRMTD-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// X86WINPRMTD-NEXT: store i32 [[A_COERCE]], ptr [[A]], align 2 +// X86WINPRMTD-NEXT: store i32 [[B_COERCE]], ptr [[B]], align 2 +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// X86WINPRMTD-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// X86WINPRMTD-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// X86WINPRMTD-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// X86WINPRMTD-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// X86WINPRMTD-NEXT: [[MUL_AC:%.*]] = fmul float [[EXT]], [[EXT2]] +// X86WINPRMTD-NEXT: [[MUL_BD:%.*]] = fmul float [[EXT1]], [[EXT3]] +// X86WINPRMTD-NEXT: [[MUL_AD:%.*]] = fmul float [[EXT]], [[EXT3]] +// X86WINPRMTD-NEXT: [[MUL_BC:%.*]] = fmul float [[EXT1]], [[EXT2]] +// X86WINPRMTD-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// X86WINPRMTD-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// X86WINPRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half +// X86WINPRMTD-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half +// X86WINPRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// X86WINPRMTD-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// X86WINPRMTD-NEXT: [[TMP0:%.*]] = load i32, ptr [[RETVAL]], align 2 +// X86WINPRMTD-NEXT: ret i32 [[TMP0]] +// +// BASIC_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @mulf16( +// BASIC_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x half> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// BASIC_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// BASIC_FAST-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// BASIC_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// BASIC_FAST-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// BASIC_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// BASIC_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// BASIC_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// BASIC_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// BASIC_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]] +// BASIC_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]] +// BASIC_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]] +// BASIC_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]] +// BASIC_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]] +// BASIC_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]] +// BASIC_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half +// BASIC_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// BASIC_FAST-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// BASIC_FAST-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// BASIC_FAST-NEXT: ret <2 x half> [[TMP0]] +// +// FULL_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @mulf16( +// FULL_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x half> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// FULL_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// FULL_FAST-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// FULL_FAST-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// FULL_FAST-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// FULL_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// FULL_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// FULL_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// FULL_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// FULL_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]] +// FULL_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]] +// FULL_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]] +// FULL_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]] +// FULL_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]] +// FULL_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]] +// FULL_FAST-NEXT: [[ISNAN_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn uno float [[MUL_R]], [[MUL_R]] +// FULL_FAST-NEXT: br i1 [[ISNAN_CMP]], label [[COMPLEX_MUL_IMAG_NAN:%.*]], label [[COMPLEX_MUL_CONT:%.*]], !prof [[PROF2:![0-9]+]] +// FULL_FAST: complex_mul_imag_nan: +// FULL_FAST-NEXT: [[ISNAN_CMP4:%.*]] = fcmp reassoc nnan ninf nsz arcp afn uno float [[MUL_I]], [[MUL_I]] +// FULL_FAST-NEXT: br i1 [[ISNAN_CMP4]], label [[COMPLEX_MUL_LIBCALL:%.*]], label [[COMPLEX_MUL_CONT]], !prof [[PROF2]] +// FULL_FAST: complex_mul_libcall: +// FULL_FAST-NEXT: [[CALL:%.*]] = call reassoc nnan ninf nsz arcp afn nofpclass(nan inf) <2 x float> @__mulsc3(float noundef nofpclass(nan inf) [[EXT]], float noundef nofpclass(nan inf) [[EXT1]], float noundef nofpclass(nan inf) [[EXT2]], float noundef nofpclass(nan inf) [[EXT3]]) #[[ATTR1]] +// FULL_FAST-NEXT: store <2 x float> [[CALL]], ptr [[COERCE]], align 4 +// FULL_FAST-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL_FAST-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL_FAST-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL_FAST-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL_FAST-NEXT: br label [[COMPLEX_MUL_CONT]] +// FULL_FAST: complex_mul_cont: +// FULL_FAST-NEXT: [[REAL_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_REAL]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL_FAST-NEXT: [[IMAG_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_IMAG]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[REAL_MUL_PHI]] to half +// FULL_FAST-NEXT: [[UNPROMOTION5:%.*]] = fptrunc float [[IMAG_MUL_PHI]] to half +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// FULL_FAST-NEXT: store half [[UNPROMOTION5]], ptr [[RETVAL_IMAGP]], align 2 +// FULL_FAST-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// FULL_FAST-NEXT: ret <2 x half> [[TMP0]] +// +// IMPRVD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @mulf16( +// IMPRVD_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x half> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// IMPRVD_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// IMPRVD_FAST-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// IMPRVD_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// IMPRVD_FAST-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// IMPRVD_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// IMPRVD_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// IMPRVD_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// IMPRVD_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// IMPRVD_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]] +// IMPRVD_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]] +// IMPRVD_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]] +// IMPRVD_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]] +// IMPRVD_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]] +// IMPRVD_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]] +// IMPRVD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half +// IMPRVD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// IMPRVD_FAST-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// IMPRVD_FAST-NEXT: ret <2 x half> [[TMP0]] +// +// PRMTD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @mulf16( +// PRMTD_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x half> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// PRMTD_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// PRMTD_FAST-NEXT: [[B:%.*]] = alloca { half, half }, align 2 +// PRMTD_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// PRMTD_FAST-NEXT: store <2 x half> [[B_COERCE]], ptr [[B]], align 2 +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext half [[A_REAL]] to float +// PRMTD_FAST-NEXT: [[EXT1:%.*]] = fpext half [[A_IMAG]] to float +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load half, ptr [[B_REALP]], align 2 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load half, ptr [[B_IMAGP]], align 2 +// PRMTD_FAST-NEXT: [[EXT2:%.*]] = fpext half [[B_REAL]] to float +// PRMTD_FAST-NEXT: [[EXT3:%.*]] = fpext half [[B_IMAG]] to float +// PRMTD_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT2]] +// PRMTD_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT3]] +// PRMTD_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT3]] +// PRMTD_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT1]], [[EXT2]] +// PRMTD_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]] +// PRMTD_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]] +// PRMTD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[MUL_R]] to half +// PRMTD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc float [[MUL_I]] to half +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// PRMTD_FAST-NEXT: store half [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 2 +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// PRMTD_FAST-NEXT: ret <2 x half> [[TMP0]] +// +_Complex _Float16 mulf16(_Complex _Float16 a, _Complex _Float16 b) { + return a * b; +} + +// FULL-LABEL: define dso_local <2 x half> @f1( +// FULL-SAME: <2 x half> noundef [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x half> noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// FULL-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// FULL-NEXT: [[C:%.*]] = alloca { half, half }, align 2 +// FULL-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// FULL-NEXT: store <2 x half> [[C_COERCE]], ptr [[C]], align 2 +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// FULL-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 0 +// FULL-NEXT: [[C_REAL:%.*]] = load half, ptr [[C_REALP]], align 2 +// FULL-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 1 +// FULL-NEXT: [[C_IMAG:%.*]] = load half, ptr [[C_IMAGP]], align 2 +// FULL-NEXT: [[CONV:%.*]] = fpext half [[C_REAL]] to x86_fp80 +// FULL-NEXT: [[CONV1:%.*]] = fpext half [[C_IMAG]] to x86_fp80 +// FULL-NEXT: [[CALL:%.*]] = call { x86_fp80, x86_fp80 } @__divxc3(x86_fp80 noundef [[B_REAL]], x86_fp80 noundef [[B_IMAG]], x86_fp80 noundef [[CONV]], x86_fp80 noundef [[CONV1]]) #[[ATTR1]] +// FULL-NEXT: [[TMP0:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 0 +// FULL-NEXT: [[TMP1:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 1 +// FULL-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP0]] to half +// FULL-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP1]] to half +// FULL-NEXT: [[EXT:%.*]] = fpext half [[CONV2]] to float +// FULL-NEXT: [[EXT4:%.*]] = fpext half [[CONV3]] to float +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// FULL-NEXT: [[EXT5:%.*]] = fpext half [[A_REAL]] to float +// FULL-NEXT: [[EXT6:%.*]] = fpext half [[A_IMAG]] to float +// FULL-NEXT: [[CALL7:%.*]] = call <2 x float> @__divsc3(float noundef [[EXT]], float noundef [[EXT4]], float noundef [[EXT5]], float noundef [[EXT6]]) #[[ATTR1]] +// FULL-NEXT: store <2 x float> [[CALL7]], ptr [[COERCE]], align 4 +// FULL-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[COERCE_REAL]] to half +// FULL-NEXT: [[UNPROMOTION8:%.*]] = fptrunc float [[COERCE_IMAG]] to half +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// FULL-NEXT: store half [[UNPROMOTION8]], ptr [[RETVAL_IMAGP]], align 2 +// FULL-NEXT: [[TMP2:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// FULL-NEXT: ret <2 x half> [[TMP2]] +// +// BASIC-LABEL: define dso_local <2 x half> @f1( +// BASIC-SAME: <2 x half> noundef [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x half> noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// BASIC-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// BASIC-NEXT: [[C:%.*]] = alloca { half, half }, align 2 +// BASIC-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// BASIC-NEXT: store <2 x half> [[C_COERCE]], ptr [[C]], align 2 +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// BASIC-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 0 +// BASIC-NEXT: [[C_REAL:%.*]] = load half, ptr [[C_REALP]], align 2 +// BASIC-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 1 +// BASIC-NEXT: [[C_IMAG:%.*]] = load half, ptr [[C_IMAGP]], align 2 +// BASIC-NEXT: [[CONV:%.*]] = fpext half [[C_REAL]] to x86_fp80 +// BASIC-NEXT: [[CONV1:%.*]] = fpext half [[C_IMAG]] to x86_fp80 +// BASIC-NEXT: [[TMP0:%.*]] = fmul x86_fp80 [[B_REAL]], [[CONV]] +// BASIC-NEXT: [[TMP1:%.*]] = fmul x86_fp80 [[B_IMAG]], [[CONV1]] +// BASIC-NEXT: [[TMP2:%.*]] = fadd x86_fp80 [[TMP0]], [[TMP1]] +// BASIC-NEXT: [[TMP3:%.*]] = fmul x86_fp80 [[CONV]], [[CONV]] +// BASIC-NEXT: [[TMP4:%.*]] = fmul x86_fp80 [[CONV1]], [[CONV1]] +// BASIC-NEXT: [[TMP5:%.*]] = fadd x86_fp80 [[TMP3]], [[TMP4]] +// BASIC-NEXT: [[TMP6:%.*]] = fmul x86_fp80 [[B_IMAG]], [[CONV]] +// BASIC-NEXT: [[TMP7:%.*]] = fmul x86_fp80 [[B_REAL]], [[CONV1]] +// BASIC-NEXT: [[TMP8:%.*]] = fsub x86_fp80 [[TMP6]], [[TMP7]] +// BASIC-NEXT: [[TMP9:%.*]] = fdiv x86_fp80 [[TMP2]], [[TMP5]] +// BASIC-NEXT: [[TMP10:%.*]] = fdiv x86_fp80 [[TMP8]], [[TMP5]] +// BASIC-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP9]] to half +// BASIC-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP10]] to half +// BASIC-NEXT: [[EXT:%.*]] = fpext half [[CONV2]] to float +// BASIC-NEXT: [[EXT4:%.*]] = fpext half [[CONV3]] to float +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// BASIC-NEXT: [[EXT5:%.*]] = fpext half [[A_REAL]] to float +// BASIC-NEXT: [[EXT6:%.*]] = fpext half [[A_IMAG]] to float +// BASIC-NEXT: [[TMP11:%.*]] = fmul float [[EXT]], [[EXT5]] +// BASIC-NEXT: [[TMP12:%.*]] = fmul float [[EXT4]], [[EXT6]] +// BASIC-NEXT: [[TMP13:%.*]] = fadd float [[TMP11]], [[TMP12]] +// BASIC-NEXT: [[TMP14:%.*]] = fmul float [[EXT5]], [[EXT5]] +// BASIC-NEXT: [[TMP15:%.*]] = fmul float [[EXT6]], [[EXT6]] +// BASIC-NEXT: [[TMP16:%.*]] = fadd float [[TMP14]], [[TMP15]] +// BASIC-NEXT: [[TMP17:%.*]] = fmul float [[EXT4]], [[EXT5]] +// BASIC-NEXT: [[TMP18:%.*]] = fmul float [[EXT]], [[EXT6]] +// BASIC-NEXT: [[TMP19:%.*]] = fsub float [[TMP17]], [[TMP18]] +// BASIC-NEXT: [[TMP20:%.*]] = fdiv float [[TMP13]], [[TMP16]] +// BASIC-NEXT: [[TMP21:%.*]] = fdiv float [[TMP19]], [[TMP16]] +// BASIC-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP20]] to half +// BASIC-NEXT: [[UNPROMOTION7:%.*]] = fptrunc float [[TMP21]] to half +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// BASIC-NEXT: store half [[UNPROMOTION7]], ptr [[RETVAL_IMAGP]], align 2 +// BASIC-NEXT: [[TMP22:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// BASIC-NEXT: ret <2 x half> [[TMP22]] +// +// IMPRVD-LABEL: define dso_local <2 x half> @f1( +// IMPRVD-SAME: <2 x half> noundef [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x half> noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// IMPRVD-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// IMPRVD-NEXT: [[C:%.*]] = alloca { half, half }, align 2 +// IMPRVD-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// IMPRVD-NEXT: store <2 x half> [[C_COERCE]], ptr [[C]], align 2 +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// IMPRVD-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 0 +// IMPRVD-NEXT: [[C_REAL:%.*]] = load half, ptr [[C_REALP]], align 2 +// IMPRVD-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 1 +// IMPRVD-NEXT: [[C_IMAG:%.*]] = load half, ptr [[C_IMAGP]], align 2 +// IMPRVD-NEXT: [[CONV:%.*]] = fpext half [[C_REAL]] to x86_fp80 +// IMPRVD-NEXT: [[CONV1:%.*]] = fpext half [[C_IMAG]] to x86_fp80 +// IMPRVD-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) +// IMPRVD-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) +// IMPRVD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt x86_fp80 [[TMP0]], [[TMP1]] +// IMPRVD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD-NEXT: [[TMP2:%.*]] = fdiv x86_fp80 [[CONV1]], [[CONV]] +// IMPRVD-NEXT: [[TMP3:%.*]] = fmul x86_fp80 [[TMP2]], [[CONV1]] +// IMPRVD-NEXT: [[TMP4:%.*]] = fadd x86_fp80 [[CONV]], [[TMP3]] +// IMPRVD-NEXT: [[TMP5:%.*]] = fmul x86_fp80 [[B_IMAG]], [[TMP2]] +// IMPRVD-NEXT: [[TMP6:%.*]] = fadd x86_fp80 [[B_REAL]], [[TMP5]] +// IMPRVD-NEXT: [[TMP7:%.*]] = fdiv x86_fp80 [[TMP6]], [[TMP4]] +// IMPRVD-NEXT: [[TMP8:%.*]] = fmul x86_fp80 [[B_REAL]], [[TMP2]] +// IMPRVD-NEXT: [[TMP9:%.*]] = fsub x86_fp80 [[B_IMAG]], [[TMP8]] +// IMPRVD-NEXT: [[TMP10:%.*]] = fdiv x86_fp80 [[TMP9]], [[TMP4]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD: abs_rhsr_less_than_abs_rhsi: +// IMPRVD-NEXT: [[TMP11:%.*]] = fdiv x86_fp80 [[CONV]], [[CONV1]] +// IMPRVD-NEXT: [[TMP12:%.*]] = fmul x86_fp80 [[TMP11]], [[CONV]] +// IMPRVD-NEXT: [[TMP13:%.*]] = fadd x86_fp80 [[CONV1]], [[TMP12]] +// IMPRVD-NEXT: [[TMP14:%.*]] = fmul x86_fp80 [[B_REAL]], [[TMP11]] +// IMPRVD-NEXT: [[TMP15:%.*]] = fadd x86_fp80 [[TMP14]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP16:%.*]] = fdiv x86_fp80 [[TMP15]], [[TMP13]] +// IMPRVD-NEXT: [[TMP17:%.*]] = fmul x86_fp80 [[B_IMAG]], [[TMP11]] +// IMPRVD-NEXT: [[TMP18:%.*]] = fsub x86_fp80 [[TMP17]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP19:%.*]] = fdiv x86_fp80 [[TMP18]], [[TMP13]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD: complex_div: +// IMPRVD-NEXT: [[TMP20:%.*]] = phi x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[TMP21:%.*]] = phi x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP20]] to half +// IMPRVD-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP21]] to half +// IMPRVD-NEXT: [[EXT:%.*]] = fpext half [[CONV2]] to float +// IMPRVD-NEXT: [[EXT4:%.*]] = fpext half [[CONV3]] to float +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// IMPRVD-NEXT: [[EXT5:%.*]] = fpext half [[A_REAL]] to float +// IMPRVD-NEXT: [[EXT6:%.*]] = fpext half [[A_IMAG]] to float +// IMPRVD-NEXT: [[TMP22:%.*]] = call float @llvm.fabs.f32(float [[EXT5]]) +// IMPRVD-NEXT: [[TMP23:%.*]] = call float @llvm.fabs.f32(float [[EXT6]]) +// IMPRVD-NEXT: [[ABS_CMP7:%.*]] = fcmp ugt float [[TMP22]], [[TMP23]] +// IMPRVD-NEXT: br i1 [[ABS_CMP7]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI8:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI9:%.*]] +// IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi8: +// IMPRVD-NEXT: [[TMP24:%.*]] = fdiv float [[EXT6]], [[EXT5]] +// IMPRVD-NEXT: [[TMP25:%.*]] = fmul float [[TMP24]], [[EXT6]] +// IMPRVD-NEXT: [[TMP26:%.*]] = fadd float [[EXT5]], [[TMP25]] +// IMPRVD-NEXT: [[TMP27:%.*]] = fmul float [[EXT4]], [[TMP24]] +// IMPRVD-NEXT: [[TMP28:%.*]] = fadd float [[EXT]], [[TMP27]] +// IMPRVD-NEXT: [[TMP29:%.*]] = fdiv float [[TMP28]], [[TMP26]] +// IMPRVD-NEXT: [[TMP30:%.*]] = fmul float [[EXT]], [[TMP24]] +// IMPRVD-NEXT: [[TMP31:%.*]] = fsub float [[EXT4]], [[TMP30]] +// IMPRVD-NEXT: [[TMP32:%.*]] = fdiv float [[TMP31]], [[TMP26]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV10:%.*]] +// IMPRVD: abs_rhsr_less_than_abs_rhsi9: +// IMPRVD-NEXT: [[TMP33:%.*]] = fdiv float [[EXT5]], [[EXT6]] +// IMPRVD-NEXT: [[TMP34:%.*]] = fmul float [[TMP33]], [[EXT5]] +// IMPRVD-NEXT: [[TMP35:%.*]] = fadd float [[EXT6]], [[TMP34]] +// IMPRVD-NEXT: [[TMP36:%.*]] = fmul float [[EXT]], [[TMP33]] +// IMPRVD-NEXT: [[TMP37:%.*]] = fadd float [[TMP36]], [[EXT4]] +// IMPRVD-NEXT: [[TMP38:%.*]] = fdiv float [[TMP37]], [[TMP35]] +// IMPRVD-NEXT: [[TMP39:%.*]] = fmul float [[EXT4]], [[TMP33]] +// IMPRVD-NEXT: [[TMP40:%.*]] = fsub float [[TMP39]], [[EXT]] +// IMPRVD-NEXT: [[TMP41:%.*]] = fdiv float [[TMP40]], [[TMP35]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV10]] +// IMPRVD: complex_div10: +// IMPRVD-NEXT: [[TMP42:%.*]] = phi float [ [[TMP29]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI8]] ], [ [[TMP38]], [[ABS_RHSR_LESS_THAN_ABS_RHSI9]] ] +// IMPRVD-NEXT: [[TMP43:%.*]] = phi float [ [[TMP32]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI8]] ], [ [[TMP41]], [[ABS_RHSR_LESS_THAN_ABS_RHSI9]] ] +// IMPRVD-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP42]] to half +// IMPRVD-NEXT: [[UNPROMOTION11:%.*]] = fptrunc float [[TMP43]] to half +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// IMPRVD-NEXT: store half [[UNPROMOTION11]], ptr [[RETVAL_IMAGP]], align 2 +// IMPRVD-NEXT: [[TMP44:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// IMPRVD-NEXT: ret <2 x half> [[TMP44]] +// +// PRMTD-LABEL: define dso_local <2 x half> @f1( +// PRMTD-SAME: <2 x half> noundef [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x half> noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// PRMTD-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// PRMTD-NEXT: [[C:%.*]] = alloca { half, half }, align 2 +// PRMTD-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// PRMTD-NEXT: store <2 x half> [[C_COERCE]], ptr [[C]], align 2 +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// PRMTD-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 0 +// PRMTD-NEXT: [[C_REAL:%.*]] = load half, ptr [[C_REALP]], align 2 +// PRMTD-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 1 +// PRMTD-NEXT: [[C_IMAG:%.*]] = load half, ptr [[C_IMAGP]], align 2 +// PRMTD-NEXT: [[CONV:%.*]] = fpext half [[C_REAL]] to x86_fp80 +// PRMTD-NEXT: [[CONV1:%.*]] = fpext half [[C_IMAG]] to x86_fp80 +// PRMTD-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) +// PRMTD-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) +// PRMTD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt x86_fp80 [[TMP0]], [[TMP1]] +// PRMTD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// PRMTD: abs_rhsr_greater_or_equal_abs_rhsi: +// PRMTD-NEXT: [[TMP2:%.*]] = fdiv x86_fp80 [[CONV1]], [[CONV]] +// PRMTD-NEXT: [[TMP3:%.*]] = fmul x86_fp80 [[TMP2]], [[CONV1]] +// PRMTD-NEXT: [[TMP4:%.*]] = fadd x86_fp80 [[CONV]], [[TMP3]] +// PRMTD-NEXT: [[TMP5:%.*]] = fmul x86_fp80 [[B_IMAG]], [[TMP2]] +// PRMTD-NEXT: [[TMP6:%.*]] = fadd x86_fp80 [[B_REAL]], [[TMP5]] +// PRMTD-NEXT: [[TMP7:%.*]] = fdiv x86_fp80 [[TMP6]], [[TMP4]] +// PRMTD-NEXT: [[TMP8:%.*]] = fmul x86_fp80 [[B_REAL]], [[TMP2]] +// PRMTD-NEXT: [[TMP9:%.*]] = fsub x86_fp80 [[B_IMAG]], [[TMP8]] +// PRMTD-NEXT: [[TMP10:%.*]] = fdiv x86_fp80 [[TMP9]], [[TMP4]] +// PRMTD-NEXT: br label [[COMPLEX_DIV:%.*]] +// PRMTD: abs_rhsr_less_than_abs_rhsi: +// PRMTD-NEXT: [[TMP11:%.*]] = fdiv x86_fp80 [[CONV]], [[CONV1]] +// PRMTD-NEXT: [[TMP12:%.*]] = fmul x86_fp80 [[TMP11]], [[CONV]] +// PRMTD-NEXT: [[TMP13:%.*]] = fadd x86_fp80 [[CONV1]], [[TMP12]] +// PRMTD-NEXT: [[TMP14:%.*]] = fmul x86_fp80 [[B_REAL]], [[TMP11]] +// PRMTD-NEXT: [[TMP15:%.*]] = fadd x86_fp80 [[TMP14]], [[B_IMAG]] +// PRMTD-NEXT: [[TMP16:%.*]] = fdiv x86_fp80 [[TMP15]], [[TMP13]] +// PRMTD-NEXT: [[TMP17:%.*]] = fmul x86_fp80 [[B_IMAG]], [[TMP11]] +// PRMTD-NEXT: [[TMP18:%.*]] = fsub x86_fp80 [[TMP17]], [[B_REAL]] +// PRMTD-NEXT: [[TMP19:%.*]] = fdiv x86_fp80 [[TMP18]], [[TMP13]] +// PRMTD-NEXT: br label [[COMPLEX_DIV]] +// PRMTD: complex_div: +// PRMTD-NEXT: [[TMP20:%.*]] = phi x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD-NEXT: [[TMP21:%.*]] = phi x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP20]] to half +// PRMTD-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP21]] to half +// PRMTD-NEXT: [[EXT:%.*]] = fpext half [[CONV2]] to float +// PRMTD-NEXT: [[EXT4:%.*]] = fpext half [[CONV3]] to float +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// PRMTD-NEXT: [[EXT5:%.*]] = fpext half [[A_REAL]] to float +// PRMTD-NEXT: [[EXT6:%.*]] = fpext half [[A_IMAG]] to float +// PRMTD-NEXT: [[TMP22:%.*]] = fmul float [[EXT]], [[EXT5]] +// PRMTD-NEXT: [[TMP23:%.*]] = fmul float [[EXT4]], [[EXT6]] +// PRMTD-NEXT: [[TMP24:%.*]] = fadd float [[TMP22]], [[TMP23]] +// PRMTD-NEXT: [[TMP25:%.*]] = fmul float [[EXT5]], [[EXT5]] +// PRMTD-NEXT: [[TMP26:%.*]] = fmul float [[EXT6]], [[EXT6]] +// PRMTD-NEXT: [[TMP27:%.*]] = fadd float [[TMP25]], [[TMP26]] +// PRMTD-NEXT: [[TMP28:%.*]] = fmul float [[EXT4]], [[EXT5]] +// PRMTD-NEXT: [[TMP29:%.*]] = fmul float [[EXT]], [[EXT6]] +// PRMTD-NEXT: [[TMP30:%.*]] = fsub float [[TMP28]], [[TMP29]] +// PRMTD-NEXT: [[TMP31:%.*]] = fdiv float [[TMP24]], [[TMP27]] +// PRMTD-NEXT: [[TMP32:%.*]] = fdiv float [[TMP30]], [[TMP27]] +// PRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP31]] to half +// PRMTD-NEXT: [[UNPROMOTION7:%.*]] = fptrunc float [[TMP32]] to half +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// PRMTD-NEXT: store half [[UNPROMOTION7]], ptr [[RETVAL_IMAGP]], align 2 +// PRMTD-NEXT: [[TMP33:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// PRMTD-NEXT: ret <2 x half> [[TMP33]] +// +// X86WINPRMTD-LABEL: define dso_local i32 @f1( +// X86WINPRMTD-SAME: i32 noundef [[A_COERCE:%.*]], ptr noundef [[B:%.*]], i32 noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// X86WINPRMTD-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// X86WINPRMTD-NEXT: [[C:%.*]] = alloca { half, half }, align 2 +// X86WINPRMTD-NEXT: [[B_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: store i32 [[A_COERCE]], ptr [[A]], align 2 +// X86WINPRMTD-NEXT: store i32 [[C_COERCE]], ptr [[C]], align 2 +// X86WINPRMTD-NEXT: store ptr [[B]], ptr [[B_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[C_REAL:%.*]] = load half, ptr [[C_REALP]], align 2 +// X86WINPRMTD-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[C_IMAG:%.*]] = load half, ptr [[C_IMAGP]], align 2 +// X86WINPRMTD-NEXT: [[CONV:%.*]] = fpext half [[C_REAL]] to double +// X86WINPRMTD-NEXT: [[CONV1:%.*]] = fpext half [[C_IMAG]] to double +// X86WINPRMTD-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[CONV]]) +// X86WINPRMTD-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[CONV1]]) +// X86WINPRMTD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt double [[TMP0]], [[TMP1]] +// X86WINPRMTD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// X86WINPRMTD: abs_rhsr_greater_or_equal_abs_rhsi: +// X86WINPRMTD-NEXT: [[TMP2:%.*]] = fdiv double [[CONV1]], [[CONV]] +// X86WINPRMTD-NEXT: [[TMP3:%.*]] = fmul double [[TMP2]], [[CONV1]] +// X86WINPRMTD-NEXT: [[TMP4:%.*]] = fadd double [[CONV]], [[TMP3]] +// X86WINPRMTD-NEXT: [[TMP5:%.*]] = fmul double [[B_IMAG]], [[TMP2]] +// X86WINPRMTD-NEXT: [[TMP6:%.*]] = fadd double [[B_REAL]], [[TMP5]] +// X86WINPRMTD-NEXT: [[TMP7:%.*]] = fdiv double [[TMP6]], [[TMP4]] +// X86WINPRMTD-NEXT: [[TMP8:%.*]] = fmul double [[B_REAL]], [[TMP2]] +// X86WINPRMTD-NEXT: [[TMP9:%.*]] = fsub double [[B_IMAG]], [[TMP8]] +// X86WINPRMTD-NEXT: [[TMP10:%.*]] = fdiv double [[TMP9]], [[TMP4]] +// X86WINPRMTD-NEXT: br label [[COMPLEX_DIV:%.*]] +// X86WINPRMTD: abs_rhsr_less_than_abs_rhsi: +// X86WINPRMTD-NEXT: [[TMP11:%.*]] = fdiv double [[CONV]], [[CONV1]] +// X86WINPRMTD-NEXT: [[TMP12:%.*]] = fmul double [[TMP11]], [[CONV]] +// X86WINPRMTD-NEXT: [[TMP13:%.*]] = fadd double [[CONV1]], [[TMP12]] +// X86WINPRMTD-NEXT: [[TMP14:%.*]] = fmul double [[B_REAL]], [[TMP11]] +// X86WINPRMTD-NEXT: [[TMP15:%.*]] = fadd double [[TMP14]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[TMP16:%.*]] = fdiv double [[TMP15]], [[TMP13]] +// X86WINPRMTD-NEXT: [[TMP17:%.*]] = fmul double [[B_IMAG]], [[TMP11]] +// X86WINPRMTD-NEXT: [[TMP18:%.*]] = fsub double [[TMP17]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[TMP19:%.*]] = fdiv double [[TMP18]], [[TMP13]] +// X86WINPRMTD-NEXT: br label [[COMPLEX_DIV]] +// X86WINPRMTD: complex_div: +// X86WINPRMTD-NEXT: [[TMP20:%.*]] = phi double [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// X86WINPRMTD-NEXT: [[TMP21:%.*]] = phi double [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// X86WINPRMTD-NEXT: [[CONV2:%.*]] = fptrunc double [[TMP20]] to half +// X86WINPRMTD-NEXT: [[CONV3:%.*]] = fptrunc double [[TMP21]] to half +// X86WINPRMTD-NEXT: [[EXT:%.*]] = fpext half [[CONV2]] to float +// X86WINPRMTD-NEXT: [[EXT4:%.*]] = fpext half [[CONV3]] to float +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// X86WINPRMTD-NEXT: [[EXT5:%.*]] = fpext half [[A_REAL]] to float +// X86WINPRMTD-NEXT: [[EXT6:%.*]] = fpext half [[A_IMAG]] to float +// X86WINPRMTD-NEXT: [[TMP22:%.*]] = fmul float [[EXT]], [[EXT5]] +// X86WINPRMTD-NEXT: [[TMP23:%.*]] = fmul float [[EXT4]], [[EXT6]] +// X86WINPRMTD-NEXT: [[TMP24:%.*]] = fadd float [[TMP22]], [[TMP23]] +// X86WINPRMTD-NEXT: [[TMP25:%.*]] = fmul float [[EXT5]], [[EXT5]] +// X86WINPRMTD-NEXT: [[TMP26:%.*]] = fmul float [[EXT6]], [[EXT6]] +// X86WINPRMTD-NEXT: [[TMP27:%.*]] = fadd float [[TMP25]], [[TMP26]] +// X86WINPRMTD-NEXT: [[TMP28:%.*]] = fmul float [[EXT4]], [[EXT5]] +// X86WINPRMTD-NEXT: [[TMP29:%.*]] = fmul float [[EXT]], [[EXT6]] +// X86WINPRMTD-NEXT: [[TMP30:%.*]] = fsub float [[TMP28]], [[TMP29]] +// X86WINPRMTD-NEXT: [[TMP31:%.*]] = fdiv float [[TMP24]], [[TMP27]] +// X86WINPRMTD-NEXT: [[TMP32:%.*]] = fdiv float [[TMP30]], [[TMP27]] +// X86WINPRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP31]] to half +// X86WINPRMTD-NEXT: [[UNPROMOTION7:%.*]] = fptrunc float [[TMP32]] to half +// X86WINPRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// X86WINPRMTD-NEXT: store half [[UNPROMOTION7]], ptr [[RETVAL_IMAGP]], align 2 +// X86WINPRMTD-NEXT: [[TMP33:%.*]] = load i32, ptr [[RETVAL]], align 2 +// X86WINPRMTD-NEXT: ret i32 [[TMP33]] +// +// BASIC_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @f1( +// BASIC_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x half> noundef nofpclass(nan inf) [[C_COERCE:%.*]]) #[[ATTR0]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// BASIC_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// BASIC_FAST-NEXT: [[C:%.*]] = alloca { half, half }, align 2 +// BASIC_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// BASIC_FAST-NEXT: store <2 x half> [[C_COERCE]], ptr [[C]], align 2 +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// BASIC_FAST-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[C_REAL:%.*]] = load half, ptr [[C_REALP]], align 2 +// BASIC_FAST-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[C_IMAG:%.*]] = load half, ptr [[C_IMAGP]], align 2 +// BASIC_FAST-NEXT: [[CONV:%.*]] = fpext half [[C_REAL]] to x86_fp80 +// BASIC_FAST-NEXT: [[CONV1:%.*]] = fpext half [[C_IMAG]] to x86_fp80 +// BASIC_FAST-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[CONV]] +// BASIC_FAST-NEXT: [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[CONV1]] +// BASIC_FAST-NEXT: [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP0]], [[TMP1]] +// BASIC_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[CONV]] +// BASIC_FAST-NEXT: [[TMP4:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[CONV1]] +// BASIC_FAST-NEXT: [[TMP5:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP3]], [[TMP4]] +// BASIC_FAST-NEXT: [[TMP6:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[CONV]] +// BASIC_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[CONV1]] +// BASIC_FAST-NEXT: [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP7]] +// BASIC_FAST-NEXT: [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP2]], [[TMP5]] +// BASIC_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP8]], [[TMP5]] +// BASIC_FAST-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP9]] to half +// BASIC_FAST-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP10]] to half +// BASIC_FAST-NEXT: [[EXT:%.*]] = fpext half [[CONV2]] to float +// BASIC_FAST-NEXT: [[EXT4:%.*]] = fpext half [[CONV3]] to float +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// BASIC_FAST-NEXT: [[EXT5:%.*]] = fpext half [[A_REAL]] to float +// BASIC_FAST-NEXT: [[EXT6:%.*]] = fpext half [[A_IMAG]] to float +// BASIC_FAST-NEXT: [[TMP11:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT5]] +// BASIC_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT4]], [[EXT6]] +// BASIC_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP11]], [[TMP12]] +// BASIC_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT5]], [[EXT5]] +// BASIC_FAST-NEXT: [[TMP15:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT6]], [[EXT6]] +// BASIC_FAST-NEXT: [[TMP16:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP14]], [[TMP15]] +// BASIC_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT4]], [[EXT5]] +// BASIC_FAST-NEXT: [[TMP18:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT6]] +// BASIC_FAST-NEXT: [[TMP19:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP17]], [[TMP18]] +// BASIC_FAST-NEXT: [[TMP20:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP13]], [[TMP16]] +// BASIC_FAST-NEXT: [[TMP21:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP19]], [[TMP16]] +// BASIC_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP20]] to half +// BASIC_FAST-NEXT: [[UNPROMOTION7:%.*]] = fptrunc float [[TMP21]] to half +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// BASIC_FAST-NEXT: store half [[UNPROMOTION7]], ptr [[RETVAL_IMAGP]], align 2 +// BASIC_FAST-NEXT: [[TMP22:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// BASIC_FAST-NEXT: ret <2 x half> [[TMP22]] +// +// FULL_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @f1( +// FULL_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x half> noundef nofpclass(nan inf) [[C_COERCE:%.*]]) #[[ATTR0]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// FULL_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// FULL_FAST-NEXT: [[C:%.*]] = alloca { half, half }, align 2 +// FULL_FAST-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// FULL_FAST-NEXT: store <2 x half> [[C_COERCE]], ptr [[C]], align 2 +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// FULL_FAST-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 0 +// FULL_FAST-NEXT: [[C_REAL:%.*]] = load half, ptr [[C_REALP]], align 2 +// FULL_FAST-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 1 +// FULL_FAST-NEXT: [[C_IMAG:%.*]] = load half, ptr [[C_IMAGP]], align 2 +// FULL_FAST-NEXT: [[CONV:%.*]] = fpext half [[C_REAL]] to x86_fp80 +// FULL_FAST-NEXT: [[CONV1:%.*]] = fpext half [[C_IMAG]] to x86_fp80 +// FULL_FAST-NEXT: [[CALL:%.*]] = call { x86_fp80, x86_fp80 } @__divxc3(x86_fp80 noundef nofpclass(nan inf) [[B_REAL]], x86_fp80 noundef nofpclass(nan inf) [[B_IMAG]], x86_fp80 noundef nofpclass(nan inf) [[CONV]], x86_fp80 noundef nofpclass(nan inf) [[CONV1]]) #[[ATTR1]] +// FULL_FAST-NEXT: [[TMP0:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 0 +// FULL_FAST-NEXT: [[TMP1:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 1 +// FULL_FAST-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP0]] to half +// FULL_FAST-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP1]] to half +// FULL_FAST-NEXT: [[EXT:%.*]] = fpext half [[CONV2]] to float +// FULL_FAST-NEXT: [[EXT4:%.*]] = fpext half [[CONV3]] to float +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// FULL_FAST-NEXT: [[EXT5:%.*]] = fpext half [[A_REAL]] to float +// FULL_FAST-NEXT: [[EXT6:%.*]] = fpext half [[A_IMAG]] to float +// FULL_FAST-NEXT: [[CALL7:%.*]] = call reassoc nnan ninf nsz arcp afn nofpclass(nan inf) <2 x float> @__divsc3(float noundef nofpclass(nan inf) [[EXT]], float noundef nofpclass(nan inf) [[EXT4]], float noundef nofpclass(nan inf) [[EXT5]], float noundef nofpclass(nan inf) [[EXT6]]) #[[ATTR1]] +// FULL_FAST-NEXT: store <2 x float> [[CALL7]], ptr [[COERCE]], align 4 +// FULL_FAST-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL_FAST-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL_FAST-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL_FAST-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[COERCE_REAL]] to half +// FULL_FAST-NEXT: [[UNPROMOTION8:%.*]] = fptrunc float [[COERCE_IMAG]] to half +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// FULL_FAST-NEXT: store half [[UNPROMOTION8]], ptr [[RETVAL_IMAGP]], align 2 +// FULL_FAST-NEXT: [[TMP2:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// FULL_FAST-NEXT: ret <2 x half> [[TMP2]] +// +// IMPRVD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @f1( +// IMPRVD_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x half> noundef nofpclass(nan inf) [[C_COERCE:%.*]]) #[[ATTR0]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// IMPRVD_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// IMPRVD_FAST-NEXT: [[C:%.*]] = alloca { half, half }, align 2 +// IMPRVD_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// IMPRVD_FAST-NEXT: store <2 x half> [[C_COERCE]], ptr [[C]], align 2 +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// IMPRVD_FAST-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[C_REAL:%.*]] = load half, ptr [[C_REALP]], align 2 +// IMPRVD_FAST-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[C_IMAG:%.*]] = load half, ptr [[C_IMAGP]], align 2 +// IMPRVD_FAST-NEXT: [[CONV:%.*]] = fpext half [[C_REAL]] to x86_fp80 +// IMPRVD_FAST-NEXT: [[CONV1:%.*]] = fpext half [[C_IMAG]] to x86_fp80 +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) +// IMPRVD_FAST-NEXT: [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) +// IMPRVD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt x86_fp80 [[TMP0]], [[TMP1]] +// IMPRVD_FAST-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD_FAST: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP2:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[CONV]] +// IMPRVD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP2]], [[CONV1]] +// IMPRVD_FAST-NEXT: [[TMP4:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[TMP3]] +// IMPRVD_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP5]] +// IMPRVD_FAST-NEXT: [[TMP7:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP4]] +// IMPRVD_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP9:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP8]] +// IMPRVD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP9]], [[TMP4]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD_FAST: abs_rhsr_less_than_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP11:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[CONV1]] +// IMPRVD_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP11]], [[CONV]] +// IMPRVD_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[TMP12]] +// IMPRVD_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP15:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP14]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP16:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP15]], [[TMP13]] +// IMPRVD_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP18:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP17]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[TMP19:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP18]], [[TMP13]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD_FAST: complex_div: +// IMPRVD_FAST-NEXT: [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP20]] to half +// IMPRVD_FAST-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP21]] to half +// IMPRVD_FAST-NEXT: [[EXT:%.*]] = fpext half [[CONV2]] to float +// IMPRVD_FAST-NEXT: [[EXT4:%.*]] = fpext half [[CONV3]] to float +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// IMPRVD_FAST-NEXT: [[EXT5:%.*]] = fpext half [[A_REAL]] to float +// IMPRVD_FAST-NEXT: [[EXT6:%.*]] = fpext half [[A_IMAG]] to float +// IMPRVD_FAST-NEXT: [[TMP22:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[EXT5]]) +// IMPRVD_FAST-NEXT: [[TMP23:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[EXT6]]) +// IMPRVD_FAST-NEXT: [[ABS_CMP7:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt float [[TMP22]], [[TMP23]] +// IMPRVD_FAST-NEXT: br i1 [[ABS_CMP7]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI8:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI9:%.*]] +// IMPRVD_FAST: abs_rhsr_greater_or_equal_abs_rhsi8: +// IMPRVD_FAST-NEXT: [[TMP24:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[EXT6]], [[EXT5]] +// IMPRVD_FAST-NEXT: [[TMP25:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP24]], [[EXT6]] +// IMPRVD_FAST-NEXT: [[TMP26:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[EXT5]], [[TMP25]] +// IMPRVD_FAST-NEXT: [[TMP27:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT4]], [[TMP24]] +// IMPRVD_FAST-NEXT: [[TMP28:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[EXT]], [[TMP27]] +// IMPRVD_FAST-NEXT: [[TMP29:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP28]], [[TMP26]] +// IMPRVD_FAST-NEXT: [[TMP30:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[TMP24]] +// IMPRVD_FAST-NEXT: [[TMP31:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[EXT4]], [[TMP30]] +// IMPRVD_FAST-NEXT: [[TMP32:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP31]], [[TMP26]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV10:%.*]] +// IMPRVD_FAST: abs_rhsr_less_than_abs_rhsi9: +// IMPRVD_FAST-NEXT: [[TMP33:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[EXT5]], [[EXT6]] +// IMPRVD_FAST-NEXT: [[TMP34:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP33]], [[EXT5]] +// IMPRVD_FAST-NEXT: [[TMP35:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[EXT6]], [[TMP34]] +// IMPRVD_FAST-NEXT: [[TMP36:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[TMP33]] +// IMPRVD_FAST-NEXT: [[TMP37:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP36]], [[EXT4]] +// IMPRVD_FAST-NEXT: [[TMP38:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP37]], [[TMP35]] +// IMPRVD_FAST-NEXT: [[TMP39:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT4]], [[TMP33]] +// IMPRVD_FAST-NEXT: [[TMP40:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP39]], [[EXT]] +// IMPRVD_FAST-NEXT: [[TMP41:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP40]], [[TMP35]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV10]] +// IMPRVD_FAST: complex_div10: +// IMPRVD_FAST-NEXT: [[TMP42:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP29]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI8]] ], [ [[TMP38]], [[ABS_RHSR_LESS_THAN_ABS_RHSI9]] ] +// IMPRVD_FAST-NEXT: [[TMP43:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP32]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI8]] ], [ [[TMP41]], [[ABS_RHSR_LESS_THAN_ABS_RHSI9]] ] +// IMPRVD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP42]] to half +// IMPRVD_FAST-NEXT: [[UNPROMOTION11:%.*]] = fptrunc float [[TMP43]] to half +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// IMPRVD_FAST-NEXT: store half [[UNPROMOTION11]], ptr [[RETVAL_IMAGP]], align 2 +// IMPRVD_FAST-NEXT: [[TMP44:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// IMPRVD_FAST-NEXT: ret <2 x half> [[TMP44]] +// +// PRMTD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x half> @f1( +// PRMTD_FAST-SAME: <2 x half> noundef nofpclass(nan inf) [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x half> noundef nofpclass(nan inf) [[C_COERCE:%.*]]) #[[ATTR0]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { half, half }, align 2 +// PRMTD_FAST-NEXT: [[A:%.*]] = alloca { half, half }, align 2 +// PRMTD_FAST-NEXT: [[C:%.*]] = alloca { half, half }, align 2 +// PRMTD_FAST-NEXT: store <2 x half> [[A_COERCE]], ptr [[A]], align 2 +// PRMTD_FAST-NEXT: store <2 x half> [[C_COERCE]], ptr [[C]], align 2 +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// PRMTD_FAST-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[C_REAL:%.*]] = load half, ptr [[C_REALP]], align 2 +// PRMTD_FAST-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[C]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[C_IMAG:%.*]] = load half, ptr [[C_IMAGP]], align 2 +// PRMTD_FAST-NEXT: [[CONV:%.*]] = fpext half [[C_REAL]] to x86_fp80 +// PRMTD_FAST-NEXT: [[CONV1:%.*]] = fpext half [[C_IMAG]] to x86_fp80 +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) +// PRMTD_FAST-NEXT: [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) +// PRMTD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt x86_fp80 [[TMP0]], [[TMP1]] +// PRMTD_FAST-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// PRMTD_FAST: abs_rhsr_greater_or_equal_abs_rhsi: +// PRMTD_FAST-NEXT: [[TMP2:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[CONV]] +// PRMTD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP2]], [[CONV1]] +// PRMTD_FAST-NEXT: [[TMP4:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[TMP3]] +// PRMTD_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP2]] +// PRMTD_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP5]] +// PRMTD_FAST-NEXT: [[TMP7:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP4]] +// PRMTD_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP2]] +// PRMTD_FAST-NEXT: [[TMP9:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP8]] +// PRMTD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP9]], [[TMP4]] +// PRMTD_FAST-NEXT: br label [[COMPLEX_DIV:%.*]] +// PRMTD_FAST: abs_rhsr_less_than_abs_rhsi: +// PRMTD_FAST-NEXT: [[TMP11:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[CONV1]] +// PRMTD_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP11]], [[CONV]] +// PRMTD_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[TMP12]] +// PRMTD_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP11]] +// PRMTD_FAST-NEXT: [[TMP15:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP14]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[TMP16:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP15]], [[TMP13]] +// PRMTD_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP11]] +// PRMTD_FAST-NEXT: [[TMP18:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP17]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[TMP19:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP18]], [[TMP13]] +// PRMTD_FAST-NEXT: br label [[COMPLEX_DIV]] +// PRMTD_FAST: complex_div: +// PRMTD_FAST-NEXT: [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD_FAST-NEXT: [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD_FAST-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP20]] to half +// PRMTD_FAST-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP21]] to half +// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext half [[CONV2]] to float +// PRMTD_FAST-NEXT: [[EXT4:%.*]] = fpext half [[CONV3]] to float +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load half, ptr [[A_REALP]], align 2 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load half, ptr [[A_IMAGP]], align 2 +// PRMTD_FAST-NEXT: [[EXT5:%.*]] = fpext half [[A_REAL]] to float +// PRMTD_FAST-NEXT: [[EXT6:%.*]] = fpext half [[A_IMAG]] to float +// PRMTD_FAST-NEXT: [[TMP22:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT5]] +// PRMTD_FAST-NEXT: [[TMP23:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT4]], [[EXT6]] +// PRMTD_FAST-NEXT: [[TMP24:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP22]], [[TMP23]] +// PRMTD_FAST-NEXT: [[TMP25:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT5]], [[EXT5]] +// PRMTD_FAST-NEXT: [[TMP26:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT6]], [[EXT6]] +// PRMTD_FAST-NEXT: [[TMP27:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP25]], [[TMP26]] +// PRMTD_FAST-NEXT: [[TMP28:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT4]], [[EXT5]] +// PRMTD_FAST-NEXT: [[TMP29:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[EXT]], [[EXT6]] +// PRMTD_FAST-NEXT: [[TMP30:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP28]], [[TMP29]] +// PRMTD_FAST-NEXT: [[TMP31:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP24]], [[TMP27]] +// PRMTD_FAST-NEXT: [[TMP32:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP30]], [[TMP27]] +// PRMTD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[TMP31]] to half +// PRMTD_FAST-NEXT: [[UNPROMOTION7:%.*]] = fptrunc float [[TMP32]] to half +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { half, half }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store half [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 2 +// PRMTD_FAST-NEXT: store half [[UNPROMOTION7]], ptr [[RETVAL_IMAGP]], align 2 +// PRMTD_FAST-NEXT: [[TMP33:%.*]] = load <2 x half>, ptr [[RETVAL]], align 2 +// PRMTD_FAST-NEXT: ret <2 x half> [[TMP33]] +// +_Complex _Float16 f1(_Complex _Float16 a, _Complex long double b, _Complex _Float16 c) { + return (_Complex _Float16)(b / c) / a; +} diff --git a/clang/test/CodeGen/complex-math.c b/clang/test/CodeGen/complex-math.c index a44aa0014a6587..ba00b9cbecd2fe 100644 --- a/clang/test/CodeGen/complex-math.c +++ b/clang/test/CodeGen/complex-math.c @@ -5,7 +5,7 @@ // RUN: %clang_cc1 %s -O0 -emit-llvm -triple armv7-none-linux-gnueabi -o - | FileCheck %s --check-prefix=ARM // RUN: %clang_cc1 %s -O0 -emit-llvm -triple armv7-none-linux-gnueabihf -o - | FileCheck %s --check-prefix=ARMHF // RUN: %clang_cc1 %s -O0 -emit-llvm -triple thumbv7k-apple-watchos2.0 -o - -target-abi aapcs16 | FileCheck %s --check-prefix=ARM7K -// RUN: %clang_cc1 %s -O0 -emit-llvm -triple aarch64-unknown-unknown -ffast-math -ffp-contract=fast -complex-range=fortran -o - | FileCheck %s --check-prefix=AARCH64-FASTMATH +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple aarch64-unknown-unknown -ffast-math -ffp-contract=fast -complex-range=improved -o - | FileCheck %s --check-prefix=AARCH64-FASTMATH // RUN: %clang_cc1 %s -O0 -emit-llvm -triple spir -o - | FileCheck %s --check-prefix=SPIR float _Complex add_float_rr(float a, float b) { diff --git a/clang/test/CodeGen/cx-complex-range.c b/clang/test/CodeGen/cx-complex-range.c index 2d8507c710f202..9ec80252085b87 100644 --- a/clang/test/CodeGen/cx-complex-range.c +++ b/clang/test/CodeGen/cx-complex-range.c @@ -1,124 +1,3451 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ // RUN: -o - | FileCheck %s --check-prefix=FULL // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ -// RUN: -complex-range=limited -o - | FileCheck %s --check-prefix=LMTD +// RUN: -complex-range=basic -o - | FileCheck %s --check-prefix=BASIC // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ // RUN: -fno-cx-limited-range -o - | FileCheck %s --check-prefix=FULL // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ -// RUN: -complex-range=fortran -o - | FileCheck %s --check-prefix=FRTRN +// RUN: -complex-range=improved -o - | FileCheck %s --check-prefix=IMPRVD + +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -complex-range=promoted -o - | FileCheck %s --check-prefix=PRMTD + +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -complex-range=full -o - | FileCheck %s --check-prefix=FULL + +// RUN: %clang_cc1 -triple x86_64-windows-pc -complex-range=promoted \ +// RUN: -emit-llvm -o - %s | FileCheck %s --check-prefix=X86WINPRMTD + +// RUN: %clang_cc1 -triple=avr-unknown-unknown -mdouble=32 \ +// RUN: -complex-range=promoted -emit-llvm -o - %s \ +// RUN: | FileCheck %s --check-prefix=AVRFP32 + +// RUN: %clang_cc1 -triple=avr-unknown-unknown -mdouble=64 \ +// RUN: -complex-range=promoted -emit-llvm -o - %s \ +// RUN: | FileCheck %s --check-prefix=AVRFP64 // Fast math // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu \ -// RUN: -ffast-math -complex-range=limited -emit-llvm -o - %s \ -// RUN: | FileCheck %s --check-prefix=LMTD-FAST +// RUN: -ffast-math -complex-range=basic -emit-llvm -o - %s \ +// RUN: | FileCheck %s --check-prefix=BASIC_FAST // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu \ // RUN: -ffast-math -complex-range=full -emit-llvm -o - %s \ -// RUN: | FileCheck %s --check-prefix=FULL +// RUN: | FileCheck %s --check-prefix=FULL_FAST // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ // RUN: -fno-cx-fortran-rules -o - | FileCheck %s --check-prefix=FULL -// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ -// RUN: -fcx-limited-range -fno-cx-limited-range -o - \ -// RUN: | FileCheck %s --check-prefix=FULL +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu \ +// RUN: -ffast-math -complex-range=improved -emit-llvm -o - %s \ +// RUN: | FileCheck %s --check-prefix=IMPRVD_FAST -// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ -// RUN: -fno-cx-limited-range -fcx-limited-range -o - \ -// RUN: | FileCheck %s --check-prefix=FULL +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu \ +// RUN: -ffast-math -complex-range=promoted -emit-llvm -o - %s \ +// RUN: | FileCheck %s --check-prefix=PRMTD_FAST -// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ -// RUN: -fno-cx-fortran-rules -fcx-fortran-rules -o - \ -// RUN: | FileCheck %s --check-prefix=FULL - -_Complex float div(_Complex float a, _Complex float b) { - // LABEL: define {{.*}} @div( - // FULL: call {{.*}} @__divsc3 - - // LMTD: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fadd float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fadd float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fsub float - // LMTD-NEXT: fdiv float - // LMTD-NEXT: fdiv float - - // FRTRN: call {{.*}}float @llvm.fabs.f32(float {{.*}}) - // FRTRN-NEXT: call {{.*}}float @llvm.fabs.f32(float {{.*}}) - // FRTRN-NEXT: fcmp {{.*}}ugt float - // FRTRN-NEXT: br i1 {{.*}}, label - // FRTRN: abs_rhsr_greater_or_equal_abs_rhsi: - // FRTRN-NEXT: fdiv {{.*}}float - // FRTRN-NEXT: fmul {{.*}}float - // FRTRN-NEXT: fadd {{.*}}float - // FRTRN-NEXT: fmul {{.*}}float - // FRTRN-NEXT: fadd {{.*}}float - // FRTRN-NEXT: fdiv {{.*}}float - // FRTRN-NEXT: fmul {{.*}}float - // FRTRN-NEXT: fsub {{.*}}float - // FRTRN-NEXT: fdiv {{.*}}float - // FRTRN-NEXT: br label - // FRTRN: abs_rhsr_less_than_abs_rhsi: - // FRTRN-NEXT: fdiv {{.*}}float - // FRTRN-NEXT: fmul {{.*}}float - // FRTRN-NEXT: fadd {{.*}}float - // FRTRN-NEXT: fmul {{.*}}float - // FRTRN-NEXT: fadd {{.*}}float - // FRTRN-NEXT: fdiv {{.*}}float - // FRTRN-NEXT: fmul {{.*}}float - // FRTRN-NEXT: fsub {{.*}}float - // FRTRN-NEXT: fdiv {{.*}}float - // FRTRN-NEXT: br label - // FRTRN: complex_div: - // FRTRN-NEXT: phi {{.*}}float - // FRTRN-NEXT: phi {{.*}}float - - // LMTD-FAST: fmul {{.*}} float - // LMTD-FAST-NEXT: fmul {{.*}} float - // LMTD-FAST-NEXT: fadd {{.*}} float - // LMTD-FAST-NEXT: fmul {{.*}} float - // LMTD-FAST-NEXT: fmul {{.*}} float - // LMTD-FAST-NEXT: fadd {{.*}} float - // LMTD-FAST-NEXT: fmul {{.*}} float - // LMTD-FAST-NEXT: fmul {{.*}} float - // LMTD-FAST-NEXT: fsub {{.*}} float - // LMTD-FAST-NEXT: fdiv {{.*}} float - // LMTD-FAST-NEXT: fdiv {{.*}} float +// FULL-LABEL: define dso_local <2 x float> @divf( +// FULL-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// FULL-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// FULL-NEXT: [[CALL:%.*]] = call <2 x float> @__divsc3(float noundef [[A_REAL]], float noundef [[A_IMAG]], float noundef [[B_REAL]], float noundef [[B_IMAG]]) #[[ATTR2:[0-9]+]] +// FULL-NEXT: store <2 x float> [[CALL]], ptr [[COERCE]], align 4 +// FULL-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store float [[COERCE_REAL]], ptr [[RETVAL_REALP]], align 4 +// FULL-NEXT: store float [[COERCE_IMAG]], ptr [[RETVAL_IMAGP]], align 4 +// FULL-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// FULL-NEXT: ret <2 x float> [[TMP0]] +// +// BASIC-LABEL: define dso_local <2 x float> @divf( +// BASIC-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// BASIC-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// BASIC-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// BASIC-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// BASIC-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// BASIC-NEXT: [[TMP0:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// BASIC-NEXT: [[TMP1:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// BASIC-NEXT: [[TMP2:%.*]] = fadd float [[TMP0]], [[TMP1]] +// BASIC-NEXT: [[TMP3:%.*]] = fmul float [[B_REAL]], [[B_REAL]] +// BASIC-NEXT: [[TMP4:%.*]] = fmul float [[B_IMAG]], [[B_IMAG]] +// BASIC-NEXT: [[TMP5:%.*]] = fadd float [[TMP3]], [[TMP4]] +// BASIC-NEXT: [[TMP6:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// BASIC-NEXT: [[TMP7:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// BASIC-NEXT: [[TMP8:%.*]] = fsub float [[TMP6]], [[TMP7]] +// BASIC-NEXT: [[TMP9:%.*]] = fdiv float [[TMP2]], [[TMP5]] +// BASIC-NEXT: [[TMP10:%.*]] = fdiv float [[TMP8]], [[TMP5]] +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store float [[TMP9]], ptr [[RETVAL_REALP]], align 4 +// BASIC-NEXT: store float [[TMP10]], ptr [[RETVAL_IMAGP]], align 4 +// BASIC-NEXT: [[TMP11:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// BASIC-NEXT: ret <2 x float> [[TMP11]] +// +// IMPRVD-LABEL: define dso_local <2 x float> @divf( +// IMPRVD-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// IMPRVD-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// IMPRVD-NEXT: [[TMP0:%.*]] = call float @llvm.fabs.f32(float [[B_REAL]]) +// IMPRVD-NEXT: [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[B_IMAG]]) +// IMPRVD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt float [[TMP0]], [[TMP1]] +// IMPRVD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD-NEXT: [[TMP2:%.*]] = fdiv float [[B_IMAG]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP3:%.*]] = fmul float [[TMP2]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP4:%.*]] = fadd float [[B_REAL]], [[TMP3]] +// IMPRVD-NEXT: [[TMP5:%.*]] = fmul float [[A_IMAG]], [[TMP2]] +// IMPRVD-NEXT: [[TMP6:%.*]] = fadd float [[A_REAL]], [[TMP5]] +// IMPRVD-NEXT: [[TMP7:%.*]] = fdiv float [[TMP6]], [[TMP4]] +// IMPRVD-NEXT: [[TMP8:%.*]] = fmul float [[A_REAL]], [[TMP2]] +// IMPRVD-NEXT: [[TMP9:%.*]] = fsub float [[A_IMAG]], [[TMP8]] +// IMPRVD-NEXT: [[TMP10:%.*]] = fdiv float [[TMP9]], [[TMP4]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD: abs_rhsr_less_than_abs_rhsi: +// IMPRVD-NEXT: [[TMP11:%.*]] = fdiv float [[B_REAL]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP12:%.*]] = fmul float [[TMP11]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP13:%.*]] = fadd float [[B_IMAG]], [[TMP12]] +// IMPRVD-NEXT: [[TMP14:%.*]] = fmul float [[A_REAL]], [[TMP11]] +// IMPRVD-NEXT: [[TMP15:%.*]] = fadd float [[TMP14]], [[A_IMAG]] +// IMPRVD-NEXT: [[TMP16:%.*]] = fdiv float [[TMP15]], [[TMP13]] +// IMPRVD-NEXT: [[TMP17:%.*]] = fmul float [[A_IMAG]], [[TMP11]] +// IMPRVD-NEXT: [[TMP18:%.*]] = fsub float [[TMP17]], [[A_REAL]] +// IMPRVD-NEXT: [[TMP19:%.*]] = fdiv float [[TMP18]], [[TMP13]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD: complex_div: +// IMPRVD-NEXT: [[TMP20:%.*]] = phi float [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[TMP21:%.*]] = phi float [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store float [[TMP20]], ptr [[RETVAL_REALP]], align 4 +// IMPRVD-NEXT: store float [[TMP21]], ptr [[RETVAL_IMAGP]], align 4 +// IMPRVD-NEXT: [[TMP22:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// IMPRVD-NEXT: ret <2 x float> [[TMP22]] +// +// PRMTD-LABEL: define dso_local <2 x float> @divf( +// PRMTD-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// PRMTD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// PRMTD-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// PRMTD-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// PRMTD-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// PRMTD-NEXT: [[EXT:%.*]] = fpext float [[A_REAL]] to double +// PRMTD-NEXT: [[EXT1:%.*]] = fpext float [[A_IMAG]] to double +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// PRMTD-NEXT: [[EXT2:%.*]] = fpext float [[B_REAL]] to double +// PRMTD-NEXT: [[EXT3:%.*]] = fpext float [[B_IMAG]] to double +// PRMTD-NEXT: [[TMP0:%.*]] = fmul double [[EXT]], [[EXT2]] +// PRMTD-NEXT: [[TMP1:%.*]] = fmul double [[EXT1]], [[EXT3]] +// PRMTD-NEXT: [[TMP2:%.*]] = fadd double [[TMP0]], [[TMP1]] +// PRMTD-NEXT: [[TMP3:%.*]] = fmul double [[EXT2]], [[EXT2]] +// PRMTD-NEXT: [[TMP4:%.*]] = fmul double [[EXT3]], [[EXT3]] +// PRMTD-NEXT: [[TMP5:%.*]] = fadd double [[TMP3]], [[TMP4]] +// PRMTD-NEXT: [[TMP6:%.*]] = fmul double [[EXT1]], [[EXT2]] +// PRMTD-NEXT: [[TMP7:%.*]] = fmul double [[EXT]], [[EXT3]] +// PRMTD-NEXT: [[TMP8:%.*]] = fsub double [[TMP6]], [[TMP7]] +// PRMTD-NEXT: [[TMP9:%.*]] = fdiv double [[TMP2]], [[TMP5]] +// PRMTD-NEXT: [[TMP10:%.*]] = fdiv double [[TMP8]], [[TMP5]] +// PRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc double [[TMP9]] to float +// PRMTD-NEXT: [[UNPROMOTION4:%.*]] = fptrunc double [[TMP10]] to float +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store float [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 4 +// PRMTD-NEXT: store float [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 4 +// PRMTD-NEXT: [[TMP11:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// PRMTD-NEXT: ret <2 x float> [[TMP11]] +// +// X86WINPRMTD-LABEL: define dso_local i64 @divf( +// X86WINPRMTD-SAME: i64 noundef [[A_COERCE:%.*]], i64 noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// X86WINPRMTD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// X86WINPRMTD-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// X86WINPRMTD-NEXT: store i64 [[A_COERCE]], ptr [[A]], align 4 +// X86WINPRMTD-NEXT: store i64 [[B_COERCE]], ptr [[B]], align 4 +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// X86WINPRMTD-NEXT: [[EXT:%.*]] = fpext float [[A_REAL]] to double +// X86WINPRMTD-NEXT: [[EXT1:%.*]] = fpext float [[A_IMAG]] to double +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// X86WINPRMTD-NEXT: [[EXT2:%.*]] = fpext float [[B_REAL]] to double +// X86WINPRMTD-NEXT: [[EXT3:%.*]] = fpext float [[B_IMAG]] to double +// X86WINPRMTD-NEXT: [[TMP0:%.*]] = fmul double [[EXT]], [[EXT2]] +// X86WINPRMTD-NEXT: [[TMP1:%.*]] = fmul double [[EXT1]], [[EXT3]] +// X86WINPRMTD-NEXT: [[TMP2:%.*]] = fadd double [[TMP0]], [[TMP1]] +// X86WINPRMTD-NEXT: [[TMP3:%.*]] = fmul double [[EXT2]], [[EXT2]] +// X86WINPRMTD-NEXT: [[TMP4:%.*]] = fmul double [[EXT3]], [[EXT3]] +// X86WINPRMTD-NEXT: [[TMP5:%.*]] = fadd double [[TMP3]], [[TMP4]] +// X86WINPRMTD-NEXT: [[TMP6:%.*]] = fmul double [[EXT1]], [[EXT2]] +// X86WINPRMTD-NEXT: [[TMP7:%.*]] = fmul double [[EXT]], [[EXT3]] +// X86WINPRMTD-NEXT: [[TMP8:%.*]] = fsub double [[TMP6]], [[TMP7]] +// X86WINPRMTD-NEXT: [[TMP9:%.*]] = fdiv double [[TMP2]], [[TMP5]] +// X86WINPRMTD-NEXT: [[TMP10:%.*]] = fdiv double [[TMP8]], [[TMP5]] +// X86WINPRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc double [[TMP9]] to float +// X86WINPRMTD-NEXT: [[UNPROMOTION4:%.*]] = fptrunc double [[TMP10]] to float +// X86WINPRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store float [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 4 +// X86WINPRMTD-NEXT: store float [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 4 +// X86WINPRMTD-NEXT: [[TMP11:%.*]] = load i64, ptr [[RETVAL]], align 4 +// X86WINPRMTD-NEXT: ret i64 [[TMP11]] +// +// AVRFP32-LABEL: define dso_local { float, float } @divf( +// AVRFP32-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], float noundef [[B_COERCE0:%.*]], float noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0:[0-9]+]] { +// AVRFP32-NEXT: entry: +// AVRFP32-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[A:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[B:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP32-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP32-NEXT: [[TMP2:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP32-NEXT: [[TMP3:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP32-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 1 +// AVRFP32-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 1 +// AVRFP32-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 1 +// AVRFP32-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP4:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[B_REAL]]) +// AVRFP32-NEXT: [[TMP5:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[B_IMAG]]) +// AVRFP32-NEXT: [[ABS_CMP:%.*]] = fcmp ugt float [[TMP4]], [[TMP5]] +// AVRFP32-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// AVRFP32: abs_rhsr_greater_or_equal_abs_rhsi: +// AVRFP32-NEXT: [[TMP6:%.*]] = fdiv float [[B_IMAG]], [[B_REAL]] +// AVRFP32-NEXT: [[TMP7:%.*]] = fmul float [[TMP6]], [[B_IMAG]] +// AVRFP32-NEXT: [[TMP8:%.*]] = fadd float [[B_REAL]], [[TMP7]] +// AVRFP32-NEXT: [[TMP9:%.*]] = fmul float [[A_IMAG]], [[TMP6]] +// AVRFP32-NEXT: [[TMP10:%.*]] = fadd float [[A_REAL]], [[TMP9]] +// AVRFP32-NEXT: [[TMP11:%.*]] = fdiv float [[TMP10]], [[TMP8]] +// AVRFP32-NEXT: [[TMP12:%.*]] = fmul float [[A_REAL]], [[TMP6]] +// AVRFP32-NEXT: [[TMP13:%.*]] = fsub float [[A_IMAG]], [[TMP12]] +// AVRFP32-NEXT: [[TMP14:%.*]] = fdiv float [[TMP13]], [[TMP8]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV:%.*]] +// AVRFP32: abs_rhsr_less_than_abs_rhsi: +// AVRFP32-NEXT: [[TMP15:%.*]] = fdiv float [[B_REAL]], [[B_IMAG]] +// AVRFP32-NEXT: [[TMP16:%.*]] = fmul float [[TMP15]], [[B_REAL]] +// AVRFP32-NEXT: [[TMP17:%.*]] = fadd float [[B_IMAG]], [[TMP16]] +// AVRFP32-NEXT: [[TMP18:%.*]] = fmul float [[A_REAL]], [[TMP15]] +// AVRFP32-NEXT: [[TMP19:%.*]] = fadd float [[TMP18]], [[A_IMAG]] +// AVRFP32-NEXT: [[TMP20:%.*]] = fdiv float [[TMP19]], [[TMP17]] +// AVRFP32-NEXT: [[TMP21:%.*]] = fmul float [[A_IMAG]], [[TMP15]] +// AVRFP32-NEXT: [[TMP22:%.*]] = fsub float [[TMP21]], [[A_REAL]] +// AVRFP32-NEXT: [[TMP23:%.*]] = fdiv float [[TMP22]], [[TMP17]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV]] +// AVRFP32: complex_div: +// AVRFP32-NEXT: [[TMP24:%.*]] = phi float [ [[TMP11]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP20]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP32-NEXT: [[TMP25:%.*]] = phi float [ [[TMP14]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP23]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP32-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP32-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[TMP24]], ptr [[RETVAL_REALP]], align 1 +// AVRFP32-NEXT: store float [[TMP25]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP26:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP32-NEXT: ret { float, float } [[TMP26]] +// +// AVRFP64-LABEL: define dso_local { float, float } @divf( +// AVRFP64-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], float noundef [[B_COERCE0:%.*]], float noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0:[0-9]+]] { +// AVRFP64-NEXT: entry: +// AVRFP64-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP64-NEXT: [[A:%.*]] = alloca { float, float }, align 1 +// AVRFP64-NEXT: [[B:%.*]] = alloca { float, float }, align 1 +// AVRFP64-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP64-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP64-NEXT: [[TMP2:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: store float [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP64-NEXT: [[TMP3:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: store float [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP64-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 1 +// AVRFP64-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 1 +// AVRFP64-NEXT: [[EXT:%.*]] = fpext float [[A_REAL]] to double +// AVRFP64-NEXT: [[EXT1:%.*]] = fpext float [[A_IMAG]] to double +// AVRFP64-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 1 +// AVRFP64-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 1 +// AVRFP64-NEXT: [[EXT2:%.*]] = fpext float [[B_REAL]] to double +// AVRFP64-NEXT: [[EXT3:%.*]] = fpext float [[B_IMAG]] to double +// AVRFP64-NEXT: [[TMP4:%.*]] = fmul double [[EXT]], [[EXT2]] +// AVRFP64-NEXT: [[TMP5:%.*]] = fmul double [[EXT1]], [[EXT3]] +// AVRFP64-NEXT: [[TMP6:%.*]] = fadd double [[TMP4]], [[TMP5]] +// AVRFP64-NEXT: [[TMP7:%.*]] = fmul double [[EXT2]], [[EXT2]] +// AVRFP64-NEXT: [[TMP8:%.*]] = fmul double [[EXT3]], [[EXT3]] +// AVRFP64-NEXT: [[TMP9:%.*]] = fadd double [[TMP7]], [[TMP8]] +// AVRFP64-NEXT: [[TMP10:%.*]] = fmul double [[EXT1]], [[EXT2]] +// AVRFP64-NEXT: [[TMP11:%.*]] = fmul double [[EXT]], [[EXT3]] +// AVRFP64-NEXT: [[TMP12:%.*]] = fsub double [[TMP10]], [[TMP11]] +// AVRFP64-NEXT: [[TMP13:%.*]] = fdiv double [[TMP6]], [[TMP9]] +// AVRFP64-NEXT: [[TMP14:%.*]] = fdiv double [[TMP12]], [[TMP9]] +// AVRFP64-NEXT: [[UNPROMOTION:%.*]] = fptrunc double [[TMP13]] to float +// AVRFP64-NEXT: [[UNPROMOTION4:%.*]] = fptrunc double [[TMP14]] to float +// AVRFP64-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP64-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP64-NEXT: store float [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 1 +// AVRFP64-NEXT: store float [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP64-NEXT: [[TMP15:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP64-NEXT: ret { float, float } [[TMP15]] +// +// BASIC_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @divf( +// BASIC_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x float> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// BASIC_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// BASIC_FAST-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// BASIC_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// BASIC_FAST-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// BASIC_FAST-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_REAL]] +// BASIC_FAST-NEXT: [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP0]], [[TMP1]] +// BASIC_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[B_REAL]], [[B_REAL]] +// BASIC_FAST-NEXT: [[TMP4:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[B_IMAG]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[TMP5:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP3]], [[TMP4]] +// BASIC_FAST-NEXT: [[TMP6:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_REAL]] +// BASIC_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP6]], [[TMP7]] +// BASIC_FAST-NEXT: [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP2]], [[TMP5]] +// BASIC_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP8]], [[TMP5]] +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store float [[TMP9]], ptr [[RETVAL_REALP]], align 4 +// BASIC_FAST-NEXT: store float [[TMP10]], ptr [[RETVAL_IMAGP]], align 4 +// BASIC_FAST-NEXT: [[TMP11:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// BASIC_FAST-NEXT: ret <2 x float> [[TMP11]] +// +// FULL_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @divf( +// FULL_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x float> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// FULL_FAST-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// FULL_FAST-NEXT: [[CALL:%.*]] = call reassoc nnan ninf nsz arcp afn nofpclass(nan inf) <2 x float> @__divsc3(float noundef nofpclass(nan inf) [[A_REAL]], float noundef nofpclass(nan inf) [[A_IMAG]], float noundef nofpclass(nan inf) [[B_REAL]], float noundef nofpclass(nan inf) [[B_IMAG]]) #[[ATTR2:[0-9]+]] +// FULL_FAST-NEXT: store <2 x float> [[CALL]], ptr [[COERCE]], align 4 +// FULL_FAST-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL_FAST-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL_FAST-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL_FAST-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store float [[COERCE_REAL]], ptr [[RETVAL_REALP]], align 4 +// FULL_FAST-NEXT: store float [[COERCE_IMAG]], ptr [[RETVAL_IMAGP]], align 4 +// FULL_FAST-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// FULL_FAST-NEXT: ret <2 x float> [[TMP0]] +// +// IMPRVD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @divf( +// IMPRVD_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x float> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// IMPRVD_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// IMPRVD_FAST-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// IMPRVD_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// IMPRVD_FAST-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[B_REAL]]) +// IMPRVD_FAST-NEXT: [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[B_IMAG]]) +// IMPRVD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt float [[TMP0]], [[TMP1]] +// IMPRVD_FAST-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD_FAST: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP2:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[B_IMAG]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP2]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP4:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[B_REAL]], [[TMP3]] +// IMPRVD_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[TMP5]] +// IMPRVD_FAST-NEXT: [[TMP7:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP6]], [[TMP4]] +// IMPRVD_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP9:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[TMP8]] +// IMPRVD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP9]], [[TMP4]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD_FAST: abs_rhsr_less_than_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP11:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[B_REAL]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP11]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[B_IMAG]], [[TMP12]] +// IMPRVD_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP15:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP14]], [[A_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP16:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP15]], [[TMP13]] +// IMPRVD_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP18:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP17]], [[A_REAL]] +// IMPRVD_FAST-NEXT: [[TMP19:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP18]], [[TMP13]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD_FAST: complex_div: +// IMPRVD_FAST-NEXT: [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store float [[TMP20]], ptr [[RETVAL_REALP]], align 4 +// IMPRVD_FAST-NEXT: store float [[TMP21]], ptr [[RETVAL_IMAGP]], align 4 +// IMPRVD_FAST-NEXT: [[TMP22:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// IMPRVD_FAST-NEXT: ret <2 x float> [[TMP22]] +// +// PRMTD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @divf( +// PRMTD_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x float> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// PRMTD_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// PRMTD_FAST-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// PRMTD_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// PRMTD_FAST-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext float [[A_REAL]] to double +// PRMTD_FAST-NEXT: [[EXT1:%.*]] = fpext float [[A_IMAG]] to double +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// PRMTD_FAST-NEXT: [[EXT2:%.*]] = fpext float [[B_REAL]] to double +// PRMTD_FAST-NEXT: [[EXT3:%.*]] = fpext float [[B_IMAG]] to double +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT]], [[EXT2]] +// PRMTD_FAST-NEXT: [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT1]], [[EXT3]] +// PRMTD_FAST-NEXT: [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[TMP0]], [[TMP1]] +// PRMTD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT2]], [[EXT2]] +// PRMTD_FAST-NEXT: [[TMP4:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT3]], [[EXT3]] +// PRMTD_FAST-NEXT: [[TMP5:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[TMP3]], [[TMP4]] +// PRMTD_FAST-NEXT: [[TMP6:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT1]], [[EXT2]] +// PRMTD_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT]], [[EXT3]] +// PRMTD_FAST-NEXT: [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn double [[TMP6]], [[TMP7]] +// PRMTD_FAST-NEXT: [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP2]], [[TMP5]] +// PRMTD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP8]], [[TMP5]] +// PRMTD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc double [[TMP9]] to float +// PRMTD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc double [[TMP10]] to float +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store float [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 4 +// PRMTD_FAST-NEXT: store float [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 4 +// PRMTD_FAST-NEXT: [[TMP11:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// PRMTD_FAST-NEXT: ret <2 x float> [[TMP11]] +// +_Complex float divf(_Complex float a, _Complex float b) { + return a / b; +} + +// FULL-LABEL: define dso_local <2 x float> @mulf( +// FULL-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// FULL-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// FULL-NEXT: [[MUL_AC:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// FULL-NEXT: [[MUL_BD:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// FULL-NEXT: [[MUL_AD:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// FULL-NEXT: [[MUL_BC:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// FULL-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// FULL-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// FULL-NEXT: [[ISNAN_CMP:%.*]] = fcmp uno float [[MUL_R]], [[MUL_R]] +// FULL-NEXT: br i1 [[ISNAN_CMP]], label [[COMPLEX_MUL_IMAG_NAN:%.*]], label [[COMPLEX_MUL_CONT:%.*]], !prof [[PROF2:![0-9]+]] +// FULL: complex_mul_imag_nan: +// FULL-NEXT: [[ISNAN_CMP1:%.*]] = fcmp uno float [[MUL_I]], [[MUL_I]] +// FULL-NEXT: br i1 [[ISNAN_CMP1]], label [[COMPLEX_MUL_LIBCALL:%.*]], label [[COMPLEX_MUL_CONT]], !prof [[PROF2]] +// FULL: complex_mul_libcall: +// FULL-NEXT: [[CALL:%.*]] = call <2 x float> @__mulsc3(float noundef [[A_REAL]], float noundef [[A_IMAG]], float noundef [[B_REAL]], float noundef [[B_IMAG]]) #[[ATTR2]] +// FULL-NEXT: store <2 x float> [[CALL]], ptr [[COERCE]], align 4 +// FULL-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL-NEXT: br label [[COMPLEX_MUL_CONT]] +// FULL: complex_mul_cont: +// FULL-NEXT: [[REAL_MUL_PHI:%.*]] = phi float [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_REAL]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL-NEXT: [[IMAG_MUL_PHI:%.*]] = phi float [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_IMAG]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store float [[REAL_MUL_PHI]], ptr [[RETVAL_REALP]], align 4 +// FULL-NEXT: store float [[IMAG_MUL_PHI]], ptr [[RETVAL_IMAGP]], align 4 +// FULL-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// FULL-NEXT: ret <2 x float> [[TMP0]] +// +// BASIC-LABEL: define dso_local <2 x float> @mulf( +// BASIC-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// BASIC-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// BASIC-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// BASIC-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// BASIC-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// BASIC-NEXT: [[MUL_AC:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// BASIC-NEXT: [[MUL_BD:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// BASIC-NEXT: [[MUL_AD:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// BASIC-NEXT: [[MUL_BC:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// BASIC-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// BASIC-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 4 +// BASIC-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 4 +// BASIC-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// BASIC-NEXT: ret <2 x float> [[TMP0]] +// +// IMPRVD-LABEL: define dso_local <2 x float> @mulf( +// IMPRVD-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// IMPRVD-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// IMPRVD-NEXT: [[MUL_AC:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// IMPRVD-NEXT: [[MUL_BD:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// IMPRVD-NEXT: [[MUL_AD:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// IMPRVD-NEXT: [[MUL_BC:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// IMPRVD-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// IMPRVD-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 4 +// IMPRVD-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 4 +// IMPRVD-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// IMPRVD-NEXT: ret <2 x float> [[TMP0]] +// +// PRMTD-LABEL: define dso_local <2 x float> @mulf( +// PRMTD-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// PRMTD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// PRMTD-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// PRMTD-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// PRMTD-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// PRMTD-NEXT: [[MUL_AC:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// PRMTD-NEXT: [[MUL_BD:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// PRMTD-NEXT: [[MUL_AD:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// PRMTD-NEXT: [[MUL_BC:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// PRMTD-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// PRMTD-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 4 +// PRMTD-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 4 +// PRMTD-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// PRMTD-NEXT: ret <2 x float> [[TMP0]] +// +// X86WINPRMTD-LABEL: define dso_local i64 @mulf( +// X86WINPRMTD-SAME: i64 noundef [[A_COERCE:%.*]], i64 noundef [[B_COERCE:%.*]]) #[[ATTR0]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// X86WINPRMTD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// X86WINPRMTD-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// X86WINPRMTD-NEXT: store i64 [[A_COERCE]], ptr [[A]], align 4 +// X86WINPRMTD-NEXT: store i64 [[B_COERCE]], ptr [[B]], align 4 +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// X86WINPRMTD-NEXT: [[MUL_AC:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[MUL_BD:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[MUL_AD:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[MUL_BC:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// X86WINPRMTD-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// X86WINPRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 4 +// X86WINPRMTD-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 4 +// X86WINPRMTD-NEXT: [[TMP0:%.*]] = load i64, ptr [[RETVAL]], align 4 +// X86WINPRMTD-NEXT: ret i64 [[TMP0]] +// +// AVRFP32-LABEL: define dso_local { float, float } @mulf( +// AVRFP32-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], float noundef [[B_COERCE0:%.*]], float noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP32-NEXT: entry: +// AVRFP32-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[A:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[B:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP32-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP32-NEXT: [[TMP2:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP32-NEXT: [[TMP3:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP32-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 1 +// AVRFP32-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 1 +// AVRFP32-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 1 +// AVRFP32-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 1 +// AVRFP32-NEXT: [[MUL_AC:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// AVRFP32-NEXT: [[MUL_BD:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// AVRFP32-NEXT: [[MUL_AD:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// AVRFP32-NEXT: [[MUL_BC:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// AVRFP32-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// AVRFP32-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// AVRFP32-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP32-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 1 +// AVRFP32-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP4:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP32-NEXT: ret { float, float } [[TMP4]] +// +// AVRFP64-LABEL: define dso_local { float, float } @mulf( +// AVRFP64-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], float noundef [[B_COERCE0:%.*]], float noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP64-NEXT: entry: +// AVRFP64-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP64-NEXT: [[A:%.*]] = alloca { float, float }, align 1 +// AVRFP64-NEXT: [[B:%.*]] = alloca { float, float }, align 1 +// AVRFP64-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP64-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP64-NEXT: [[TMP2:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: store float [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP64-NEXT: [[TMP3:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: store float [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP64-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 1 +// AVRFP64-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 1 +// AVRFP64-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 1 +// AVRFP64-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 1 +// AVRFP64-NEXT: [[MUL_AC:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// AVRFP64-NEXT: [[MUL_BD:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// AVRFP64-NEXT: [[MUL_AD:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// AVRFP64-NEXT: [[MUL_BC:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// AVRFP64-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// AVRFP64-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// AVRFP64-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP64-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP64-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 1 +// AVRFP64-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP64-NEXT: [[TMP4:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP64-NEXT: ret { float, float } [[TMP4]] +// +// BASIC_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @mulf( +// BASIC_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x float> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// BASIC_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// BASIC_FAST-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// BASIC_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// BASIC_FAST-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// BASIC_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_REAL]] +// BASIC_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_REAL]] +// BASIC_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]] +// BASIC_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]] +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 4 +// BASIC_FAST-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 4 +// BASIC_FAST-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// BASIC_FAST-NEXT: ret <2 x float> [[TMP0]] +// +// FULL_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @mulf( +// FULL_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x float> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// FULL_FAST-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// FULL_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_REAL]] +// FULL_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_IMAG]] +// FULL_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_IMAG]] +// FULL_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_REAL]] +// FULL_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]] +// FULL_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]] +// FULL_FAST-NEXT: [[ISNAN_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn uno float [[MUL_R]], [[MUL_R]] +// FULL_FAST-NEXT: br i1 [[ISNAN_CMP]], label [[COMPLEX_MUL_IMAG_NAN:%.*]], label [[COMPLEX_MUL_CONT:%.*]], !prof [[PROF2:![0-9]+]] +// FULL_FAST: complex_mul_imag_nan: +// FULL_FAST-NEXT: [[ISNAN_CMP1:%.*]] = fcmp reassoc nnan ninf nsz arcp afn uno float [[MUL_I]], [[MUL_I]] +// FULL_FAST-NEXT: br i1 [[ISNAN_CMP1]], label [[COMPLEX_MUL_LIBCALL:%.*]], label [[COMPLEX_MUL_CONT]], !prof [[PROF2]] +// FULL_FAST: complex_mul_libcall: +// FULL_FAST-NEXT: [[CALL:%.*]] = call reassoc nnan ninf nsz arcp afn nofpclass(nan inf) <2 x float> @__mulsc3(float noundef nofpclass(nan inf) [[A_REAL]], float noundef nofpclass(nan inf) [[A_IMAG]], float noundef nofpclass(nan inf) [[B_REAL]], float noundef nofpclass(nan inf) [[B_IMAG]]) #[[ATTR2]] +// FULL_FAST-NEXT: store <2 x float> [[CALL]], ptr [[COERCE]], align 4 +// FULL_FAST-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL_FAST-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL_FAST-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL_FAST-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL_FAST-NEXT: br label [[COMPLEX_MUL_CONT]] +// FULL_FAST: complex_mul_cont: +// FULL_FAST-NEXT: [[REAL_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_REAL]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL_FAST-NEXT: [[IMAG_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[COERCE_IMAG]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store float [[REAL_MUL_PHI]], ptr [[RETVAL_REALP]], align 4 +// FULL_FAST-NEXT: store float [[IMAG_MUL_PHI]], ptr [[RETVAL_IMAGP]], align 4 +// FULL_FAST-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// FULL_FAST-NEXT: ret <2 x float> [[TMP0]] +// +// IMPRVD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @mulf( +// IMPRVD_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x float> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// IMPRVD_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// IMPRVD_FAST-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// IMPRVD_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// IMPRVD_FAST-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// IMPRVD_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]] +// IMPRVD_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]] +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 4 +// IMPRVD_FAST-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 4 +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// IMPRVD_FAST-NEXT: ret <2 x float> [[TMP0]] +// +// PRMTD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @mulf( +// PRMTD_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], <2 x float> noundef nofpclass(nan inf) [[B_COERCE:%.*]]) #[[ATTR0]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// PRMTD_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// PRMTD_FAST-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// PRMTD_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// PRMTD_FAST-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// PRMTD_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[MUL_AC]], [[MUL_BD]] +// PRMTD_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[MUL_AD]], [[MUL_BC]] +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 4 +// PRMTD_FAST-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 4 +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// PRMTD_FAST-NEXT: ret <2 x float> [[TMP0]] +// +_Complex float mulf(_Complex float a, _Complex float b) { + return a * b; +} +// FULL-LABEL: define dso_local { double, double } @divd( +// FULL-SAME: double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) #[[ATTR1:[0-9]+]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// FULL-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// FULL-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// FULL-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// FULL-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// FULL-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// FULL-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// FULL-NEXT: [[CALL:%.*]] = call { double, double } @__divdc3(double noundef [[A_REAL]], double noundef [[A_IMAG]], double noundef [[B_REAL]], double noundef [[B_IMAG]]) #[[ATTR2]] +// FULL-NEXT: [[TMP4:%.*]] = extractvalue { double, double } [[CALL]], 0 +// FULL-NEXT: [[TMP5:%.*]] = extractvalue { double, double } [[CALL]], 1 +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store double [[TMP4]], ptr [[RETVAL_REALP]], align 8 +// FULL-NEXT: store double [[TMP5]], ptr [[RETVAL_IMAGP]], align 8 +// FULL-NEXT: [[TMP6:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// FULL-NEXT: ret { double, double } [[TMP6]] +// +// BASIC-LABEL: define dso_local { double, double } @divd( +// BASIC-SAME: double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) #[[ATTR1:[0-9]+]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// BASIC-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// BASIC-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// BASIC-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// BASIC-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// BASIC-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// BASIC-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// BASIC-NEXT: [[TMP4:%.*]] = fmul double [[A_REAL]], [[B_REAL]] +// BASIC-NEXT: [[TMP5:%.*]] = fmul double [[A_IMAG]], [[B_IMAG]] +// BASIC-NEXT: [[TMP6:%.*]] = fadd double [[TMP4]], [[TMP5]] +// BASIC-NEXT: [[TMP7:%.*]] = fmul double [[B_REAL]], [[B_REAL]] +// BASIC-NEXT: [[TMP8:%.*]] = fmul double [[B_IMAG]], [[B_IMAG]] +// BASIC-NEXT: [[TMP9:%.*]] = fadd double [[TMP7]], [[TMP8]] +// BASIC-NEXT: [[TMP10:%.*]] = fmul double [[A_IMAG]], [[B_REAL]] +// BASIC-NEXT: [[TMP11:%.*]] = fmul double [[A_REAL]], [[B_IMAG]] +// BASIC-NEXT: [[TMP12:%.*]] = fsub double [[TMP10]], [[TMP11]] +// BASIC-NEXT: [[TMP13:%.*]] = fdiv double [[TMP6]], [[TMP9]] +// BASIC-NEXT: [[TMP14:%.*]] = fdiv double [[TMP12]], [[TMP9]] +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store double [[TMP13]], ptr [[RETVAL_REALP]], align 8 +// BASIC-NEXT: store double [[TMP14]], ptr [[RETVAL_IMAGP]], align 8 +// BASIC-NEXT: [[TMP15:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// BASIC-NEXT: ret { double, double } [[TMP15]] +// +// IMPRVD-LABEL: define dso_local { double, double } @divd( +// IMPRVD-SAME: double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) #[[ATTR2:[0-9]+]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// IMPRVD-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// IMPRVD-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// IMPRVD-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// IMPRVD-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// IMPRVD-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// IMPRVD-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// IMPRVD-NEXT: [[TMP4:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]]) +// IMPRVD-NEXT: [[TMP5:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]]) +// IMPRVD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt double [[TMP4]], [[TMP5]] +// IMPRVD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD-NEXT: [[TMP6:%.*]] = fdiv double [[B_IMAG]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP7:%.*]] = fmul double [[TMP6]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP8:%.*]] = fadd double [[B_REAL]], [[TMP7]] +// IMPRVD-NEXT: [[TMP9:%.*]] = fmul double [[A_IMAG]], [[TMP6]] +// IMPRVD-NEXT: [[TMP10:%.*]] = fadd double [[A_REAL]], [[TMP9]] +// IMPRVD-NEXT: [[TMP11:%.*]] = fdiv double [[TMP10]], [[TMP8]] +// IMPRVD-NEXT: [[TMP12:%.*]] = fmul double [[A_REAL]], [[TMP6]] +// IMPRVD-NEXT: [[TMP13:%.*]] = fsub double [[A_IMAG]], [[TMP12]] +// IMPRVD-NEXT: [[TMP14:%.*]] = fdiv double [[TMP13]], [[TMP8]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD: abs_rhsr_less_than_abs_rhsi: +// IMPRVD-NEXT: [[TMP15:%.*]] = fdiv double [[B_REAL]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP16:%.*]] = fmul double [[TMP15]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP17:%.*]] = fadd double [[B_IMAG]], [[TMP16]] +// IMPRVD-NEXT: [[TMP18:%.*]] = fmul double [[A_REAL]], [[TMP15]] +// IMPRVD-NEXT: [[TMP19:%.*]] = fadd double [[TMP18]], [[A_IMAG]] +// IMPRVD-NEXT: [[TMP20:%.*]] = fdiv double [[TMP19]], [[TMP17]] +// IMPRVD-NEXT: [[TMP21:%.*]] = fmul double [[A_IMAG]], [[TMP15]] +// IMPRVD-NEXT: [[TMP22:%.*]] = fsub double [[TMP21]], [[A_REAL]] +// IMPRVD-NEXT: [[TMP23:%.*]] = fdiv double [[TMP22]], [[TMP17]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD: complex_div: +// IMPRVD-NEXT: [[TMP24:%.*]] = phi double [ [[TMP11]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP20]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[TMP25:%.*]] = phi double [ [[TMP14]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP23]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store double [[TMP24]], ptr [[RETVAL_REALP]], align 8 +// IMPRVD-NEXT: store double [[TMP25]], ptr [[RETVAL_IMAGP]], align 8 +// IMPRVD-NEXT: [[TMP26:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// IMPRVD-NEXT: ret { double, double } [[TMP26]] +// +// PRMTD-LABEL: define dso_local { double, double } @divd( +// PRMTD-SAME: double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) #[[ATTR1:[0-9]+]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// PRMTD-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// PRMTD-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// PRMTD-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// PRMTD-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// PRMTD-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// PRMTD-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// PRMTD-NEXT: [[EXT:%.*]] = fpext double [[A_REAL]] to x86_fp80 +// PRMTD-NEXT: [[EXT1:%.*]] = fpext double [[A_IMAG]] to x86_fp80 +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// PRMTD-NEXT: [[EXT2:%.*]] = fpext double [[B_REAL]] to x86_fp80 +// PRMTD-NEXT: [[EXT3:%.*]] = fpext double [[B_IMAG]] to x86_fp80 +// PRMTD-NEXT: [[TMP4:%.*]] = fmul x86_fp80 [[EXT]], [[EXT2]] +// PRMTD-NEXT: [[TMP5:%.*]] = fmul x86_fp80 [[EXT1]], [[EXT3]] +// PRMTD-NEXT: [[TMP6:%.*]] = fadd x86_fp80 [[TMP4]], [[TMP5]] +// PRMTD-NEXT: [[TMP7:%.*]] = fmul x86_fp80 [[EXT2]], [[EXT2]] +// PRMTD-NEXT: [[TMP8:%.*]] = fmul x86_fp80 [[EXT3]], [[EXT3]] +// PRMTD-NEXT: [[TMP9:%.*]] = fadd x86_fp80 [[TMP7]], [[TMP8]] +// PRMTD-NEXT: [[TMP10:%.*]] = fmul x86_fp80 [[EXT1]], [[EXT2]] +// PRMTD-NEXT: [[TMP11:%.*]] = fmul x86_fp80 [[EXT]], [[EXT3]] +// PRMTD-NEXT: [[TMP12:%.*]] = fsub x86_fp80 [[TMP10]], [[TMP11]] +// PRMTD-NEXT: [[TMP13:%.*]] = fdiv x86_fp80 [[TMP6]], [[TMP9]] +// PRMTD-NEXT: [[TMP14:%.*]] = fdiv x86_fp80 [[TMP12]], [[TMP9]] +// PRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc x86_fp80 [[TMP13]] to double +// PRMTD-NEXT: [[UNPROMOTION4:%.*]] = fptrunc x86_fp80 [[TMP14]] to double +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store double [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 8 +// PRMTD-NEXT: store double [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 8 +// PRMTD-NEXT: [[TMP15:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// PRMTD-NEXT: ret { double, double } [[TMP15]] +// +// X86WINPRMTD-LABEL: define dso_local void @divd( +// X86WINPRMTD-SAME: ptr dead_on_unwind noalias writable sret({ double, double }) align 8 [[AGG_RESULT:%.*]], ptr noundef [[A:%.*]], ptr noundef [[B:%.*]]) #[[ATTR0]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RESULT_PTR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: [[B_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: [[A_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: store ptr [[AGG_RESULT]], ptr [[RESULT_PTR]], align 8 +// X86WINPRMTD-NEXT: store ptr [[B]], ptr [[B_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: store ptr [[A]], ptr [[A_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]]) +// X86WINPRMTD-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]]) +// X86WINPRMTD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt double [[TMP0]], [[TMP1]] +// X86WINPRMTD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// X86WINPRMTD: abs_rhsr_greater_or_equal_abs_rhsi: +// X86WINPRMTD-NEXT: [[TMP2:%.*]] = fdiv double [[B_IMAG]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[TMP3:%.*]] = fmul double [[TMP2]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[TMP4:%.*]] = fadd double [[B_REAL]], [[TMP3]] +// X86WINPRMTD-NEXT: [[TMP5:%.*]] = fmul double [[A_IMAG]], [[TMP2]] +// X86WINPRMTD-NEXT: [[TMP6:%.*]] = fadd double [[A_REAL]], [[TMP5]] +// X86WINPRMTD-NEXT: [[TMP7:%.*]] = fdiv double [[TMP6]], [[TMP4]] +// X86WINPRMTD-NEXT: [[TMP8:%.*]] = fmul double [[A_REAL]], [[TMP2]] +// X86WINPRMTD-NEXT: [[TMP9:%.*]] = fsub double [[A_IMAG]], [[TMP8]] +// X86WINPRMTD-NEXT: [[TMP10:%.*]] = fdiv double [[TMP9]], [[TMP4]] +// X86WINPRMTD-NEXT: br label [[COMPLEX_DIV:%.*]] +// X86WINPRMTD: abs_rhsr_less_than_abs_rhsi: +// X86WINPRMTD-NEXT: [[TMP11:%.*]] = fdiv double [[B_REAL]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[TMP12:%.*]] = fmul double [[TMP11]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[TMP13:%.*]] = fadd double [[B_IMAG]], [[TMP12]] +// X86WINPRMTD-NEXT: [[TMP14:%.*]] = fmul double [[A_REAL]], [[TMP11]] +// X86WINPRMTD-NEXT: [[TMP15:%.*]] = fadd double [[TMP14]], [[A_IMAG]] +// X86WINPRMTD-NEXT: [[TMP16:%.*]] = fdiv double [[TMP15]], [[TMP13]] +// X86WINPRMTD-NEXT: [[TMP17:%.*]] = fmul double [[A_IMAG]], [[TMP11]] +// X86WINPRMTD-NEXT: [[TMP18:%.*]] = fsub double [[TMP17]], [[A_REAL]] +// X86WINPRMTD-NEXT: [[TMP19:%.*]] = fdiv double [[TMP18]], [[TMP13]] +// X86WINPRMTD-NEXT: br label [[COMPLEX_DIV]] +// X86WINPRMTD: complex_div: +// X86WINPRMTD-NEXT: [[TMP20:%.*]] = phi double [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// X86WINPRMTD-NEXT: [[TMP21:%.*]] = phi double [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store double [[TMP20]], ptr [[AGG_RESULT_REALP]], align 8 +// X86WINPRMTD-NEXT: store double [[TMP21]], ptr [[AGG_RESULT_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP1:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REAL:%.*]] = load double, ptr [[AGG_RESULT_REALP1]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP2:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAG:%.*]] = load double, ptr [[AGG_RESULT_IMAGP2]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP3:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP4:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store double [[AGG_RESULT_REAL]], ptr [[AGG_RESULT_REALP3]], align 8 +// X86WINPRMTD-NEXT: store double [[AGG_RESULT_IMAG]], ptr [[AGG_RESULT_IMAGP4]], align 8 +// X86WINPRMTD-NEXT: ret void +// +// AVRFP32-LABEL: define dso_local { float, float } @divd( +// AVRFP32-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], float noundef [[B_COERCE0:%.*]], float noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP32-NEXT: entry: +// AVRFP32-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// AVRFP32-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// AVRFP32-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 4 +// AVRFP32-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 4 +// AVRFP32-NEXT: [[TMP2:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[B_COERCE0]], ptr [[TMP2]], align 4 +// AVRFP32-NEXT: [[TMP3:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[B_COERCE1]], ptr [[TMP3]], align 4 +// AVRFP32-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// AVRFP32-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// AVRFP32-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// AVRFP32-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// AVRFP32-NEXT: [[TMP4:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[B_REAL]]) +// AVRFP32-NEXT: [[TMP5:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[B_IMAG]]) +// AVRFP32-NEXT: [[ABS_CMP:%.*]] = fcmp ugt float [[TMP4]], [[TMP5]] +// AVRFP32-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// AVRFP32: abs_rhsr_greater_or_equal_abs_rhsi: +// AVRFP32-NEXT: [[TMP6:%.*]] = fdiv float [[B_IMAG]], [[B_REAL]] +// AVRFP32-NEXT: [[TMP7:%.*]] = fmul float [[TMP6]], [[B_IMAG]] +// AVRFP32-NEXT: [[TMP8:%.*]] = fadd float [[B_REAL]], [[TMP7]] +// AVRFP32-NEXT: [[TMP9:%.*]] = fmul float [[A_IMAG]], [[TMP6]] +// AVRFP32-NEXT: [[TMP10:%.*]] = fadd float [[A_REAL]], [[TMP9]] +// AVRFP32-NEXT: [[TMP11:%.*]] = fdiv float [[TMP10]], [[TMP8]] +// AVRFP32-NEXT: [[TMP12:%.*]] = fmul float [[A_REAL]], [[TMP6]] +// AVRFP32-NEXT: [[TMP13:%.*]] = fsub float [[A_IMAG]], [[TMP12]] +// AVRFP32-NEXT: [[TMP14:%.*]] = fdiv float [[TMP13]], [[TMP8]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV:%.*]] +// AVRFP32: abs_rhsr_less_than_abs_rhsi: +// AVRFP32-NEXT: [[TMP15:%.*]] = fdiv float [[B_REAL]], [[B_IMAG]] +// AVRFP32-NEXT: [[TMP16:%.*]] = fmul float [[TMP15]], [[B_REAL]] +// AVRFP32-NEXT: [[TMP17:%.*]] = fadd float [[B_IMAG]], [[TMP16]] +// AVRFP32-NEXT: [[TMP18:%.*]] = fmul float [[A_REAL]], [[TMP15]] +// AVRFP32-NEXT: [[TMP19:%.*]] = fadd float [[TMP18]], [[A_IMAG]] +// AVRFP32-NEXT: [[TMP20:%.*]] = fdiv float [[TMP19]], [[TMP17]] +// AVRFP32-NEXT: [[TMP21:%.*]] = fmul float [[A_IMAG]], [[TMP15]] +// AVRFP32-NEXT: [[TMP22:%.*]] = fsub float [[TMP21]], [[A_REAL]] +// AVRFP32-NEXT: [[TMP23:%.*]] = fdiv float [[TMP22]], [[TMP17]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV]] +// AVRFP32: complex_div: +// AVRFP32-NEXT: [[TMP24:%.*]] = phi float [ [[TMP11]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP20]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP32-NEXT: [[TMP25:%.*]] = phi float [ [[TMP14]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP23]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP32-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP32-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[TMP24]], ptr [[RETVAL_REALP]], align 1 +// AVRFP32-NEXT: store float [[TMP25]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP26:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP32-NEXT: ret { float, float } [[TMP26]] +// +// AVRFP64-LABEL: define dso_local void @divd( +// AVRFP64-SAME: ptr dead_on_unwind noalias writable sret({ double, double }) align 1 [[AGG_RESULT:%.*]], double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP64-NEXT: entry: +// AVRFP64-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// AVRFP64-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// AVRFP64-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// AVRFP64-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// AVRFP64-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// AVRFP64-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// AVRFP64-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// AVRFP64-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// AVRFP64-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// AVRFP64-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// AVRFP64-NEXT: [[TMP4:%.*]] = call addrspace(1) double @llvm.fabs.f64(double [[B_REAL]]) +// AVRFP64-NEXT: [[TMP5:%.*]] = call addrspace(1) double @llvm.fabs.f64(double [[B_IMAG]]) +// AVRFP64-NEXT: [[ABS_CMP:%.*]] = fcmp ugt double [[TMP4]], [[TMP5]] +// AVRFP64-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// AVRFP64: abs_rhsr_greater_or_equal_abs_rhsi: +// AVRFP64-NEXT: [[TMP6:%.*]] = fdiv double [[B_IMAG]], [[B_REAL]] +// AVRFP64-NEXT: [[TMP7:%.*]] = fmul double [[TMP6]], [[B_IMAG]] +// AVRFP64-NEXT: [[TMP8:%.*]] = fadd double [[B_REAL]], [[TMP7]] +// AVRFP64-NEXT: [[TMP9:%.*]] = fmul double [[A_IMAG]], [[TMP6]] +// AVRFP64-NEXT: [[TMP10:%.*]] = fadd double [[A_REAL]], [[TMP9]] +// AVRFP64-NEXT: [[TMP11:%.*]] = fdiv double [[TMP10]], [[TMP8]] +// AVRFP64-NEXT: [[TMP12:%.*]] = fmul double [[A_REAL]], [[TMP6]] +// AVRFP64-NEXT: [[TMP13:%.*]] = fsub double [[A_IMAG]], [[TMP12]] +// AVRFP64-NEXT: [[TMP14:%.*]] = fdiv double [[TMP13]], [[TMP8]] +// AVRFP64-NEXT: br label [[COMPLEX_DIV:%.*]] +// AVRFP64: abs_rhsr_less_than_abs_rhsi: +// AVRFP64-NEXT: [[TMP15:%.*]] = fdiv double [[B_REAL]], [[B_IMAG]] +// AVRFP64-NEXT: [[TMP16:%.*]] = fmul double [[TMP15]], [[B_REAL]] +// AVRFP64-NEXT: [[TMP17:%.*]] = fadd double [[B_IMAG]], [[TMP16]] +// AVRFP64-NEXT: [[TMP18:%.*]] = fmul double [[A_REAL]], [[TMP15]] +// AVRFP64-NEXT: [[TMP19:%.*]] = fadd double [[TMP18]], [[A_IMAG]] +// AVRFP64-NEXT: [[TMP20:%.*]] = fdiv double [[TMP19]], [[TMP17]] +// AVRFP64-NEXT: [[TMP21:%.*]] = fmul double [[A_IMAG]], [[TMP15]] +// AVRFP64-NEXT: [[TMP22:%.*]] = fsub double [[TMP21]], [[A_REAL]] +// AVRFP64-NEXT: [[TMP23:%.*]] = fdiv double [[TMP22]], [[TMP17]] +// AVRFP64-NEXT: br label [[COMPLEX_DIV]] +// AVRFP64: complex_div: +// AVRFP64-NEXT: [[TMP24:%.*]] = phi double [ [[TMP11]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP20]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP64-NEXT: [[TMP25:%.*]] = phi double [ [[TMP14]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP23]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP64-NEXT: [[AGG_RESULT_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[TMP24]], ptr [[AGG_RESULT_REALP]], align 1 +// AVRFP64-NEXT: store double [[TMP25]], ptr [[AGG_RESULT_IMAGP]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_REALP1:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_REAL:%.*]] = load double, ptr [[AGG_RESULT_REALP1]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP2:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: [[AGG_RESULT_IMAG:%.*]] = load double, ptr [[AGG_RESULT_IMAGP2]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_REALP3:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP4:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[AGG_RESULT_REAL]], ptr [[AGG_RESULT_REALP3]], align 1 +// AVRFP64-NEXT: store double [[AGG_RESULT_IMAG]], ptr [[AGG_RESULT_IMAGP4]], align 1 +// AVRFP64-NEXT: ret void +// +// BASIC_FAST-LABEL: define dso_local { double, double } @divd( +// BASIC_FAST-SAME: double noundef nofpclass(nan inf) [[A_COERCE0:%.*]], double noundef nofpclass(nan inf) [[A_COERCE1:%.*]], double noundef nofpclass(nan inf) [[B_COERCE0:%.*]], double noundef nofpclass(nan inf) [[B_COERCE1:%.*]]) #[[ATTR1:[0-9]+]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// BASIC_FAST-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// BASIC_FAST-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// BASIC_FAST-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// BASIC_FAST-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// BASIC_FAST-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// BASIC_FAST-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// BASIC_FAST-NEXT: [[TMP4:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_REAL]] +// BASIC_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[TMP4]], [[TMP5]] +// BASIC_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[B_REAL]], [[B_REAL]] +// BASIC_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[B_IMAG]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[TMP9:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[TMP7]], [[TMP8]] +// BASIC_FAST-NEXT: [[TMP10:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_REAL]] +// BASIC_FAST-NEXT: [[TMP11:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[TMP12:%.*]] = fsub reassoc nnan ninf nsz arcp afn double [[TMP10]], [[TMP11]] +// BASIC_FAST-NEXT: [[TMP13:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP6]], [[TMP9]] +// BASIC_FAST-NEXT: [[TMP14:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP12]], [[TMP9]] +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store double [[TMP13]], ptr [[RETVAL_REALP]], align 8 +// BASIC_FAST-NEXT: store double [[TMP14]], ptr [[RETVAL_IMAGP]], align 8 +// BASIC_FAST-NEXT: [[TMP15:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// BASIC_FAST-NEXT: ret { double, double } [[TMP15]] +// +// FULL_FAST-LABEL: define dso_local { double, double } @divd( +// FULL_FAST-SAME: double noundef nofpclass(nan inf) [[A_COERCE0:%.*]], double noundef nofpclass(nan inf) [[A_COERCE1:%.*]], double noundef nofpclass(nan inf) [[B_COERCE0:%.*]], double noundef nofpclass(nan inf) [[B_COERCE1:%.*]]) #[[ATTR1:[0-9]+]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// FULL_FAST-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// FULL_FAST-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// FULL_FAST-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// FULL_FAST-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// FULL_FAST-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// FULL_FAST-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// FULL_FAST-NEXT: [[CALL:%.*]] = call { double, double } @__divdc3(double noundef nofpclass(nan inf) [[A_REAL]], double noundef nofpclass(nan inf) [[A_IMAG]], double noundef nofpclass(nan inf) [[B_REAL]], double noundef nofpclass(nan inf) [[B_IMAG]]) #[[ATTR2]] +// FULL_FAST-NEXT: [[TMP4:%.*]] = extractvalue { double, double } [[CALL]], 0 +// FULL_FAST-NEXT: [[TMP5:%.*]] = extractvalue { double, double } [[CALL]], 1 +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store double [[TMP4]], ptr [[RETVAL_REALP]], align 8 +// FULL_FAST-NEXT: store double [[TMP5]], ptr [[RETVAL_IMAGP]], align 8 +// FULL_FAST-NEXT: [[TMP6:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// FULL_FAST-NEXT: ret { double, double } [[TMP6]] +// +// IMPRVD_FAST-LABEL: define dso_local { double, double } @divd( +// IMPRVD_FAST-SAME: double noundef nofpclass(nan inf) [[A_COERCE0:%.*]], double noundef nofpclass(nan inf) [[A_COERCE1:%.*]], double noundef nofpclass(nan inf) [[B_COERCE0:%.*]], double noundef nofpclass(nan inf) [[B_COERCE1:%.*]]) #[[ATTR2:[0-9]+]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// IMPRVD_FAST-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// IMPRVD_FAST-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// IMPRVD_FAST-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// IMPRVD_FAST-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// IMPRVD_FAST-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// IMPRVD_FAST-NEXT: [[TMP4:%.*]] = call reassoc nnan ninf nsz arcp afn double @llvm.fabs.f64(double [[B_REAL]]) +// IMPRVD_FAST-NEXT: [[TMP5:%.*]] = call reassoc nnan ninf nsz arcp afn double @llvm.fabs.f64(double [[B_IMAG]]) +// IMPRVD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt double [[TMP4]], [[TMP5]] +// IMPRVD_FAST-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD_FAST: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP6:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[B_IMAG]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[TMP6]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP8:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[B_REAL]], [[TMP7]] +// IMPRVD_FAST-NEXT: [[TMP9:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[TMP6]] +// IMPRVD_FAST-NEXT: [[TMP10:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[TMP9]] +// IMPRVD_FAST-NEXT: [[TMP11:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP10]], [[TMP8]] +// IMPRVD_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[TMP6]] +// IMPRVD_FAST-NEXT: [[TMP13:%.*]] = fsub reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[TMP12]] +// IMPRVD_FAST-NEXT: [[TMP14:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP13]], [[TMP8]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD_FAST: abs_rhsr_less_than_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP15:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[B_REAL]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP16:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[TMP15]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[TMP17:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[B_IMAG]], [[TMP16]] +// IMPRVD_FAST-NEXT: [[TMP18:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[TMP15]] +// IMPRVD_FAST-NEXT: [[TMP19:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[TMP18]], [[A_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP20:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP19]], [[TMP17]] +// IMPRVD_FAST-NEXT: [[TMP21:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[TMP15]] +// IMPRVD_FAST-NEXT: [[TMP22:%.*]] = fsub reassoc nnan ninf nsz arcp afn double [[TMP21]], [[A_REAL]] +// IMPRVD_FAST-NEXT: [[TMP23:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP22]], [[TMP17]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD_FAST: complex_div: +// IMPRVD_FAST-NEXT: [[TMP24:%.*]] = phi reassoc nnan ninf nsz arcp afn double [ [[TMP11]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP20]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[TMP25:%.*]] = phi reassoc nnan ninf nsz arcp afn double [ [[TMP14]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP23]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store double [[TMP24]], ptr [[RETVAL_REALP]], align 8 +// IMPRVD_FAST-NEXT: store double [[TMP25]], ptr [[RETVAL_IMAGP]], align 8 +// IMPRVD_FAST-NEXT: [[TMP26:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// IMPRVD_FAST-NEXT: ret { double, double } [[TMP26]] +// +// PRMTD_FAST-LABEL: define dso_local { double, double } @divd( +// PRMTD_FAST-SAME: double noundef nofpclass(nan inf) [[A_COERCE0:%.*]], double noundef nofpclass(nan inf) [[A_COERCE1:%.*]], double noundef nofpclass(nan inf) [[B_COERCE0:%.*]], double noundef nofpclass(nan inf) [[B_COERCE1:%.*]]) #[[ATTR1:[0-9]+]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// PRMTD_FAST-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// PRMTD_FAST-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// PRMTD_FAST-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// PRMTD_FAST-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// PRMTD_FAST-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext double [[A_REAL]] to x86_fp80 +// PRMTD_FAST-NEXT: [[EXT1:%.*]] = fpext double [[A_IMAG]] to x86_fp80 +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// PRMTD_FAST-NEXT: [[EXT2:%.*]] = fpext double [[B_REAL]] to x86_fp80 +// PRMTD_FAST-NEXT: [[EXT3:%.*]] = fpext double [[B_IMAG]] to x86_fp80 +// PRMTD_FAST-NEXT: [[TMP4:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[EXT]], [[EXT2]] +// PRMTD_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[EXT1]], [[EXT3]] +// PRMTD_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP4]], [[TMP5]] +// PRMTD_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[EXT2]], [[EXT2]] +// PRMTD_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[EXT3]], [[EXT3]] +// PRMTD_FAST-NEXT: [[TMP9:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP7]], [[TMP8]] +// PRMTD_FAST-NEXT: [[TMP10:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[EXT1]], [[EXT2]] +// PRMTD_FAST-NEXT: [[TMP11:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[EXT]], [[EXT3]] +// PRMTD_FAST-NEXT: [[TMP12:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP10]], [[TMP11]] +// PRMTD_FAST-NEXT: [[TMP13:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP9]] +// PRMTD_FAST-NEXT: [[TMP14:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP12]], [[TMP9]] +// PRMTD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc x86_fp80 [[TMP13]] to double +// PRMTD_FAST-NEXT: [[UNPROMOTION4:%.*]] = fptrunc x86_fp80 [[TMP14]] to double +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store double [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 8 +// PRMTD_FAST-NEXT: store double [[UNPROMOTION4]], ptr [[RETVAL_IMAGP]], align 8 +// PRMTD_FAST-NEXT: [[TMP15:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// PRMTD_FAST-NEXT: ret { double, double } [[TMP15]] +// +_Complex double divd(_Complex double a, _Complex double b) { return a / b; } -_Complex float mul(_Complex float a, _Complex float b) { - // LABEL: define {{.*}} @mul( - // FULL: call {{.*}} @__mulsc3 - - // LMTD: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fsub float - // LMTD-NEXT: fadd float - - // FRTRN: fmul {{.*}}float - // FRTRN-NEXT: fmul {{.*}}float - // FRTRN-NEXT: fmul {{.*}}float - // FRTRN-NEXT: fmul {{.*}}float - // FRTRN-NEXT: fsub {{.*}}float - // FRTRN-NEXT: fadd {{.*}}float - - // LMTD-FAST: fmul {{.*}} float - // LMTD-FAST-NEXT: fmul {{.*}} float - // LMTD-FAST-NEXT: fmul {{.*}} float - // LMTD-FAST-NEXT: fmul {{.*}} float - // LMTD-FAST-NEXT: fsub {{.*}} float - // LMTD-FAST-NEXT: fadd {{.*}} float +// FULL-LABEL: define dso_local { double, double } @muld( +// FULL-SAME: double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) #[[ATTR1]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// FULL-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// FULL-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// FULL-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// FULL-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// FULL-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// FULL-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// FULL-NEXT: [[MUL_AC:%.*]] = fmul double [[A_REAL]], [[B_REAL]] +// FULL-NEXT: [[MUL_BD:%.*]] = fmul double [[A_IMAG]], [[B_IMAG]] +// FULL-NEXT: [[MUL_AD:%.*]] = fmul double [[A_REAL]], [[B_IMAG]] +// FULL-NEXT: [[MUL_BC:%.*]] = fmul double [[A_IMAG]], [[B_REAL]] +// FULL-NEXT: [[MUL_R:%.*]] = fsub double [[MUL_AC]], [[MUL_BD]] +// FULL-NEXT: [[MUL_I:%.*]] = fadd double [[MUL_AD]], [[MUL_BC]] +// FULL-NEXT: [[ISNAN_CMP:%.*]] = fcmp uno double [[MUL_R]], [[MUL_R]] +// FULL-NEXT: br i1 [[ISNAN_CMP]], label [[COMPLEX_MUL_IMAG_NAN:%.*]], label [[COMPLEX_MUL_CONT:%.*]], !prof [[PROF2]] +// FULL: complex_mul_imag_nan: +// FULL-NEXT: [[ISNAN_CMP1:%.*]] = fcmp uno double [[MUL_I]], [[MUL_I]] +// FULL-NEXT: br i1 [[ISNAN_CMP1]], label [[COMPLEX_MUL_LIBCALL:%.*]], label [[COMPLEX_MUL_CONT]], !prof [[PROF2]] +// FULL: complex_mul_libcall: +// FULL-NEXT: [[CALL:%.*]] = call { double, double } @__muldc3(double noundef [[A_REAL]], double noundef [[A_IMAG]], double noundef [[B_REAL]], double noundef [[B_IMAG]]) #[[ATTR2]] +// FULL-NEXT: [[TMP4:%.*]] = extractvalue { double, double } [[CALL]], 0 +// FULL-NEXT: [[TMP5:%.*]] = extractvalue { double, double } [[CALL]], 1 +// FULL-NEXT: br label [[COMPLEX_MUL_CONT]] +// FULL: complex_mul_cont: +// FULL-NEXT: [[REAL_MUL_PHI:%.*]] = phi double [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[TMP4]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL-NEXT: [[IMAG_MUL_PHI:%.*]] = phi double [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[TMP5]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store double [[REAL_MUL_PHI]], ptr [[RETVAL_REALP]], align 8 +// FULL-NEXT: store double [[IMAG_MUL_PHI]], ptr [[RETVAL_IMAGP]], align 8 +// FULL-NEXT: [[TMP6:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// FULL-NEXT: ret { double, double } [[TMP6]] +// +// BASIC-LABEL: define dso_local { double, double } @muld( +// BASIC-SAME: double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) #[[ATTR1]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// BASIC-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// BASIC-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// BASIC-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// BASIC-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// BASIC-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// BASIC-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// BASIC-NEXT: [[MUL_AC:%.*]] = fmul double [[A_REAL]], [[B_REAL]] +// BASIC-NEXT: [[MUL_BD:%.*]] = fmul double [[A_IMAG]], [[B_IMAG]] +// BASIC-NEXT: [[MUL_AD:%.*]] = fmul double [[A_REAL]], [[B_IMAG]] +// BASIC-NEXT: [[MUL_BC:%.*]] = fmul double [[A_IMAG]], [[B_REAL]] +// BASIC-NEXT: [[MUL_R:%.*]] = fsub double [[MUL_AC]], [[MUL_BD]] +// BASIC-NEXT: [[MUL_I:%.*]] = fadd double [[MUL_AD]], [[MUL_BC]] +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store double [[MUL_R]], ptr [[RETVAL_REALP]], align 8 +// BASIC-NEXT: store double [[MUL_I]], ptr [[RETVAL_IMAGP]], align 8 +// BASIC-NEXT: [[TMP4:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// BASIC-NEXT: ret { double, double } [[TMP4]] +// +// IMPRVD-LABEL: define dso_local { double, double } @muld( +// IMPRVD-SAME: double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) #[[ATTR2]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// IMPRVD-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// IMPRVD-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// IMPRVD-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// IMPRVD-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// IMPRVD-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// IMPRVD-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// IMPRVD-NEXT: [[MUL_AC:%.*]] = fmul double [[A_REAL]], [[B_REAL]] +// IMPRVD-NEXT: [[MUL_BD:%.*]] = fmul double [[A_IMAG]], [[B_IMAG]] +// IMPRVD-NEXT: [[MUL_AD:%.*]] = fmul double [[A_REAL]], [[B_IMAG]] +// IMPRVD-NEXT: [[MUL_BC:%.*]] = fmul double [[A_IMAG]], [[B_REAL]] +// IMPRVD-NEXT: [[MUL_R:%.*]] = fsub double [[MUL_AC]], [[MUL_BD]] +// IMPRVD-NEXT: [[MUL_I:%.*]] = fadd double [[MUL_AD]], [[MUL_BC]] +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store double [[MUL_R]], ptr [[RETVAL_REALP]], align 8 +// IMPRVD-NEXT: store double [[MUL_I]], ptr [[RETVAL_IMAGP]], align 8 +// IMPRVD-NEXT: [[TMP4:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// IMPRVD-NEXT: ret { double, double } [[TMP4]] +// +// PRMTD-LABEL: define dso_local { double, double } @muld( +// PRMTD-SAME: double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) #[[ATTR1]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// PRMTD-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// PRMTD-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// PRMTD-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// PRMTD-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// PRMTD-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// PRMTD-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// PRMTD-NEXT: [[MUL_AC:%.*]] = fmul double [[A_REAL]], [[B_REAL]] +// PRMTD-NEXT: [[MUL_BD:%.*]] = fmul double [[A_IMAG]], [[B_IMAG]] +// PRMTD-NEXT: [[MUL_AD:%.*]] = fmul double [[A_REAL]], [[B_IMAG]] +// PRMTD-NEXT: [[MUL_BC:%.*]] = fmul double [[A_IMAG]], [[B_REAL]] +// PRMTD-NEXT: [[MUL_R:%.*]] = fsub double [[MUL_AC]], [[MUL_BD]] +// PRMTD-NEXT: [[MUL_I:%.*]] = fadd double [[MUL_AD]], [[MUL_BC]] +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store double [[MUL_R]], ptr [[RETVAL_REALP]], align 8 +// PRMTD-NEXT: store double [[MUL_I]], ptr [[RETVAL_IMAGP]], align 8 +// PRMTD-NEXT: [[TMP4:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// PRMTD-NEXT: ret { double, double } [[TMP4]] +// +// X86WINPRMTD-LABEL: define dso_local void @muld( +// X86WINPRMTD-SAME: ptr dead_on_unwind noalias writable sret({ double, double }) align 8 [[AGG_RESULT:%.*]], ptr noundef [[A:%.*]], ptr noundef [[B:%.*]]) #[[ATTR0]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RESULT_PTR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: [[B_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: [[A_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: store ptr [[AGG_RESULT]], ptr [[RESULT_PTR]], align 8 +// X86WINPRMTD-NEXT: store ptr [[B]], ptr [[B_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: store ptr [[A]], ptr [[A_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[MUL_AC:%.*]] = fmul double [[A_REAL]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[MUL_BD:%.*]] = fmul double [[A_IMAG]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[MUL_AD:%.*]] = fmul double [[A_REAL]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[MUL_BC:%.*]] = fmul double [[A_IMAG]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[MUL_R:%.*]] = fsub double [[MUL_AC]], [[MUL_BD]] +// X86WINPRMTD-NEXT: [[MUL_I:%.*]] = fadd double [[MUL_AD]], [[MUL_BC]] +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store double [[MUL_R]], ptr [[AGG_RESULT_REALP]], align 8 +// X86WINPRMTD-NEXT: store double [[MUL_I]], ptr [[AGG_RESULT_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP1:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REAL:%.*]] = load double, ptr [[AGG_RESULT_REALP1]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP2:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAG:%.*]] = load double, ptr [[AGG_RESULT_IMAGP2]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP3:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP4:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store double [[AGG_RESULT_REAL]], ptr [[AGG_RESULT_REALP3]], align 8 +// X86WINPRMTD-NEXT: store double [[AGG_RESULT_IMAG]], ptr [[AGG_RESULT_IMAGP4]], align 8 +// X86WINPRMTD-NEXT: ret void +// +// AVRFP32-LABEL: define dso_local { float, float } @muld( +// AVRFP32-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], float noundef [[B_COERCE0:%.*]], float noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP32-NEXT: entry: +// AVRFP32-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// AVRFP32-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// AVRFP32-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 4 +// AVRFP32-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 4 +// AVRFP32-NEXT: [[TMP2:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[B_COERCE0]], ptr [[TMP2]], align 4 +// AVRFP32-NEXT: [[TMP3:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[B_COERCE1]], ptr [[TMP3]], align 4 +// AVRFP32-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// AVRFP32-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// AVRFP32-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// AVRFP32-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// AVRFP32-NEXT: [[MUL_AC:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// AVRFP32-NEXT: [[MUL_BD:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// AVRFP32-NEXT: [[MUL_AD:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// AVRFP32-NEXT: [[MUL_BC:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// AVRFP32-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// AVRFP32-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// AVRFP32-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP32-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 1 +// AVRFP32-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP4:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP32-NEXT: ret { float, float } [[TMP4]] +// +// AVRFP64-LABEL: define dso_local void @muld( +// AVRFP64-SAME: ptr dead_on_unwind noalias writable sret({ double, double }) align 1 [[AGG_RESULT:%.*]], double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP64-NEXT: entry: +// AVRFP64-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// AVRFP64-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// AVRFP64-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// AVRFP64-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// AVRFP64-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// AVRFP64-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// AVRFP64-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// AVRFP64-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// AVRFP64-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// AVRFP64-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// AVRFP64-NEXT: [[MUL_AC:%.*]] = fmul double [[A_REAL]], [[B_REAL]] +// AVRFP64-NEXT: [[MUL_BD:%.*]] = fmul double [[A_IMAG]], [[B_IMAG]] +// AVRFP64-NEXT: [[MUL_AD:%.*]] = fmul double [[A_REAL]], [[B_IMAG]] +// AVRFP64-NEXT: [[MUL_BC:%.*]] = fmul double [[A_IMAG]], [[B_REAL]] +// AVRFP64-NEXT: [[MUL_R:%.*]] = fsub double [[MUL_AC]], [[MUL_BD]] +// AVRFP64-NEXT: [[MUL_I:%.*]] = fadd double [[MUL_AD]], [[MUL_BC]] +// AVRFP64-NEXT: [[AGG_RESULT_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[MUL_R]], ptr [[AGG_RESULT_REALP]], align 1 +// AVRFP64-NEXT: store double [[MUL_I]], ptr [[AGG_RESULT_IMAGP]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_REALP1:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_REAL:%.*]] = load double, ptr [[AGG_RESULT_REALP1]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP2:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: [[AGG_RESULT_IMAG:%.*]] = load double, ptr [[AGG_RESULT_IMAGP2]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_REALP3:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP4:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[AGG_RESULT_REAL]], ptr [[AGG_RESULT_REALP3]], align 1 +// AVRFP64-NEXT: store double [[AGG_RESULT_IMAG]], ptr [[AGG_RESULT_IMAGP4]], align 1 +// AVRFP64-NEXT: ret void +// +// BASIC_FAST-LABEL: define dso_local { double, double } @muld( +// BASIC_FAST-SAME: double noundef nofpclass(nan inf) [[A_COERCE0:%.*]], double noundef nofpclass(nan inf) [[A_COERCE1:%.*]], double noundef nofpclass(nan inf) [[B_COERCE0:%.*]], double noundef nofpclass(nan inf) [[B_COERCE1:%.*]]) #[[ATTR1]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// BASIC_FAST-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// BASIC_FAST-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// BASIC_FAST-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// BASIC_FAST-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// BASIC_FAST-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// BASIC_FAST-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// BASIC_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_REAL]] +// BASIC_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_REAL]] +// BASIC_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn double [[MUL_AC]], [[MUL_BD]] +// BASIC_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[MUL_AD]], [[MUL_BC]] +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store double [[MUL_R]], ptr [[RETVAL_REALP]], align 8 +// BASIC_FAST-NEXT: store double [[MUL_I]], ptr [[RETVAL_IMAGP]], align 8 +// BASIC_FAST-NEXT: [[TMP4:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// BASIC_FAST-NEXT: ret { double, double } [[TMP4]] +// +// FULL_FAST-LABEL: define dso_local { double, double } @muld( +// FULL_FAST-SAME: double noundef nofpclass(nan inf) [[A_COERCE0:%.*]], double noundef nofpclass(nan inf) [[A_COERCE1:%.*]], double noundef nofpclass(nan inf) [[B_COERCE0:%.*]], double noundef nofpclass(nan inf) [[B_COERCE1:%.*]]) #[[ATTR1]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// FULL_FAST-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// FULL_FAST-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// FULL_FAST-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// FULL_FAST-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// FULL_FAST-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// FULL_FAST-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// FULL_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_REAL]] +// FULL_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_IMAG]] +// FULL_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_IMAG]] +// FULL_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_REAL]] +// FULL_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn double [[MUL_AC]], [[MUL_BD]] +// FULL_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[MUL_AD]], [[MUL_BC]] +// FULL_FAST-NEXT: [[ISNAN_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn uno double [[MUL_R]], [[MUL_R]] +// FULL_FAST-NEXT: br i1 [[ISNAN_CMP]], label [[COMPLEX_MUL_IMAG_NAN:%.*]], label [[COMPLEX_MUL_CONT:%.*]], !prof [[PROF2]] +// FULL_FAST: complex_mul_imag_nan: +// FULL_FAST-NEXT: [[ISNAN_CMP1:%.*]] = fcmp reassoc nnan ninf nsz arcp afn uno double [[MUL_I]], [[MUL_I]] +// FULL_FAST-NEXT: br i1 [[ISNAN_CMP1]], label [[COMPLEX_MUL_LIBCALL:%.*]], label [[COMPLEX_MUL_CONT]], !prof [[PROF2]] +// FULL_FAST: complex_mul_libcall: +// FULL_FAST-NEXT: [[CALL:%.*]] = call { double, double } @__muldc3(double noundef nofpclass(nan inf) [[A_REAL]], double noundef nofpclass(nan inf) [[A_IMAG]], double noundef nofpclass(nan inf) [[B_REAL]], double noundef nofpclass(nan inf) [[B_IMAG]]) #[[ATTR2]] +// FULL_FAST-NEXT: [[TMP4:%.*]] = extractvalue { double, double } [[CALL]], 0 +// FULL_FAST-NEXT: [[TMP5:%.*]] = extractvalue { double, double } [[CALL]], 1 +// FULL_FAST-NEXT: br label [[COMPLEX_MUL_CONT]] +// FULL_FAST: complex_mul_cont: +// FULL_FAST-NEXT: [[REAL_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn double [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[TMP4]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL_FAST-NEXT: [[IMAG_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn double [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[TMP5]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store double [[REAL_MUL_PHI]], ptr [[RETVAL_REALP]], align 8 +// FULL_FAST-NEXT: store double [[IMAG_MUL_PHI]], ptr [[RETVAL_IMAGP]], align 8 +// FULL_FAST-NEXT: [[TMP6:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// FULL_FAST-NEXT: ret { double, double } [[TMP6]] +// +// IMPRVD_FAST-LABEL: define dso_local { double, double } @muld( +// IMPRVD_FAST-SAME: double noundef nofpclass(nan inf) [[A_COERCE0:%.*]], double noundef nofpclass(nan inf) [[A_COERCE1:%.*]], double noundef nofpclass(nan inf) [[B_COERCE0:%.*]], double noundef nofpclass(nan inf) [[B_COERCE1:%.*]]) #[[ATTR2]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// IMPRVD_FAST-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// IMPRVD_FAST-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// IMPRVD_FAST-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// IMPRVD_FAST-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// IMPRVD_FAST-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// IMPRVD_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn double [[MUL_AC]], [[MUL_BD]] +// IMPRVD_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[MUL_AD]], [[MUL_BC]] +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store double [[MUL_R]], ptr [[RETVAL_REALP]], align 8 +// IMPRVD_FAST-NEXT: store double [[MUL_I]], ptr [[RETVAL_IMAGP]], align 8 +// IMPRVD_FAST-NEXT: [[TMP4:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// IMPRVD_FAST-NEXT: ret { double, double } [[TMP4]] +// +// PRMTD_FAST-LABEL: define dso_local { double, double } @muld( +// PRMTD_FAST-SAME: double noundef nofpclass(nan inf) [[A_COERCE0:%.*]], double noundef nofpclass(nan inf) [[A_COERCE1:%.*]], double noundef nofpclass(nan inf) [[B_COERCE0:%.*]], double noundef nofpclass(nan inf) [[B_COERCE1:%.*]]) #[[ATTR1]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { double, double }, align 8 +// PRMTD_FAST-NEXT: [[A:%.*]] = alloca { double, double }, align 8 +// PRMTD_FAST-NEXT: [[B:%.*]] = alloca { double, double }, align 8 +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 8 +// PRMTD_FAST-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 8 +// PRMTD_FAST-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 8 +// PRMTD_FAST-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 8 +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// PRMTD_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_REAL]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[A_IMAG]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn double [[MUL_AC]], [[MUL_BD]] +// PRMTD_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[MUL_AD]], [[MUL_BC]] +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store double [[MUL_R]], ptr [[RETVAL_REALP]], align 8 +// PRMTD_FAST-NEXT: store double [[MUL_I]], ptr [[RETVAL_IMAGP]], align 8 +// PRMTD_FAST-NEXT: [[TMP4:%.*]] = load { double, double }, ptr [[RETVAL]], align 8 +// PRMTD_FAST-NEXT: ret { double, double } [[TMP4]] +// +_Complex double muld(_Complex double a, _Complex double b) { + return a * b; +} + +// FULL-LABEL: define dso_local { x86_fp80, x86_fp80 } @divld( +// FULL-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// FULL-NEXT: [[CALL:%.*]] = call { x86_fp80, x86_fp80 } @__divxc3(x86_fp80 noundef [[A_REAL]], x86_fp80 noundef [[A_IMAG]], x86_fp80 noundef [[B_REAL]], x86_fp80 noundef [[B_IMAG]]) #[[ATTR2]] +// FULL-NEXT: [[TMP0:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 0 +// FULL-NEXT: [[TMP1:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 1 +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store x86_fp80 [[TMP0]], ptr [[RETVAL_REALP]], align 16 +// FULL-NEXT: store x86_fp80 [[TMP1]], ptr [[RETVAL_IMAGP]], align 16 +// FULL-NEXT: [[TMP2:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// FULL-NEXT: ret { x86_fp80, x86_fp80 } [[TMP2]] +// +// BASIC-LABEL: define dso_local { x86_fp80, x86_fp80 } @divld( +// BASIC-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// BASIC-NEXT: [[TMP0:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_REAL]] +// BASIC-NEXT: [[TMP1:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_IMAG]] +// BASIC-NEXT: [[TMP2:%.*]] = fadd x86_fp80 [[TMP0]], [[TMP1]] +// BASIC-NEXT: [[TMP3:%.*]] = fmul x86_fp80 [[B_REAL]], [[B_REAL]] +// BASIC-NEXT: [[TMP4:%.*]] = fmul x86_fp80 [[B_IMAG]], [[B_IMAG]] +// BASIC-NEXT: [[TMP5:%.*]] = fadd x86_fp80 [[TMP3]], [[TMP4]] +// BASIC-NEXT: [[TMP6:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_REAL]] +// BASIC-NEXT: [[TMP7:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_IMAG]] +// BASIC-NEXT: [[TMP8:%.*]] = fsub x86_fp80 [[TMP6]], [[TMP7]] +// BASIC-NEXT: [[TMP9:%.*]] = fdiv x86_fp80 [[TMP2]], [[TMP5]] +// BASIC-NEXT: [[TMP10:%.*]] = fdiv x86_fp80 [[TMP8]], [[TMP5]] +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store x86_fp80 [[TMP9]], ptr [[RETVAL_REALP]], align 16 +// BASIC-NEXT: store x86_fp80 [[TMP10]], ptr [[RETVAL_IMAGP]], align 16 +// BASIC-NEXT: [[TMP11:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// BASIC-NEXT: ret { x86_fp80, x86_fp80 } [[TMP11]] +// +// IMPRVD-LABEL: define dso_local { x86_fp80, x86_fp80 } @divld( +// IMPRVD-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR2]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// IMPRVD-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_REAL]]) +// IMPRVD-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_IMAG]]) +// IMPRVD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt x86_fp80 [[TMP0]], [[TMP1]] +// IMPRVD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD-NEXT: [[TMP2:%.*]] = fdiv x86_fp80 [[B_IMAG]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP3:%.*]] = fmul x86_fp80 [[TMP2]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP4:%.*]] = fadd x86_fp80 [[B_REAL]], [[TMP3]] +// IMPRVD-NEXT: [[TMP5:%.*]] = fmul x86_fp80 [[A_IMAG]], [[TMP2]] +// IMPRVD-NEXT: [[TMP6:%.*]] = fadd x86_fp80 [[A_REAL]], [[TMP5]] +// IMPRVD-NEXT: [[TMP7:%.*]] = fdiv x86_fp80 [[TMP6]], [[TMP4]] +// IMPRVD-NEXT: [[TMP8:%.*]] = fmul x86_fp80 [[A_REAL]], [[TMP2]] +// IMPRVD-NEXT: [[TMP9:%.*]] = fsub x86_fp80 [[A_IMAG]], [[TMP8]] +// IMPRVD-NEXT: [[TMP10:%.*]] = fdiv x86_fp80 [[TMP9]], [[TMP4]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD: abs_rhsr_less_than_abs_rhsi: +// IMPRVD-NEXT: [[TMP11:%.*]] = fdiv x86_fp80 [[B_REAL]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP12:%.*]] = fmul x86_fp80 [[TMP11]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP13:%.*]] = fadd x86_fp80 [[B_IMAG]], [[TMP12]] +// IMPRVD-NEXT: [[TMP14:%.*]] = fmul x86_fp80 [[A_REAL]], [[TMP11]] +// IMPRVD-NEXT: [[TMP15:%.*]] = fadd x86_fp80 [[TMP14]], [[A_IMAG]] +// IMPRVD-NEXT: [[TMP16:%.*]] = fdiv x86_fp80 [[TMP15]], [[TMP13]] +// IMPRVD-NEXT: [[TMP17:%.*]] = fmul x86_fp80 [[A_IMAG]], [[TMP11]] +// IMPRVD-NEXT: [[TMP18:%.*]] = fsub x86_fp80 [[TMP17]], [[A_REAL]] +// IMPRVD-NEXT: [[TMP19:%.*]] = fdiv x86_fp80 [[TMP18]], [[TMP13]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD: complex_div: +// IMPRVD-NEXT: [[TMP20:%.*]] = phi x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[TMP21:%.*]] = phi x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store x86_fp80 [[TMP20]], ptr [[RETVAL_REALP]], align 16 +// IMPRVD-NEXT: store x86_fp80 [[TMP21]], ptr [[RETVAL_IMAGP]], align 16 +// IMPRVD-NEXT: [[TMP22:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// IMPRVD-NEXT: ret { x86_fp80, x86_fp80 } [[TMP22]] +// +// PRMTD-LABEL: define dso_local { x86_fp80, x86_fp80 } @divld( +// PRMTD-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// PRMTD-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_REAL]]) +// PRMTD-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_IMAG]]) +// PRMTD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt x86_fp80 [[TMP0]], [[TMP1]] +// PRMTD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// PRMTD: abs_rhsr_greater_or_equal_abs_rhsi: +// PRMTD-NEXT: [[TMP2:%.*]] = fdiv x86_fp80 [[B_IMAG]], [[B_REAL]] +// PRMTD-NEXT: [[TMP3:%.*]] = fmul x86_fp80 [[TMP2]], [[B_IMAG]] +// PRMTD-NEXT: [[TMP4:%.*]] = fadd x86_fp80 [[B_REAL]], [[TMP3]] +// PRMTD-NEXT: [[TMP5:%.*]] = fmul x86_fp80 [[A_IMAG]], [[TMP2]] +// PRMTD-NEXT: [[TMP6:%.*]] = fadd x86_fp80 [[A_REAL]], [[TMP5]] +// PRMTD-NEXT: [[TMP7:%.*]] = fdiv x86_fp80 [[TMP6]], [[TMP4]] +// PRMTD-NEXT: [[TMP8:%.*]] = fmul x86_fp80 [[A_REAL]], [[TMP2]] +// PRMTD-NEXT: [[TMP9:%.*]] = fsub x86_fp80 [[A_IMAG]], [[TMP8]] +// PRMTD-NEXT: [[TMP10:%.*]] = fdiv x86_fp80 [[TMP9]], [[TMP4]] +// PRMTD-NEXT: br label [[COMPLEX_DIV:%.*]] +// PRMTD: abs_rhsr_less_than_abs_rhsi: +// PRMTD-NEXT: [[TMP11:%.*]] = fdiv x86_fp80 [[B_REAL]], [[B_IMAG]] +// PRMTD-NEXT: [[TMP12:%.*]] = fmul x86_fp80 [[TMP11]], [[B_REAL]] +// PRMTD-NEXT: [[TMP13:%.*]] = fadd x86_fp80 [[B_IMAG]], [[TMP12]] +// PRMTD-NEXT: [[TMP14:%.*]] = fmul x86_fp80 [[A_REAL]], [[TMP11]] +// PRMTD-NEXT: [[TMP15:%.*]] = fadd x86_fp80 [[TMP14]], [[A_IMAG]] +// PRMTD-NEXT: [[TMP16:%.*]] = fdiv x86_fp80 [[TMP15]], [[TMP13]] +// PRMTD-NEXT: [[TMP17:%.*]] = fmul x86_fp80 [[A_IMAG]], [[TMP11]] +// PRMTD-NEXT: [[TMP18:%.*]] = fsub x86_fp80 [[TMP17]], [[A_REAL]] +// PRMTD-NEXT: [[TMP19:%.*]] = fdiv x86_fp80 [[TMP18]], [[TMP13]] +// PRMTD-NEXT: br label [[COMPLEX_DIV]] +// PRMTD: complex_div: +// PRMTD-NEXT: [[TMP20:%.*]] = phi x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD-NEXT: [[TMP21:%.*]] = phi x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store x86_fp80 [[TMP20]], ptr [[RETVAL_REALP]], align 16 +// PRMTD-NEXT: store x86_fp80 [[TMP21]], ptr [[RETVAL_IMAGP]], align 16 +// PRMTD-NEXT: [[TMP22:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// PRMTD-NEXT: ret { x86_fp80, x86_fp80 } [[TMP22]] +// +// X86WINPRMTD-LABEL: define dso_local void @divld( +// X86WINPRMTD-SAME: ptr dead_on_unwind noalias writable sret({ double, double }) align 8 [[AGG_RESULT:%.*]], ptr noundef [[A:%.*]], ptr noundef [[B:%.*]]) #[[ATTR0]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RESULT_PTR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: [[B_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: [[A_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: store ptr [[AGG_RESULT]], ptr [[RESULT_PTR]], align 8 +// X86WINPRMTD-NEXT: store ptr [[B]], ptr [[B_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: store ptr [[A]], ptr [[A_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]]) +// X86WINPRMTD-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]]) +// X86WINPRMTD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt double [[TMP0]], [[TMP1]] +// X86WINPRMTD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// X86WINPRMTD: abs_rhsr_greater_or_equal_abs_rhsi: +// X86WINPRMTD-NEXT: [[TMP2:%.*]] = fdiv double [[B_IMAG]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[TMP3:%.*]] = fmul double [[TMP2]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[TMP4:%.*]] = fadd double [[B_REAL]], [[TMP3]] +// X86WINPRMTD-NEXT: [[TMP5:%.*]] = fmul double [[A_IMAG]], [[TMP2]] +// X86WINPRMTD-NEXT: [[TMP6:%.*]] = fadd double [[A_REAL]], [[TMP5]] +// X86WINPRMTD-NEXT: [[TMP7:%.*]] = fdiv double [[TMP6]], [[TMP4]] +// X86WINPRMTD-NEXT: [[TMP8:%.*]] = fmul double [[A_REAL]], [[TMP2]] +// X86WINPRMTD-NEXT: [[TMP9:%.*]] = fsub double [[A_IMAG]], [[TMP8]] +// X86WINPRMTD-NEXT: [[TMP10:%.*]] = fdiv double [[TMP9]], [[TMP4]] +// X86WINPRMTD-NEXT: br label [[COMPLEX_DIV:%.*]] +// X86WINPRMTD: abs_rhsr_less_than_abs_rhsi: +// X86WINPRMTD-NEXT: [[TMP11:%.*]] = fdiv double [[B_REAL]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[TMP12:%.*]] = fmul double [[TMP11]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[TMP13:%.*]] = fadd double [[B_IMAG]], [[TMP12]] +// X86WINPRMTD-NEXT: [[TMP14:%.*]] = fmul double [[A_REAL]], [[TMP11]] +// X86WINPRMTD-NEXT: [[TMP15:%.*]] = fadd double [[TMP14]], [[A_IMAG]] +// X86WINPRMTD-NEXT: [[TMP16:%.*]] = fdiv double [[TMP15]], [[TMP13]] +// X86WINPRMTD-NEXT: [[TMP17:%.*]] = fmul double [[A_IMAG]], [[TMP11]] +// X86WINPRMTD-NEXT: [[TMP18:%.*]] = fsub double [[TMP17]], [[A_REAL]] +// X86WINPRMTD-NEXT: [[TMP19:%.*]] = fdiv double [[TMP18]], [[TMP13]] +// X86WINPRMTD-NEXT: br label [[COMPLEX_DIV]] +// X86WINPRMTD: complex_div: +// X86WINPRMTD-NEXT: [[TMP20:%.*]] = phi double [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// X86WINPRMTD-NEXT: [[TMP21:%.*]] = phi double [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store double [[TMP20]], ptr [[AGG_RESULT_REALP]], align 8 +// X86WINPRMTD-NEXT: store double [[TMP21]], ptr [[AGG_RESULT_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP1:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REAL:%.*]] = load double, ptr [[AGG_RESULT_REALP1]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP2:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAG:%.*]] = load double, ptr [[AGG_RESULT_IMAGP2]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP3:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP4:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store double [[AGG_RESULT_REAL]], ptr [[AGG_RESULT_REALP3]], align 8 +// X86WINPRMTD-NEXT: store double [[AGG_RESULT_IMAG]], ptr [[AGG_RESULT_IMAGP4]], align 8 +// X86WINPRMTD-NEXT: ret void +// +// AVRFP32-LABEL: define dso_local { float, float } @divld( +// AVRFP32-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], float noundef [[B_COERCE0:%.*]], float noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP32-NEXT: entry: +// AVRFP32-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[A:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[B:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP32-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP32-NEXT: [[TMP2:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP32-NEXT: [[TMP3:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP32-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 1 +// AVRFP32-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 1 +// AVRFP32-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 1 +// AVRFP32-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP4:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[B_REAL]]) +// AVRFP32-NEXT: [[TMP5:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[B_IMAG]]) +// AVRFP32-NEXT: [[ABS_CMP:%.*]] = fcmp ugt float [[TMP4]], [[TMP5]] +// AVRFP32-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// AVRFP32: abs_rhsr_greater_or_equal_abs_rhsi: +// AVRFP32-NEXT: [[TMP6:%.*]] = fdiv float [[B_IMAG]], [[B_REAL]] +// AVRFP32-NEXT: [[TMP7:%.*]] = fmul float [[TMP6]], [[B_IMAG]] +// AVRFP32-NEXT: [[TMP8:%.*]] = fadd float [[B_REAL]], [[TMP7]] +// AVRFP32-NEXT: [[TMP9:%.*]] = fmul float [[A_IMAG]], [[TMP6]] +// AVRFP32-NEXT: [[TMP10:%.*]] = fadd float [[A_REAL]], [[TMP9]] +// AVRFP32-NEXT: [[TMP11:%.*]] = fdiv float [[TMP10]], [[TMP8]] +// AVRFP32-NEXT: [[TMP12:%.*]] = fmul float [[A_REAL]], [[TMP6]] +// AVRFP32-NEXT: [[TMP13:%.*]] = fsub float [[A_IMAG]], [[TMP12]] +// AVRFP32-NEXT: [[TMP14:%.*]] = fdiv float [[TMP13]], [[TMP8]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV:%.*]] +// AVRFP32: abs_rhsr_less_than_abs_rhsi: +// AVRFP32-NEXT: [[TMP15:%.*]] = fdiv float [[B_REAL]], [[B_IMAG]] +// AVRFP32-NEXT: [[TMP16:%.*]] = fmul float [[TMP15]], [[B_REAL]] +// AVRFP32-NEXT: [[TMP17:%.*]] = fadd float [[B_IMAG]], [[TMP16]] +// AVRFP32-NEXT: [[TMP18:%.*]] = fmul float [[A_REAL]], [[TMP15]] +// AVRFP32-NEXT: [[TMP19:%.*]] = fadd float [[TMP18]], [[A_IMAG]] +// AVRFP32-NEXT: [[TMP20:%.*]] = fdiv float [[TMP19]], [[TMP17]] +// AVRFP32-NEXT: [[TMP21:%.*]] = fmul float [[A_IMAG]], [[TMP15]] +// AVRFP32-NEXT: [[TMP22:%.*]] = fsub float [[TMP21]], [[A_REAL]] +// AVRFP32-NEXT: [[TMP23:%.*]] = fdiv float [[TMP22]], [[TMP17]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV]] +// AVRFP32: complex_div: +// AVRFP32-NEXT: [[TMP24:%.*]] = phi float [ [[TMP11]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP20]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP32-NEXT: [[TMP25:%.*]] = phi float [ [[TMP14]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP23]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP32-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP32-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[TMP24]], ptr [[RETVAL_REALP]], align 1 +// AVRFP32-NEXT: store float [[TMP25]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP26:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP32-NEXT: ret { float, float } [[TMP26]] +// +// AVRFP64-LABEL: define dso_local void @divld( +// AVRFP64-SAME: ptr dead_on_unwind noalias writable sret({ double, double }) align 1 [[AGG_RESULT:%.*]], double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP64-NEXT: entry: +// AVRFP64-NEXT: [[A:%.*]] = alloca { double, double }, align 1 +// AVRFP64-NEXT: [[B:%.*]] = alloca { double, double }, align 1 +// AVRFP64-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP64-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP64-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP64-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP64-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 1 +// AVRFP64-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 1 +// AVRFP64-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 1 +// AVRFP64-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 1 +// AVRFP64-NEXT: [[TMP4:%.*]] = call addrspace(1) double @llvm.fabs.f64(double [[B_REAL]]) +// AVRFP64-NEXT: [[TMP5:%.*]] = call addrspace(1) double @llvm.fabs.f64(double [[B_IMAG]]) +// AVRFP64-NEXT: [[ABS_CMP:%.*]] = fcmp ugt double [[TMP4]], [[TMP5]] +// AVRFP64-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// AVRFP64: abs_rhsr_greater_or_equal_abs_rhsi: +// AVRFP64-NEXT: [[TMP6:%.*]] = fdiv double [[B_IMAG]], [[B_REAL]] +// AVRFP64-NEXT: [[TMP7:%.*]] = fmul double [[TMP6]], [[B_IMAG]] +// AVRFP64-NEXT: [[TMP8:%.*]] = fadd double [[B_REAL]], [[TMP7]] +// AVRFP64-NEXT: [[TMP9:%.*]] = fmul double [[A_IMAG]], [[TMP6]] +// AVRFP64-NEXT: [[TMP10:%.*]] = fadd double [[A_REAL]], [[TMP9]] +// AVRFP64-NEXT: [[TMP11:%.*]] = fdiv double [[TMP10]], [[TMP8]] +// AVRFP64-NEXT: [[TMP12:%.*]] = fmul double [[A_REAL]], [[TMP6]] +// AVRFP64-NEXT: [[TMP13:%.*]] = fsub double [[A_IMAG]], [[TMP12]] +// AVRFP64-NEXT: [[TMP14:%.*]] = fdiv double [[TMP13]], [[TMP8]] +// AVRFP64-NEXT: br label [[COMPLEX_DIV:%.*]] +// AVRFP64: abs_rhsr_less_than_abs_rhsi: +// AVRFP64-NEXT: [[TMP15:%.*]] = fdiv double [[B_REAL]], [[B_IMAG]] +// AVRFP64-NEXT: [[TMP16:%.*]] = fmul double [[TMP15]], [[B_REAL]] +// AVRFP64-NEXT: [[TMP17:%.*]] = fadd double [[B_IMAG]], [[TMP16]] +// AVRFP64-NEXT: [[TMP18:%.*]] = fmul double [[A_REAL]], [[TMP15]] +// AVRFP64-NEXT: [[TMP19:%.*]] = fadd double [[TMP18]], [[A_IMAG]] +// AVRFP64-NEXT: [[TMP20:%.*]] = fdiv double [[TMP19]], [[TMP17]] +// AVRFP64-NEXT: [[TMP21:%.*]] = fmul double [[A_IMAG]], [[TMP15]] +// AVRFP64-NEXT: [[TMP22:%.*]] = fsub double [[TMP21]], [[A_REAL]] +// AVRFP64-NEXT: [[TMP23:%.*]] = fdiv double [[TMP22]], [[TMP17]] +// AVRFP64-NEXT: br label [[COMPLEX_DIV]] +// AVRFP64: complex_div: +// AVRFP64-NEXT: [[TMP24:%.*]] = phi double [ [[TMP11]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP20]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP64-NEXT: [[TMP25:%.*]] = phi double [ [[TMP14]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP23]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP64-NEXT: [[AGG_RESULT_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[TMP24]], ptr [[AGG_RESULT_REALP]], align 1 +// AVRFP64-NEXT: store double [[TMP25]], ptr [[AGG_RESULT_IMAGP]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_REALP1:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_REAL:%.*]] = load double, ptr [[AGG_RESULT_REALP1]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP2:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: [[AGG_RESULT_IMAG:%.*]] = load double, ptr [[AGG_RESULT_IMAGP2]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_REALP3:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP4:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[AGG_RESULT_REAL]], ptr [[AGG_RESULT_REALP3]], align 1 +// AVRFP64-NEXT: store double [[AGG_RESULT_IMAG]], ptr [[AGG_RESULT_IMAGP4]], align 1 +// AVRFP64-NEXT: ret void +// +// BASIC_FAST-LABEL: define dso_local { x86_fp80, x86_fp80 } @divld( +// BASIC_FAST-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// BASIC_FAST-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_REAL]] +// BASIC_FAST-NEXT: [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP0]], [[TMP1]] +// BASIC_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[B_REAL]] +// BASIC_FAST-NEXT: [[TMP4:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[TMP5:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP3]], [[TMP4]] +// BASIC_FAST-NEXT: [[TMP6:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_REAL]] +// BASIC_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP7]] +// BASIC_FAST-NEXT: [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP2]], [[TMP5]] +// BASIC_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP8]], [[TMP5]] +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store x86_fp80 [[TMP9]], ptr [[RETVAL_REALP]], align 16 +// BASIC_FAST-NEXT: store x86_fp80 [[TMP10]], ptr [[RETVAL_IMAGP]], align 16 +// BASIC_FAST-NEXT: [[TMP11:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// BASIC_FAST-NEXT: ret { x86_fp80, x86_fp80 } [[TMP11]] +// +// FULL_FAST-LABEL: define dso_local { x86_fp80, x86_fp80 } @divld( +// FULL_FAST-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// FULL_FAST-NEXT: [[CALL:%.*]] = call { x86_fp80, x86_fp80 } @__divxc3(x86_fp80 noundef nofpclass(nan inf) [[A_REAL]], x86_fp80 noundef nofpclass(nan inf) [[A_IMAG]], x86_fp80 noundef nofpclass(nan inf) [[B_REAL]], x86_fp80 noundef nofpclass(nan inf) [[B_IMAG]]) #[[ATTR2]] +// FULL_FAST-NEXT: [[TMP0:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 0 +// FULL_FAST-NEXT: [[TMP1:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 1 +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store x86_fp80 [[TMP0]], ptr [[RETVAL_REALP]], align 16 +// FULL_FAST-NEXT: store x86_fp80 [[TMP1]], ptr [[RETVAL_IMAGP]], align 16 +// FULL_FAST-NEXT: [[TMP2:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// FULL_FAST-NEXT: ret { x86_fp80, x86_fp80 } [[TMP2]] +// +// IMPRVD_FAST-LABEL: define dso_local { x86_fp80, x86_fp80 } @divld( +// IMPRVD_FAST-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR2]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_REAL]]) +// IMPRVD_FAST-NEXT: [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_IMAG]]) +// IMPRVD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt x86_fp80 [[TMP0]], [[TMP1]] +// IMPRVD_FAST-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD_FAST: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP2:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP2]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP4:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP3]] +// IMPRVD_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[TMP5]] +// IMPRVD_FAST-NEXT: [[TMP7:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP4]] +// IMPRVD_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP9:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[TMP8]] +// IMPRVD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP9]], [[TMP4]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD_FAST: abs_rhsr_less_than_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP11:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP11]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP12]] +// IMPRVD_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP15:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP14]], [[A_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP16:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP15]], [[TMP13]] +// IMPRVD_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP18:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP17]], [[A_REAL]] +// IMPRVD_FAST-NEXT: [[TMP19:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP18]], [[TMP13]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD_FAST: complex_div: +// IMPRVD_FAST-NEXT: [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store x86_fp80 [[TMP20]], ptr [[RETVAL_REALP]], align 16 +// IMPRVD_FAST-NEXT: store x86_fp80 [[TMP21]], ptr [[RETVAL_IMAGP]], align 16 +// IMPRVD_FAST-NEXT: [[TMP22:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// IMPRVD_FAST-NEXT: ret { x86_fp80, x86_fp80 } [[TMP22]] +// +// PRMTD_FAST-LABEL: define dso_local { x86_fp80, x86_fp80 } @divld( +// PRMTD_FAST-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_REAL]]) +// PRMTD_FAST-NEXT: [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_IMAG]]) +// PRMTD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt x86_fp80 [[TMP0]], [[TMP1]] +// PRMTD_FAST-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// PRMTD_FAST: abs_rhsr_greater_or_equal_abs_rhsi: +// PRMTD_FAST-NEXT: [[TMP2:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP2]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[TMP4:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP3]] +// PRMTD_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[TMP2]] +// PRMTD_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[TMP5]] +// PRMTD_FAST-NEXT: [[TMP7:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP4]] +// PRMTD_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[TMP2]] +// PRMTD_FAST-NEXT: [[TMP9:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[TMP8]] +// PRMTD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP9]], [[TMP4]] +// PRMTD_FAST-NEXT: br label [[COMPLEX_DIV:%.*]] +// PRMTD_FAST: abs_rhsr_less_than_abs_rhsi: +// PRMTD_FAST-NEXT: [[TMP11:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP11]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP12]] +// PRMTD_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[TMP11]] +// PRMTD_FAST-NEXT: [[TMP15:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP14]], [[A_IMAG]] +// PRMTD_FAST-NEXT: [[TMP16:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP15]], [[TMP13]] +// PRMTD_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[TMP11]] +// PRMTD_FAST-NEXT: [[TMP18:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP17]], [[A_REAL]] +// PRMTD_FAST-NEXT: [[TMP19:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP18]], [[TMP13]] +// PRMTD_FAST-NEXT: br label [[COMPLEX_DIV]] +// PRMTD_FAST: complex_div: +// PRMTD_FAST-NEXT: [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD_FAST-NEXT: [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store x86_fp80 [[TMP20]], ptr [[RETVAL_REALP]], align 16 +// PRMTD_FAST-NEXT: store x86_fp80 [[TMP21]], ptr [[RETVAL_IMAGP]], align 16 +// PRMTD_FAST-NEXT: [[TMP22:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// PRMTD_FAST-NEXT: ret { x86_fp80, x86_fp80 } [[TMP22]] +// +_Complex long double divld(_Complex long double a, _Complex long double b) { + return a / b; +} +// FULL-LABEL: define dso_local { x86_fp80, x86_fp80 } @mulld( +// FULL-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// FULL-NEXT: [[MUL_AC:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_REAL]] +// FULL-NEXT: [[MUL_BD:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_IMAG]] +// FULL-NEXT: [[MUL_AD:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_IMAG]] +// FULL-NEXT: [[MUL_BC:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_REAL]] +// FULL-NEXT: [[MUL_R:%.*]] = fsub x86_fp80 [[MUL_AC]], [[MUL_BD]] +// FULL-NEXT: [[MUL_I:%.*]] = fadd x86_fp80 [[MUL_AD]], [[MUL_BC]] +// FULL-NEXT: [[ISNAN_CMP:%.*]] = fcmp uno x86_fp80 [[MUL_R]], [[MUL_R]] +// FULL-NEXT: br i1 [[ISNAN_CMP]], label [[COMPLEX_MUL_IMAG_NAN:%.*]], label [[COMPLEX_MUL_CONT:%.*]], !prof [[PROF2]] +// FULL: complex_mul_imag_nan: +// FULL-NEXT: [[ISNAN_CMP1:%.*]] = fcmp uno x86_fp80 [[MUL_I]], [[MUL_I]] +// FULL-NEXT: br i1 [[ISNAN_CMP1]], label [[COMPLEX_MUL_LIBCALL:%.*]], label [[COMPLEX_MUL_CONT]], !prof [[PROF2]] +// FULL: complex_mul_libcall: +// FULL-NEXT: [[CALL:%.*]] = call { x86_fp80, x86_fp80 } @__mulxc3(x86_fp80 noundef [[A_REAL]], x86_fp80 noundef [[A_IMAG]], x86_fp80 noundef [[B_REAL]], x86_fp80 noundef [[B_IMAG]]) #[[ATTR2]] +// FULL-NEXT: [[TMP0:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 0 +// FULL-NEXT: [[TMP1:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 1 +// FULL-NEXT: br label [[COMPLEX_MUL_CONT]] +// FULL: complex_mul_cont: +// FULL-NEXT: [[REAL_MUL_PHI:%.*]] = phi x86_fp80 [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[TMP0]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL-NEXT: [[IMAG_MUL_PHI:%.*]] = phi x86_fp80 [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[TMP1]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store x86_fp80 [[REAL_MUL_PHI]], ptr [[RETVAL_REALP]], align 16 +// FULL-NEXT: store x86_fp80 [[IMAG_MUL_PHI]], ptr [[RETVAL_IMAGP]], align 16 +// FULL-NEXT: [[TMP2:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// FULL-NEXT: ret { x86_fp80, x86_fp80 } [[TMP2]] +// +// BASIC-LABEL: define dso_local { x86_fp80, x86_fp80 } @mulld( +// BASIC-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// BASIC-NEXT: [[MUL_AC:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_REAL]] +// BASIC-NEXT: [[MUL_BD:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_IMAG]] +// BASIC-NEXT: [[MUL_AD:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_IMAG]] +// BASIC-NEXT: [[MUL_BC:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_REAL]] +// BASIC-NEXT: [[MUL_R:%.*]] = fsub x86_fp80 [[MUL_AC]], [[MUL_BD]] +// BASIC-NEXT: [[MUL_I:%.*]] = fadd x86_fp80 [[MUL_AD]], [[MUL_BC]] +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store x86_fp80 [[MUL_R]], ptr [[RETVAL_REALP]], align 16 +// BASIC-NEXT: store x86_fp80 [[MUL_I]], ptr [[RETVAL_IMAGP]], align 16 +// BASIC-NEXT: [[TMP0:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// BASIC-NEXT: ret { x86_fp80, x86_fp80 } [[TMP0]] +// +// IMPRVD-LABEL: define dso_local { x86_fp80, x86_fp80 } @mulld( +// IMPRVD-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR2]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// IMPRVD-NEXT: [[MUL_AC:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_REAL]] +// IMPRVD-NEXT: [[MUL_BD:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_IMAG]] +// IMPRVD-NEXT: [[MUL_AD:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_IMAG]] +// IMPRVD-NEXT: [[MUL_BC:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_REAL]] +// IMPRVD-NEXT: [[MUL_R:%.*]] = fsub x86_fp80 [[MUL_AC]], [[MUL_BD]] +// IMPRVD-NEXT: [[MUL_I:%.*]] = fadd x86_fp80 [[MUL_AD]], [[MUL_BC]] +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store x86_fp80 [[MUL_R]], ptr [[RETVAL_REALP]], align 16 +// IMPRVD-NEXT: store x86_fp80 [[MUL_I]], ptr [[RETVAL_IMAGP]], align 16 +// IMPRVD-NEXT: [[TMP0:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// IMPRVD-NEXT: ret { x86_fp80, x86_fp80 } [[TMP0]] +// +// PRMTD-LABEL: define dso_local { x86_fp80, x86_fp80 } @mulld( +// PRMTD-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// PRMTD-NEXT: [[MUL_AC:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_REAL]] +// PRMTD-NEXT: [[MUL_BD:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_IMAG]] +// PRMTD-NEXT: [[MUL_AD:%.*]] = fmul x86_fp80 [[A_REAL]], [[B_IMAG]] +// PRMTD-NEXT: [[MUL_BC:%.*]] = fmul x86_fp80 [[A_IMAG]], [[B_REAL]] +// PRMTD-NEXT: [[MUL_R:%.*]] = fsub x86_fp80 [[MUL_AC]], [[MUL_BD]] +// PRMTD-NEXT: [[MUL_I:%.*]] = fadd x86_fp80 [[MUL_AD]], [[MUL_BC]] +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store x86_fp80 [[MUL_R]], ptr [[RETVAL_REALP]], align 16 +// PRMTD-NEXT: store x86_fp80 [[MUL_I]], ptr [[RETVAL_IMAGP]], align 16 +// PRMTD-NEXT: [[TMP0:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// PRMTD-NEXT: ret { x86_fp80, x86_fp80 } [[TMP0]] +// +// X86WINPRMTD-LABEL: define dso_local void @mulld( +// X86WINPRMTD-SAME: ptr dead_on_unwind noalias writable sret({ double, double }) align 8 [[AGG_RESULT:%.*]], ptr noundef [[A:%.*]], ptr noundef [[B:%.*]]) #[[ATTR0]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RESULT_PTR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: [[B_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: [[A_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: store ptr [[AGG_RESULT]], ptr [[RESULT_PTR]], align 8 +// X86WINPRMTD-NEXT: store ptr [[B]], ptr [[B_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: store ptr [[A]], ptr [[A_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 8 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[MUL_AC:%.*]] = fmul double [[A_REAL]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[MUL_BD:%.*]] = fmul double [[A_IMAG]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[MUL_AD:%.*]] = fmul double [[A_REAL]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[MUL_BC:%.*]] = fmul double [[A_IMAG]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[MUL_R:%.*]] = fsub double [[MUL_AC]], [[MUL_BD]] +// X86WINPRMTD-NEXT: [[MUL_I:%.*]] = fadd double [[MUL_AD]], [[MUL_BC]] +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store double [[MUL_R]], ptr [[AGG_RESULT_REALP]], align 8 +// X86WINPRMTD-NEXT: store double [[MUL_I]], ptr [[AGG_RESULT_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP1:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REAL:%.*]] = load double, ptr [[AGG_RESULT_REALP1]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP2:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAG:%.*]] = load double, ptr [[AGG_RESULT_IMAGP2]], align 8 +// X86WINPRMTD-NEXT: [[AGG_RESULT_REALP3:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[AGG_RESULT_IMAGP4:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store double [[AGG_RESULT_REAL]], ptr [[AGG_RESULT_REALP3]], align 8 +// X86WINPRMTD-NEXT: store double [[AGG_RESULT_IMAG]], ptr [[AGG_RESULT_IMAGP4]], align 8 +// X86WINPRMTD-NEXT: ret void +// +// AVRFP32-LABEL: define dso_local { float, float } @mulld( +// AVRFP32-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], float noundef [[B_COERCE0:%.*]], float noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP32-NEXT: entry: +// AVRFP32-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[A:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[B:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP32-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP32-NEXT: [[TMP2:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP32-NEXT: [[TMP3:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP32-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 1 +// AVRFP32-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 1 +// AVRFP32-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 1 +// AVRFP32-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 1 +// AVRFP32-NEXT: [[MUL_AC:%.*]] = fmul float [[A_REAL]], [[B_REAL]] +// AVRFP32-NEXT: [[MUL_BD:%.*]] = fmul float [[A_IMAG]], [[B_IMAG]] +// AVRFP32-NEXT: [[MUL_AD:%.*]] = fmul float [[A_REAL]], [[B_IMAG]] +// AVRFP32-NEXT: [[MUL_BC:%.*]] = fmul float [[A_IMAG]], [[B_REAL]] +// AVRFP32-NEXT: [[MUL_R:%.*]] = fsub float [[MUL_AC]], [[MUL_BD]] +// AVRFP32-NEXT: [[MUL_I:%.*]] = fadd float [[MUL_AD]], [[MUL_BC]] +// AVRFP32-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP32-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[MUL_R]], ptr [[RETVAL_REALP]], align 1 +// AVRFP32-NEXT: store float [[MUL_I]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP4:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP32-NEXT: ret { float, float } [[TMP4]] +// +// AVRFP64-LABEL: define dso_local void @mulld( +// AVRFP64-SAME: ptr dead_on_unwind noalias writable sret({ double, double }) align 1 [[AGG_RESULT:%.*]], double noundef [[A_COERCE0:%.*]], double noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP64-NEXT: entry: +// AVRFP64-NEXT: [[A:%.*]] = alloca { double, double }, align 1 +// AVRFP64-NEXT: [[B:%.*]] = alloca { double, double }, align 1 +// AVRFP64-NEXT: [[TMP0:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: store double [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP64-NEXT: [[TMP1:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP64-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP64-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP64-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: [[A_REAL:%.*]] = load double, ptr [[A_REALP]], align 1 +// AVRFP64-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: [[A_IMAG:%.*]] = load double, ptr [[A_IMAGP]], align 1 +// AVRFP64-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 1 +// AVRFP64-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 1 +// AVRFP64-NEXT: [[MUL_AC:%.*]] = fmul double [[A_REAL]], [[B_REAL]] +// AVRFP64-NEXT: [[MUL_BD:%.*]] = fmul double [[A_IMAG]], [[B_IMAG]] +// AVRFP64-NEXT: [[MUL_AD:%.*]] = fmul double [[A_REAL]], [[B_IMAG]] +// AVRFP64-NEXT: [[MUL_BC:%.*]] = fmul double [[A_IMAG]], [[B_REAL]] +// AVRFP64-NEXT: [[MUL_R:%.*]] = fsub double [[MUL_AC]], [[MUL_BD]] +// AVRFP64-NEXT: [[MUL_I:%.*]] = fadd double [[MUL_AD]], [[MUL_BC]] +// AVRFP64-NEXT: [[AGG_RESULT_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[MUL_R]], ptr [[AGG_RESULT_REALP]], align 1 +// AVRFP64-NEXT: store double [[MUL_I]], ptr [[AGG_RESULT_IMAGP]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_REALP1:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_REAL:%.*]] = load double, ptr [[AGG_RESULT_REALP1]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP2:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: [[AGG_RESULT_IMAG:%.*]] = load double, ptr [[AGG_RESULT_IMAGP2]], align 1 +// AVRFP64-NEXT: [[AGG_RESULT_REALP3:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 0 +// AVRFP64-NEXT: [[AGG_RESULT_IMAGP4:%.*]] = getelementptr inbounds { double, double }, ptr [[AGG_RESULT]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[AGG_RESULT_REAL]], ptr [[AGG_RESULT_REALP3]], align 1 +// AVRFP64-NEXT: store double [[AGG_RESULT_IMAG]], ptr [[AGG_RESULT_IMAGP4]], align 1 +// AVRFP64-NEXT: ret void +// +// BASIC_FAST-LABEL: define dso_local { x86_fp80, x86_fp80 } @mulld( +// BASIC_FAST-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// BASIC_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_REAL]] +// BASIC_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_IMAG]] +// BASIC_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_REAL]] +// BASIC_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[MUL_AC]], [[MUL_BD]] +// BASIC_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[MUL_AD]], [[MUL_BC]] +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store x86_fp80 [[MUL_R]], ptr [[RETVAL_REALP]], align 16 +// BASIC_FAST-NEXT: store x86_fp80 [[MUL_I]], ptr [[RETVAL_IMAGP]], align 16 +// BASIC_FAST-NEXT: [[TMP0:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// BASIC_FAST-NEXT: ret { x86_fp80, x86_fp80 } [[TMP0]] +// +// FULL_FAST-LABEL: define dso_local { x86_fp80, x86_fp80 } @mulld( +// FULL_FAST-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// FULL_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_REAL]] +// FULL_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_IMAG]] +// FULL_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_IMAG]] +// FULL_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_REAL]] +// FULL_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[MUL_AC]], [[MUL_BD]] +// FULL_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[MUL_AD]], [[MUL_BC]] +// FULL_FAST-NEXT: [[ISNAN_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn uno x86_fp80 [[MUL_R]], [[MUL_R]] +// FULL_FAST-NEXT: br i1 [[ISNAN_CMP]], label [[COMPLEX_MUL_IMAG_NAN:%.*]], label [[COMPLEX_MUL_CONT:%.*]], !prof [[PROF2]] +// FULL_FAST: complex_mul_imag_nan: +// FULL_FAST-NEXT: [[ISNAN_CMP1:%.*]] = fcmp reassoc nnan ninf nsz arcp afn uno x86_fp80 [[MUL_I]], [[MUL_I]] +// FULL_FAST-NEXT: br i1 [[ISNAN_CMP1]], label [[COMPLEX_MUL_LIBCALL:%.*]], label [[COMPLEX_MUL_CONT]], !prof [[PROF2]] +// FULL_FAST: complex_mul_libcall: +// FULL_FAST-NEXT: [[CALL:%.*]] = call { x86_fp80, x86_fp80 } @__mulxc3(x86_fp80 noundef nofpclass(nan inf) [[A_REAL]], x86_fp80 noundef nofpclass(nan inf) [[A_IMAG]], x86_fp80 noundef nofpclass(nan inf) [[B_REAL]], x86_fp80 noundef nofpclass(nan inf) [[B_IMAG]]) #[[ATTR2]] +// FULL_FAST-NEXT: [[TMP0:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 0 +// FULL_FAST-NEXT: [[TMP1:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 1 +// FULL_FAST-NEXT: br label [[COMPLEX_MUL_CONT]] +// FULL_FAST: complex_mul_cont: +// FULL_FAST-NEXT: [[REAL_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[MUL_R]], [[ENTRY:%.*]] ], [ [[MUL_R]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[TMP0]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL_FAST-NEXT: [[IMAG_MUL_PHI:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[MUL_I]], [[ENTRY]] ], [ [[MUL_I]], [[COMPLEX_MUL_IMAG_NAN]] ], [ [[TMP1]], [[COMPLEX_MUL_LIBCALL]] ] +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store x86_fp80 [[REAL_MUL_PHI]], ptr [[RETVAL_REALP]], align 16 +// FULL_FAST-NEXT: store x86_fp80 [[IMAG_MUL_PHI]], ptr [[RETVAL_IMAGP]], align 16 +// FULL_FAST-NEXT: [[TMP2:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// FULL_FAST-NEXT: ret { x86_fp80, x86_fp80 } [[TMP2]] +// +// IMPRVD_FAST-LABEL: define dso_local { x86_fp80, x86_fp80 } @mulld( +// IMPRVD_FAST-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR2]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// IMPRVD_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[MUL_AC]], [[MUL_BD]] +// IMPRVD_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[MUL_AD]], [[MUL_BC]] +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store x86_fp80 [[MUL_R]], ptr [[RETVAL_REALP]], align 16 +// IMPRVD_FAST-NEXT: store x86_fp80 [[MUL_I]], ptr [[RETVAL_IMAGP]], align 16 +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// IMPRVD_FAST-NEXT: ret { x86_fp80, x86_fp80 } [[TMP0]] +// +// PRMTD_FAST-LABEL: define dso_local { x86_fp80, x86_fp80 } @mulld( +// PRMTD_FAST-SAME: ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[A:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]]) #[[ATTR1]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { x86_fp80, x86_fp80 }, align 16 +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load x86_fp80, ptr [[A_REALP]], align 16 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load x86_fp80, ptr [[A_IMAGP]], align 16 +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// PRMTD_FAST-NEXT: [[MUL_AC:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[MUL_BD:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[MUL_AD:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_REAL]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[MUL_BC:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[A_IMAG]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[MUL_R:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[MUL_AC]], [[MUL_BD]] +// PRMTD_FAST-NEXT: [[MUL_I:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[MUL_AD]], [[MUL_BC]] +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store x86_fp80 [[MUL_R]], ptr [[RETVAL_REALP]], align 16 +// PRMTD_FAST-NEXT: store x86_fp80 [[MUL_I]], ptr [[RETVAL_IMAGP]], align 16 +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = load { x86_fp80, x86_fp80 }, ptr [[RETVAL]], align 16 +// PRMTD_FAST-NEXT: ret { x86_fp80, x86_fp80 } [[TMP0]] +// +_Complex long double mulld(_Complex long double a, _Complex long double b) { return a * b; } + +// FULL-LABEL: define dso_local <2 x float> @f1( +// FULL-SAME: <2 x float> noundef [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x float> noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// FULL-NEXT: entry: +// FULL-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: [[C:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// FULL-NEXT: store <2 x float> [[C_COERCE]], ptr [[C]], align 4 +// FULL-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// FULL-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// FULL-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// FULL-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// FULL-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// FULL-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 4 +// FULL-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// FULL-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4 +// FULL-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to x86_fp80 +// FULL-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to x86_fp80 +// FULL-NEXT: [[CALL:%.*]] = call { x86_fp80, x86_fp80 } @__divxc3(x86_fp80 noundef [[B_REAL]], x86_fp80 noundef [[B_IMAG]], x86_fp80 noundef [[CONV]], x86_fp80 noundef [[CONV1]]) #[[ATTR2]] +// FULL-NEXT: [[TMP0:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 0 +// FULL-NEXT: [[TMP1:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 1 +// FULL-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP0]] to float +// FULL-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP1]] to float +// FULL-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// FULL-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// FULL-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// FULL-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// FULL-NEXT: [[CALL4:%.*]] = call <2 x float> @__divsc3(float noundef [[CONV2]], float noundef [[CONV3]], float noundef [[A_REAL]], float noundef [[A_IMAG]]) #[[ATTR2]] +// FULL-NEXT: store <2 x float> [[CALL4]], ptr [[COERCE]], align 4 +// FULL-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// FULL-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// FULL-NEXT: store float [[COERCE_REAL]], ptr [[RETVAL_REALP]], align 4 +// FULL-NEXT: store float [[COERCE_IMAG]], ptr [[RETVAL_IMAGP]], align 4 +// FULL-NEXT: [[TMP2:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// FULL-NEXT: ret <2 x float> [[TMP2]] +// +// BASIC-LABEL: define dso_local <2 x float> @f1( +// BASIC-SAME: <2 x float> noundef [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x float> noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// BASIC-NEXT: entry: +// BASIC-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// BASIC-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// BASIC-NEXT: [[C:%.*]] = alloca { float, float }, align 4 +// BASIC-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// BASIC-NEXT: store <2 x float> [[C_COERCE]], ptr [[C]], align 4 +// BASIC-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// BASIC-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// BASIC-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// BASIC-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// BASIC-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// BASIC-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 4 +// BASIC-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// BASIC-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4 +// BASIC-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to x86_fp80 +// BASIC-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to x86_fp80 +// BASIC-NEXT: [[TMP0:%.*]] = fmul x86_fp80 [[B_REAL]], [[CONV]] +// BASIC-NEXT: [[TMP1:%.*]] = fmul x86_fp80 [[B_IMAG]], [[CONV1]] +// BASIC-NEXT: [[TMP2:%.*]] = fadd x86_fp80 [[TMP0]], [[TMP1]] +// BASIC-NEXT: [[TMP3:%.*]] = fmul x86_fp80 [[CONV]], [[CONV]] +// BASIC-NEXT: [[TMP4:%.*]] = fmul x86_fp80 [[CONV1]], [[CONV1]] +// BASIC-NEXT: [[TMP5:%.*]] = fadd x86_fp80 [[TMP3]], [[TMP4]] +// BASIC-NEXT: [[TMP6:%.*]] = fmul x86_fp80 [[B_IMAG]], [[CONV]] +// BASIC-NEXT: [[TMP7:%.*]] = fmul x86_fp80 [[B_REAL]], [[CONV1]] +// BASIC-NEXT: [[TMP8:%.*]] = fsub x86_fp80 [[TMP6]], [[TMP7]] +// BASIC-NEXT: [[TMP9:%.*]] = fdiv x86_fp80 [[TMP2]], [[TMP5]] +// BASIC-NEXT: [[TMP10:%.*]] = fdiv x86_fp80 [[TMP8]], [[TMP5]] +// BASIC-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP9]] to float +// BASIC-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP10]] to float +// BASIC-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// BASIC-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// BASIC-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// BASIC-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// BASIC-NEXT: [[TMP11:%.*]] = fmul float [[CONV2]], [[A_REAL]] +// BASIC-NEXT: [[TMP12:%.*]] = fmul float [[CONV3]], [[A_IMAG]] +// BASIC-NEXT: [[TMP13:%.*]] = fadd float [[TMP11]], [[TMP12]] +// BASIC-NEXT: [[TMP14:%.*]] = fmul float [[A_REAL]], [[A_REAL]] +// BASIC-NEXT: [[TMP15:%.*]] = fmul float [[A_IMAG]], [[A_IMAG]] +// BASIC-NEXT: [[TMP16:%.*]] = fadd float [[TMP14]], [[TMP15]] +// BASIC-NEXT: [[TMP17:%.*]] = fmul float [[CONV3]], [[A_REAL]] +// BASIC-NEXT: [[TMP18:%.*]] = fmul float [[CONV2]], [[A_IMAG]] +// BASIC-NEXT: [[TMP19:%.*]] = fsub float [[TMP17]], [[TMP18]] +// BASIC-NEXT: [[TMP20:%.*]] = fdiv float [[TMP13]], [[TMP16]] +// BASIC-NEXT: [[TMP21:%.*]] = fdiv float [[TMP19]], [[TMP16]] +// BASIC-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC-NEXT: store float [[TMP20]], ptr [[RETVAL_REALP]], align 4 +// BASIC-NEXT: store float [[TMP21]], ptr [[RETVAL_IMAGP]], align 4 +// BASIC-NEXT: [[TMP22:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// BASIC-NEXT: ret <2 x float> [[TMP22]] +// +// IMPRVD-LABEL: define dso_local <2 x float> @f1( +// IMPRVD-SAME: <2 x float> noundef [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x float> noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: [[C:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// IMPRVD-NEXT: store <2 x float> [[C_COERCE]], ptr [[C]], align 4 +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// IMPRVD-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// IMPRVD-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 4 +// IMPRVD-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// IMPRVD-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4 +// IMPRVD-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to x86_fp80 +// IMPRVD-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to x86_fp80 +// IMPRVD-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) +// IMPRVD-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) +// IMPRVD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt x86_fp80 [[TMP0]], [[TMP1]] +// IMPRVD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD-NEXT: [[TMP2:%.*]] = fdiv x86_fp80 [[CONV1]], [[CONV]] +// IMPRVD-NEXT: [[TMP3:%.*]] = fmul x86_fp80 [[TMP2]], [[CONV1]] +// IMPRVD-NEXT: [[TMP4:%.*]] = fadd x86_fp80 [[CONV]], [[TMP3]] +// IMPRVD-NEXT: [[TMP5:%.*]] = fmul x86_fp80 [[B_IMAG]], [[TMP2]] +// IMPRVD-NEXT: [[TMP6:%.*]] = fadd x86_fp80 [[B_REAL]], [[TMP5]] +// IMPRVD-NEXT: [[TMP7:%.*]] = fdiv x86_fp80 [[TMP6]], [[TMP4]] +// IMPRVD-NEXT: [[TMP8:%.*]] = fmul x86_fp80 [[B_REAL]], [[TMP2]] +// IMPRVD-NEXT: [[TMP9:%.*]] = fsub x86_fp80 [[B_IMAG]], [[TMP8]] +// IMPRVD-NEXT: [[TMP10:%.*]] = fdiv x86_fp80 [[TMP9]], [[TMP4]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD: abs_rhsr_less_than_abs_rhsi: +// IMPRVD-NEXT: [[TMP11:%.*]] = fdiv x86_fp80 [[CONV]], [[CONV1]] +// IMPRVD-NEXT: [[TMP12:%.*]] = fmul x86_fp80 [[TMP11]], [[CONV]] +// IMPRVD-NEXT: [[TMP13:%.*]] = fadd x86_fp80 [[CONV1]], [[TMP12]] +// IMPRVD-NEXT: [[TMP14:%.*]] = fmul x86_fp80 [[B_REAL]], [[TMP11]] +// IMPRVD-NEXT: [[TMP15:%.*]] = fadd x86_fp80 [[TMP14]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP16:%.*]] = fdiv x86_fp80 [[TMP15]], [[TMP13]] +// IMPRVD-NEXT: [[TMP17:%.*]] = fmul x86_fp80 [[B_IMAG]], [[TMP11]] +// IMPRVD-NEXT: [[TMP18:%.*]] = fsub x86_fp80 [[TMP17]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP19:%.*]] = fdiv x86_fp80 [[TMP18]], [[TMP13]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD: complex_div: +// IMPRVD-NEXT: [[TMP20:%.*]] = phi x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[TMP21:%.*]] = phi x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP20]] to float +// IMPRVD-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP21]] to float +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// IMPRVD-NEXT: [[TMP22:%.*]] = call float @llvm.fabs.f32(float [[A_REAL]]) +// IMPRVD-NEXT: [[TMP23:%.*]] = call float @llvm.fabs.f32(float [[A_IMAG]]) +// IMPRVD-NEXT: [[ABS_CMP4:%.*]] = fcmp ugt float [[TMP22]], [[TMP23]] +// IMPRVD-NEXT: br i1 [[ABS_CMP4]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI5:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI6:%.*]] +// IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi5: +// IMPRVD-NEXT: [[TMP24:%.*]] = fdiv float [[A_IMAG]], [[A_REAL]] +// IMPRVD-NEXT: [[TMP25:%.*]] = fmul float [[TMP24]], [[A_IMAG]] +// IMPRVD-NEXT: [[TMP26:%.*]] = fadd float [[A_REAL]], [[TMP25]] +// IMPRVD-NEXT: [[TMP27:%.*]] = fmul float [[CONV3]], [[TMP24]] +// IMPRVD-NEXT: [[TMP28:%.*]] = fadd float [[CONV2]], [[TMP27]] +// IMPRVD-NEXT: [[TMP29:%.*]] = fdiv float [[TMP28]], [[TMP26]] +// IMPRVD-NEXT: [[TMP30:%.*]] = fmul float [[CONV2]], [[TMP24]] +// IMPRVD-NEXT: [[TMP31:%.*]] = fsub float [[CONV3]], [[TMP30]] +// IMPRVD-NEXT: [[TMP32:%.*]] = fdiv float [[TMP31]], [[TMP26]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV7:%.*]] +// IMPRVD: abs_rhsr_less_than_abs_rhsi6: +// IMPRVD-NEXT: [[TMP33:%.*]] = fdiv float [[A_REAL]], [[A_IMAG]] +// IMPRVD-NEXT: [[TMP34:%.*]] = fmul float [[TMP33]], [[A_REAL]] +// IMPRVD-NEXT: [[TMP35:%.*]] = fadd float [[A_IMAG]], [[TMP34]] +// IMPRVD-NEXT: [[TMP36:%.*]] = fmul float [[CONV2]], [[TMP33]] +// IMPRVD-NEXT: [[TMP37:%.*]] = fadd float [[TMP36]], [[CONV3]] +// IMPRVD-NEXT: [[TMP38:%.*]] = fdiv float [[TMP37]], [[TMP35]] +// IMPRVD-NEXT: [[TMP39:%.*]] = fmul float [[CONV3]], [[TMP33]] +// IMPRVD-NEXT: [[TMP40:%.*]] = fsub float [[TMP39]], [[CONV2]] +// IMPRVD-NEXT: [[TMP41:%.*]] = fdiv float [[TMP40]], [[TMP35]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV7]] +// IMPRVD: complex_div7: +// IMPRVD-NEXT: [[TMP42:%.*]] = phi float [ [[TMP29]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI5]] ], [ [[TMP38]], [[ABS_RHSR_LESS_THAN_ABS_RHSI6]] ] +// IMPRVD-NEXT: [[TMP43:%.*]] = phi float [ [[TMP32]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI5]] ], [ [[TMP41]], [[ABS_RHSR_LESS_THAN_ABS_RHSI6]] ] +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store float [[TMP42]], ptr [[RETVAL_REALP]], align 4 +// IMPRVD-NEXT: store float [[TMP43]], ptr [[RETVAL_IMAGP]], align 4 +// IMPRVD-NEXT: [[TMP44:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// IMPRVD-NEXT: ret <2 x float> [[TMP44]] +// +// PRMTD-LABEL: define dso_local <2 x float> @f1( +// PRMTD-SAME: <2 x float> noundef [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x float> noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// PRMTD-NEXT: entry: +// PRMTD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// PRMTD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// PRMTD-NEXT: [[C:%.*]] = alloca { float, float }, align 4 +// PRMTD-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// PRMTD-NEXT: store <2 x float> [[C_COERCE]], ptr [[C]], align 4 +// PRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// PRMTD-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// PRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// PRMTD-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// PRMTD-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// PRMTD-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 4 +// PRMTD-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// PRMTD-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4 +// PRMTD-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to x86_fp80 +// PRMTD-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to x86_fp80 +// PRMTD-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) +// PRMTD-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) +// PRMTD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt x86_fp80 [[TMP0]], [[TMP1]] +// PRMTD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// PRMTD: abs_rhsr_greater_or_equal_abs_rhsi: +// PRMTD-NEXT: [[TMP2:%.*]] = fdiv x86_fp80 [[CONV1]], [[CONV]] +// PRMTD-NEXT: [[TMP3:%.*]] = fmul x86_fp80 [[TMP2]], [[CONV1]] +// PRMTD-NEXT: [[TMP4:%.*]] = fadd x86_fp80 [[CONV]], [[TMP3]] +// PRMTD-NEXT: [[TMP5:%.*]] = fmul x86_fp80 [[B_IMAG]], [[TMP2]] +// PRMTD-NEXT: [[TMP6:%.*]] = fadd x86_fp80 [[B_REAL]], [[TMP5]] +// PRMTD-NEXT: [[TMP7:%.*]] = fdiv x86_fp80 [[TMP6]], [[TMP4]] +// PRMTD-NEXT: [[TMP8:%.*]] = fmul x86_fp80 [[B_REAL]], [[TMP2]] +// PRMTD-NEXT: [[TMP9:%.*]] = fsub x86_fp80 [[B_IMAG]], [[TMP8]] +// PRMTD-NEXT: [[TMP10:%.*]] = fdiv x86_fp80 [[TMP9]], [[TMP4]] +// PRMTD-NEXT: br label [[COMPLEX_DIV:%.*]] +// PRMTD: abs_rhsr_less_than_abs_rhsi: +// PRMTD-NEXT: [[TMP11:%.*]] = fdiv x86_fp80 [[CONV]], [[CONV1]] +// PRMTD-NEXT: [[TMP12:%.*]] = fmul x86_fp80 [[TMP11]], [[CONV]] +// PRMTD-NEXT: [[TMP13:%.*]] = fadd x86_fp80 [[CONV1]], [[TMP12]] +// PRMTD-NEXT: [[TMP14:%.*]] = fmul x86_fp80 [[B_REAL]], [[TMP11]] +// PRMTD-NEXT: [[TMP15:%.*]] = fadd x86_fp80 [[TMP14]], [[B_IMAG]] +// PRMTD-NEXT: [[TMP16:%.*]] = fdiv x86_fp80 [[TMP15]], [[TMP13]] +// PRMTD-NEXT: [[TMP17:%.*]] = fmul x86_fp80 [[B_IMAG]], [[TMP11]] +// PRMTD-NEXT: [[TMP18:%.*]] = fsub x86_fp80 [[TMP17]], [[B_REAL]] +// PRMTD-NEXT: [[TMP19:%.*]] = fdiv x86_fp80 [[TMP18]], [[TMP13]] +// PRMTD-NEXT: br label [[COMPLEX_DIV]] +// PRMTD: complex_div: +// PRMTD-NEXT: [[TMP20:%.*]] = phi x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD-NEXT: [[TMP21:%.*]] = phi x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP20]] to float +// PRMTD-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP21]] to float +// PRMTD-NEXT: [[EXT:%.*]] = fpext float [[CONV2]] to double +// PRMTD-NEXT: [[EXT4:%.*]] = fpext float [[CONV3]] to double +// PRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// PRMTD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// PRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// PRMTD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// PRMTD-NEXT: [[EXT5:%.*]] = fpext float [[A_REAL]] to double +// PRMTD-NEXT: [[EXT6:%.*]] = fpext float [[A_IMAG]] to double +// PRMTD-NEXT: [[TMP22:%.*]] = fmul double [[EXT]], [[EXT5]] +// PRMTD-NEXT: [[TMP23:%.*]] = fmul double [[EXT4]], [[EXT6]] +// PRMTD-NEXT: [[TMP24:%.*]] = fadd double [[TMP22]], [[TMP23]] +// PRMTD-NEXT: [[TMP25:%.*]] = fmul double [[EXT5]], [[EXT5]] +// PRMTD-NEXT: [[TMP26:%.*]] = fmul double [[EXT6]], [[EXT6]] +// PRMTD-NEXT: [[TMP27:%.*]] = fadd double [[TMP25]], [[TMP26]] +// PRMTD-NEXT: [[TMP28:%.*]] = fmul double [[EXT4]], [[EXT5]] +// PRMTD-NEXT: [[TMP29:%.*]] = fmul double [[EXT]], [[EXT6]] +// PRMTD-NEXT: [[TMP30:%.*]] = fsub double [[TMP28]], [[TMP29]] +// PRMTD-NEXT: [[TMP31:%.*]] = fdiv double [[TMP24]], [[TMP27]] +// PRMTD-NEXT: [[TMP32:%.*]] = fdiv double [[TMP30]], [[TMP27]] +// PRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc double [[TMP31]] to float +// PRMTD-NEXT: [[UNPROMOTION7:%.*]] = fptrunc double [[TMP32]] to float +// PRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD-NEXT: store float [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 4 +// PRMTD-NEXT: store float [[UNPROMOTION7]], ptr [[RETVAL_IMAGP]], align 4 +// PRMTD-NEXT: [[TMP33:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// PRMTD-NEXT: ret <2 x float> [[TMP33]] +// +// X86WINPRMTD-LABEL: define dso_local i64 @f1( +// X86WINPRMTD-SAME: i64 noundef [[A_COERCE:%.*]], ptr noundef [[B:%.*]], i64 noundef [[C_COERCE:%.*]]) #[[ATTR0]] { +// X86WINPRMTD-NEXT: entry: +// X86WINPRMTD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// X86WINPRMTD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// X86WINPRMTD-NEXT: [[C:%.*]] = alloca { float, float }, align 4 +// X86WINPRMTD-NEXT: [[B_INDIRECT_ADDR:%.*]] = alloca ptr, align 8 +// X86WINPRMTD-NEXT: store i64 [[A_COERCE]], ptr [[A]], align 4 +// X86WINPRMTD-NEXT: store i64 [[C_COERCE]], ptr [[C]], align 4 +// X86WINPRMTD-NEXT: store ptr [[B]], ptr [[B_INDIRECT_ADDR]], align 8 +// X86WINPRMTD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8 +// X86WINPRMTD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8 +// X86WINPRMTD-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 4 +// X86WINPRMTD-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4 +// X86WINPRMTD-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to double +// X86WINPRMTD-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to double +// X86WINPRMTD-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[CONV]]) +// X86WINPRMTD-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[CONV1]]) +// X86WINPRMTD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt double [[TMP0]], [[TMP1]] +// X86WINPRMTD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// X86WINPRMTD: abs_rhsr_greater_or_equal_abs_rhsi: +// X86WINPRMTD-NEXT: [[TMP2:%.*]] = fdiv double [[CONV1]], [[CONV]] +// X86WINPRMTD-NEXT: [[TMP3:%.*]] = fmul double [[TMP2]], [[CONV1]] +// X86WINPRMTD-NEXT: [[TMP4:%.*]] = fadd double [[CONV]], [[TMP3]] +// X86WINPRMTD-NEXT: [[TMP5:%.*]] = fmul double [[B_IMAG]], [[TMP2]] +// X86WINPRMTD-NEXT: [[TMP6:%.*]] = fadd double [[B_REAL]], [[TMP5]] +// X86WINPRMTD-NEXT: [[TMP7:%.*]] = fdiv double [[TMP6]], [[TMP4]] +// X86WINPRMTD-NEXT: [[TMP8:%.*]] = fmul double [[B_REAL]], [[TMP2]] +// X86WINPRMTD-NEXT: [[TMP9:%.*]] = fsub double [[B_IMAG]], [[TMP8]] +// X86WINPRMTD-NEXT: [[TMP10:%.*]] = fdiv double [[TMP9]], [[TMP4]] +// X86WINPRMTD-NEXT: br label [[COMPLEX_DIV:%.*]] +// X86WINPRMTD: abs_rhsr_less_than_abs_rhsi: +// X86WINPRMTD-NEXT: [[TMP11:%.*]] = fdiv double [[CONV]], [[CONV1]] +// X86WINPRMTD-NEXT: [[TMP12:%.*]] = fmul double [[TMP11]], [[CONV]] +// X86WINPRMTD-NEXT: [[TMP13:%.*]] = fadd double [[CONV1]], [[TMP12]] +// X86WINPRMTD-NEXT: [[TMP14:%.*]] = fmul double [[B_REAL]], [[TMP11]] +// X86WINPRMTD-NEXT: [[TMP15:%.*]] = fadd double [[TMP14]], [[B_IMAG]] +// X86WINPRMTD-NEXT: [[TMP16:%.*]] = fdiv double [[TMP15]], [[TMP13]] +// X86WINPRMTD-NEXT: [[TMP17:%.*]] = fmul double [[B_IMAG]], [[TMP11]] +// X86WINPRMTD-NEXT: [[TMP18:%.*]] = fsub double [[TMP17]], [[B_REAL]] +// X86WINPRMTD-NEXT: [[TMP19:%.*]] = fdiv double [[TMP18]], [[TMP13]] +// X86WINPRMTD-NEXT: br label [[COMPLEX_DIV]] +// X86WINPRMTD: complex_div: +// X86WINPRMTD-NEXT: [[TMP20:%.*]] = phi double [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// X86WINPRMTD-NEXT: [[TMP21:%.*]] = phi double [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// X86WINPRMTD-NEXT: [[CONV2:%.*]] = fptrunc double [[TMP20]] to float +// X86WINPRMTD-NEXT: [[CONV3:%.*]] = fptrunc double [[TMP21]] to float +// X86WINPRMTD-NEXT: [[EXT:%.*]] = fpext float [[CONV2]] to double +// X86WINPRMTD-NEXT: [[EXT4:%.*]] = fpext float [[CONV3]] to double +// X86WINPRMTD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// X86WINPRMTD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// X86WINPRMTD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// X86WINPRMTD-NEXT: [[EXT5:%.*]] = fpext float [[A_REAL]] to double +// X86WINPRMTD-NEXT: [[EXT6:%.*]] = fpext float [[A_IMAG]] to double +// X86WINPRMTD-NEXT: [[TMP22:%.*]] = fmul double [[EXT]], [[EXT5]] +// X86WINPRMTD-NEXT: [[TMP23:%.*]] = fmul double [[EXT4]], [[EXT6]] +// X86WINPRMTD-NEXT: [[TMP24:%.*]] = fadd double [[TMP22]], [[TMP23]] +// X86WINPRMTD-NEXT: [[TMP25:%.*]] = fmul double [[EXT5]], [[EXT5]] +// X86WINPRMTD-NEXT: [[TMP26:%.*]] = fmul double [[EXT6]], [[EXT6]] +// X86WINPRMTD-NEXT: [[TMP27:%.*]] = fadd double [[TMP25]], [[TMP26]] +// X86WINPRMTD-NEXT: [[TMP28:%.*]] = fmul double [[EXT4]], [[EXT5]] +// X86WINPRMTD-NEXT: [[TMP29:%.*]] = fmul double [[EXT]], [[EXT6]] +// X86WINPRMTD-NEXT: [[TMP30:%.*]] = fsub double [[TMP28]], [[TMP29]] +// X86WINPRMTD-NEXT: [[TMP31:%.*]] = fdiv double [[TMP24]], [[TMP27]] +// X86WINPRMTD-NEXT: [[TMP32:%.*]] = fdiv double [[TMP30]], [[TMP27]] +// X86WINPRMTD-NEXT: [[UNPROMOTION:%.*]] = fptrunc double [[TMP31]] to float +// X86WINPRMTD-NEXT: [[UNPROMOTION7:%.*]] = fptrunc double [[TMP32]] to float +// X86WINPRMTD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// X86WINPRMTD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// X86WINPRMTD-NEXT: store float [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 4 +// X86WINPRMTD-NEXT: store float [[UNPROMOTION7]], ptr [[RETVAL_IMAGP]], align 4 +// X86WINPRMTD-NEXT: [[TMP33:%.*]] = load i64, ptr [[RETVAL]], align 4 +// X86WINPRMTD-NEXT: ret i64 [[TMP33]] +// +// AVRFP32-LABEL: define dso_local { float, float } @f1( +// AVRFP32-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], float noundef [[B_COERCE0:%.*]], float noundef [[B_COERCE1:%.*]], float noundef [[C_COERCE0:%.*]], float noundef [[C_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP32-NEXT: entry: +// AVRFP32-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[A:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[B:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[C:%.*]] = alloca { float, float }, align 1 +// AVRFP32-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP32-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP32-NEXT: [[TMP2:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP32-NEXT: [[TMP3:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP32-NEXT: [[TMP4:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// AVRFP32-NEXT: store float [[C_COERCE0]], ptr [[TMP4]], align 1 +// AVRFP32-NEXT: [[TMP5:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[C_COERCE1]], ptr [[TMP5]], align 1 +// AVRFP32-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// AVRFP32-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 1 +// AVRFP32-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// AVRFP32-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 1 +// AVRFP32-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// AVRFP32-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 1 +// AVRFP32-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// AVRFP32-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP6:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[C_REAL]]) +// AVRFP32-NEXT: [[TMP7:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[C_IMAG]]) +// AVRFP32-NEXT: [[ABS_CMP:%.*]] = fcmp ugt float [[TMP6]], [[TMP7]] +// AVRFP32-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// AVRFP32: abs_rhsr_greater_or_equal_abs_rhsi: +// AVRFP32-NEXT: [[TMP8:%.*]] = fdiv float [[C_IMAG]], [[C_REAL]] +// AVRFP32-NEXT: [[TMP9:%.*]] = fmul float [[TMP8]], [[C_IMAG]] +// AVRFP32-NEXT: [[TMP10:%.*]] = fadd float [[C_REAL]], [[TMP9]] +// AVRFP32-NEXT: [[TMP11:%.*]] = fmul float [[B_IMAG]], [[TMP8]] +// AVRFP32-NEXT: [[TMP12:%.*]] = fadd float [[B_REAL]], [[TMP11]] +// AVRFP32-NEXT: [[TMP13:%.*]] = fdiv float [[TMP12]], [[TMP10]] +// AVRFP32-NEXT: [[TMP14:%.*]] = fmul float [[B_REAL]], [[TMP8]] +// AVRFP32-NEXT: [[TMP15:%.*]] = fsub float [[B_IMAG]], [[TMP14]] +// AVRFP32-NEXT: [[TMP16:%.*]] = fdiv float [[TMP15]], [[TMP10]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV:%.*]] +// AVRFP32: abs_rhsr_less_than_abs_rhsi: +// AVRFP32-NEXT: [[TMP17:%.*]] = fdiv float [[C_REAL]], [[C_IMAG]] +// AVRFP32-NEXT: [[TMP18:%.*]] = fmul float [[TMP17]], [[C_REAL]] +// AVRFP32-NEXT: [[TMP19:%.*]] = fadd float [[C_IMAG]], [[TMP18]] +// AVRFP32-NEXT: [[TMP20:%.*]] = fmul float [[B_REAL]], [[TMP17]] +// AVRFP32-NEXT: [[TMP21:%.*]] = fadd float [[TMP20]], [[B_IMAG]] +// AVRFP32-NEXT: [[TMP22:%.*]] = fdiv float [[TMP21]], [[TMP19]] +// AVRFP32-NEXT: [[TMP23:%.*]] = fmul float [[B_IMAG]], [[TMP17]] +// AVRFP32-NEXT: [[TMP24:%.*]] = fsub float [[TMP23]], [[B_REAL]] +// AVRFP32-NEXT: [[TMP25:%.*]] = fdiv float [[TMP24]], [[TMP19]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV]] +// AVRFP32: complex_div: +// AVRFP32-NEXT: [[TMP26:%.*]] = phi float [ [[TMP13]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP22]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP32-NEXT: [[TMP27:%.*]] = phi float [ [[TMP16]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP25]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP32-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP32-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 1 +// AVRFP32-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP32-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP28:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[A_REAL]]) +// AVRFP32-NEXT: [[TMP29:%.*]] = call addrspace(1) float @llvm.fabs.f32(float [[A_IMAG]]) +// AVRFP32-NEXT: [[ABS_CMP1:%.*]] = fcmp ugt float [[TMP28]], [[TMP29]] +// AVRFP32-NEXT: br i1 [[ABS_CMP1]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI2:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI3:%.*]] +// AVRFP32: abs_rhsr_greater_or_equal_abs_rhsi2: +// AVRFP32-NEXT: [[TMP30:%.*]] = fdiv float [[A_IMAG]], [[A_REAL]] +// AVRFP32-NEXT: [[TMP31:%.*]] = fmul float [[TMP30]], [[A_IMAG]] +// AVRFP32-NEXT: [[TMP32:%.*]] = fadd float [[A_REAL]], [[TMP31]] +// AVRFP32-NEXT: [[TMP33:%.*]] = fmul float [[TMP27]], [[TMP30]] +// AVRFP32-NEXT: [[TMP34:%.*]] = fadd float [[TMP26]], [[TMP33]] +// AVRFP32-NEXT: [[TMP35:%.*]] = fdiv float [[TMP34]], [[TMP32]] +// AVRFP32-NEXT: [[TMP36:%.*]] = fmul float [[TMP26]], [[TMP30]] +// AVRFP32-NEXT: [[TMP37:%.*]] = fsub float [[TMP27]], [[TMP36]] +// AVRFP32-NEXT: [[TMP38:%.*]] = fdiv float [[TMP37]], [[TMP32]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV4:%.*]] +// AVRFP32: abs_rhsr_less_than_abs_rhsi3: +// AVRFP32-NEXT: [[TMP39:%.*]] = fdiv float [[A_REAL]], [[A_IMAG]] +// AVRFP32-NEXT: [[TMP40:%.*]] = fmul float [[TMP39]], [[A_REAL]] +// AVRFP32-NEXT: [[TMP41:%.*]] = fadd float [[A_IMAG]], [[TMP40]] +// AVRFP32-NEXT: [[TMP42:%.*]] = fmul float [[TMP26]], [[TMP39]] +// AVRFP32-NEXT: [[TMP43:%.*]] = fadd float [[TMP42]], [[TMP27]] +// AVRFP32-NEXT: [[TMP44:%.*]] = fdiv float [[TMP43]], [[TMP41]] +// AVRFP32-NEXT: [[TMP45:%.*]] = fmul float [[TMP27]], [[TMP39]] +// AVRFP32-NEXT: [[TMP46:%.*]] = fsub float [[TMP45]], [[TMP26]] +// AVRFP32-NEXT: [[TMP47:%.*]] = fdiv float [[TMP46]], [[TMP41]] +// AVRFP32-NEXT: br label [[COMPLEX_DIV4]] +// AVRFP32: complex_div4: +// AVRFP32-NEXT: [[TMP48:%.*]] = phi float [ [[TMP35]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI2]] ], [ [[TMP44]], [[ABS_RHSR_LESS_THAN_ABS_RHSI3]] ] +// AVRFP32-NEXT: [[TMP49:%.*]] = phi float [ [[TMP38]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI2]] ], [ [[TMP47]], [[ABS_RHSR_LESS_THAN_ABS_RHSI3]] ] +// AVRFP32-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP32-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP32-NEXT: store float [[TMP48]], ptr [[RETVAL_REALP]], align 1 +// AVRFP32-NEXT: store float [[TMP49]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP32-NEXT: [[TMP50:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP32-NEXT: ret { float, float } [[TMP50]] +// +// AVRFP64-LABEL: define dso_local { float, float } @f1( +// AVRFP64-SAME: float noundef [[A_COERCE0:%.*]], float noundef [[A_COERCE1:%.*]], double noundef [[B_COERCE0:%.*]], double noundef [[B_COERCE1:%.*]], float noundef [[C_COERCE0:%.*]], float noundef [[C_COERCE1:%.*]]) addrspace(1) #[[ATTR0]] { +// AVRFP64-NEXT: entry: +// AVRFP64-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 1 +// AVRFP64-NEXT: [[A:%.*]] = alloca { float, float }, align 1 +// AVRFP64-NEXT: [[B:%.*]] = alloca { double, double }, align 1 +// AVRFP64-NEXT: [[C:%.*]] = alloca { float, float }, align 1 +// AVRFP64-NEXT: [[TMP0:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: store float [[A_COERCE0]], ptr [[TMP0]], align 1 +// AVRFP64-NEXT: [[TMP1:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: store float [[A_COERCE1]], ptr [[TMP1]], align 1 +// AVRFP64-NEXT: [[TMP2:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: store double [[B_COERCE0]], ptr [[TMP2]], align 1 +// AVRFP64-NEXT: [[TMP3:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: store double [[B_COERCE1]], ptr [[TMP3]], align 1 +// AVRFP64-NEXT: [[TMP4:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// AVRFP64-NEXT: store float [[C_COERCE0]], ptr [[TMP4]], align 1 +// AVRFP64-NEXT: [[TMP5:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// AVRFP64-NEXT: store float [[C_COERCE1]], ptr [[TMP5]], align 1 +// AVRFP64-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 0 +// AVRFP64-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 1 +// AVRFP64-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { double, double }, ptr [[B]], i32 0, i32 1 +// AVRFP64-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 1 +// AVRFP64-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// AVRFP64-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 1 +// AVRFP64-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// AVRFP64-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 1 +// AVRFP64-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to double +// AVRFP64-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to double +// AVRFP64-NEXT: [[TMP6:%.*]] = call addrspace(1) double @llvm.fabs.f64(double [[CONV]]) +// AVRFP64-NEXT: [[TMP7:%.*]] = call addrspace(1) double @llvm.fabs.f64(double [[CONV1]]) +// AVRFP64-NEXT: [[ABS_CMP:%.*]] = fcmp ugt double [[TMP6]], [[TMP7]] +// AVRFP64-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// AVRFP64: abs_rhsr_greater_or_equal_abs_rhsi: +// AVRFP64-NEXT: [[TMP8:%.*]] = fdiv double [[CONV1]], [[CONV]] +// AVRFP64-NEXT: [[TMP9:%.*]] = fmul double [[TMP8]], [[CONV1]] +// AVRFP64-NEXT: [[TMP10:%.*]] = fadd double [[CONV]], [[TMP9]] +// AVRFP64-NEXT: [[TMP11:%.*]] = fmul double [[B_IMAG]], [[TMP8]] +// AVRFP64-NEXT: [[TMP12:%.*]] = fadd double [[B_REAL]], [[TMP11]] +// AVRFP64-NEXT: [[TMP13:%.*]] = fdiv double [[TMP12]], [[TMP10]] +// AVRFP64-NEXT: [[TMP14:%.*]] = fmul double [[B_REAL]], [[TMP8]] +// AVRFP64-NEXT: [[TMP15:%.*]] = fsub double [[B_IMAG]], [[TMP14]] +// AVRFP64-NEXT: [[TMP16:%.*]] = fdiv double [[TMP15]], [[TMP10]] +// AVRFP64-NEXT: br label [[COMPLEX_DIV:%.*]] +// AVRFP64: abs_rhsr_less_than_abs_rhsi: +// AVRFP64-NEXT: [[TMP17:%.*]] = fdiv double [[CONV]], [[CONV1]] +// AVRFP64-NEXT: [[TMP18:%.*]] = fmul double [[TMP17]], [[CONV]] +// AVRFP64-NEXT: [[TMP19:%.*]] = fadd double [[CONV1]], [[TMP18]] +// AVRFP64-NEXT: [[TMP20:%.*]] = fmul double [[B_REAL]], [[TMP17]] +// AVRFP64-NEXT: [[TMP21:%.*]] = fadd double [[TMP20]], [[B_IMAG]] +// AVRFP64-NEXT: [[TMP22:%.*]] = fdiv double [[TMP21]], [[TMP19]] +// AVRFP64-NEXT: [[TMP23:%.*]] = fmul double [[B_IMAG]], [[TMP17]] +// AVRFP64-NEXT: [[TMP24:%.*]] = fsub double [[TMP23]], [[B_REAL]] +// AVRFP64-NEXT: [[TMP25:%.*]] = fdiv double [[TMP24]], [[TMP19]] +// AVRFP64-NEXT: br label [[COMPLEX_DIV]] +// AVRFP64: complex_div: +// AVRFP64-NEXT: [[TMP26:%.*]] = phi double [ [[TMP13]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP22]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP64-NEXT: [[TMP27:%.*]] = phi double [ [[TMP16]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP25]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// AVRFP64-NEXT: [[CONV2:%.*]] = fptrunc double [[TMP26]] to float +// AVRFP64-NEXT: [[CONV3:%.*]] = fptrunc double [[TMP27]] to float +// AVRFP64-NEXT: [[EXT:%.*]] = fpext float [[CONV2]] to double +// AVRFP64-NEXT: [[EXT4:%.*]] = fpext float [[CONV3]] to double +// AVRFP64-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// AVRFP64-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 1 +// AVRFP64-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// AVRFP64-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 1 +// AVRFP64-NEXT: [[EXT5:%.*]] = fpext float [[A_REAL]] to double +// AVRFP64-NEXT: [[EXT6:%.*]] = fpext float [[A_IMAG]] to double +// AVRFP64-NEXT: [[TMP28:%.*]] = fmul double [[EXT]], [[EXT5]] +// AVRFP64-NEXT: [[TMP29:%.*]] = fmul double [[EXT4]], [[EXT6]] +// AVRFP64-NEXT: [[TMP30:%.*]] = fadd double [[TMP28]], [[TMP29]] +// AVRFP64-NEXT: [[TMP31:%.*]] = fmul double [[EXT5]], [[EXT5]] +// AVRFP64-NEXT: [[TMP32:%.*]] = fmul double [[EXT6]], [[EXT6]] +// AVRFP64-NEXT: [[TMP33:%.*]] = fadd double [[TMP31]], [[TMP32]] +// AVRFP64-NEXT: [[TMP34:%.*]] = fmul double [[EXT4]], [[EXT5]] +// AVRFP64-NEXT: [[TMP35:%.*]] = fmul double [[EXT]], [[EXT6]] +// AVRFP64-NEXT: [[TMP36:%.*]] = fsub double [[TMP34]], [[TMP35]] +// AVRFP64-NEXT: [[TMP37:%.*]] = fdiv double [[TMP30]], [[TMP33]] +// AVRFP64-NEXT: [[TMP38:%.*]] = fdiv double [[TMP36]], [[TMP33]] +// AVRFP64-NEXT: [[UNPROMOTION:%.*]] = fptrunc double [[TMP37]] to float +// AVRFP64-NEXT: [[UNPROMOTION7:%.*]] = fptrunc double [[TMP38]] to float +// AVRFP64-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// AVRFP64-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// AVRFP64-NEXT: store float [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 1 +// AVRFP64-NEXT: store float [[UNPROMOTION7]], ptr [[RETVAL_IMAGP]], align 1 +// AVRFP64-NEXT: [[TMP39:%.*]] = load { float, float }, ptr [[RETVAL]], align 1 +// AVRFP64-NEXT: ret { float, float } [[TMP39]] +// +// BASIC_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @f1( +// BASIC_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x float> noundef nofpclass(nan inf) [[C_COERCE:%.*]]) #[[ATTR0]] { +// BASIC_FAST-NEXT: entry: +// BASIC_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// BASIC_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// BASIC_FAST-NEXT: [[C:%.*]] = alloca { float, float }, align 4 +// BASIC_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// BASIC_FAST-NEXT: store <2 x float> [[C_COERCE]], ptr [[C]], align 4 +// BASIC_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// BASIC_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// BASIC_FAST-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 4 +// BASIC_FAST-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4 +// BASIC_FAST-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to x86_fp80 +// BASIC_FAST-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to x86_fp80 +// BASIC_FAST-NEXT: [[TMP0:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[CONV]] +// BASIC_FAST-NEXT: [[TMP1:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[CONV1]] +// BASIC_FAST-NEXT: [[TMP2:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP0]], [[TMP1]] +// BASIC_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[CONV]] +// BASIC_FAST-NEXT: [[TMP4:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[CONV1]] +// BASIC_FAST-NEXT: [[TMP5:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP3]], [[TMP4]] +// BASIC_FAST-NEXT: [[TMP6:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[CONV]] +// BASIC_FAST-NEXT: [[TMP7:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[CONV1]] +// BASIC_FAST-NEXT: [[TMP8:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP7]] +// BASIC_FAST-NEXT: [[TMP9:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP2]], [[TMP5]] +// BASIC_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP8]], [[TMP5]] +// BASIC_FAST-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP9]] to float +// BASIC_FAST-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP10]] to float +// BASIC_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// BASIC_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// BASIC_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// BASIC_FAST-NEXT: [[TMP11:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[CONV2]], [[A_REAL]] +// BASIC_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[CONV3]], [[A_IMAG]] +// BASIC_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP11]], [[TMP12]] +// BASIC_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[A_REAL]] +// BASIC_FAST-NEXT: [[TMP15:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[A_IMAG]] +// BASIC_FAST-NEXT: [[TMP16:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP14]], [[TMP15]] +// BASIC_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[CONV3]], [[A_REAL]] +// BASIC_FAST-NEXT: [[TMP18:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[CONV2]], [[A_IMAG]] +// BASIC_FAST-NEXT: [[TMP19:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP17]], [[TMP18]] +// BASIC_FAST-NEXT: [[TMP20:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP13]], [[TMP16]] +// BASIC_FAST-NEXT: [[TMP21:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP19]], [[TMP16]] +// BASIC_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// BASIC_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// BASIC_FAST-NEXT: store float [[TMP20]], ptr [[RETVAL_REALP]], align 4 +// BASIC_FAST-NEXT: store float [[TMP21]], ptr [[RETVAL_IMAGP]], align 4 +// BASIC_FAST-NEXT: [[TMP22:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// BASIC_FAST-NEXT: ret <2 x float> [[TMP22]] +// +// FULL_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @f1( +// FULL_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x float> noundef nofpclass(nan inf) [[C_COERCE:%.*]]) #[[ATTR0]] { +// FULL_FAST-NEXT: entry: +// FULL_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: [[C:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: [[COERCE:%.*]] = alloca { float, float }, align 4 +// FULL_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// FULL_FAST-NEXT: store <2 x float> [[C_COERCE]], ptr [[C]], align 4 +// FULL_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// FULL_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// FULL_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// FULL_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// FULL_FAST-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// FULL_FAST-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 4 +// FULL_FAST-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// FULL_FAST-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4 +// FULL_FAST-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to x86_fp80 +// FULL_FAST-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to x86_fp80 +// FULL_FAST-NEXT: [[CALL:%.*]] = call { x86_fp80, x86_fp80 } @__divxc3(x86_fp80 noundef nofpclass(nan inf) [[B_REAL]], x86_fp80 noundef nofpclass(nan inf) [[B_IMAG]], x86_fp80 noundef nofpclass(nan inf) [[CONV]], x86_fp80 noundef nofpclass(nan inf) [[CONV1]]) #[[ATTR2]] +// FULL_FAST-NEXT: [[TMP0:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 0 +// FULL_FAST-NEXT: [[TMP1:%.*]] = extractvalue { x86_fp80, x86_fp80 } [[CALL]], 1 +// FULL_FAST-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP0]] to float +// FULL_FAST-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP1]] to float +// FULL_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// FULL_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// FULL_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// FULL_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// FULL_FAST-NEXT: [[CALL4:%.*]] = call reassoc nnan ninf nsz arcp afn nofpclass(nan inf) <2 x float> @__divsc3(float noundef nofpclass(nan inf) [[CONV2]], float noundef nofpclass(nan inf) [[CONV3]], float noundef nofpclass(nan inf) [[A_REAL]], float noundef nofpclass(nan inf) [[A_IMAG]]) #[[ATTR2]] +// FULL_FAST-NEXT: store <2 x float> [[CALL4]], ptr [[COERCE]], align 4 +// FULL_FAST-NEXT: [[COERCE_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 0 +// FULL_FAST-NEXT: [[COERCE_REAL:%.*]] = load float, ptr [[COERCE_REALP]], align 4 +// FULL_FAST-NEXT: [[COERCE_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[COERCE]], i32 0, i32 1 +// FULL_FAST-NEXT: [[COERCE_IMAG:%.*]] = load float, ptr [[COERCE_IMAGP]], align 4 +// FULL_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// FULL_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// FULL_FAST-NEXT: store float [[COERCE_REAL]], ptr [[RETVAL_REALP]], align 4 +// FULL_FAST-NEXT: store float [[COERCE_IMAG]], ptr [[RETVAL_IMAGP]], align 4 +// FULL_FAST-NEXT: [[TMP2:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// FULL_FAST-NEXT: ret <2 x float> [[TMP2]] +// +// IMPRVD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @f1( +// IMPRVD_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x float> noundef nofpclass(nan inf) [[C_COERCE:%.*]]) #[[ATTR0]] { +// IMPRVD_FAST-NEXT: entry: +// IMPRVD_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// IMPRVD_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// IMPRVD_FAST-NEXT: [[C:%.*]] = alloca { float, float }, align 4 +// IMPRVD_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// IMPRVD_FAST-NEXT: store <2 x float> [[C_COERCE]], ptr [[C]], align 4 +// IMPRVD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// IMPRVD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// IMPRVD_FAST-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 4 +// IMPRVD_FAST-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4 +// IMPRVD_FAST-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to x86_fp80 +// IMPRVD_FAST-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to x86_fp80 +// IMPRVD_FAST-NEXT: [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) +// IMPRVD_FAST-NEXT: [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) +// IMPRVD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt x86_fp80 [[TMP0]], [[TMP1]] +// IMPRVD_FAST-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD_FAST: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP2:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[CONV]] +// IMPRVD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP2]], [[CONV1]] +// IMPRVD_FAST-NEXT: [[TMP4:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[TMP3]] +// IMPRVD_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP5]] +// IMPRVD_FAST-NEXT: [[TMP7:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP4]] +// IMPRVD_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP2]] +// IMPRVD_FAST-NEXT: [[TMP9:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP8]] +// IMPRVD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP9]], [[TMP4]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD_FAST: abs_rhsr_less_than_abs_rhsi: +// IMPRVD_FAST-NEXT: [[TMP11:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[CONV1]] +// IMPRVD_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP11]], [[CONV]] +// IMPRVD_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[TMP12]] +// IMPRVD_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP15:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP14]], [[B_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP16:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP15]], [[TMP13]] +// IMPRVD_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP11]] +// IMPRVD_FAST-NEXT: [[TMP18:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP17]], [[B_REAL]] +// IMPRVD_FAST-NEXT: [[TMP19:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP18]], [[TMP13]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD_FAST: complex_div: +// IMPRVD_FAST-NEXT: [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD_FAST-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP20]] to float +// IMPRVD_FAST-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP21]] to float +// IMPRVD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// IMPRVD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// IMPRVD_FAST-NEXT: [[TMP22:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[A_REAL]]) +// IMPRVD_FAST-NEXT: [[TMP23:%.*]] = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float [[A_IMAG]]) +// IMPRVD_FAST-NEXT: [[ABS_CMP4:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt float [[TMP22]], [[TMP23]] +// IMPRVD_FAST-NEXT: br i1 [[ABS_CMP4]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI5:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI6:%.*]] +// IMPRVD_FAST: abs_rhsr_greater_or_equal_abs_rhsi5: +// IMPRVD_FAST-NEXT: [[TMP24:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[A_REAL]] +// IMPRVD_FAST-NEXT: [[TMP25:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP24]], [[A_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP26:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[TMP25]] +// IMPRVD_FAST-NEXT: [[TMP27:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[CONV3]], [[TMP24]] +// IMPRVD_FAST-NEXT: [[TMP28:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[CONV2]], [[TMP27]] +// IMPRVD_FAST-NEXT: [[TMP29:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP28]], [[TMP26]] +// IMPRVD_FAST-NEXT: [[TMP30:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[CONV2]], [[TMP24]] +// IMPRVD_FAST-NEXT: [[TMP31:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[CONV3]], [[TMP30]] +// IMPRVD_FAST-NEXT: [[TMP32:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP31]], [[TMP26]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV7:%.*]] +// IMPRVD_FAST: abs_rhsr_less_than_abs_rhsi6: +// IMPRVD_FAST-NEXT: [[TMP33:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[A_REAL]], [[A_IMAG]] +// IMPRVD_FAST-NEXT: [[TMP34:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[TMP33]], [[A_REAL]] +// IMPRVD_FAST-NEXT: [[TMP35:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[A_IMAG]], [[TMP34]] +// IMPRVD_FAST-NEXT: [[TMP36:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[CONV2]], [[TMP33]] +// IMPRVD_FAST-NEXT: [[TMP37:%.*]] = fadd reassoc nnan ninf nsz arcp afn float [[TMP36]], [[CONV3]] +// IMPRVD_FAST-NEXT: [[TMP38:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP37]], [[TMP35]] +// IMPRVD_FAST-NEXT: [[TMP39:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[CONV3]], [[TMP33]] +// IMPRVD_FAST-NEXT: [[TMP40:%.*]] = fsub reassoc nnan ninf nsz arcp afn float [[TMP39]], [[CONV2]] +// IMPRVD_FAST-NEXT: [[TMP41:%.*]] = fdiv reassoc nnan ninf nsz arcp afn float [[TMP40]], [[TMP35]] +// IMPRVD_FAST-NEXT: br label [[COMPLEX_DIV7]] +// IMPRVD_FAST: complex_div7: +// IMPRVD_FAST-NEXT: [[TMP42:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP29]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI5]] ], [ [[TMP38]], [[ABS_RHSR_LESS_THAN_ABS_RHSI6]] ] +// IMPRVD_FAST-NEXT: [[TMP43:%.*]] = phi reassoc nnan ninf nsz arcp afn float [ [[TMP32]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI5]] ], [ [[TMP41]], [[ABS_RHSR_LESS_THAN_ABS_RHSI6]] ] +// IMPRVD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD_FAST-NEXT: store float [[TMP42]], ptr [[RETVAL_REALP]], align 4 +// IMPRVD_FAST-NEXT: store float [[TMP43]], ptr [[RETVAL_IMAGP]], align 4 +// IMPRVD_FAST-NEXT: [[TMP44:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// IMPRVD_FAST-NEXT: ret <2 x float> [[TMP44]] +// +// PRMTD_FAST-LABEL: define dso_local nofpclass(nan inf) <2 x float> @f1( +// PRMTD_FAST-SAME: <2 x float> noundef nofpclass(nan inf) [[A_COERCE:%.*]], ptr noundef byval({ x86_fp80, x86_fp80 }) align 16 [[B:%.*]], <2 x float> noundef nofpclass(nan inf) [[C_COERCE:%.*]]) #[[ATTR0]] { +// PRMTD_FAST-NEXT: entry: +// PRMTD_FAST-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// PRMTD_FAST-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// PRMTD_FAST-NEXT: [[C:%.*]] = alloca { float, float }, align 4 +// PRMTD_FAST-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// PRMTD_FAST-NEXT: store <2 x float> [[C_COERCE]], ptr [[C]], align 4 +// PRMTD_FAST-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16 +// PRMTD_FAST-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16 +// PRMTD_FAST-NEXT: [[C_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[C_REAL:%.*]] = load float, ptr [[C_REALP]], align 4 +// PRMTD_FAST-NEXT: [[C_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[C]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4 +// PRMTD_FAST-NEXT: [[CONV:%.*]] = fpext float [[C_REAL]] to x86_fp80 +// PRMTD_FAST-NEXT: [[CONV1:%.*]] = fpext float [[C_IMAG]] to x86_fp80 +// PRMTD_FAST-NEXT: [[TMP0:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) +// PRMTD_FAST-NEXT: [[TMP1:%.*]] = call reassoc nnan ninf nsz arcp afn x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) +// PRMTD_FAST-NEXT: [[ABS_CMP:%.*]] = fcmp reassoc nnan ninf nsz arcp afn ugt x86_fp80 [[TMP0]], [[TMP1]] +// PRMTD_FAST-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// PRMTD_FAST: abs_rhsr_greater_or_equal_abs_rhsi: +// PRMTD_FAST-NEXT: [[TMP2:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[CONV]] +// PRMTD_FAST-NEXT: [[TMP3:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP2]], [[CONV1]] +// PRMTD_FAST-NEXT: [[TMP4:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[TMP3]] +// PRMTD_FAST-NEXT: [[TMP5:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP2]] +// PRMTD_FAST-NEXT: [[TMP6:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP5]] +// PRMTD_FAST-NEXT: [[TMP7:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP6]], [[TMP4]] +// PRMTD_FAST-NEXT: [[TMP8:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP2]] +// PRMTD_FAST-NEXT: [[TMP9:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP8]] +// PRMTD_FAST-NEXT: [[TMP10:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP9]], [[TMP4]] +// PRMTD_FAST-NEXT: br label [[COMPLEX_DIV:%.*]] +// PRMTD_FAST: abs_rhsr_less_than_abs_rhsi: +// PRMTD_FAST-NEXT: [[TMP11:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV]], [[CONV1]] +// PRMTD_FAST-NEXT: [[TMP12:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP11]], [[CONV]] +// PRMTD_FAST-NEXT: [[TMP13:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[CONV1]], [[TMP12]] +// PRMTD_FAST-NEXT: [[TMP14:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_REAL]], [[TMP11]] +// PRMTD_FAST-NEXT: [[TMP15:%.*]] = fadd reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP14]], [[B_IMAG]] +// PRMTD_FAST-NEXT: [[TMP16:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP15]], [[TMP13]] +// PRMTD_FAST-NEXT: [[TMP17:%.*]] = fmul reassoc nnan ninf nsz arcp afn x86_fp80 [[B_IMAG]], [[TMP11]] +// PRMTD_FAST-NEXT: [[TMP18:%.*]] = fsub reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP17]], [[B_REAL]] +// PRMTD_FAST-NEXT: [[TMP19:%.*]] = fdiv reassoc nnan ninf nsz arcp afn x86_fp80 [[TMP18]], [[TMP13]] +// PRMTD_FAST-NEXT: br label [[COMPLEX_DIV]] +// PRMTD_FAST: complex_div: +// PRMTD_FAST-NEXT: [[TMP20:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD_FAST-NEXT: [[TMP21:%.*]] = phi reassoc nnan ninf nsz arcp afn x86_fp80 [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// PRMTD_FAST-NEXT: [[CONV2:%.*]] = fptrunc x86_fp80 [[TMP20]] to float +// PRMTD_FAST-NEXT: [[CONV3:%.*]] = fptrunc x86_fp80 [[TMP21]] to float +// PRMTD_FAST-NEXT: [[EXT:%.*]] = fpext float [[CONV2]] to double +// PRMTD_FAST-NEXT: [[EXT4:%.*]] = fpext float [[CONV3]] to double +// PRMTD_FAST-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// PRMTD_FAST-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// PRMTD_FAST-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// PRMTD_FAST-NEXT: [[EXT5:%.*]] = fpext float [[A_REAL]] to double +// PRMTD_FAST-NEXT: [[EXT6:%.*]] = fpext float [[A_IMAG]] to double +// PRMTD_FAST-NEXT: [[TMP22:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT]], [[EXT5]] +// PRMTD_FAST-NEXT: [[TMP23:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT4]], [[EXT6]] +// PRMTD_FAST-NEXT: [[TMP24:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[TMP22]], [[TMP23]] +// PRMTD_FAST-NEXT: [[TMP25:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT5]], [[EXT5]] +// PRMTD_FAST-NEXT: [[TMP26:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT6]], [[EXT6]] +// PRMTD_FAST-NEXT: [[TMP27:%.*]] = fadd reassoc nnan ninf nsz arcp afn double [[TMP25]], [[TMP26]] +// PRMTD_FAST-NEXT: [[TMP28:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT4]], [[EXT5]] +// PRMTD_FAST-NEXT: [[TMP29:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[EXT]], [[EXT6]] +// PRMTD_FAST-NEXT: [[TMP30:%.*]] = fsub reassoc nnan ninf nsz arcp afn double [[TMP28]], [[TMP29]] +// PRMTD_FAST-NEXT: [[TMP31:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP24]], [[TMP27]] +// PRMTD_FAST-NEXT: [[TMP32:%.*]] = fdiv reassoc nnan ninf nsz arcp afn double [[TMP30]], [[TMP27]] +// PRMTD_FAST-NEXT: [[UNPROMOTION:%.*]] = fptrunc double [[TMP31]] to float +// PRMTD_FAST-NEXT: [[UNPROMOTION7:%.*]] = fptrunc double [[TMP32]] to float +// PRMTD_FAST-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// PRMTD_FAST-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// PRMTD_FAST-NEXT: store float [[UNPROMOTION]], ptr [[RETVAL_REALP]], align 4 +// PRMTD_FAST-NEXT: store float [[UNPROMOTION7]], ptr [[RETVAL_IMAGP]], align 4 +// PRMTD_FAST-NEXT: [[TMP33:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// PRMTD_FAST-NEXT: ret <2 x float> [[TMP33]] +// +_Complex float f1(_Complex float a, _Complex long double b, _Complex float c) { + return (_Complex float)(b / c) / a; +} diff --git a/clang/test/CodeGen/pragma-cx-limited-range.c b/clang/test/CodeGen/pragma-cx-limited-range.c index 926da8afbee558..68615348c1871f 100644 --- a/clang/test/CodeGen/pragma-cx-limited-range.c +++ b/clang/test/CodeGen/pragma-cx-limited-range.c @@ -2,20 +2,24 @@ // RUN: -o - | FileCheck %s --check-prefix=FULL // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ -// RUN: -complex-range=limited -o - | FileCheck --check-prefix=LMTD %s +// RUN: -complex-range=basic -o - | FileCheck --check-prefix=BASIC %s // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ // RUN: -fno-cx-limited-range -o - | FileCheck %s --check-prefix=FULL // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ -// RUN: -complex-range=fortran -o - | FileCheck --check-prefix=FRTRN %s +// RUN: -complex-range=improved -o - | FileCheck --check-prefix=IMPRVD %s // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ -// RUN: -fno-cx-fortran-rules -o - | FileCheck --check-prefix=FULL %s +// RUN: -complex-range=promoted -o - | FileCheck --check-prefix=PRMTD %s + +// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ +// RUN: -complex-range=full -o - | FileCheck --check-prefix=FULL %s _Complex float pragma_on_mul(_Complex float a, _Complex float b) { #pragma STDC CX_LIMITED_RANGE ON // LABEL: define {{.*}} @pragma_on_mul( + // FULL: fmul float // FULL-NEXT: fmul float // FULL-NEXT: fmul float @@ -23,19 +27,26 @@ _Complex float pragma_on_mul(_Complex float a, _Complex float b) { // FULL-NEXT: fsub float // FULL-NEXT: fadd float - // LMTD: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fsub float - // LMTD-NEXT: fadd float - - // FRTRN: fmul float - // FRTRN-NEXT: fmul float - // FRTRN-NEXT: fmul float - // FRTRN-NEXT: fmul float - // FRTRN-NEXT: fsub float - // FRTRN-NEXT: fadd float + // BASIC: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fsub float + // BASIC-NEXT: fadd float + + // IMPRVD: fmul float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fsub float + // IMPRVD-NEXT: fadd float + + // PRMTD: fmul float + // PRMTD-NEXT: fmul float + // PRMTD-NEXT: fmul float + // PRMTD-NEXT: fmul float + // PRMTD-NEXT: fsub float + // PRMTD-NEXT: fadd float return a * b; } @@ -43,11 +54,14 @@ _Complex float pragma_on_mul(_Complex float a, _Complex float b) { _Complex float pragma_off_mul(_Complex float a, _Complex float b) { #pragma STDC CX_LIMITED_RANGE OFF // LABEL: define {{.*}} @pragma_off_mul( + // FULL: call {{.*}} @__mulsc3 - // LMTD: call {{.*}} @__mulsc3 + // BASIC: call {{.*}} @__mulsc3 + + // IMPRVD: call {{.*}} @__mulsc3 - // FRTRN: call {{.*}} @__mulsc3 + // PRMTD: call {{.*}} @__mulsc3 return a * b; } @@ -55,6 +69,7 @@ _Complex float pragma_off_mul(_Complex float a, _Complex float b) { _Complex float pragma_on_div(_Complex float a, _Complex float b) { #pragma STDC CX_LIMITED_RANGE ON // LABEL: define {{.*}} @pragma_on_div( + // FULL: fmul float // FULL-NEXT: fmul float // FULL-NEXT: fadd float @@ -67,29 +82,45 @@ _Complex float pragma_on_div(_Complex float a, _Complex float b) { // FULL-NEXT: fdiv float // FULL: fdiv float - // LMTD: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fadd float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fadd float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fmul float - // LMTD-NEXT: fsub float - // LMTD-NEXT: fdiv float - // LMTD-NEXT: fdiv float - - // FRTRN: fmul float - // FRTRN-NEXT: fmul float - // FRTRN-NEXT: fadd float - // FRTRN-NEXT: fmul float - // FRTRN-NEXT: fmul float - // FRTRN-NEXT: fadd float - // FRTRN-NEXT: fmul float - // FRTRN-NEXT: fmul float - // FRTRN-NEXT: fsub float - // FRTRN-NEXT: fdiv float - // FRTRN-NEXT: fdiv float + // BASIC: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fadd float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fadd float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fsub float + // BASIC-NEXT: fdiv float + // BASIC-NEXT: fdiv float + + // IMPRVD: fmul float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fadd float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fadd float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fsub float + // IMPRVD-NEXT: fdiv float + // IMPRVD-NEXT: fdiv float + + // PRMTD: fpext float {{.*}} to double + // PRMTD: fpext float {{.*}} to double + // PRMTD: fmul double + // PRMTD: fmul double + // PRMTD: fadd double + // PRMTD: fmul double + // PRMTD: fmul double + // PRMTD: fadd double + // PRMTD: fmul double + // PRMTD: fmul double + // PRMTD: fsub double + // PRMTD: fdiv double + // PRMTD: fdiv double + // PRMTD: fptrunc double + // PRMTD: fptrunc double return a / b; } @@ -97,11 +128,118 @@ _Complex float pragma_on_div(_Complex float a, _Complex float b) { _Complex float pragma_off_div(_Complex float a, _Complex float b) { #pragma STDC CX_LIMITED_RANGE OFF // LABEL: define {{.*}} @pragma_off_div( + // FULL: call {{.*}} @__divsc3 - // LMTD: call {{.*}} @__divsc3 + // BASIC: call {{.*}} @__divsc3 + + // IMPRVD: call {{.*}} @__divsc3 + + // PRMTD: call {{.*}} @__divdc3 + + return a / b; +} + +_Complex float pragma_default_mul(_Complex float a, _Complex float b) { +#pragma STDC CX_LIMITED_RANGE DEFAULT + // LABEL: define {{.*}} @pragma_on_mul( + + // FULL: fmul float + // FULL-NEXT: fmul float + // FULL-NEXT: fmul float + // FULL-NEXT: fmul float + // FULL-NEXT: fsub float + // FULL-NEXT: fadd float + + // BASIC: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fsub float + // BASIC-NEXT: fadd float + + // IMPRVD: fmul float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fsub float + // IMPRVD-NEXT: fadd float + + // PRMTD: fmul float + // PRMTD-NEXT: fmul float + // PRMTD-NEXT: fmul float + // PRMTD-NEXT: fmul float + // PRMTD-NEXT: fsub float + // PRMTD-NEXT: fadd float + + return a * b; +} +_Complex float pragma_default_div(_Complex float a, _Complex float b) { +#pragma STDC CX_LIMITED_RANGE DEFAULT + // LABEL: define {{.*}} @pragma_on_divx( + + // FULL: call {{.*}} @__divsc3 - // FRTRN: call {{.*}} @__divsc3 + // BASIC: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fadd float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fadd float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fmul float + // BASIC-NEXT: fsub float + // BASIC-NEXT: fdiv float + // BASIC-NEXT: fdiv float + + // IMPRVD: call{{.*}}float @llvm.fabs.f32(float {{.*}}) + // IMPRVD-NEXT: call{{.*}}float @llvm.fabs.f32(float {{.*}}) + // IMPRVD-NEXT: fcmp{{.*}}ugt float {{.*}}, {{.*}} + // IMPRVD-NEXT: br i1 {{.*}}, label + // IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi: + // IMPRVD-NEXT: fdiv float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fadd float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fadd float + // IMPRVD-NEXT: fdiv float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fsub float + // IMPRVD-NEXT: fdiv float + // IMPRVD-NEXT: br label + // IMPRVD: abs_rhsr_less_than_abs_rhsi: + // IMPRVD-NEXT: fdiv float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fadd float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fadd float + // IMPRVD-NEXT: fdiv float + // IMPRVD-NEXT: fmul float + // IMPRVD-NEXT: fsub float + // IMPRVD-NEXT: fdiv float + + // PRMTD: load float, ptr {{.*}} + // PRMTD: fpext float {{.*}} to double + // PRMTD-NEXT: fpext float {{.*}} to double + // PRMTD-NEXT: getelementptr inbounds { float, float }, ptr {{.*}}, i32 0, i32 0 + // PRMTD-NEXT: load float, ptr {{.*}} + // PRMTD-NEXT: getelementptr inbounds { float, float }, ptr {{.*}}, i32 0, i32 1 + // PRMTD-NEXT: load float, ptr {{.*}} + // PRMTD-NEXT: fpext float {{.*}} to double + // PRMTD-NEXT: fpext float {{.*}} to double + // PRMTD-NEXT: fmul double + // PRMTD-NEXT: fmul double + // PRMTD-NEXT: fadd double + // PRMTD-NEXT: fmul double + // PRMTD-NEXT: fmul double + // PRMTD-NEXT: fadd double + // PRMTD-NEXT: fmul double + // PRMTD-NEXT: fmul double + // PRMTD-NEXT: fsub double + // PRMTD-NEXT: fdiv double + // PRMTD-NEXT: fdiv double + // PRMTD-NEXT: fptrunc double {{.*}} to float + // PRMTD-NEXT: fptrunc double {{.*}} to float return a / b; } diff --git a/clang/test/CodeGen/smiths-complex-div.c b/clang/test/CodeGen/smiths-complex-div.c index 75775675c92381..5882f8b3545f9f 100644 --- a/clang/test/CodeGen/smiths-complex-div.c +++ b/clang/test/CodeGen/smiths-complex-div.c @@ -1,58 +1,58 @@ // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \ -// RUN: -complex-range=fortran -o - | FileCheck %s --check-prefix=FRTRN +// RUN: -complex-range=improved -o - | FileCheck %s --check-prefix=IMPRVD -// FRTRN-LABEL: define dso_local <2 x float> @div( -// FRTRN-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { -// FRTRN-NEXT: entry: -// FRTRN-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 -// FRTRN-NEXT: [[A:%.*]] = alloca { float, float }, align 4 -// FRTRN-NEXT: [[B:%.*]] = alloca { float, float }, align 4 -// FRTRN-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 -// FRTRN-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 -// FRTRN-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 -// FRTRN-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 -// FRTRN-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 -// FRTRN-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 -// FRTRN-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 -// FRTRN-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 -// FRTRN-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 -// FRTRN-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 -// FRTRN-NEXT: [[TMP0:%.*]] = call float @llvm.fabs.f32(float [[B_REAL]]) -// FRTRN-NEXT: [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[B_IMAG]]) -// FRTRN-NEXT: [[ABS_CMP:%.*]] = fcmp ugt float [[TMP0]], [[TMP1]] -// FRTRN-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] -// FRTRN: abs_rhsr_greater_or_equal_abs_rhsi: -// FRTRN-NEXT: [[TMP2:%.*]] = fdiv float [[B_IMAG]], [[B_REAL]] -// FRTRN-NEXT: [[TMP3:%.*]] = fmul float [[TMP2]], [[B_IMAG]] -// FRTRN-NEXT: [[TMP4:%.*]] = fadd float [[B_REAL]], [[TMP3]] -// FRTRN-NEXT: [[TMP5:%.*]] = fmul float [[A_IMAG]], [[TMP2]] -// FRTRN-NEXT: [[TMP6:%.*]] = fadd float [[A_REAL]], [[TMP5]] -// FRTRN-NEXT: [[TMP7:%.*]] = fdiv float [[TMP6]], [[TMP4]] -// FRTRN-NEXT: [[TMP8:%.*]] = fmul float [[A_REAL]], [[TMP2]] -// FRTRN-NEXT: [[TMP9:%.*]] = fsub float [[A_IMAG]], [[TMP8]] -// FRTRN-NEXT: [[TMP10:%.*]] = fdiv float [[TMP9]], [[TMP4]] -// FRTRN-NEXT: br label [[COMPLEX_DIV:%.*]] -// FRTRN: abs_rhsr_less_than_abs_rhsi: -// FRTRN-NEXT: [[TMP11:%.*]] = fdiv float [[B_REAL]], [[B_IMAG]] -// FRTRN-NEXT: [[TMP12:%.*]] = fmul float [[TMP11]], [[B_REAL]] -// FRTRN-NEXT: [[TMP13:%.*]] = fadd float [[B_IMAG]], [[TMP12]] -// FRTRN-NEXT: [[TMP14:%.*]] = fmul float [[A_REAL]], [[TMP11]] -// FRTRN-NEXT: [[TMP15:%.*]] = fadd float [[TMP14]], [[A_IMAG]] -// FRTRN-NEXT: [[TMP16:%.*]] = fdiv float [[TMP15]], [[TMP13]] -// FRTRN-NEXT: [[TMP17:%.*]] = fmul float [[A_IMAG]], [[TMP11]] -// FRTRN-NEXT: [[TMP18:%.*]] = fsub float [[TMP17]], [[A_REAL]] -// FRTRN-NEXT: [[TMP19:%.*]] = fdiv float [[TMP18]], [[TMP13]] -// FRTRN-NEXT: br label [[COMPLEX_DIV]] -// FRTRN: complex_div: -// FRTRN-NEXT: [[TMP20:%.*]] = phi float [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] -// FRTRN-NEXT: [[TMP21:%.*]] = phi float [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] -// FRTRN-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 -// FRTRN-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 -// FRTRN-NEXT: store float [[TMP20]], ptr [[RETVAL_REALP]], align 4 -// FRTRN-NEXT: store float [[TMP21]], ptr [[RETVAL_IMAGP]], align 4 -// FRTRN-NEXT: [[TMP22:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 -// FRTRN-NEXT: ret <2 x float> [[TMP22]] +// IMPRVD-LABEL: define dso_local <2 x float> @div( +// IMPRVD-SAME: <2 x float> noundef [[A_COERCE:%.*]], <2 x float> noundef [[B_COERCE:%.*]]) #[[ATTR0:[0-9]+]] { +// IMPRVD-NEXT: entry: +// IMPRVD-NEXT: [[RETVAL:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: [[A:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: [[B:%.*]] = alloca { float, float }, align 4 +// IMPRVD-NEXT: store <2 x float> [[A_COERCE]], ptr [[A]], align 4 +// IMPRVD-NEXT: store <2 x float> [[B_COERCE]], ptr [[B]], align 4 +// IMPRVD-NEXT: [[A_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 0 +// IMPRVD-NEXT: [[A_REAL:%.*]] = load float, ptr [[A_REALP]], align 4 +// IMPRVD-NEXT: [[A_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[A]], i32 0, i32 1 +// IMPRVD-NEXT: [[A_IMAG:%.*]] = load float, ptr [[A_IMAGP]], align 4 +// IMPRVD-NEXT: [[B_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 0 +// IMPRVD-NEXT: [[B_REAL:%.*]] = load float, ptr [[B_REALP]], align 4 +// IMPRVD-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[B]], i32 0, i32 1 +// IMPRVD-NEXT: [[B_IMAG:%.*]] = load float, ptr [[B_IMAGP]], align 4 +// IMPRVD-NEXT: [[TMP0:%.*]] = call float @llvm.fabs.f32(float [[B_REAL]]) +// IMPRVD-NEXT: [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[B_IMAG]]) +// IMPRVD-NEXT: [[ABS_CMP:%.*]] = fcmp ugt float [[TMP0]], [[TMP1]] +// IMPRVD-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]] +// IMPRVD: abs_rhsr_greater_or_equal_abs_rhsi: +// IMPRVD-NEXT: [[TMP2:%.*]] = fdiv float [[B_IMAG]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP3:%.*]] = fmul float [[TMP2]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP4:%.*]] = fadd float [[B_REAL]], [[TMP3]] +// IMPRVD-NEXT: [[TMP5:%.*]] = fmul float [[A_IMAG]], [[TMP2]] +// IMPRVD-NEXT: [[TMP6:%.*]] = fadd float [[A_REAL]], [[TMP5]] +// IMPRVD-NEXT: [[TMP7:%.*]] = fdiv float [[TMP6]], [[TMP4]] +// IMPRVD-NEXT: [[TMP8:%.*]] = fmul float [[A_REAL]], [[TMP2]] +// IMPRVD-NEXT: [[TMP9:%.*]] = fsub float [[A_IMAG]], [[TMP8]] +// IMPRVD-NEXT: [[TMP10:%.*]] = fdiv float [[TMP9]], [[TMP4]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV:%.*]] +// IMPRVD: abs_rhsr_less_than_abs_rhsi: +// IMPRVD-NEXT: [[TMP11:%.*]] = fdiv float [[B_REAL]], [[B_IMAG]] +// IMPRVD-NEXT: [[TMP12:%.*]] = fmul float [[TMP11]], [[B_REAL]] +// IMPRVD-NEXT: [[TMP13:%.*]] = fadd float [[B_IMAG]], [[TMP12]] +// IMPRVD-NEXT: [[TMP14:%.*]] = fmul float [[A_REAL]], [[TMP11]] +// IMPRVD-NEXT: [[TMP15:%.*]] = fadd float [[TMP14]], [[A_IMAG]] +// IMPRVD-NEXT: [[TMP16:%.*]] = fdiv float [[TMP15]], [[TMP13]] +// IMPRVD-NEXT: [[TMP17:%.*]] = fmul float [[A_IMAG]], [[TMP11]] +// IMPRVD-NEXT: [[TMP18:%.*]] = fsub float [[TMP17]], [[A_REAL]] +// IMPRVD-NEXT: [[TMP19:%.*]] = fdiv float [[TMP18]], [[TMP13]] +// IMPRVD-NEXT: br label [[COMPLEX_DIV]] +// IMPRVD: complex_div: +// IMPRVD-NEXT: [[TMP20:%.*]] = phi float [ [[TMP7]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP16]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[TMP21:%.*]] = phi float [ [[TMP10]], [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI]] ], [ [[TMP19]], [[ABS_RHSR_LESS_THAN_ABS_RHSI]] ] +// IMPRVD-NEXT: [[RETVAL_REALP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 0 +// IMPRVD-NEXT: [[RETVAL_IMAGP:%.*]] = getelementptr inbounds { float, float }, ptr [[RETVAL]], i32 0, i32 1 +// IMPRVD-NEXT: store float [[TMP20]], ptr [[RETVAL_REALP]], align 4 +// IMPRVD-NEXT: store float [[TMP21]], ptr [[RETVAL_IMAGP]], align 4 +// IMPRVD-NEXT: [[TMP22:%.*]] = load <2 x float>, ptr [[RETVAL]], align 4 +// IMPRVD-NEXT: ret <2 x float> [[TMP22]] // _Complex float div(_Complex float a, _Complex float b) { return a / b; diff --git a/clang/test/Driver/range.c b/clang/test/Driver/range.c index 2d1fd7f9f1a9d5..da5748d7c723c1 100644 --- a/clang/test/Driver/range.c +++ b/clang/test/Driver/range.c @@ -1,16 +1,37 @@ // Test range options for complex multiplication and division. // RUN: %clang -### -target x86_64 -fcx-limited-range -c %s 2>&1 \ -// RUN: | FileCheck --check-prefix=LMTD %s +// RUN: | FileCheck --check-prefix=BASIC %s // RUN: %clang -### -target x86_64 -fno-cx-limited-range -c %s 2>&1 \ -// RUN: | FileCheck %s +// RUN: | FileCheck --check-prefix=FULL %s + +// RUN: %clang -### -target x86_64 -fcx-limited-range -fcx-fortran-rules \ +// RUN: -c %s 2>&1 | FileCheck --check-prefix=WARN1 %s + +// RUN: %clang -### -target x86_64 -fno-cx-limited-range -fcx-fortran-rules \ +// RUN: -c %s 2>&1 | FileCheck --check-prefix=WARN2 %s // RUN: %clang -### -target x86_64 -fcx-limited-range -fno-cx-limited-range \ // RUN: -c %s 2>&1 | FileCheck --check-prefix=FULL %s +// RUN: %clang -### -target x86_64 -fno-cx-limited-range -fcx-limited-range \ +// RUN: -c %s 2>&1 | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -fno-cx-limited-range -fno-cx-fortran-rules \ +// RUN: -c %s 2>&1 | FileCheck --check-prefix=FULL %s + +// RUN: %clang -### -target x86_64 -fno-cx-fortran-rules -fno-cx-limited-range \ +// RUN: -c %s 2>&1 | FileCheck --check-prefix=FULL %s + +// RUN: %clang -### -target x86_64 -fcx-limited-range -fno-cx-fortran-rules \ +// RUN: -c %s 2>&1 | FileCheck --check-prefix=WARN4 %s + // RUN: %clang -### -target x86_64 -fcx-fortran-rules -c %s 2>&1 \ -// RUN: | FileCheck --check-prefix=FRTRN %s +// RUN: | FileCheck --check-prefix=IMPRVD %s + +// RUN: %clang -### -target x86_64 -fno-cx-fortran-rules -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=FULL %s // RUN: %clang -### -target x86_64 -fcx-fortran-rules -c %s 2>&1 \ // RUN: -fno-cx-fortran-rules | FileCheck --check-prefix=FULL %s @@ -32,34 +53,148 @@ // RUN: %clang -### -target x86_64 -fcx-fortran-rules \ // RUN: -fcx-limited-range -c %s 2>&1 \ -// RUN: | FileCheck --check-prefix=WARN2 %s +// RUN: | FileCheck --check-prefix=WARN20 %s // RUN: %clang -### -target x86_64 -ffast-math -c %s 2>&1 \ -// RUN: | FileCheck --check-prefix=LMTD %s +// RUN: | FileCheck --check-prefix=BASIC %s // RUN: %clang -### -target x86_64 -ffast-math -fcx-limited-range -c %s 2>&1 \ -// RUN: | FileCheck --check-prefix=LMTD %s +// RUN: | FileCheck --check-prefix=BASIC %s // RUN: %clang -### -target x86_64 -fcx-limited-range -ffast-math -c %s 2>&1 \ -// RUN: | FileCheck --check-prefix=LMTD %s +// RUN: | FileCheck --check-prefix=BASIC %s // RUN: %clang -### -target x86_64 -ffast-math -fno-cx-limited-range \ // RUN: -c %s 2>&1 | FileCheck --check-prefix=FULL %s +// RUN: not %clang -### -target x86_64 -fcomplex-arithmetic=foo -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=ERR %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=basic -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=improved -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=IMPRVD %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=promoted -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=PRMTD %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=full -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=FULL %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=basic \ +// RUN: -fcx-limited-range -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=basic \ +// RUN: -fcomplex-arithmetic=improved -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=IMPRVD %s + +// RUN: %clang -### -target x86_64 -fcx-limited-range \ +// RUN: -fcomplex-arithmetic=improved -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=WARN6 %s + +// RUN: %clang -### -target x86_64 -fcx-fortran-rules \ +// RUN: -fcomplex-arithmetic=basic -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=WARN7 %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=basic \ +// RUN: -fcomplex-arithmetic=full -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=FULL %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=basic \ +// RUN: -fcomplex-arithmetic=promoted -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=PRMTD %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=improved \ +// RUN: -fcomplex-arithmetic=basic -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=improved \ +// RUN: -fcomplex-arithmetic=full -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=FULL %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=improved \ +// RUN: -fcomplex-arithmetic=promoted -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=PRMTD %s + + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=promoted \ +// RUN: -fcomplex-arithmetic=basic -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=promoted \ +// RUN: -fcx-limited-range -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=WARN14 %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=promoted \ +// RUN: -fcomplex-arithmetic=improved -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=IMPRVD %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=promoted \ +// RUN: -fcomplex-arithmetic=full -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=FULL %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=full \ +// RUN: -fcomplex-arithmetic=basic -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=full \ +// RUN: -ffast-math -c %s 2>&1 | FileCheck --check-prefix=WARN17 %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=full \ +// RUN: -fcomplex-arithmetic=improved -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=IMPRVD %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=full \ +// RUN: -fcomplex-arithmetic=promoted -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=PRMTD %s + +// RUN: %clang -### -target x86_64 -ffast-math -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -ffast-math -fcx-limited-range -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -fcx-limited-range -ffast-math -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -ffast-math -fno-cx-limited-range -c %s \ +// RUN: 2>&1 | FileCheck --check-prefix=FULL %s + +// RUN: %clang -### -target x86_64 -ffast-math -fcomplex-arithmetic=basic -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -fcomplex-arithmetic=basic -ffast-math -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s + // RUN: %clang -### -Werror -target x86_64 -fcx-limited-range -c %s 2>&1 \ -// RUN: | FileCheck --check-prefix=LMTD %s +// RUN: | FileCheck --check-prefix=BASIC %s + +// RUN: %clang -### -target x86_64 -ffast-math -fcomplex-arithmetic=full -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=FULL %s -// RUN: %clang -### -Werror -target x86_64 -fcx-fortran-rules -c %s 2>&1 \ -// RUN: | FileCheck --check-prefix=FRTRN %s +// RUN: %clang -### -target x86_64 -ffast-math -fcomplex-arithmetic=basic -c %s 2>&1 \ +// RUN: | FileCheck --check-prefix=BASIC %s -// LMTD: -complex-range=limited +// BASIC: -complex-range=basic // FULL: -complex-range=full -// LMTD-NOT: -complex-range=fortran -// CHECK-NOT: -complex-range=limited -// FRTRN: -complex-range=fortran -// FRTRN-NOT: -complex-range=limited -// CHECK-NOT: -complex-range=fortran +// PRMTD: -complex-range=promoted +// BASIC-NOT: -complex-range=improved +// CHECK-NOT: -complex-range=basic +// IMPRVD: -complex-range=improved +// IMPRVD-NOT: -complex-range=basic +// CHECK-NOT: -complex-range=improved + // WARN1: warning: overriding '-fcx-limited-range' option with '-fcx-fortran-rules' [-Woverriding-option] -// WARN2: warning: overriding '-fcx-fortran-rules' option with '-fcx-limited-range' [-Woverriding-option] +// WARN2: warning: overriding '-fno-cx-limited-range' option with '-fcx-fortran-rules' [-Woverriding-option] // WARN3: warning: overriding '-fcx-fortran-rules' option with '-fno-cx-limited-range' [-Woverriding-option] // WARN4: warning: overriding '-fcx-limited-range' option with '-fno-cx-fortran-rules' [-Woverriding-option] +// WARN5: warning: overriding '-fcomplex-arithmetic=basic' option with '-fcomplex-arithmetic=improved' [-Woverriding-option] +// WARN6: warning: overriding '-fcx-limited-range' option with '-fcomplex-arithmetic=improved' [-Woverriding-option] +// WARN7: warning: overriding '-fcx-fortran-rules' option with '-fcomplex-arithmetic=basic' [-Woverriding-option] +// WARN14: overriding '-complex-range=promoted' option with '-fcx-limited-range' [-Woverriding-option] +// WARN17: warning: overriding '-fcomplex-arithmetic=full' option with '-fcomplex-arithmetic=basic' [-Woverriding-option] +// WARN20: warning: overriding '-fcx-fortran-rules' option with '-fcx-limited-range' [-Woverriding-option] + +// ERR: error: unsupported argument 'foo' to option '-fcomplex-arithmetic='