-
Notifications
You must be signed in to change notification settings - Fork 14k
[AArch64][VecLib] Add libmvec support for AArch64 targets #143696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Change-Id: I07cd07932c0cb94782de8cf0d25c4729a48e695b
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-backend-aarch64 Author: Mary Kassayova (marykass-arm) ChangesThis patch adds support for the Previously, Patch is 219.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143696.diff 11 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b5e6cf088a4b1..11c23064ab604 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -357,6 +357,8 @@ Modified Compiler Flags
- The ``-fchar8_t`` flag is no longer considered in non-C++ languages modes. (#GH55373)
+- The ``-fveclib=libmvec`` option now supports AArch64 targets (requires GLIBC 2.40 or newer).
+
Removed Compiler Flags
-------------------------
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152df89118a6a..b886b75fa4fa9 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3473,8 +3473,9 @@ def fveclib : Joined<["-"], "fveclib=">, Group<f_Group>,
Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
HelpText<"Use the given vector functions library">,
HelpTextForVariants<[ClangOption, CC1Option],
- "Use the given vector functions library. "
- "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">,
+ "Use the given vector functions library.\n"
+ " Note: -fveclib={ArmPL,SLEEF,libmvec} implies -fno-math-errno.\n"
+ " Note: -fveclib=libmvec on AArch64 requires GLIBC 2.40 or newer.">,
Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">,
NormalizedValuesScope<"llvm::driver::VectorLibrary">,
NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index a74fa81f3cf5b..fdc023d193aa9 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5683,11 +5683,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
Triple.getArch() != llvm::Triple::x86_64)
D.Diag(diag::err_drv_unsupported_opt_for_target)
<< Name << Triple.getArchName();
- } else if (Name == "libmvec" || Name == "AMDLIBM") {
+ } else if (Name == "AMDLIBM") {
if (Triple.getArch() != llvm::Triple::x86 &&
Triple.getArch() != llvm::Triple::x86_64)
D.Diag(diag::err_drv_unsupported_opt_for_target)
<< Name << Triple.getArchName();
+ } else if (Name == "libmvec") {
+ if (Triple.getArch() != llvm::Triple::x86 &&
+ Triple.getArch() != llvm::Triple::x86_64 &&
+ Triple.getArch() != llvm::Triple::aarch64 &&
+ Triple.getArch() != llvm::Triple::aarch64_be)
+ D.Diag(diag::err_drv_unsupported_opt_for_target)
+ << Name << Triple.getArchName();
} else if (Name == "SLEEF" || Name == "ArmPL") {
if (Triple.getArch() != llvm::Triple::aarch64 &&
Triple.getArch() != llvm::Triple::aarch64_be &&
diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c
index 5420555c36a2a..c57e9aa7a3cc2 100644
--- a/clang/test/Driver/fveclib.c
+++ b/clang/test/Driver/fveclib.c
@@ -1,6 +1,7 @@
// RUN: %clang -### -c -fveclib=none %s 2>&1 | FileCheck --check-prefix=CHECK-NOLIB %s
// RUN: %clang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-ACCELERATE %s
// RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s
+// RUN: %clang -### -c --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-LIBMVEC-AARCH64 %s
// RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-AMDLIBM %s
// RUN: %clang -### -c -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-MASSV %s
// RUN: %clang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck --check-prefix=CHECK-DARWIN_LIBSYSTEM_M %s
@@ -12,6 +13,7 @@
// CHECK-NOLIB: "-fveclib=none"
// CHECK-ACCELERATE: "-fveclib=Accelerate"
// CHECK-libmvec: "-fveclib=libmvec"
+// CHECK-LIBMVEC-AARCH64: "-fveclib=libmvec"
// CHECK-AMDLIBM: "-fveclib=AMDLIBM"
// CHECK-MASSV: "-fveclib=MASSV"
// CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m"
@@ -23,7 +25,6 @@
// RUN: not %clang --target=x86 -c -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// RUN: not %clang --target=x86 -c -fveclib=ArmPL %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
-// RUN: not %clang --target=aarch64 -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// RUN: not %clang --target=aarch64 -c -fveclib=SVML %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// RUN: not %clang --target=aarch64 -c -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// CHECK-ERROR: unsupported option {{.*}} for target
@@ -43,6 +44,9 @@
// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s
// CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC"
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC-AARCH64 %s
+// CHECK-LTO-LIBMVEC-AARCH64: "-plugin-opt=-vector-library=LIBMVEC"
+
// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-AMDLIBM %s
// CHECK-LTO-AMDLIBM: "-plugin-opt=-vector-library=AMDLIBM"
@@ -68,6 +72,10 @@
// CHECK-ERRNO-LIBMVEC: "-fveclib=libmvec"
// CHECK-ERRNO-LIBMVEC-SAME: "-fmath-errno"
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-LIBMVEC-AARCH64 %s
+// CHECK-ERRNO-LIBMVEC-AARCH64: "-fveclib=libmvec"
+// CHECK-ERRNO-LIBMVEC-AARCH64-SAME: "-fmath-errno"
+
// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-AMDLIBM %s
// CHECK-ERRNO-AMDLIBM: "-fveclib=AMDLIBM"
// CHECK-ERRNO-AMDLIBM-SAME: "-fmath-errno"
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index 68753a2497db2..cb8e6755a486b 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -237,6 +237,305 @@ TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVdN4v_log", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVbN4v_logf", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVdN8v_logf", FIXED(8), "_ZGV_LLVM_N8v")
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVnN2v_acosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN2v_acoshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN2v_asinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVnN2v_asinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN2v_asinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN2vv_atan2f", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVnN2v_atanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN2v_atanhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVnN2v_cbrt", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN2v_cbrtf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN2v_cosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVnN2v_erf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN2v_erff", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVnN2v_erfc", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN2v_erfcf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVnN2v_expm1", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN2v_expm1f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVnN2vv_hypot", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN2vv_hypotf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVnN2v_log1p", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN2v_log1pf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f64", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("pow", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("sin", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("sinh", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tan", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f64", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tanh", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN4v_acoshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN4v_asinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN4v_atanhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN4v_cbrtf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN4v_erff", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN4v_erfcf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN4v_expm1f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN4vv_hypotf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN4v_log1pf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_SCALABLE_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVsMxv_acosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVsMxv_acoshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVsMxv_asinh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVsMxv_asinhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVsMxv_atanh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVsMxv_atanhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVsMxv_cbrt", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVsMxv_cbrtf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVsMxv_cos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVsMxv_cos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVsMxv_erf", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erff", "_ZGVsMxv_erff", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVsMxv_erfc", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVsMxv_erfcf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVsMxv_exp", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expf", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVsMxv_exp", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVsMxv_exp10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVsMxv_exp10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVsMxv_exp2", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVsMxv_exp2", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVsMxv_expm1", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVsMxv_expm1f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVsMxvv_hypot", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVsMxvv_hypotf", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVsMxv_log", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("logf", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVsMxv_log", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVsMxv_log10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVsMxv_log10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVsMxv_log1p", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVsMxv_log1pf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVsMxv_log2", SCALABLE(2), MASKED, "_ZGVsMxv")...
[truncated]
|
@llvm/pr-subscribers-clang-driver Author: Mary Kassayova (marykass-arm) ChangesThis patch adds support for the Previously, Patch is 219.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143696.diff 11 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b5e6cf088a4b1..11c23064ab604 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -357,6 +357,8 @@ Modified Compiler Flags
- The ``-fchar8_t`` flag is no longer considered in non-C++ languages modes. (#GH55373)
+- The ``-fveclib=libmvec`` option now supports AArch64 targets (requires GLIBC 2.40 or newer).
+
Removed Compiler Flags
-------------------------
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152df89118a6a..b886b75fa4fa9 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3473,8 +3473,9 @@ def fveclib : Joined<["-"], "fveclib=">, Group<f_Group>,
Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
HelpText<"Use the given vector functions library">,
HelpTextForVariants<[ClangOption, CC1Option],
- "Use the given vector functions library. "
- "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">,
+ "Use the given vector functions library.\n"
+ " Note: -fveclib={ArmPL,SLEEF,libmvec} implies -fno-math-errno.\n"
+ " Note: -fveclib=libmvec on AArch64 requires GLIBC 2.40 or newer.">,
Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">,
NormalizedValuesScope<"llvm::driver::VectorLibrary">,
NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index a74fa81f3cf5b..fdc023d193aa9 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5683,11 +5683,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
Triple.getArch() != llvm::Triple::x86_64)
D.Diag(diag::err_drv_unsupported_opt_for_target)
<< Name << Triple.getArchName();
- } else if (Name == "libmvec" || Name == "AMDLIBM") {
+ } else if (Name == "AMDLIBM") {
if (Triple.getArch() != llvm::Triple::x86 &&
Triple.getArch() != llvm::Triple::x86_64)
D.Diag(diag::err_drv_unsupported_opt_for_target)
<< Name << Triple.getArchName();
+ } else if (Name == "libmvec") {
+ if (Triple.getArch() != llvm::Triple::x86 &&
+ Triple.getArch() != llvm::Triple::x86_64 &&
+ Triple.getArch() != llvm::Triple::aarch64 &&
+ Triple.getArch() != llvm::Triple::aarch64_be)
+ D.Diag(diag::err_drv_unsupported_opt_for_target)
+ << Name << Triple.getArchName();
} else if (Name == "SLEEF" || Name == "ArmPL") {
if (Triple.getArch() != llvm::Triple::aarch64 &&
Triple.getArch() != llvm::Triple::aarch64_be &&
diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c
index 5420555c36a2a..c57e9aa7a3cc2 100644
--- a/clang/test/Driver/fveclib.c
+++ b/clang/test/Driver/fveclib.c
@@ -1,6 +1,7 @@
// RUN: %clang -### -c -fveclib=none %s 2>&1 | FileCheck --check-prefix=CHECK-NOLIB %s
// RUN: %clang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-ACCELERATE %s
// RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s
+// RUN: %clang -### -c --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-LIBMVEC-AARCH64 %s
// RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-AMDLIBM %s
// RUN: %clang -### -c -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-MASSV %s
// RUN: %clang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck --check-prefix=CHECK-DARWIN_LIBSYSTEM_M %s
@@ -12,6 +13,7 @@
// CHECK-NOLIB: "-fveclib=none"
// CHECK-ACCELERATE: "-fveclib=Accelerate"
// CHECK-libmvec: "-fveclib=libmvec"
+// CHECK-LIBMVEC-AARCH64: "-fveclib=libmvec"
// CHECK-AMDLIBM: "-fveclib=AMDLIBM"
// CHECK-MASSV: "-fveclib=MASSV"
// CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m"
@@ -23,7 +25,6 @@
// RUN: not %clang --target=x86 -c -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// RUN: not %clang --target=x86 -c -fveclib=ArmPL %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
-// RUN: not %clang --target=aarch64 -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// RUN: not %clang --target=aarch64 -c -fveclib=SVML %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// RUN: not %clang --target=aarch64 -c -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// CHECK-ERROR: unsupported option {{.*}} for target
@@ -43,6 +44,9 @@
// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s
// CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC"
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC-AARCH64 %s
+// CHECK-LTO-LIBMVEC-AARCH64: "-plugin-opt=-vector-library=LIBMVEC"
+
// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-AMDLIBM %s
// CHECK-LTO-AMDLIBM: "-plugin-opt=-vector-library=AMDLIBM"
@@ -68,6 +72,10 @@
// CHECK-ERRNO-LIBMVEC: "-fveclib=libmvec"
// CHECK-ERRNO-LIBMVEC-SAME: "-fmath-errno"
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-LIBMVEC-AARCH64 %s
+// CHECK-ERRNO-LIBMVEC-AARCH64: "-fveclib=libmvec"
+// CHECK-ERRNO-LIBMVEC-AARCH64-SAME: "-fmath-errno"
+
// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-AMDLIBM %s
// CHECK-ERRNO-AMDLIBM: "-fveclib=AMDLIBM"
// CHECK-ERRNO-AMDLIBM-SAME: "-fmath-errno"
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index 68753a2497db2..cb8e6755a486b 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -237,6 +237,305 @@ TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVdN4v_log", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVbN4v_logf", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVdN8v_logf", FIXED(8), "_ZGV_LLVM_N8v")
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVnN2v_acosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN2v_acoshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN2v_asinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVnN2v_asinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN2v_asinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN2vv_atan2f", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVnN2v_atanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN2v_atanhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVnN2v_cbrt", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN2v_cbrtf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN2v_cosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVnN2v_erf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN2v_erff", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVnN2v_erfc", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN2v_erfcf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVnN2v_expm1", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN2v_expm1f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVnN2vv_hypot", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN2vv_hypotf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVnN2v_log1p", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN2v_log1pf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f64", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("pow", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("sin", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("sinh", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tan", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f64", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tanh", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN4v_acoshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN4v_asinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN4v_atanhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN4v_cbrtf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN4v_erff", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN4v_erfcf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN4v_expm1f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN4vv_hypotf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN4v_log1pf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_SCALABLE_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVsMxv_acosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVsMxv_acoshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVsMxv_asinh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVsMxv_asinhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVsMxv_atanh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVsMxv_atanhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVsMxv_cbrt", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVsMxv_cbrtf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVsMxv_cos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVsMxv_cos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVsMxv_erf", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erff", "_ZGVsMxv_erff", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVsMxv_erfc", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVsMxv_erfcf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVsMxv_exp", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expf", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVsMxv_exp", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVsMxv_exp10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVsMxv_exp10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVsMxv_exp2", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVsMxv_exp2", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVsMxv_expm1", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVsMxv_expm1f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVsMxvv_hypot", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVsMxvv_hypotf", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVsMxv_log", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("logf", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVsMxv_log", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVsMxv_log10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVsMxv_log10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVsMxv_log1p", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVsMxv_log1pf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVsMxv_log2", SCALABLE(2), MASKED, "_ZGVsMxv")...
[truncated]
|
@llvm/pr-subscribers-clang Author: Mary Kassayova (marykass-arm) ChangesThis patch adds support for the Previously, Patch is 219.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143696.diff 11 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b5e6cf088a4b1..11c23064ab604 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -357,6 +357,8 @@ Modified Compiler Flags
- The ``-fchar8_t`` flag is no longer considered in non-C++ languages modes. (#GH55373)
+- The ``-fveclib=libmvec`` option now supports AArch64 targets (requires GLIBC 2.40 or newer).
+
Removed Compiler Flags
-------------------------
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152df89118a6a..b886b75fa4fa9 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3473,8 +3473,9 @@ def fveclib : Joined<["-"], "fveclib=">, Group<f_Group>,
Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
HelpText<"Use the given vector functions library">,
HelpTextForVariants<[ClangOption, CC1Option],
- "Use the given vector functions library. "
- "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">,
+ "Use the given vector functions library.\n"
+ " Note: -fveclib={ArmPL,SLEEF,libmvec} implies -fno-math-errno.\n"
+ " Note: -fveclib=libmvec on AArch64 requires GLIBC 2.40 or newer.">,
Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">,
NormalizedValuesScope<"llvm::driver::VectorLibrary">,
NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index a74fa81f3cf5b..fdc023d193aa9 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5683,11 +5683,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
Triple.getArch() != llvm::Triple::x86_64)
D.Diag(diag::err_drv_unsupported_opt_for_target)
<< Name << Triple.getArchName();
- } else if (Name == "libmvec" || Name == "AMDLIBM") {
+ } else if (Name == "AMDLIBM") {
if (Triple.getArch() != llvm::Triple::x86 &&
Triple.getArch() != llvm::Triple::x86_64)
D.Diag(diag::err_drv_unsupported_opt_for_target)
<< Name << Triple.getArchName();
+ } else if (Name == "libmvec") {
+ if (Triple.getArch() != llvm::Triple::x86 &&
+ Triple.getArch() != llvm::Triple::x86_64 &&
+ Triple.getArch() != llvm::Triple::aarch64 &&
+ Triple.getArch() != llvm::Triple::aarch64_be)
+ D.Diag(diag::err_drv_unsupported_opt_for_target)
+ << Name << Triple.getArchName();
} else if (Name == "SLEEF" || Name == "ArmPL") {
if (Triple.getArch() != llvm::Triple::aarch64 &&
Triple.getArch() != llvm::Triple::aarch64_be &&
diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c
index 5420555c36a2a..c57e9aa7a3cc2 100644
--- a/clang/test/Driver/fveclib.c
+++ b/clang/test/Driver/fveclib.c
@@ -1,6 +1,7 @@
// RUN: %clang -### -c -fveclib=none %s 2>&1 | FileCheck --check-prefix=CHECK-NOLIB %s
// RUN: %clang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-ACCELERATE %s
// RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s
+// RUN: %clang -### -c --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-LIBMVEC-AARCH64 %s
// RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-AMDLIBM %s
// RUN: %clang -### -c -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-MASSV %s
// RUN: %clang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck --check-prefix=CHECK-DARWIN_LIBSYSTEM_M %s
@@ -12,6 +13,7 @@
// CHECK-NOLIB: "-fveclib=none"
// CHECK-ACCELERATE: "-fveclib=Accelerate"
// CHECK-libmvec: "-fveclib=libmvec"
+// CHECK-LIBMVEC-AARCH64: "-fveclib=libmvec"
// CHECK-AMDLIBM: "-fveclib=AMDLIBM"
// CHECK-MASSV: "-fveclib=MASSV"
// CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m"
@@ -23,7 +25,6 @@
// RUN: not %clang --target=x86 -c -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// RUN: not %clang --target=x86 -c -fveclib=ArmPL %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
-// RUN: not %clang --target=aarch64 -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// RUN: not %clang --target=aarch64 -c -fveclib=SVML %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// RUN: not %clang --target=aarch64 -c -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
// CHECK-ERROR: unsupported option {{.*}} for target
@@ -43,6 +44,9 @@
// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s
// CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC"
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC-AARCH64 %s
+// CHECK-LTO-LIBMVEC-AARCH64: "-plugin-opt=-vector-library=LIBMVEC"
+
// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-AMDLIBM %s
// CHECK-LTO-AMDLIBM: "-plugin-opt=-vector-library=AMDLIBM"
@@ -68,6 +72,10 @@
// CHECK-ERRNO-LIBMVEC: "-fveclib=libmvec"
// CHECK-ERRNO-LIBMVEC-SAME: "-fmath-errno"
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-LIBMVEC-AARCH64 %s
+// CHECK-ERRNO-LIBMVEC-AARCH64: "-fveclib=libmvec"
+// CHECK-ERRNO-LIBMVEC-AARCH64-SAME: "-fmath-errno"
+
// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-AMDLIBM %s
// CHECK-ERRNO-AMDLIBM: "-fveclib=AMDLIBM"
// CHECK-ERRNO-AMDLIBM-SAME: "-fmath-errno"
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index 68753a2497db2..cb8e6755a486b 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -237,6 +237,305 @@ TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVdN4v_log", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVbN4v_logf", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVdN8v_logf", FIXED(8), "_ZGV_LLVM_N8v")
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVnN2v_acosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN2v_acoshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN2v_asinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVnN2v_asinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN2v_asinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN2vv_atan2f", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVnN2v_atanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN2v_atanhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVnN2v_cbrt", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN2v_cbrtf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN2v_cosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVnN2v_erf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN2v_erff", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVnN2v_erfc", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN2v_erfcf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVnN2v_expm1", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN2v_expm1f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVnN2vv_hypot", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN2vv_hypotf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVnN2v_log1p", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN2v_log1pf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f64", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("pow", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("sin", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("sinh", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tan", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f64", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tanh", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN4v_acoshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN4v_asinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN4v_atanhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN4v_cbrtf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN4v_erff", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN4v_erfcf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN4v_expm1f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN4vv_hypotf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN4v_log1pf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_SCALABLE_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVsMxv_acosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVsMxv_acoshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVsMxv_asinh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVsMxv_asinhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVsMxv_atanh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVsMxv_atanhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVsMxv_cbrt", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVsMxv_cbrtf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVsMxv_cos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVsMxv_cos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVsMxv_erf", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erff", "_ZGVsMxv_erff", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVsMxv_erfc", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVsMxv_erfcf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVsMxv_exp", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expf", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVsMxv_exp", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVsMxv_exp10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVsMxv_exp10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVsMxv_exp2", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVsMxv_exp2", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVsMxv_expm1", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVsMxv_expm1f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVsMxvv_hypot", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVsMxvv_hypotf", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVsMxv_log", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("logf", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVsMxv_log", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVsMxv_log10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVsMxv_log10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVsMxv_log1p", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVsMxv_log1pf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVsMxv_log2", SCALABLE(2), MASKED, "_ZGVsMxv")...
[truncated]
|
static const VecDesc VecFuncs_LIBMVEC_AARCH64_VF2[] = { | ||
#define TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS | ||
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VABI_PREFIX) \ | ||
{SCAL, VEC, /* VF = */ FIXED(2), /* MASK = */ false, VABI_PREFIX, \ | ||
/* CC = */ CallingConv::AArch64_VectorCall}, | ||
#include "llvm/Analysis/VecFuncs.def" | ||
#undef TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS | ||
}; | ||
static const VecDesc VecFuncs_LIBMVEC_AARCH64_VF4[] = { | ||
#define TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS | ||
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VABI_PREFIX) \ | ||
{SCAL, VEC, /* VF = */ FIXED(4), /* MASK = */ false, VABI_PREFIX, \ | ||
/* CC = */ CallingConv::AArch64_VectorCall}, | ||
#include "llvm/Analysis/VecFuncs.def" | ||
#undef TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I see much value in those being defined by separate macros, can those be merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, I was thinking of copying this idiom to remove much of the redundancy from how we currently define the ArmPL routines. That said, I'm happy either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I asked this is that when all (e.g. sin
) function-mappings are defined together, it is easier to see what variants are missing, whereas now it's artificially split between two macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, in which case I suggest following the idiom we use for ArmPL. I'm happy to circle back to see about fulfilling both ideals. @marykass-arm sorry about the code churn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paulwalker-arm @sdesmalen-arm Should I merge all three macros (following ArmPL) or just merge fixed and keep scalable separated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on Sander's rational I'd merge all three macros. This means we're switching from the SLEEF style to the ArmPL style, which I prefer over creating a new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would be even better, thanks!
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MASK, VABI_PREFIX) \ | ||
{SCAL, VEC, VF, MASK, VABI_PREFIX, /* CC = */ std::nullopt}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the SVE functions are masked so you could pass in /* MASK = */ true
.
✅ With the latest revision this PR passed the C/C++ code formatter. |
Change-Id: Ib750e05d6daeca404a02b214272727827af13b2d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double-checked the functions added cover up to 2.40 -- I think they do 👍 Generally, LGTM -- but I'll leave final approval to Sander/Paul.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not spot any tests for the <2 x float>
vector functions? Hopefully it's as easy as adding extra RUN lines to veclib-function-calls.ll and veclib-intrinsic-calls.ll. I'm thinking variants of LIBMVEC-NEON that use -force-vector-width=2
. If this works then it's worth changing the existing LIBMVEC-NEON RUN lines to use -force-vector-width=4
so that both variants are locked in.
FYI: I've crossed out my final suggestion because that would also force the need for <4 x double>
vector functions, which we don't have.
…ping Change-Id: Ia863b7dee5585d782cd049d99758b7aba04b5c2c
e4d49c4
to
8264797
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation build failure has been verified locally as an existing upstream issue and not the fault of this PR.
@marykass-arm Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
This patch adds support for the `libmvec` vector library on AArch64 targets. Currently, all `libmvec` functions in GLIBC version 2.40 are supported. The full list of math functions enabled can be found [here](https://github.com/bminor/glibc/blob/96abd59bf2a11ddd4e7ccaac840ec13c0b62d3ba/sysdeps/aarch64/fpu/Versions) (up to GLIBC 2.40). Previously, `libmvec` was only supported on x86_64 targets. Attempts to use it on AArch64 resulted in the following error from Clang: `unsupported option 'libmvec' for target 'aarch64'`.
This patch adds support for the
libmvec
vector library on AArch64 targets. Currently, alllibmvec
functions in GLIBC version 2.40 are supported. The full list of math functions enabled can be found here (up to GLIBC 2.40).Previously,
libmvec
was only supported on x86_64 targets. Attempts to use it on AArch64 resulted in the following error from Clang:unsupported option 'libmvec' for target 'aarch64'
.