Skip to content

[AArch64][VecLib] Add libmvec support for AArch64 targets #143696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 17, 2025

Conversation

marykass-arm
Copy link
Contributor

This patch adds support for the libmvec vector library on AArch64 targets. Currently, all libmvec functions in GLIBC version 2.40 are supported. The full list of math functions enabled can be found here (up to GLIBC 2.40).

Previously, libmvec was only supported on x86_64 targets. Attempts to use it on AArch64 resulted in the following error from Clang: unsupported option 'libmvec' for target 'aarch64'.

Change-Id: I07cd07932c0cb94782de8cf0d25c4729a48e695b
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' llvm:analysis llvm:transforms labels Jun 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 11, 2025

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-aarch64

Author: Mary Kassayova (marykass-arm)

Changes

This patch adds support for the libmvec vector library on AArch64 targets. Currently, all libmvec functions in GLIBC version 2.40 are supported. The full list of math functions enabled can be found here (up to GLIBC 2.40).

Previously, libmvec was only supported on x86_64 targets. Attempts to use it on AArch64 resulted in the following error from Clang: unsupported option 'libmvec' for target 'aarch64'.


Patch is 219.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143696.diff

11 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/include/clang/Driver/Options.td (+3-2)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+8-1)
  • (modified) clang/test/Driver/fveclib.c (+9-1)
  • (modified) llvm/include/llvm/Analysis/VecFuncs.def (+299)
  • (modified) llvm/lib/Analysis/TargetLibraryInfo.cpp (+30)
  • (added) llvm/test/CodeGen/AArch64/replace-with-veclib-libmvec-scalable.ll (+579)
  • (added) llvm/test/CodeGen/AArch64/replace-with-veclib-libmvec.ll (+577)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/veclib-function-calls.ll (+690)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/veclib-intrinsic-calls.ll (+502)
  • (modified) llvm/test/Transforms/Util/add-TLI-mappings.ll (+23-5)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b5e6cf088a4b1..11c23064ab604 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -357,6 +357,8 @@ Modified Compiler Flags
 
 - The ``-fchar8_t`` flag is no longer considered in non-C++ languages modes. (#GH55373)
 
+- The ``-fveclib=libmvec`` option now supports AArch64 targets (requires GLIBC 2.40 or newer).
+
 Removed Compiler Flags
 -------------------------
 
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152df89118a6a..b886b75fa4fa9 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3473,8 +3473,9 @@ def fveclib : Joined<["-"], "fveclib=">, Group<f_Group>,
   Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
     HelpText<"Use the given vector functions library">,
     HelpTextForVariants<[ClangOption, CC1Option],
-      "Use the given vector functions library. "
-      "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">,
+      "Use the given vector functions library.\n"
+      "  Note: -fveclib={ArmPL,SLEEF,libmvec} implies -fno-math-errno.\n"
+      "  Note: -fveclib=libmvec on AArch64 requires GLIBC 2.40 or newer.">,
     Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">,
     NormalizedValuesScope<"llvm::driver::VectorLibrary">,
     NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index a74fa81f3cf5b..fdc023d193aa9 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5683,11 +5683,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
           Triple.getArch() != llvm::Triple::x86_64)
         D.Diag(diag::err_drv_unsupported_opt_for_target)
             << Name << Triple.getArchName();
-    } else if (Name == "libmvec" || Name == "AMDLIBM") {
+    } else if (Name == "AMDLIBM") {
       if (Triple.getArch() != llvm::Triple::x86 &&
           Triple.getArch() != llvm::Triple::x86_64)
         D.Diag(diag::err_drv_unsupported_opt_for_target)
             << Name << Triple.getArchName();
+    } else if (Name == "libmvec") {
+      if (Triple.getArch() != llvm::Triple::x86 &&
+          Triple.getArch() != llvm::Triple::x86_64 &&
+          Triple.getArch() != llvm::Triple::aarch64 &&
+          Triple.getArch() != llvm::Triple::aarch64_be)
+        D.Diag(diag::err_drv_unsupported_opt_for_target)
+            << Name << Triple.getArchName();
     } else if (Name == "SLEEF" || Name == "ArmPL") {
       if (Triple.getArch() != llvm::Triple::aarch64 &&
           Triple.getArch() != llvm::Triple::aarch64_be &&
diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c
index 5420555c36a2a..c57e9aa7a3cc2 100644
--- a/clang/test/Driver/fveclib.c
+++ b/clang/test/Driver/fveclib.c
@@ -1,6 +1,7 @@
 // RUN: %clang -### -c -fveclib=none %s 2>&1 | FileCheck --check-prefix=CHECK-NOLIB %s
 // RUN: %clang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-ACCELERATE %s
 // RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s
+// RUN: %clang -### -c --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-LIBMVEC-AARCH64 %s
 // RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-AMDLIBM %s
 // RUN: %clang -### -c -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-MASSV %s
 // RUN: %clang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck --check-prefix=CHECK-DARWIN_LIBSYSTEM_M %s
@@ -12,6 +13,7 @@
 // CHECK-NOLIB: "-fveclib=none"
 // CHECK-ACCELERATE: "-fveclib=Accelerate"
 // CHECK-libmvec: "-fveclib=libmvec"
+// CHECK-LIBMVEC-AARCH64: "-fveclib=libmvec"
 // CHECK-AMDLIBM: "-fveclib=AMDLIBM"
 // CHECK-MASSV: "-fveclib=MASSV"
 // CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m"
@@ -23,7 +25,6 @@
 
 // RUN: not %clang --target=x86 -c -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // RUN: not %clang --target=x86 -c -fveclib=ArmPL %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
-// RUN: not %clang --target=aarch64 -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // RUN: not %clang --target=aarch64 -c -fveclib=SVML %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // RUN: not %clang --target=aarch64 -c -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // CHECK-ERROR: unsupported option {{.*}} for target
@@ -43,6 +44,9 @@
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s
 // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC"
 
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC-AARCH64 %s
+// CHECK-LTO-LIBMVEC-AARCH64: "-plugin-opt=-vector-library=LIBMVEC"
+
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-AMDLIBM %s
 // CHECK-LTO-AMDLIBM: "-plugin-opt=-vector-library=AMDLIBM"
 
@@ -68,6 +72,10 @@
 // CHECK-ERRNO-LIBMVEC: "-fveclib=libmvec"
 // CHECK-ERRNO-LIBMVEC-SAME: "-fmath-errno"
 
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-LIBMVEC-AARCH64 %s
+// CHECK-ERRNO-LIBMVEC-AARCH64: "-fveclib=libmvec"
+// CHECK-ERRNO-LIBMVEC-AARCH64-SAME: "-fmath-errno"
+
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-AMDLIBM %s
 // CHECK-ERRNO-AMDLIBM: "-fveclib=AMDLIBM"
 // CHECK-ERRNO-AMDLIBM-SAME: "-fmath-errno"
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index 68753a2497db2..cb8e6755a486b 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -237,6 +237,305 @@ TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVdN4v_log", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVbN4v_logf", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVdN8v_logf", FIXED(8), "_ZGV_LLVM_N8v")
 
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVnN2v_acosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN2v_acoshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN2v_asinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVnN2v_asinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN2v_asinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN2vv_atan2f", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVnN2v_atanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN2v_atanhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVnN2v_cbrt", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN2v_cbrtf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN2v_cosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVnN2v_erf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN2v_erff", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVnN2v_erfc", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN2v_erfcf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVnN2v_expm1", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN2v_expm1f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVnN2vv_hypot", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN2vv_hypotf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVnN2v_log1p", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN2v_log1pf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f64", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("pow", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("sin", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("sinh", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tan", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f64", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tanh", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN4v_acoshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN4v_asinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN4v_atanhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN4v_cbrtf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN4v_erff", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN4v_erfcf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN4v_expm1f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN4vv_hypotf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN4v_log1pf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_SCALABLE_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVsMxv_acosh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVsMxv_acoshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVsMxv_asinh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVsMxv_asinhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVsMxv_atanh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVsMxv_atanhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVsMxv_cbrt",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVsMxv_cbrtf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVsMxv_cos",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVsMxv_cos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVsMxv_cosh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVsMxv_erf",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erff", "_ZGVsMxv_erff", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVsMxv_erfc",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVsMxv_erfcf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVsMxv_exp",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expf", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVsMxv_exp", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVsMxv_exp10",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVsMxv_exp10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVsMxv_exp2",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVsMxv_exp2", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVsMxv_expm1",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVsMxv_expm1f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVsMxvv_hypot", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVsMxvv_hypotf", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVsMxv_log",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("logf", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVsMxv_log", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVsMxv_log10",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVsMxv_log10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVsMxv_log1p",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVsMxv_log1pf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVsMxv_log2",  SCALABLE(2), MASKED, "_ZGVsMxv")...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jun 11, 2025

@llvm/pr-subscribers-clang-driver

Author: Mary Kassayova (marykass-arm)

Changes

This patch adds support for the libmvec vector library on AArch64 targets. Currently, all libmvec functions in GLIBC version 2.40 are supported. The full list of math functions enabled can be found here (up to GLIBC 2.40).

Previously, libmvec was only supported on x86_64 targets. Attempts to use it on AArch64 resulted in the following error from Clang: unsupported option 'libmvec' for target 'aarch64'.


Patch is 219.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143696.diff

11 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/include/clang/Driver/Options.td (+3-2)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+8-1)
  • (modified) clang/test/Driver/fveclib.c (+9-1)
  • (modified) llvm/include/llvm/Analysis/VecFuncs.def (+299)
  • (modified) llvm/lib/Analysis/TargetLibraryInfo.cpp (+30)
  • (added) llvm/test/CodeGen/AArch64/replace-with-veclib-libmvec-scalable.ll (+579)
  • (added) llvm/test/CodeGen/AArch64/replace-with-veclib-libmvec.ll (+577)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/veclib-function-calls.ll (+690)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/veclib-intrinsic-calls.ll (+502)
  • (modified) llvm/test/Transforms/Util/add-TLI-mappings.ll (+23-5)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b5e6cf088a4b1..11c23064ab604 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -357,6 +357,8 @@ Modified Compiler Flags
 
 - The ``-fchar8_t`` flag is no longer considered in non-C++ languages modes. (#GH55373)
 
+- The ``-fveclib=libmvec`` option now supports AArch64 targets (requires GLIBC 2.40 or newer).
+
 Removed Compiler Flags
 -------------------------
 
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152df89118a6a..b886b75fa4fa9 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3473,8 +3473,9 @@ def fveclib : Joined<["-"], "fveclib=">, Group<f_Group>,
   Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
     HelpText<"Use the given vector functions library">,
     HelpTextForVariants<[ClangOption, CC1Option],
-      "Use the given vector functions library. "
-      "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">,
+      "Use the given vector functions library.\n"
+      "  Note: -fveclib={ArmPL,SLEEF,libmvec} implies -fno-math-errno.\n"
+      "  Note: -fveclib=libmvec on AArch64 requires GLIBC 2.40 or newer.">,
     Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">,
     NormalizedValuesScope<"llvm::driver::VectorLibrary">,
     NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index a74fa81f3cf5b..fdc023d193aa9 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5683,11 +5683,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
           Triple.getArch() != llvm::Triple::x86_64)
         D.Diag(diag::err_drv_unsupported_opt_for_target)
             << Name << Triple.getArchName();
-    } else if (Name == "libmvec" || Name == "AMDLIBM") {
+    } else if (Name == "AMDLIBM") {
       if (Triple.getArch() != llvm::Triple::x86 &&
           Triple.getArch() != llvm::Triple::x86_64)
         D.Diag(diag::err_drv_unsupported_opt_for_target)
             << Name << Triple.getArchName();
+    } else if (Name == "libmvec") {
+      if (Triple.getArch() != llvm::Triple::x86 &&
+          Triple.getArch() != llvm::Triple::x86_64 &&
+          Triple.getArch() != llvm::Triple::aarch64 &&
+          Triple.getArch() != llvm::Triple::aarch64_be)
+        D.Diag(diag::err_drv_unsupported_opt_for_target)
+            << Name << Triple.getArchName();
     } else if (Name == "SLEEF" || Name == "ArmPL") {
       if (Triple.getArch() != llvm::Triple::aarch64 &&
           Triple.getArch() != llvm::Triple::aarch64_be &&
diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c
index 5420555c36a2a..c57e9aa7a3cc2 100644
--- a/clang/test/Driver/fveclib.c
+++ b/clang/test/Driver/fveclib.c
@@ -1,6 +1,7 @@
 // RUN: %clang -### -c -fveclib=none %s 2>&1 | FileCheck --check-prefix=CHECK-NOLIB %s
 // RUN: %clang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-ACCELERATE %s
 // RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s
+// RUN: %clang -### -c --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-LIBMVEC-AARCH64 %s
 // RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-AMDLIBM %s
 // RUN: %clang -### -c -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-MASSV %s
 // RUN: %clang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck --check-prefix=CHECK-DARWIN_LIBSYSTEM_M %s
@@ -12,6 +13,7 @@
 // CHECK-NOLIB: "-fveclib=none"
 // CHECK-ACCELERATE: "-fveclib=Accelerate"
 // CHECK-libmvec: "-fveclib=libmvec"
+// CHECK-LIBMVEC-AARCH64: "-fveclib=libmvec"
 // CHECK-AMDLIBM: "-fveclib=AMDLIBM"
 // CHECK-MASSV: "-fveclib=MASSV"
 // CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m"
@@ -23,7 +25,6 @@
 
 // RUN: not %clang --target=x86 -c -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // RUN: not %clang --target=x86 -c -fveclib=ArmPL %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
-// RUN: not %clang --target=aarch64 -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // RUN: not %clang --target=aarch64 -c -fveclib=SVML %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // RUN: not %clang --target=aarch64 -c -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // CHECK-ERROR: unsupported option {{.*}} for target
@@ -43,6 +44,9 @@
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s
 // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC"
 
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC-AARCH64 %s
+// CHECK-LTO-LIBMVEC-AARCH64: "-plugin-opt=-vector-library=LIBMVEC"
+
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-AMDLIBM %s
 // CHECK-LTO-AMDLIBM: "-plugin-opt=-vector-library=AMDLIBM"
 
@@ -68,6 +72,10 @@
 // CHECK-ERRNO-LIBMVEC: "-fveclib=libmvec"
 // CHECK-ERRNO-LIBMVEC-SAME: "-fmath-errno"
 
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-LIBMVEC-AARCH64 %s
+// CHECK-ERRNO-LIBMVEC-AARCH64: "-fveclib=libmvec"
+// CHECK-ERRNO-LIBMVEC-AARCH64-SAME: "-fmath-errno"
+
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-AMDLIBM %s
 // CHECK-ERRNO-AMDLIBM: "-fveclib=AMDLIBM"
 // CHECK-ERRNO-AMDLIBM-SAME: "-fmath-errno"
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index 68753a2497db2..cb8e6755a486b 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -237,6 +237,305 @@ TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVdN4v_log", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVbN4v_logf", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVdN8v_logf", FIXED(8), "_ZGV_LLVM_N8v")
 
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVnN2v_acosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN2v_acoshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN2v_asinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVnN2v_asinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN2v_asinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN2vv_atan2f", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVnN2v_atanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN2v_atanhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVnN2v_cbrt", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN2v_cbrtf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN2v_cosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVnN2v_erf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN2v_erff", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVnN2v_erfc", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN2v_erfcf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVnN2v_expm1", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN2v_expm1f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVnN2vv_hypot", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN2vv_hypotf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVnN2v_log1p", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN2v_log1pf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f64", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("pow", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("sin", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("sinh", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tan", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f64", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tanh", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN4v_acoshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN4v_asinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN4v_atanhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN4v_cbrtf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN4v_erff", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN4v_erfcf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN4v_expm1f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN4vv_hypotf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN4v_log1pf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_SCALABLE_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVsMxv_acosh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVsMxv_acoshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVsMxv_asinh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVsMxv_asinhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVsMxv_atanh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVsMxv_atanhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVsMxv_cbrt",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVsMxv_cbrtf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVsMxv_cos",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVsMxv_cos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVsMxv_cosh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVsMxv_erf",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erff", "_ZGVsMxv_erff", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVsMxv_erfc",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVsMxv_erfcf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVsMxv_exp",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expf", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVsMxv_exp", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVsMxv_exp10",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVsMxv_exp10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVsMxv_exp2",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVsMxv_exp2", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVsMxv_expm1",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVsMxv_expm1f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVsMxvv_hypot", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVsMxvv_hypotf", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVsMxv_log",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("logf", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVsMxv_log", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVsMxv_log10",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVsMxv_log10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVsMxv_log1p",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVsMxv_log1pf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVsMxv_log2",  SCALABLE(2), MASKED, "_ZGVsMxv")...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jun 11, 2025

@llvm/pr-subscribers-clang

Author: Mary Kassayova (marykass-arm)

Changes

This patch adds support for the libmvec vector library on AArch64 targets. Currently, all libmvec functions in GLIBC version 2.40 are supported. The full list of math functions enabled can be found here (up to GLIBC 2.40).

Previously, libmvec was only supported on x86_64 targets. Attempts to use it on AArch64 resulted in the following error from Clang: unsupported option 'libmvec' for target 'aarch64'.


Patch is 219.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143696.diff

11 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/include/clang/Driver/Options.td (+3-2)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+8-1)
  • (modified) clang/test/Driver/fveclib.c (+9-1)
  • (modified) llvm/include/llvm/Analysis/VecFuncs.def (+299)
  • (modified) llvm/lib/Analysis/TargetLibraryInfo.cpp (+30)
  • (added) llvm/test/CodeGen/AArch64/replace-with-veclib-libmvec-scalable.ll (+579)
  • (added) llvm/test/CodeGen/AArch64/replace-with-veclib-libmvec.ll (+577)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/veclib-function-calls.ll (+690)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/veclib-intrinsic-calls.ll (+502)
  • (modified) llvm/test/Transforms/Util/add-TLI-mappings.ll (+23-5)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b5e6cf088a4b1..11c23064ab604 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -357,6 +357,8 @@ Modified Compiler Flags
 
 - The ``-fchar8_t`` flag is no longer considered in non-C++ languages modes. (#GH55373)
 
+- The ``-fveclib=libmvec`` option now supports AArch64 targets (requires GLIBC 2.40 or newer).
+
 Removed Compiler Flags
 -------------------------
 
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152df89118a6a..b886b75fa4fa9 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3473,8 +3473,9 @@ def fveclib : Joined<["-"], "fveclib=">, Group<f_Group>,
   Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
     HelpText<"Use the given vector functions library">,
     HelpTextForVariants<[ClangOption, CC1Option],
-      "Use the given vector functions library. "
-      "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">,
+      "Use the given vector functions library.\n"
+      "  Note: -fveclib={ArmPL,SLEEF,libmvec} implies -fno-math-errno.\n"
+      "  Note: -fveclib=libmvec on AArch64 requires GLIBC 2.40 or newer.">,
     Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">,
     NormalizedValuesScope<"llvm::driver::VectorLibrary">,
     NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index a74fa81f3cf5b..fdc023d193aa9 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5683,11 +5683,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
           Triple.getArch() != llvm::Triple::x86_64)
         D.Diag(diag::err_drv_unsupported_opt_for_target)
             << Name << Triple.getArchName();
-    } else if (Name == "libmvec" || Name == "AMDLIBM") {
+    } else if (Name == "AMDLIBM") {
       if (Triple.getArch() != llvm::Triple::x86 &&
           Triple.getArch() != llvm::Triple::x86_64)
         D.Diag(diag::err_drv_unsupported_opt_for_target)
             << Name << Triple.getArchName();
+    } else if (Name == "libmvec") {
+      if (Triple.getArch() != llvm::Triple::x86 &&
+          Triple.getArch() != llvm::Triple::x86_64 &&
+          Triple.getArch() != llvm::Triple::aarch64 &&
+          Triple.getArch() != llvm::Triple::aarch64_be)
+        D.Diag(diag::err_drv_unsupported_opt_for_target)
+            << Name << Triple.getArchName();
     } else if (Name == "SLEEF" || Name == "ArmPL") {
       if (Triple.getArch() != llvm::Triple::aarch64 &&
           Triple.getArch() != llvm::Triple::aarch64_be &&
diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c
index 5420555c36a2a..c57e9aa7a3cc2 100644
--- a/clang/test/Driver/fveclib.c
+++ b/clang/test/Driver/fveclib.c
@@ -1,6 +1,7 @@
 // RUN: %clang -### -c -fveclib=none %s 2>&1 | FileCheck --check-prefix=CHECK-NOLIB %s
 // RUN: %clang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-ACCELERATE %s
 // RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s
+// RUN: %clang -### -c --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-LIBMVEC-AARCH64 %s
 // RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-AMDLIBM %s
 // RUN: %clang -### -c -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-MASSV %s
 // RUN: %clang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck --check-prefix=CHECK-DARWIN_LIBSYSTEM_M %s
@@ -12,6 +13,7 @@
 // CHECK-NOLIB: "-fveclib=none"
 // CHECK-ACCELERATE: "-fveclib=Accelerate"
 // CHECK-libmvec: "-fveclib=libmvec"
+// CHECK-LIBMVEC-AARCH64: "-fveclib=libmvec"
 // CHECK-AMDLIBM: "-fveclib=AMDLIBM"
 // CHECK-MASSV: "-fveclib=MASSV"
 // CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m"
@@ -23,7 +25,6 @@
 
 // RUN: not %clang --target=x86 -c -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // RUN: not %clang --target=x86 -c -fveclib=ArmPL %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
-// RUN: not %clang --target=aarch64 -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // RUN: not %clang --target=aarch64 -c -fveclib=SVML %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // RUN: not %clang --target=aarch64 -c -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
 // CHECK-ERROR: unsupported option {{.*}} for target
@@ -43,6 +44,9 @@
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s
 // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC"
 
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC-AARCH64 %s
+// CHECK-LTO-LIBMVEC-AARCH64: "-plugin-opt=-vector-library=LIBMVEC"
+
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-AMDLIBM %s
 // CHECK-LTO-AMDLIBM: "-plugin-opt=-vector-library=AMDLIBM"
 
@@ -68,6 +72,10 @@
 // CHECK-ERRNO-LIBMVEC: "-fveclib=libmvec"
 // CHECK-ERRNO-LIBMVEC-SAME: "-fmath-errno"
 
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-LIBMVEC-AARCH64 %s
+// CHECK-ERRNO-LIBMVEC-AARCH64: "-fveclib=libmvec"
+// CHECK-ERRNO-LIBMVEC-AARCH64-SAME: "-fmath-errno"
+
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-AMDLIBM %s
 // CHECK-ERRNO-AMDLIBM: "-fveclib=AMDLIBM"
 // CHECK-ERRNO-AMDLIBM-SAME: "-fmath-errno"
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index 68753a2497db2..cb8e6755a486b 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -237,6 +237,305 @@ TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVdN4v_log", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVbN4v_logf", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVdN8v_logf", FIXED(8), "_ZGV_LLVM_N8v")
 
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVnN2v_acos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN2v_acosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVnN2v_acosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN2v_acoshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVnN2v_asin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN2v_asinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVnN2v_asinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN2v_asinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVnN2v_atan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN2v_atanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVnN2vv_atan2", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN2vv_atan2f", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVnN2v_atanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN2v_atanhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVnN2v_cbrt", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN2v_cbrtf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVnN2v_cos", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN2v_cosf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVnN2v_cosh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN2v_coshf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVnN2v_erf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN2v_erff", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVnN2v_erfc", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN2v_erfcf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVnN2v_exp", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN2v_expf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVnN2v_exp10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN2v_exp10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVnN2v_exp2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN2v_exp2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVnN2v_expm1", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN2v_expm1f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVnN2vv_hypot", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN2vv_hypotf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVnN2v_log", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN2v_logf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVnN2v_log10", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN2v_log10f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVnN2v_log1p", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN2v_log1pf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f64", "_ZGVnN2v_log2", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN2v_log2f", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("pow", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVnN2vv_pow", "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN2vv_powf", "_ZGV_LLVM_N2vv")
+
+TLI_DEFINE_VECFUNC("sin", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVnN2v_sin", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN2v_sinf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("sinh", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "_ZGVnN2v_sinh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN2v_sinhf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tan", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f64", "_ZGVnN2v_tan", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN2v_tanf", "_ZGV_LLVM_N2v")
+
+TLI_DEFINE_VECFUNC("tanh", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "_ZGVnN2v_tanh", "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN2v_tanhf", "_ZGV_LLVM_N2v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acosf", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN4v_acosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN4v_acoshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinf", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN4v_asinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN4v_asinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atanf", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN4v_atanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVnN4vv_atan2f", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVnN4v_atanhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVnN4v_cbrtf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("cosf", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN4v_cosf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("coshf", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN4v_coshf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erff", "_ZGVnN4v_erff", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVnN4v_erfcf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expf", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVnN4v_expf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVnN4v_exp10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVnN4v_exp2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVnN4v_expm1f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVnN4vv_hypotf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("logf", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVnN4v_logf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log10f", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVnN4v_log10f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVnN4v_log1pf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("log2f", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.log2.f32", "_ZGVnN4v_log2f", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("powf", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVnN4vv_powf", "_ZGV_LLVM_N4vv")
+
+TLI_DEFINE_VECFUNC("sinf", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVnN4v_sinf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN4v_sinhf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanf", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN4v_tanf", "_ZGV_LLVM_N4v")
+
+TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN4v_tanhf", "_ZGV_LLVM_N4v")
+
+#elif defined(TLI_DEFINE_LIBMVEC_AARCH64_SCALABLE_VECFUNCS)
+
+TLI_DEFINE_VECFUNC("acos", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acosf", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("acosh", "_ZGVsMxv_acosh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("acoshf", "_ZGVsMxv_acoshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asin", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinf", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("asinh", "_ZGVsMxv_asinh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("asinhf", "_ZGVsMxv_asinhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanf", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("atan2", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("atan2f", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "_ZGVsMxvv_atan2", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("atanh", "_ZGVsMxv_atanh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("atanhf", "_ZGVsMxv_atanhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cbrt", "_ZGVsMxv_cbrt",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cbrtf", "_ZGVsMxv_cbrtf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cos", "_ZGVsMxv_cos",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVsMxv_cos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("cosh", "_ZGVsMxv_cosh",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("coshf", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erf", "_ZGVsMxv_erf",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erff", "_ZGVsMxv_erff", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("erfc", "_ZGVsMxv_erfc",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("erfcf", "_ZGVsMxv_erfcf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp", "_ZGVsMxv_exp",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expf", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVsMxv_exp", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_ZGVsMxv_expf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp10", "_ZGVsMxv_exp10",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp10f", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f64", "_ZGVsMxv_exp10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp10.f32", "_ZGVsMxv_exp10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("exp2", "_ZGVsMxv_exp2",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("exp2f", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f64", "_ZGVsMxv_exp2", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.exp2.f32", "_ZGVsMxv_exp2f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("expm1", "_ZGVsMxv_expm1",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("expm1f", "_ZGVsMxv_expm1f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("hypot", "_ZGVsMxvv_hypot", SCALABLE(2), MASKED, "_ZGVsMxvv")
+TLI_DEFINE_VECFUNC("hypotf", "_ZGVsMxvv_hypotf", SCALABLE(4), MASKED, "_ZGVsMxvv")
+
+TLI_DEFINE_VECFUNC("log", "_ZGVsMxv_log",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("logf", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f64", "_ZGVsMxv_log", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "_ZGVsMxv_logf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log10", "_ZGVsMxv_log10",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log10f", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f64", "_ZGVsMxv_log10", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "_ZGVsMxv_log10f", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log1p", "_ZGVsMxv_log1p",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("log1pf", "_ZGVsMxv_log1pf", SCALABLE(4), MASKED, "_ZGVsMxv")
+
+TLI_DEFINE_VECFUNC("log2", "_ZGVsMxv_log2",  SCALABLE(2), MASKED, "_ZGVsMxv")...
[truncated]

@marykass-arm
Copy link
Contributor Author

marykass-arm commented Jun 11, 2025

Comment on lines 1302 to 1317
static const VecDesc VecFuncs_LIBMVEC_AARCH64_VF2[] = {
#define TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VABI_PREFIX) \
{SCAL, VEC, /* VF = */ FIXED(2), /* MASK = */ false, VABI_PREFIX, \
/* CC = */ CallingConv::AArch64_VectorCall},
#include "llvm/Analysis/VecFuncs.def"
#undef TLI_DEFINE_LIBMVEC_AARCH64_VF2_VECFUNCS
};
static const VecDesc VecFuncs_LIBMVEC_AARCH64_VF4[] = {
#define TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VABI_PREFIX) \
{SCAL, VEC, /* VF = */ FIXED(4), /* MASK = */ false, VABI_PREFIX, \
/* CC = */ CallingConv::AArch64_VectorCall},
#include "llvm/Analysis/VecFuncs.def"
#undef TLI_DEFINE_LIBMVEC_AARCH64_VF4_VECFUNCS
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I see much value in those being defined by separate macros, can those be merged?

Copy link
Collaborator

@paulwalker-arm paulwalker-arm Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, I was thinking of copying this idiom to remove much of the redundancy from how we currently define the ArmPL routines. That said, I'm happy either way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I asked this is that when all (e.g. sin) function-mappings are defined together, it is easier to see what variants are missing, whereas now it's artificially split between two macros.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, in which case I suggest following the idiom we use for ArmPL. I'm happy to circle back to see about fulfilling both ideals. @marykass-arm sorry about the code churn.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulwalker-arm @sdesmalen-arm Should I merge all three macros (following ArmPL) or just merge fixed and keep scalable separated?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on Sander's rational I'd merge all three macros. This means we're switching from the SLEEF style to the ArmPL style, which I prefer over creating a new one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that would be even better, thanks!

Comment on lines 1320 to 1321
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MASK, VABI_PREFIX) \
{SCAL, VEC, VF, MASK, VABI_PREFIX, /* CC = */ std::nullopt},
Copy link
Collaborator

@paulwalker-arm paulwalker-arm Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the SVE functions are masked so you could pass in /* MASK = */ true.

@MacDue MacDue self-requested a review June 12, 2025 10:06
Copy link

github-actions bot commented Jun 12, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Change-Id: Ib750e05d6daeca404a02b214272727827af13b2d
Copy link
Member

@MacDue MacDue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double-checked the functions added cover up to 2.40 -- I think they do 👍 Generally, LGTM -- but I'll leave final approval to Sander/Paul.

Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not spot any tests for the <2 x float> vector functions? Hopefully it's as easy as adding extra RUN lines to veclib-function-calls.ll and veclib-intrinsic-calls.ll. I'm thinking variants of LIBMVEC-NEON that use -force-vector-width=2. If this works then it's worth changing the existing LIBMVEC-NEON RUN lines to use -force-vector-width=4 so that both variants are locked in.

FYI: I've crossed out my final suggestion because that would also force the need for <4 x double> vector functions, which we don't have.

…ping

Change-Id: Ia863b7dee5585d782cd049d99758b7aba04b5c2c
@marykass-arm marykass-arm force-pushed the libmvec branch 2 times, most recently from e4d49c4 to 8264797 Compare June 16, 2025 15:01
Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation build failure has been verified locally as an existing upstream issue and not the fault of this PR.

@sdesmalen-arm sdesmalen-arm merged commit c377ce1 into llvm:main Jun 17, 2025
12 of 15 checks passed
Copy link

@marykass-arm Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 17, 2025
This patch adds support for the `libmvec` vector library on AArch64
targets. Currently, all `libmvec` functions in GLIBC version 2.40 are
supported. The full list of math functions enabled can be found
[here](https://github.com/bminor/glibc/blob/96abd59bf2a11ddd4e7ccaac840ec13c0b62d3ba/sysdeps/aarch64/fpu/Versions)
(up to GLIBC 2.40).

Previously, `libmvec` was only supported on x86_64 targets. Attempts to
use it on AArch64 resulted in the following error from Clang:
`unsupported option 'libmvec' for target 'aarch64'`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category llvm:analysis llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants