Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mechanism propagating mangled names for TLI function mappings #66656

Conversation

JolantaJensen
Copy link
Contributor

Currently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern:
ZGV<scalar_name>[(<vector_redirection>)] The problem is the mangled name uses LLVM as the ISA name which prevents the compiler to compute vectorization factor for scalable vectors as it cannot make any decision based on the LLVM ISA. If we use "s" as the ISA name, the compiler can make decisions based on VFABI specification where SVE spacific rules are described.

This patch is only a refactoring stage where there is no change to the compiler's behaviour.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 18, 2023

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Changes

Currently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern:
ZGV<isa><mask><vlen><parameters><scalar_name>[(<vector_redirection>)] The problem is the mangled name uses LLVM as the ISA name which prevents the compiler to compute vectorization factor for scalable vectors as it cannot make any decision based on the LLVM ISA. If we use "s" as the ISA name, the compiler can make decisions based on VFABI specification where SVE spacific rules are described.

This patch is only a refactoring stage where there is no change to the compiler's behaviour.


Patch is 117.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66656.diff

9 Files Affected:

  • (modified) llvm/include/llvm/Analysis/TargetLibraryInfo.h (+23-8)
  • (modified) llvm/include/llvm/Analysis/VecFuncs.def (+667-667)
  • (modified) llvm/include/llvm/Analysis/VectorUtils.h (-21)
  • (modified) llvm/lib/Analysis/TargetLibraryInfo.cpp (+13-10)
  • (modified) llvm/lib/Analysis/VectorUtils.cpp (-16)
  • (modified) llvm/lib/CodeGen/ReplaceWithVeclib.cpp (+1-2)
  • (modified) llvm/lib/Transforms/Utils/InjectTLIMappings.cpp (+7-7)
  • (modified) llvm/test/Transforms/Util/add-TLI-mappings.ll (+13-3)
  • (modified) llvm/unittests/Analysis/VectorFunctionABITest.cpp (-13)
diff --git a/llvm/include/llvm/Analysis/TargetLibraryInfo.h b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
index 5d62e837c1f3d5f..b15f20f43ab29a2 100644
--- a/llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -27,11 +27,25 @@ class Triple;
 /// Describes a possible vectorization of a function.
 /// Function 'VectorFnName' is equivalent to 'ScalarFnName' vectorized
 /// by a factor 'VectorizationFactor'.
+/// The MangledName string holds scalar-to-vector mapping:
+///    _ZGV<isa><mask><vlen><vparams>_<scalarname>(<vectorname>)
+///
+/// where:
+///
+/// <isa> = "_LLVM_"
+/// <mask> = "M" if masked, "N" if no mask.
+/// <vlen> = Number of concurrent lanes, stored in the `VectorizationFactor`
+///          field of the `VecDesc` struct. If the number of lanes is scalable
+///          then 'x' is printed instead.
+/// <vparams> = "v", as many as are the numArgs.
+/// <scalarname> = the name of the scalar function.
+/// <vectorname> = the name of the vector function.
 struct VecDesc {
   StringRef ScalarFnName;
   StringRef VectorFnName;
   ElementCount VectorizationFactor;
   bool Masked;
+  StringRef MangledName;
 };
 
   enum LibFunc : unsigned {
@@ -163,18 +177,18 @@ class TargetLibraryInfoImpl {
   /// Return true if the function F has a vector equivalent with vectorization
   /// factor VF.
   bool isFunctionVectorizable(StringRef F, const ElementCount &VF) const {
-    return !(getVectorizedFunction(F, VF, false).empty() &&
-             getVectorizedFunction(F, VF, true).empty());
+    return !(getVectorizedFunction(F, VF, false).first.empty() &&
+             getVectorizedFunction(F, VF, true).first.empty());
   }
 
   /// Return true if the function F has a vector equivalent with any
   /// vectorization factor.
   bool isFunctionVectorizable(StringRef F) const;
 
-  /// Return the name of the equivalent of F, vectorized with factor VF. If no
-  /// such mapping exists, return the empty string.
-  StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,
-                                  bool Masked) const;
+  /// Return the name of the equivalent of F, vectorized with factor VF and it's
+  /// mangled name. If no such mapping exists, return empty strings.
+  std::pair<StringRef, StringRef>
+  getVectorizedFunction(StringRef F, const ElementCount &VF, bool Masked) const;
 
   /// Set to true iff i32 parameters to library functions should have signext
   /// or zeroext attributes if they correspond to C-level int or unsigned int,
@@ -350,8 +364,9 @@ class TargetLibraryInfo {
   bool isFunctionVectorizable(StringRef F) const {
     return Impl->isFunctionVectorizable(F);
   }
-  StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,
-                                  bool Masked = false) const {
+  std::pair<StringRef, StringRef>
+  getVectorizedFunction(StringRef F, const ElementCount &VF,
+                        bool Masked = false) const {
     return Impl->getVectorizedFunction(F, VF, Masked);
   }
 
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index 98bcfe3843669f7..77cc458823ebc8b 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -14,7 +14,7 @@
 
 #if defined(TLI_DEFINE_MASSV_VECFUNCS_NAMES)
 #define TLI_DEFINE_MASSV_VECFUNCS
-#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF) VEC,
+#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MANGLN) VEC,
 #endif
 
 #define FIXED(NL) ElementCount::getFixed(NL)
@@ -23,860 +23,860 @@
 #define MASKED true
 
 #if !(defined(TLI_DEFINE_VECFUNC))
-#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF) {SCAL, VEC, VF, NOMASK},
+#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MANGLN) {SCAL, VEC, VF, NOMASK, MANGLN},
 #endif
 
 #if defined(TLI_DEFINE_ACCELERATE_VECFUNCS)
 // Accelerate framework's Vector Functions
 
 // Floating-Point Arithmetic and Auxiliary Functions
-TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4))
-TLI_DEFINE_VECFUNC("fabsf", "vfabsf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.fabs.f32", "vfabsf", FIXED(4))
-TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4))
-TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4))
+TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4), "_ZGV_LLVM_N4v_ceilf(vceilf)")
+TLI_DEFINE_VECFUNC("fabsf", "vfabsf", FIXED(4), "_ZGV_LLVM_N4v_fabsf(vfabsf)")
+TLI_DEFINE_VECFUNC("llvm.fabs.f32", "vfabsf", FIXED(4), "_ZGV_LLVM_N4v_llvm.fabs.f32(vfabsf)")
+TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4), "_ZGV_LLVM_N4v_floorf(vfloorf)")
+TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4), "_ZGV_LLVM_N4v_sqrtf(vsqrtf)")
+TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4), "_ZGV_LLVM_N4v_llvm.sqrt.f32(vsqrtf)")
 
 // Exponential and Logarithmic Functions
-TLI_DEFINE_VECFUNC("expf", "vexpf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.exp.f32", "vexpf", FIXED(4))
-TLI_DEFINE_VECFUNC("expm1f", "vexpm1f", FIXED(4))
-TLI_DEFINE_VECFUNC("logf", "vlogf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.log.f32", "vlogf", FIXED(4))
-TLI_DEFINE_VECFUNC("log1pf", "vlog1pf", FIXED(4))
-TLI_DEFINE_VECFUNC("log10f", "vlog10f", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.log10.f32", "vlog10f", FIXED(4))
-TLI_DEFINE_VECFUNC("logbf", "vlogbf", FIXED(4))
+TLI_DEFINE_VECFUNC("expf", "vexpf", FIXED(4), "_ZGV_LLVM_N4v_expf(vexpf)")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "vexpf", FIXED(4), "_ZGV_LLVM_N4v_llvm.exp.f32(vexpf)")
+TLI_DEFINE_VECFUNC("expm1f", "vexpm1f", FIXED(4), "_ZGV_LLVM_N4v_expm1f(vexpm1f)")
+TLI_DEFINE_VECFUNC("logf", "vlogf", FIXED(4), "_ZGV_LLVM_N4v_logf(vlogf)")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "vlogf", FIXED(4), "_ZGV_LLVM_N4v_llvm.log.f32(vlogf)")
+TLI_DEFINE_VECFUNC("log1pf", "vlog1pf", FIXED(4), "_ZGV_LLVM_N4v_log1pf(vlog1pf)")
+TLI_DEFINE_VECFUNC("log10f", "vlog10f", FIXED(4), "_ZGV_LLVM_N4v_log10f(vlog10f)")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "vlog10f", FIXED(4), "_ZGV_LLVM_N4v_llvm.log10.f32(vlog10f)")
+TLI_DEFINE_VECFUNC("logbf", "vlogbf", FIXED(4), "_ZGV_LLVM_N4v_logbf(vlogbf)")
 
 // Trigonometric Functions
-TLI_DEFINE_VECFUNC("sinf", "vsinf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "vsinf", FIXED(4))
-TLI_DEFINE_VECFUNC("cosf", "vcosf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "vcosf", FIXED(4))
-TLI_DEFINE_VECFUNC("tanf", "vtanf", FIXED(4))
-TLI_DEFINE_VECFUNC("asinf", "vasinf", FIXED(4))
-TLI_DEFINE_VECFUNC("acosf", "vacosf", FIXED(4))
-TLI_DEFINE_VECFUNC("atanf", "vatanf", FIXED(4))
+TLI_DEFINE_VECFUNC("sinf", "vsinf", FIXED(4), "_ZGV_LLVM_N4v_sinf(vsinf)")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "vsinf", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f32(vsinf)")
+TLI_DEFINE_VECFUNC("cosf", "vcosf", FIXED(4), "_ZGV_LLVM_N4v_cosf(vcosf)")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "vcosf", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f32(vcosf)")
+TLI_DEFINE_VECFUNC("tanf", "vtanf", FIXED(4), "_ZGV_LLVM_N4v_tanf(vtanf)")
+TLI_DEFINE_VECFUNC("asinf", "vasinf", FIXED(4), "_ZGV_LLVM_N4v_asinf(vasinf)")
+TLI_DEFINE_VECFUNC("acosf", "vacosf", FIXED(4), "_ZGV_LLVM_N4v_acosf(vacosf)")
+TLI_DEFINE_VECFUNC("atanf", "vatanf", FIXED(4), "_ZGV_LLVM_N4v_atanf(vatanf)")
 
 // Hyperbolic Functions
-TLI_DEFINE_VECFUNC("sinhf", "vsinhf", FIXED(4))
-TLI_DEFINE_VECFUNC("coshf", "vcoshf", FIXED(4))
-TLI_DEFINE_VECFUNC("tanhf", "vtanhf", FIXED(4))
-TLI_DEFINE_VECFUNC("asinhf", "vasinhf", FIXED(4))
-TLI_DEFINE_VECFUNC("acoshf", "vacoshf", FIXED(4))
-TLI_DEFINE_VECFUNC("atanhf", "vatanhf", FIXED(4))
+TLI_DEFINE_VECFUNC("sinhf", "vsinhf", FIXED(4), "_ZGV_LLVM_N4v_sinhf(vsinhf)")
+TLI_DEFINE_VECFUNC("coshf", "vcoshf", FIXED(4), "_ZGV_LLVM_N4v_coshf(vcoshf)")
+TLI_DEFINE_VECFUNC("tanhf", "vtanhf", FIXED(4), "_ZGV_LLVM_N4v_tanhf(vtanhf)")
+TLI_DEFINE_VECFUNC("asinhf", "vasinhf", FIXED(4), "_ZGV_LLVM_N4v_asinhf(vasinhf)")
+TLI_DEFINE_VECFUNC("acoshf", "vacoshf", FIXED(4), "_ZGV_LLVM_N4v_acoshf(vacoshf)")
+TLI_DEFINE_VECFUNC("atanhf", "vatanhf", FIXED(4), "_ZGV_LLVM_N4v_atanhf(vatanhf)")
 
 #elif defined(TLI_DEFINE_DARWIN_LIBSYSTEM_M_VECFUNCS)
 // Darwin libsystem_m vector functions.
 
 // Exponential and Logarithmic Functions
-TLI_DEFINE_VECFUNC("exp", "_simd_exp_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.exp.f64", "_simd_exp_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("expf", "_simd_exp_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.exp.f32", "_simd_exp_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("exp", "_simd_exp_d2", FIXED(2), "_ZGV_LLVM_N2v_exp(_simd_exp_d2)")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_simd_exp_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.exp.f64(_simd_exp_d2)")
+TLI_DEFINE_VECFUNC("expf", "_simd_exp_f4", FIXED(4), "_ZGV_LLVM_N4v_expf(_simd_exp_f4)")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_simd_exp_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.exp.f32(_simd_exp_f4)")
 
 // Trigonometric Functions
-TLI_DEFINE_VECFUNC("acos", "_simd_acos_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("acosf", "_simd_acos_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("asin", "_simd_asin_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("asinf", "_simd_asin_f4", FIXED(4))
-
-TLI_DEFINE_VECFUNC("atan", "_simd_atan_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("atanf", "_simd_atan_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4))
-
-TLI_DEFINE_VECFUNC("cos", "_simd_cos_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.cos.f64", "_simd_cos_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("cosf", "_simd_cos_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "_simd_cos_f4", FIXED(4))
-
-TLI_DEFINE_VECFUNC("sin", "_simd_sin_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.sin.f64", "_simd_sin_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("sinf", "_simd_sin_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "_simd_sin_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("acos", "_simd_acos_d2", FIXED(2), "_ZGV_LLVM_N2v_acos(_simd_acos_d2)")
+TLI_DEFINE_VECFUNC("acosf", "_simd_acos_f4", FIXED(4), "_ZGV_LLVM_N4v_acosf(_simd_acos_f4)")
+TLI_DEFINE_VECFUNC("asin", "_simd_asin_d2", FIXED(2), "_ZGV_LLVM_N2v_asin(_simd_asin_d2)")
+TLI_DEFINE_VECFUNC("asinf", "_simd_asin_f4", FIXED(4), "_ZGV_LLVM_N4v_asinf(_simd_asin_f4)")
+
+TLI_DEFINE_VECFUNC("atan", "_simd_atan_d2", FIXED(2), "_ZGV_LLVM_N2v_atan(_simd_atan_d2)")
+TLI_DEFINE_VECFUNC("atanf", "_simd_atan_f4", FIXED(4), "_ZGV_LLVM_N4v_atanf(_simd_atan_f4)")
+TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2), "_ZGV_LLVM_N2v_atan2(_simd_atan2_d2)")
+TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4), "_ZGV_LLVM_N4v_atan2f(_simd_atan2_f4)")
+
+TLI_DEFINE_VECFUNC("cos", "_simd_cos_d2", FIXED(2), "_ZGV_LLVM_N2v_cos(_simd_cos_d2)")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_simd_cos_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.cos.f64(_simd_cos_d2)")
+TLI_DEFINE_VECFUNC("cosf", "_simd_cos_f4", FIXED(4), "_ZGV_LLVM_N4v_cosf(_simd_cos_f4)")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_simd_cos_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f32(_simd_cos_f4)")
+
+TLI_DEFINE_VECFUNC("sin", "_simd_sin_d2", FIXED(2), "_ZGV_LLVM_N2v_sin(_simd_sin_d2)")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_simd_sin_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.sin.f64(_simd_sin_d2)")
+TLI_DEFINE_VECFUNC("sinf", "_simd_sin_f4", FIXED(4), "_ZGV_LLVM_N4v_sinf(_simd_sin_f4)")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_simd_sin_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f32(_simd_sin_f4)")
 
 // Floating-Point Arithmetic and Auxiliary Functions
-TLI_DEFINE_VECFUNC("cbrt", "_simd_cbrt_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("cbrtf", "_simd_cbrt_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("powf", "_simd_pow_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("cbrt", "_simd_cbrt_d2", FIXED(2), "_ZGV_LLVM_N2v_cbrt(_simd_cbrt_d2)")
+TLI_DEFINE_VECFUNC("cbrtf", "_simd_cbrt_f4", FIXED(4), "_ZGV_LLVM_N4v_cbrtf(_simd_cbrt_f4)")
+TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2), "_ZGV_LLVM_N2v_erf(_simd_erf_d2)")
+TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4), "_ZGV_LLVM_N4v_erff(_simd_erf_f4)")
+TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_pow(_simd_pow_d2)")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.pow.f64(_simd_pow_d2)")
+TLI_DEFINE_VECFUNC("powf", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v_powf(_simd_pow_f4)")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.pow.f32(_simd_pow_f4)")
 
 // Hyperbolic Functions
-TLI_DEFINE_VECFUNC("sinh", "_simd_sinh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("sinhf", "_simd_sinh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("cosh", "_simd_cosh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("coshf", "_simd_cosh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("tanh", "_simd_tanh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("tanhf", "_simd_tanh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("asinh", "_simd_asinh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("asinhf", "_simd_asinh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("acosh", "_simd_acosh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("acoshf", "_simd_acosh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("atanh", "_simd_atanh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("atanhf", "_simd_atanh_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("sinh", "_simd_sinh_d2", FIXED(2), "_ZGV_LLVM_N2v_sinh(_simd_sinh_d2)")
+TLI_DEFINE_VECFUNC("sinhf", "_simd_sinh_f4", FIXED(4), "_ZGV_LLVM_N4v_sinhf(_simd_sinh_f4)")
+TLI_DEFINE_VECFUNC("cosh", "_simd_cosh_d2", FIXED(2), "_ZGV_LLVM_N2v_cosh(_simd_cosh_d2)")
+TLI_DEFINE_VECFUNC("coshf", "_simd_cosh_f4", FIXED(4), "_ZGV_LLVM_N4v_coshf(_simd_cosh_f4)")
+TLI_DEFINE_VECFUNC("tanh", "_simd_tanh_d2", FIXED(2), "_ZGV_LLVM_N2v_tanh(_simd_tanh_d2)")
+TLI_DEFINE_VECFUNC("tanhf", "_simd_tanh_f4", FIXED(4), "_ZGV_LLVM_N4v_tanhf(_simd_tanh_f4)")
+TLI_DEFINE_VECFUNC("asinh", "_simd_asinh_d2", FIXED(2), "_ZGV_LLVM_N2v_asinh(_simd_asinh_d2)")
+TLI_DEFINE_VECFUNC("asinhf", "_simd_asinh_f4", FIXED(4), "_ZGV_LLVM_N4v_asinhf(_simd_asinh_f4)")
+TLI_DEFINE_VECFUNC("acosh", "_simd_acosh_d2", FIXED(2), "_ZGV_LLVM_N2v_acosh(_simd_acosh_d2)")
+TLI_DEFINE_VECFUNC("acoshf", "_simd_acosh_f4", FIXED(4), "_ZGV_LLVM_N4v_acoshf(_simd_acosh_f4)")
+TLI_DEFINE_VECFUNC("atanh", "_simd_atanh_d2", FIXED(2), "_ZGV_LLVM_N2v_atanh(_simd_atanh_d2)")
+TLI_DEFINE_VECFUNC("atanhf", "_simd_atanh_f4", FIXED(4), "_ZGV_LLVM_N4v_atanhf(_simd_atanh_f4)")
 
 #elif defined(TLI_DEFINE_LIBMVEC_X86_VECFUNCS)
 // GLIBC Vector math Functions
 
-TLI_DEFINE_VECFUNC("sin", "_ZGVbN2v_sin", FIXED(2))
-TLI_DEFINE_VECFUNC("sin", "_ZGVdN4v_sin", FIXED(4))
+TLI_DEFINE_VECFUNC("sin", "_ZGVbN2v_sin", FIXED(2), "_ZGV_LLVM_N2v_sin(_ZGVbN2v_sin)")
+TLI_DEFINE_VECFUNC("sin", "_ZGVdN4v_sin", FIXED(4), "_ZGV_LLVM_N4v_sin(_ZGVdN4v_sin)")
 
-TLI_DEFINE_VECFUNC("sinf", "_ZGVbN4v_sinf", FIXED(4))
-TLI_DEFINE_VECFUNC("sinf", "_ZGVdN8v_sinf", FIXED(8))
+TLI_DEFINE_VECFUNC("sinf", "_ZGVbN4v_sinf", FIXED(4), "_ZGV_LLVM_N4v_sinf(_ZGVbN4v_sinf)")
+TLI_DEFINE_VECFUNC("sinf", "_ZGVdN8v_sinf", FIXED(8), "_ZGV_LLVM_N8v_sinf(_ZGVdN8v_sinf)")
 
-TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVbN2v_sin", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVdN4v_sin", FIXED(4))
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVbN2v_sin", FIXED(2), "_ZGV_LLVM_N2v_llvm.sin.f64(_ZGVbN2v_sin)")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVdN4v_sin", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f64(_ZGVdN4v_sin)")
 
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVbN4v_sinf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVdN8v_sinf", FIXED(8))
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVbN4v_sinf", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f32(_ZGVbN4v_sinf)")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVdN8v_sinf", FIXED(8), "_ZGV_LLVM_N8v_llvm.sin.f32(_ZGVdN8v_sinf)")
 
-TLI_DEFINE_VECFUNC("cos", "_ZGVbN2v_cos", FIXED(2))
-TLI_DEFINE_VECFUNC("cos", "_ZGVdN4v_cos", FIXED(4))
+TLI_DEFINE_VECFUNC("cos", "_ZGVbN2v_cos", FIXED(2), "_ZGV_LLVM_N2v_cos(_ZGVbN2v_cos)")
+TLI_DEFINE_VECFUNC("cos", "_ZGVdN4v_cos", FIXED(4), "_ZGV_LLVM_N4v_cos(_ZGVdN4v_cos)")
 
-TLI_DEFINE_VECFUNC("cosf", "_ZGVbN4v_cosf", FIXED(4))
-TLI_DEFINE_VECFUNC("cosf", "_ZGVdN8v_cosf", FIXED(8))
+TLI_DEFINE_VECFUNC("cosf", "_ZGVbN4v_cosf", FIXED(4), "_ZGV_LLVM_N4v_cosf(_ZGVbN4v_cosf)")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVdN8v_cosf", FIXED(8), "_ZGV_LLVM_N8v_cosf(_ZGVdN8v_cosf)")
 
-TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVbN2v_cos", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVdN4v_cos", FIXED(4))
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVbN2v_cos", FIXED(2), "_ZGV_LLVM_N2v_llvm.cos.f64(_ZGVbN2v_cos)")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVdN4v_cos", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f64(_ZGVdN4v_cos)")
 
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVbN4v_cosf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVdN8v_cosf", FIXED(8))
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVbN4v_cosf", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f32(_ZGVbN4v_cosf)")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVdN8v_cosf", FIXED(8), "_ZGV_LLVM_N8v_llvm.cos.f32(_ZGVdN8v_cosf)")
 
-TLI_DEFINE_VECFUNC("pow", "_ZGVbN2vv_pow", FIXED(2))
-TLI_DEFINE_VECFUNC("pow", "_ZGVdN4vv_pow", FIXED(4))
+TLI_DEFINE_VECFUNC("pow", "_ZGVbN2vv_pow", FIXED(2), "_ZGV_LLVM_N2vv_pow(_ZGVbN2vv_pow)")
+TLI_DEFINE_VECFUNC("pow", "_ZGVdN4vv_pow", FIXED(4), "_ZGV_LLVM_N4vv_pow(_ZGVdN4vv_pow)")
 
-TLI_DEFINE_VECFUNC("powf", "_ZGVbN4vv_powf", FIXED(4))
-TLI_DEFINE_VECFUNC("powf", "_ZGVdN8vv_powf", FIXED(8))
+TLI_DEFINE_VECFUNC("powf", "_ZGVbN4vv_powf", FIXED(4), "_ZGV_LLVM_N4vv_powf(_ZGVbN4vv_powf)")
+TLI_DEFINE_VECFUNC("powf", "_ZGVdN8vv_powf", FIXED(8), "_ZGV_LLVM_N8vv_powf(_ZGVdN8vv_powf)")
 
-TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVbN2vv___pow_finite", FIXED(2))
-TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVdN4vv___pow_finite", FIXED(4))
+TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVbN2vv___pow_finite", FIXED(2), "_ZGV_LLVM_N2vv___pow_finite(_ZGVbN2vv___pow_finite)")
+TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVdN4vv___pow_finite", FIXED(4), "_ZGV_LLVM_N4vv___pow_finite(_ZGVdN4vv___pow_finite)")
 
-TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVbN4vv___powf_finite", FIXED(4))
-TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVdN8vv___powf_finite", FIXED(8))
+TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVbN4vv___powf_finite", FIXED(4), "_ZGV_LLVM_N4vv___powf_finite(_ZGVbN4vv___powf_finite)")
+TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVdN8vv___powf_finite", FIXED(8), "_ZGV_LLVM_N8vv___powf_finite(_ZGVdN8vv___powf_finite)")
 
-TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVbN2vv_pow", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVdN4vv_pow", FIXED(4))
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVbN2vv_pow", FIXED(2), "_ZGV_LLVM_N2vv_llvm.pow.f64(_ZGVbN2vv_pow)")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVdN4vv_pow", FIXED(4), "_ZGV_LLVM_N4vv_llvm.pow.f64(_ZGVdN4vv_pow)")
 
-TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVbN4vv_powf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVdN8vv_powf", FIXED(8))
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVbN4vv_powf", FIXED(4), "_ZGV_LLVM_N4vv_llvm.pow.f32(_ZGVbN4vv_powf)")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVdN8vv_powf", FIXED(8), "_ZGV_LLVM_N8vv_llvm.pow.f32(_ZGVdN8vv_powf)")
 
-TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2))
-TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4))
+TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2), "_ZGV_LLVM_N2v_exp(_ZGVbN2v_exp)")
+TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4), "_ZGV_LLVM_N2v_exp(_ZGVdN4v_exp)")
 
-TLI_DEFINE_VECFUNC("expf", "_ZGVbN4v_expf", FIXED(4))
-TLI_DEFINE_VECFUNC("expf", "_ZGVdN8v_expf", FIXED(8))
+TLI_DEFINE_VECFUNC("expf", "_ZGVbN4v_expf", FIXED(4), "_ZGV_LLVM_N4v_expf(_ZGVbN4v_expf)")
+TLI_DEFINE_VECFUNC("expf", "_ZGVdN8v_expf", FIXED(8), "_ZGV_LLVM_N8v_expf(_ZGVdN8v_expf)")
 
-TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVbN2v___exp_finite", FIXED(2))
-TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVdN4v___exp_finite", FIXED(4))
+TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVbN2v___exp_finite", FIXED(2), "_ZGV_LLVM_N2v___exp_finite(_ZGVbN2v___exp_finite)")
+TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVdN4v___exp_finite", FIXED(4), "_ZGV_LLVM_N4v___exp_finite(_ZGVdN4v___exp_finite)")
 
-TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVbN4v___expf_finite", FIXED(4))
-TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVdN8v___expf_finite", FIXED(8))
+TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVbN4v___expf_finite", FIXED(4), "_ZGV_LLVM_N4v___expf_finite(_ZGVbN4v___expf_finite)")
+TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVdN8v___expf_finite", FIXED(8), "_ZGV_LLVM_N8v___expf_finite(_ZGVdN8v___expf_finite)")
 
-TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVbN2v_exp", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVdN4v_...
[truncated]

TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4))
TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4))
TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4))
TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4), "_ZGV_LLVM_N4v_ceilf(vceilf)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MANGLN shouldn't need to be a full vector-function-abi-variant string because that string has a specific use and is built up based on the fields within this table. We should try to keep things minimal and that means certainly dropping the (vector-name) part of the string.

I cannot quite decide whether the scalar name is useful to include or not but that likely comes down to the interface used to extract the data. You're current returning the mangling and the vector name so that's likely where the need comes from. However if you instead have queries that return a reference to the whole record then you might be able to restrict the new field to just mangling prefix because the scalar name will be accessible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I cannot shorten the mangled name removing the optional part.
If I do I'll crash in VFABI::setVectorVariantNames() with
assert(VI && "Cannot add an invalid VFABI name.").
The culprit is the code below. VectorName will only be changed if there
is an optional part. For LLVM isa this causes a std::nullopt return
because the VectorName is unchanged and equals OriginalName.

std::optional VFABI::tryDemangleForVFABI(StringRef MangledName,
const Module &M) {
const StringRef OriginalName = MangledName;
StringRef VectorName = MangledName;
...
// Reduce MangledName to [()].
MangledName = MangledName.ltrim(ScalarName);
// Find the optional custom name redirection.
if (MangledName.consume_front("(")) {
if (!MangledName.consume_back(")"))
return std::nullopt;
// Update the vector variant with the one specified by the user.
VectorName = MangledName;
// If the vector name is missing, bail out.
if (VectorName.empty())
return std::nullopt;
}
// LLVM internal mapping via the TargetLibraryInfo (TLI) must be
// redirected to an existing name.
if (ISA == VFISAKind::LLVM && VectorName == OriginalName)
return std::nullopt;

Sorry for not formatting the above as code but this removes all line breaks for me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think perhaps you've misunderstood the intent of this work and thus my request. I'm not expecting there to be any changes to the value of the vector-function-abi-variant attribute. I'm only expecting the way the text is generated to be changed so that for follow on patches we can stop tryDemangleForVFABI from incorrectly looking through the module in order to figure out the element count for scalable vector functions.

If you look at how mangleTLIVectorName is used you'll see that it's given data that has been extracted from TargetLibraryInfo (via its database of VecDesc objects). The only missing piece of information is the Vector ABI mangling prefix that it instead computes. The request is for this prefix to be store within TargetLibraryInfo as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully fixed now.

Comment on lines 190 to 191
std::pair<StringRef, StringRef>
getVectorizedFunction(StringRef F, const ElementCount &VF, bool Masked) const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels clunky and I think the original interface is just wrong because it's not going to scale well with different variants for the same scalar functions. I think it's best to return the complete record (essentially turn VecDesc into a class) so that all the related information is kept together.

NOTE: I'm not saying you need to convert all the code paths, feel free to sit the old interface on top of new one and then just use the new one where you need it (i.e. when you need access to the mangle string).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2), "_ZGV_LLVM_N2v_erf(_simd_erf_d2)")
TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4), "_ZGV_LLVM_N4v_erff(_simd_erf_f4)")
TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_pow(_simd_pow_d2)")
TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.pow.f64(_simd_pow_d2)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how many of these are wrong but pow takes two parameters and yet the vector ABI string only shows one v (i.e. one operand). I had thought there would be existing tests to ensure this couldn't go wrong. I'm guessing now there are no such existing tests?

It might be worth using a macro a bit like we have for the ElementCount fields to cover the most common usages (e.g. VABI_MANGLE_OneOp(#NoElts), VABI_MANGLE_TwoOp(#NoElts)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, pow usually takes two arguments. This one seems to be an exception. From the veclib-calls-libsystem-darwin.ll test file:
declare double @pow(double) ... %call = tail call double @pow(double %lv)
All darwin tests are passing but I cannot guarantee the tests cover all mappings.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a bug with the original implementation. @fhahn does the use of pow/powf (i.e. only having a single operand) within veclib-calls-libsystem-darwin.ll look correct to you?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're moving towards only the prefix needing to be store. If true then that reenforces the idea of using a macros of some form to help reduce the chances of typos.

/// Return the name of the equivalent of F, vectorized with factor VF. If no
/// such mapping exists, return the empty string.
/// Return the name of the equivalent of F, vectorized with factor VF.
/// If no such mapping exists, return empty strings.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this change can now be reverted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

/// Return a pointer to a VecDesc object holding all info for scalar to vector
/// mappings in TLI for the equivalent of F, vectorized with factor VF.
/// If no such mapping exists, return nullpointer.
const VecDesc *getMangledTLIVectorName(StringRef F, const ElementCount &VF,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like getVectorMappingInfo seems more representative of what the functions does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -354,6 +389,10 @@ class TargetLibraryInfo {
bool Masked = false) const {
return Impl->getVectorizedFunction(F, VF, Masked);
}
const VecDesc *getMangledTLIVectorName(StringRef F, const ElementCount &VF,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above regarding the function name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4))
TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4))
TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4))
TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4), "_ZGV_LLVM_N4v_ceilf(vceilf)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think perhaps you've misunderstood the intent of this work and thus my request. I'm not expecting there to be any changes to the value of the vector-function-abi-variant attribute. I'm only expecting the way the text is generated to be changed so that for follow on patches we can stop tryDemangleForVFABI from incorrectly looking through the module in order to figure out the element count for scalable vector functions.

If you look at how mangleTLIVectorName is used you'll see that it's given data that has been extracted from TargetLibraryInfo (via its database of VecDesc objects). The only missing piece of information is the Vector ABI mangling prefix that it instead computes. The request is for this prefix to be store within TargetLibraryInfo as well.

StringRef getScalarFnName() const { return ScalarFnName; }
StringRef getVectorFnName() const { return VectorFnName; }
ElementCount getVectorizationFactor() const { return VectorizationFactor; }
bool getMasked() const { return Masked; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be isMasked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

StringRef getVectorFnName() const { return VectorFnName; }
ElementCount getVectorizationFactor() const { return VectorizationFactor; }
bool getMasked() const { return Masked; }
StringRef getMangledName() const { return MangledName; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding a function like getVectorFunctionABIVariantString that uses all the fields to construct _ZGV<isa><mask><vlen><vparams>_<scalarname>(<vectorname>)?

Once this exists then perhaps getMangledName() becomes redundant and can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. getMangledName() is removed.

/// <scalarname> = the name of the scalar function.
/// <vectorname> = the name of the vector function.
class VecDesc {
private:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The private: can be removed because by default class members are private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2), "_ZGV_LLVM_N2v_erf(_simd_erf_d2)")
TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4), "_ZGV_LLVM_N4v_erff(_simd_erf_f4)")
TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_pow(_simd_pow_d2)")
TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.pow.f64(_simd_pow_d2)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're moving towards only the prefix needing to be store. If true then that reenforces the idea of using a macros of some form to help reduce the chances of typos.

Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I'll take a proper look at VecFuncs.def once I'm happy with the rest of the patch.

Comment on lines 24 to 33
; SLEEFGNUABI-SAME: [4 x ptr] [
; SLEEFGNUABI-SAME: ptr @_ZGVnN2v_sin,
; SLEEFGNUABI-SAME: ptr @_ZGVsMxv_sin,
; SLEEFGNUABI_SAME; ptr @_ZGVnN4v_log10f,
; SLEEFGNUABI-SAME: ptr @_ZGVsMxv_log10f
; ARMPL-SAME: [4 x ptr] [
; ARMPL-SAME: ptr @armpl_vsinq_f64,
; ARMPL-SAME: ptr @armpl_svsin_f64_x,
; ARMPL-SAME: ptr @armpl_vlog10q_f32,
; ARMPL-SAME: ptr @armpl_svlog10_f32_x
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you rebase the pull request to remove the non-relevant changes. I know that might mess up review comment linking but that doesn't matter at this stage.

Comment on lines 1456 to 1459
std::string
VFABI::getVectorFunctionABIVariantString(const StringRef MangledNamePrefix,
const StringRef ScalarFnName,
const StringRef VectorFnName) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function would be better as a VecDesc member function.

@@ -14,7 +14,7 @@

#if defined(TLI_DEFINE_MASSV_VECFUNCS_NAMES)
#define TLI_DEFINE_MASSV_VECFUNCS
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF) VEC,
#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MANGLN) VEC,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we've settled on just storing the prefix I suggest VABI_PREFIX as a more accurate name.

@JolantaJensen JolantaJensen force-pushed the TLI-NFC-Fix-mechanism-propagating-mangled-names-for-TLI-function-mappings branch from e466c0e to 477a0b3 Compare September 28, 2023 15:53
Currently the mappings from TLI are used to generate the list of available
"scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant"
LLVM IR attribute. Function names from TLI are wrapped in mangled name following
the pattern:
_ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
The problem is the mangled name uses _LLVM_ as the ISA name which prevents
the compiler to compute vectorization factor for scalable vectors as it cannot
make any decision based on the _LLVM_ ISA. If we use "s" as the ISA name, the
compiler can make decisions based on VFABI specification where SVE spacific
rules are described.

This patch is only a refactoring stage where there is no change to the compiler's
behaviour.
renamed a macro field and rebased on top of the main branch.
@JolantaJensen JolantaJensen force-pushed the TLI-NFC-Fix-mechanism-propagating-mangled-names-for-TLI-function-mappings branch from 477a0b3 to 1ec6a90 Compare September 28, 2023 17:56
Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've not heard back about the pow issue but I think it's just a bug that we can fix in this patch.

TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2))
TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4))
TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2), "_ZGV_LLVM_N2v")
TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4), "_ZGV_LLVM_N2v")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be _ZGV_LLVM_N4v?

Comment on lines 104 to 107
TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v")
TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v")
TLI_DEFINE_VECFUNC("powf", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These take two parameters and so show end vv.

Comment on lines 258 to 259
TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2v")
TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4v")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These take two parameters and this should end vv.

Comment on lines 86 to 87
TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2), "_ZGV_LLVM_N2v")
TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4), "_ZGV_LLVM_N4v")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These take two parameters and so show end vv.

@JolantaJensen JolantaJensen merged commit 01797da into llvm:main Oct 2, 2023
3 checks passed
@mikaelholmen
Copy link
Collaborator

mikaelholmen commented Oct 3, 2023

Hi @JolantaJensen and others,

I'm seeing problems with this patch for my out-of-tree target.

For this targert we have a bunch of target intrinsics that we say are vectorizable, but we don't provide a name of the vector version of the intrinsic. This means that before this patch, addMappingsFromTLI wouldn't do much since TLI.getVectorizedFunction(ScalarName, VF, Predicate) returned "" and AddVariantDecl wouldn't add anything.

Vectorization then worked as we wanted anyway.

With this patch, the vectorFn name guard in AddVariantDecl has apparently been removed and instead
something is indeed added to Mappings. And then setVectorVariantNames fails with
opt: ../lib/Transforms/Utils/ModuleUtils.cpp:352: void llvm::VFABI::setVectorVariantNames(llvm::CallInst *, ArrayRef<std::string>): AssertionVI && "Cannot add an invalid VFABI name."' failed.
`
since it couldn't demangle the mapping (since there is no vector name).

Is this change on purpose?

We had some similar problems a couple of years ago with
https://reviews.llvm.org/D67572
which was then reverted and improved in
https://reviews.llvm.org/D72734
so that vectorization would work even without a vector function name.

EDIT: The commit message says
"This patch is only a refactoring stage where there is no change to the compiler's behaviour."
which makes me think the change in behaviour indeed is not on purpose.

@JolantaJensen
Copy link
Contributor Author

Hi @mikaelholmen ,
I'll push a patch covering the case of an empty vector function name.

JolantaJensen added a commit to JolantaJensen/jjensen-llvm-project that referenced this pull request Oct 3, 2023
A guard for empty vector function name was removed in llvm#66656.
This patch adds the guard back.
JolantaJensen added a commit that referenced this pull request Oct 3, 2023
A guard for empty vector function name was removed in #66656. This patch
adds the guard back.
JolantaJensen added a commit to JolantaJensen/jjensen-llvm-project that referenced this pull request Oct 5, 2023
…ed names

        and the call's argument types to construct the VFInfo.

Currently the tryDemangleForVFABI is incorrectly making assumptions that when
the mangled name has __LLVM__ set as ISA and _vlen_ is set to scalable/VLA
the module can be queried to identify the correct VF. This works currently only
by chance rather than by design and limits the ability to maximise the use for
the Vector ABI. This behaviour is changed by
llvm#66656 and
llvm#67308
and means demangling can be implemented as expected based solely on the rules
as documented within the VFABI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants