Fix mechanism propagating mangled names for TLI function mappings #66656

JolantaJensen · 2023-09-18T15:25:05Z

Currently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern:
ZGV<scalar_name>[(<vector_redirection>)] The problem is the mangled name uses LLVM as the ISA name which prevents the compiler to compute vectorization factor for scalable vectors as it cannot make any decision based on the LLVM ISA. If we use "s" as the ISA name, the compiler can make decisions based on VFABI specification where SVE spacific rules are described.

This patch is only a refactoring stage where there is no change to the compiler's behaviour.

llvmbot · 2023-09-18T15:26:17Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Changes

Currently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern:
ZGV<isa><mask><vlen><parameters><scalar_name>[(<vector_redirection>)] The problem is the mangled name uses LLVM as the ISA name which prevents the compiler to compute vectorization factor for scalable vectors as it cannot make any decision based on the LLVM ISA. If we use "s" as the ISA name, the compiler can make decisions based on VFABI specification where SVE spacific rules are described.

This patch is only a refactoring stage where there is no change to the compiler's behaviour.

Patch is 117.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66656.diff

9 Files Affected:

(modified) llvm/include/llvm/Analysis/TargetLibraryInfo.h (+23-8)
(modified) llvm/include/llvm/Analysis/VecFuncs.def (+667-667)
(modified) llvm/include/llvm/Analysis/VectorUtils.h (-21)
(modified) llvm/lib/Analysis/TargetLibraryInfo.cpp (+13-10)
(modified) llvm/lib/Analysis/VectorUtils.cpp (-16)
(modified) llvm/lib/CodeGen/ReplaceWithVeclib.cpp (+1-2)
(modified) llvm/lib/Transforms/Utils/InjectTLIMappings.cpp (+7-7)
(modified) llvm/test/Transforms/Util/add-TLI-mappings.ll (+13-3)
(modified) llvm/unittests/Analysis/VectorFunctionABITest.cpp (-13)

diff --git a/llvm/include/llvm/Analysis/TargetLibraryInfo.h b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
index 5d62e837c1f3d5f..b15f20f43ab29a2 100644
--- a/llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -27,11 +27,25 @@ class Triple;
 /// Describes a possible vectorization of a function.
 /// Function 'VectorFnName' is equivalent to 'ScalarFnName' vectorized
 /// by a factor 'VectorizationFactor'.
+/// The MangledName string holds scalar-to-vector mapping:
+///    _ZGV<isa><mask><vlen><vparams>_<scalarname>(<vectorname>)
+///
+/// where:
+///
+/// <isa> = "_LLVM_"
+/// <mask> = "M" if masked, "N" if no mask.
+/// <vlen> = Number of concurrent lanes, stored in the `VectorizationFactor`
+///          field of the `VecDesc` struct. If the number of lanes is scalable
+///          then 'x' is printed instead.
+/// <vparams> = "v", as many as are the numArgs.
+/// <scalarname> = the name of the scalar function.
+/// <vectorname> = the name of the vector function.
 struct VecDesc {
   StringRef ScalarFnName;
   StringRef VectorFnName;
   ElementCount VectorizationFactor;
   bool Masked;
+  StringRef MangledName;
 };
 
   enum LibFunc : unsigned {
@@ -163,18 +177,18 @@ class TargetLibraryInfoImpl {
   /// Return true if the function F has a vector equivalent with vectorization
   /// factor VF.
   bool isFunctionVectorizable(StringRef F, const ElementCount &VF) const {
-    return !(getVectorizedFunction(F, VF, false).empty() &&
-             getVectorizedFunction(F, VF, true).empty());
+    return !(getVectorizedFunction(F, VF, false).first.empty() &&
+             getVectorizedFunction(F, VF, true).first.empty());
   }
 
   /// Return true if the function F has a vector equivalent with any
   /// vectorization factor.
   bool isFunctionVectorizable(StringRef F) const;
 
-  /// Return the name of the equivalent of F, vectorized with factor VF. If no
-  /// such mapping exists, return the empty string.
-  StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,
-                                  bool Masked) const;
+  /// Return the name of the equivalent of F, vectorized with factor VF and it's
+  /// mangled name. If no such mapping exists, return empty strings.
+  std::pair<StringRef, StringRef>
+  getVectorizedFunction(StringRef F, const ElementCount &VF, bool Masked) const;
 
   /// Set to true iff i32 parameters to library functions should have signext
   /// or zeroext attributes if they correspond to C-level int or unsigned int,
@@ -350,8 +364,9 @@ class TargetLibraryInfo {
   bool isFunctionVectorizable(StringRef F) const {
     return Impl->isFunctionVectorizable(F);
   }
-  StringRef getVectorizedFunction(StringRef F, const ElementCount &VF,
-                                  bool Masked = false) const {
+  std::pair<StringRef, StringRef>
+  getVectorizedFunction(StringRef F, const ElementCount &VF,
+                        bool Masked = false) const {
     return Impl->getVectorizedFunction(F, VF, Masked);
   }
 
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index 98bcfe3843669f7..77cc458823ebc8b 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -14,7 +14,7 @@
 
 #if defined(TLI_DEFINE_MASSV_VECFUNCS_NAMES)
 #define TLI_DEFINE_MASSV_VECFUNCS
-#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF) VEC,
+#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MANGLN) VEC,
 #endif
 
 #define FIXED(NL) ElementCount::getFixed(NL)
@@ -23,860 +23,860 @@
 #define MASKED true
 
 #if !(defined(TLI_DEFINE_VECFUNC))
-#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF) {SCAL, VEC, VF, NOMASK},
+#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MANGLN) {SCAL, VEC, VF, NOMASK, MANGLN},
 #endif
 
 #if defined(TLI_DEFINE_ACCELERATE_VECFUNCS)
 // Accelerate framework's Vector Functions
 
 // Floating-Point Arithmetic and Auxiliary Functions
-TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4))
-TLI_DEFINE_VECFUNC("fabsf", "vfabsf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.fabs.f32", "vfabsf", FIXED(4))
-TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4))
-TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4))
+TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4), "_ZGV_LLVM_N4v_ceilf(vceilf)")
+TLI_DEFINE_VECFUNC("fabsf", "vfabsf", FIXED(4), "_ZGV_LLVM_N4v_fabsf(vfabsf)")
+TLI_DEFINE_VECFUNC("llvm.fabs.f32", "vfabsf", FIXED(4), "_ZGV_LLVM_N4v_llvm.fabs.f32(vfabsf)")
+TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4), "_ZGV_LLVM_N4v_floorf(vfloorf)")
+TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4), "_ZGV_LLVM_N4v_sqrtf(vsqrtf)")
+TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4), "_ZGV_LLVM_N4v_llvm.sqrt.f32(vsqrtf)")
 
 // Exponential and Logarithmic Functions
-TLI_DEFINE_VECFUNC("expf", "vexpf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.exp.f32", "vexpf", FIXED(4))
-TLI_DEFINE_VECFUNC("expm1f", "vexpm1f", FIXED(4))
-TLI_DEFINE_VECFUNC("logf", "vlogf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.log.f32", "vlogf", FIXED(4))
-TLI_DEFINE_VECFUNC("log1pf", "vlog1pf", FIXED(4))
-TLI_DEFINE_VECFUNC("log10f", "vlog10f", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.log10.f32", "vlog10f", FIXED(4))
-TLI_DEFINE_VECFUNC("logbf", "vlogbf", FIXED(4))
+TLI_DEFINE_VECFUNC("expf", "vexpf", FIXED(4), "_ZGV_LLVM_N4v_expf(vexpf)")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "vexpf", FIXED(4), "_ZGV_LLVM_N4v_llvm.exp.f32(vexpf)")
+TLI_DEFINE_VECFUNC("expm1f", "vexpm1f", FIXED(4), "_ZGV_LLVM_N4v_expm1f(vexpm1f)")
+TLI_DEFINE_VECFUNC("logf", "vlogf", FIXED(4), "_ZGV_LLVM_N4v_logf(vlogf)")
+TLI_DEFINE_VECFUNC("llvm.log.f32", "vlogf", FIXED(4), "_ZGV_LLVM_N4v_llvm.log.f32(vlogf)")
+TLI_DEFINE_VECFUNC("log1pf", "vlog1pf", FIXED(4), "_ZGV_LLVM_N4v_log1pf(vlog1pf)")
+TLI_DEFINE_VECFUNC("log10f", "vlog10f", FIXED(4), "_ZGV_LLVM_N4v_log10f(vlog10f)")
+TLI_DEFINE_VECFUNC("llvm.log10.f32", "vlog10f", FIXED(4), "_ZGV_LLVM_N4v_llvm.log10.f32(vlog10f)")
+TLI_DEFINE_VECFUNC("logbf", "vlogbf", FIXED(4), "_ZGV_LLVM_N4v_logbf(vlogbf)")
 
 // Trigonometric Functions
-TLI_DEFINE_VECFUNC("sinf", "vsinf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "vsinf", FIXED(4))
-TLI_DEFINE_VECFUNC("cosf", "vcosf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "vcosf", FIXED(4))
-TLI_DEFINE_VECFUNC("tanf", "vtanf", FIXED(4))
-TLI_DEFINE_VECFUNC("asinf", "vasinf", FIXED(4))
-TLI_DEFINE_VECFUNC("acosf", "vacosf", FIXED(4))
-TLI_DEFINE_VECFUNC("atanf", "vatanf", FIXED(4))
+TLI_DEFINE_VECFUNC("sinf", "vsinf", FIXED(4), "_ZGV_LLVM_N4v_sinf(vsinf)")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "vsinf", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f32(vsinf)")
+TLI_DEFINE_VECFUNC("cosf", "vcosf", FIXED(4), "_ZGV_LLVM_N4v_cosf(vcosf)")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "vcosf", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f32(vcosf)")
+TLI_DEFINE_VECFUNC("tanf", "vtanf", FIXED(4), "_ZGV_LLVM_N4v_tanf(vtanf)")
+TLI_DEFINE_VECFUNC("asinf", "vasinf", FIXED(4), "_ZGV_LLVM_N4v_asinf(vasinf)")
+TLI_DEFINE_VECFUNC("acosf", "vacosf", FIXED(4), "_ZGV_LLVM_N4v_acosf(vacosf)")
+TLI_DEFINE_VECFUNC("atanf", "vatanf", FIXED(4), "_ZGV_LLVM_N4v_atanf(vatanf)")
 
 // Hyperbolic Functions
-TLI_DEFINE_VECFUNC("sinhf", "vsinhf", FIXED(4))
-TLI_DEFINE_VECFUNC("coshf", "vcoshf", FIXED(4))
-TLI_DEFINE_VECFUNC("tanhf", "vtanhf", FIXED(4))
-TLI_DEFINE_VECFUNC("asinhf", "vasinhf", FIXED(4))
-TLI_DEFINE_VECFUNC("acoshf", "vacoshf", FIXED(4))
-TLI_DEFINE_VECFUNC("atanhf", "vatanhf", FIXED(4))
+TLI_DEFINE_VECFUNC("sinhf", "vsinhf", FIXED(4), "_ZGV_LLVM_N4v_sinhf(vsinhf)")
+TLI_DEFINE_VECFUNC("coshf", "vcoshf", FIXED(4), "_ZGV_LLVM_N4v_coshf(vcoshf)")
+TLI_DEFINE_VECFUNC("tanhf", "vtanhf", FIXED(4), "_ZGV_LLVM_N4v_tanhf(vtanhf)")
+TLI_DEFINE_VECFUNC("asinhf", "vasinhf", FIXED(4), "_ZGV_LLVM_N4v_asinhf(vasinhf)")
+TLI_DEFINE_VECFUNC("acoshf", "vacoshf", FIXED(4), "_ZGV_LLVM_N4v_acoshf(vacoshf)")
+TLI_DEFINE_VECFUNC("atanhf", "vatanhf", FIXED(4), "_ZGV_LLVM_N4v_atanhf(vatanhf)")
 
 #elif defined(TLI_DEFINE_DARWIN_LIBSYSTEM_M_VECFUNCS)
 // Darwin libsystem_m vector functions.
 
 // Exponential and Logarithmic Functions
-TLI_DEFINE_VECFUNC("exp", "_simd_exp_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.exp.f64", "_simd_exp_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("expf", "_simd_exp_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.exp.f32", "_simd_exp_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("exp", "_simd_exp_d2", FIXED(2), "_ZGV_LLVM_N2v_exp(_simd_exp_d2)")
+TLI_DEFINE_VECFUNC("llvm.exp.f64", "_simd_exp_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.exp.f64(_simd_exp_d2)")
+TLI_DEFINE_VECFUNC("expf", "_simd_exp_f4", FIXED(4), "_ZGV_LLVM_N4v_expf(_simd_exp_f4)")
+TLI_DEFINE_VECFUNC("llvm.exp.f32", "_simd_exp_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.exp.f32(_simd_exp_f4)")
 
 // Trigonometric Functions
-TLI_DEFINE_VECFUNC("acos", "_simd_acos_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("acosf", "_simd_acos_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("asin", "_simd_asin_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("asinf", "_simd_asin_f4", FIXED(4))
-
-TLI_DEFINE_VECFUNC("atan", "_simd_atan_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("atanf", "_simd_atan_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4))
-
-TLI_DEFINE_VECFUNC("cos", "_simd_cos_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.cos.f64", "_simd_cos_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("cosf", "_simd_cos_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "_simd_cos_f4", FIXED(4))
-
-TLI_DEFINE_VECFUNC("sin", "_simd_sin_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.sin.f64", "_simd_sin_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("sinf", "_simd_sin_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "_simd_sin_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("acos", "_simd_acos_d2", FIXED(2), "_ZGV_LLVM_N2v_acos(_simd_acos_d2)")
+TLI_DEFINE_VECFUNC("acosf", "_simd_acos_f4", FIXED(4), "_ZGV_LLVM_N4v_acosf(_simd_acos_f4)")
+TLI_DEFINE_VECFUNC("asin", "_simd_asin_d2", FIXED(2), "_ZGV_LLVM_N2v_asin(_simd_asin_d2)")
+TLI_DEFINE_VECFUNC("asinf", "_simd_asin_f4", FIXED(4), "_ZGV_LLVM_N4v_asinf(_simd_asin_f4)")
+
+TLI_DEFINE_VECFUNC("atan", "_simd_atan_d2", FIXED(2), "_ZGV_LLVM_N2v_atan(_simd_atan_d2)")
+TLI_DEFINE_VECFUNC("atanf", "_simd_atan_f4", FIXED(4), "_ZGV_LLVM_N4v_atanf(_simd_atan_f4)")
+TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2), "_ZGV_LLVM_N2v_atan2(_simd_atan2_d2)")
+TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4), "_ZGV_LLVM_N4v_atan2f(_simd_atan2_f4)")
+
+TLI_DEFINE_VECFUNC("cos", "_simd_cos_d2", FIXED(2), "_ZGV_LLVM_N2v_cos(_simd_cos_d2)")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_simd_cos_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.cos.f64(_simd_cos_d2)")
+TLI_DEFINE_VECFUNC("cosf", "_simd_cos_f4", FIXED(4), "_ZGV_LLVM_N4v_cosf(_simd_cos_f4)")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_simd_cos_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f32(_simd_cos_f4)")
+
+TLI_DEFINE_VECFUNC("sin", "_simd_sin_d2", FIXED(2), "_ZGV_LLVM_N2v_sin(_simd_sin_d2)")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_simd_sin_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.sin.f64(_simd_sin_d2)")
+TLI_DEFINE_VECFUNC("sinf", "_simd_sin_f4", FIXED(4), "_ZGV_LLVM_N4v_sinf(_simd_sin_f4)")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_simd_sin_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f32(_simd_sin_f4)")
 
 // Floating-Point Arithmetic and Auxiliary Functions
-TLI_DEFINE_VECFUNC("cbrt", "_simd_cbrt_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("cbrtf", "_simd_cbrt_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("powf", "_simd_pow_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("cbrt", "_simd_cbrt_d2", FIXED(2), "_ZGV_LLVM_N2v_cbrt(_simd_cbrt_d2)")
+TLI_DEFINE_VECFUNC("cbrtf", "_simd_cbrt_f4", FIXED(4), "_ZGV_LLVM_N4v_cbrtf(_simd_cbrt_f4)")
+TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2), "_ZGV_LLVM_N2v_erf(_simd_erf_d2)")
+TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4), "_ZGV_LLVM_N4v_erff(_simd_erf_f4)")
+TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_pow(_simd_pow_d2)")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.pow.f64(_simd_pow_d2)")
+TLI_DEFINE_VECFUNC("powf", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v_powf(_simd_pow_f4)")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v_llvm.pow.f32(_simd_pow_f4)")
 
 // Hyperbolic Functions
-TLI_DEFINE_VECFUNC("sinh", "_simd_sinh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("sinhf", "_simd_sinh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("cosh", "_simd_cosh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("coshf", "_simd_cosh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("tanh", "_simd_tanh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("tanhf", "_simd_tanh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("asinh", "_simd_asinh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("asinhf", "_simd_asinh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("acosh", "_simd_acosh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("acoshf", "_simd_acosh_f4", FIXED(4))
-TLI_DEFINE_VECFUNC("atanh", "_simd_atanh_d2", FIXED(2))
-TLI_DEFINE_VECFUNC("atanhf", "_simd_atanh_f4", FIXED(4))
+TLI_DEFINE_VECFUNC("sinh", "_simd_sinh_d2", FIXED(2), "_ZGV_LLVM_N2v_sinh(_simd_sinh_d2)")
+TLI_DEFINE_VECFUNC("sinhf", "_simd_sinh_f4", FIXED(4), "_ZGV_LLVM_N4v_sinhf(_simd_sinh_f4)")
+TLI_DEFINE_VECFUNC("cosh", "_simd_cosh_d2", FIXED(2), "_ZGV_LLVM_N2v_cosh(_simd_cosh_d2)")
+TLI_DEFINE_VECFUNC("coshf", "_simd_cosh_f4", FIXED(4), "_ZGV_LLVM_N4v_coshf(_simd_cosh_f4)")
+TLI_DEFINE_VECFUNC("tanh", "_simd_tanh_d2", FIXED(2), "_ZGV_LLVM_N2v_tanh(_simd_tanh_d2)")
+TLI_DEFINE_VECFUNC("tanhf", "_simd_tanh_f4", FIXED(4), "_ZGV_LLVM_N4v_tanhf(_simd_tanh_f4)")
+TLI_DEFINE_VECFUNC("asinh", "_simd_asinh_d2", FIXED(2), "_ZGV_LLVM_N2v_asinh(_simd_asinh_d2)")
+TLI_DEFINE_VECFUNC("asinhf", "_simd_asinh_f4", FIXED(4), "_ZGV_LLVM_N4v_asinhf(_simd_asinh_f4)")
+TLI_DEFINE_VECFUNC("acosh", "_simd_acosh_d2", FIXED(2), "_ZGV_LLVM_N2v_acosh(_simd_acosh_d2)")
+TLI_DEFINE_VECFUNC("acoshf", "_simd_acosh_f4", FIXED(4), "_ZGV_LLVM_N4v_acoshf(_simd_acosh_f4)")
+TLI_DEFINE_VECFUNC("atanh", "_simd_atanh_d2", FIXED(2), "_ZGV_LLVM_N2v_atanh(_simd_atanh_d2)")
+TLI_DEFINE_VECFUNC("atanhf", "_simd_atanh_f4", FIXED(4), "_ZGV_LLVM_N4v_atanhf(_simd_atanh_f4)")
 
 #elif defined(TLI_DEFINE_LIBMVEC_X86_VECFUNCS)
 // GLIBC Vector math Functions
 
-TLI_DEFINE_VECFUNC("sin", "_ZGVbN2v_sin", FIXED(2))
-TLI_DEFINE_VECFUNC("sin", "_ZGVdN4v_sin", FIXED(4))
+TLI_DEFINE_VECFUNC("sin", "_ZGVbN2v_sin", FIXED(2), "_ZGV_LLVM_N2v_sin(_ZGVbN2v_sin)")
+TLI_DEFINE_VECFUNC("sin", "_ZGVdN4v_sin", FIXED(4), "_ZGV_LLVM_N4v_sin(_ZGVdN4v_sin)")
 
-TLI_DEFINE_VECFUNC("sinf", "_ZGVbN4v_sinf", FIXED(4))
-TLI_DEFINE_VECFUNC("sinf", "_ZGVdN8v_sinf", FIXED(8))
+TLI_DEFINE_VECFUNC("sinf", "_ZGVbN4v_sinf", FIXED(4), "_ZGV_LLVM_N4v_sinf(_ZGVbN4v_sinf)")
+TLI_DEFINE_VECFUNC("sinf", "_ZGVdN8v_sinf", FIXED(8), "_ZGV_LLVM_N8v_sinf(_ZGVdN8v_sinf)")
 
-TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVbN2v_sin", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVdN4v_sin", FIXED(4))
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVbN2v_sin", FIXED(2), "_ZGV_LLVM_N2v_llvm.sin.f64(_ZGVbN2v_sin)")
+TLI_DEFINE_VECFUNC("llvm.sin.f64", "_ZGVdN4v_sin", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f64(_ZGVdN4v_sin)")
 
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVbN4v_sinf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVdN8v_sinf", FIXED(8))
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVbN4v_sinf", FIXED(4), "_ZGV_LLVM_N4v_llvm.sin.f32(_ZGVbN4v_sinf)")
+TLI_DEFINE_VECFUNC("llvm.sin.f32", "_ZGVdN8v_sinf", FIXED(8), "_ZGV_LLVM_N8v_llvm.sin.f32(_ZGVdN8v_sinf)")
 
-TLI_DEFINE_VECFUNC("cos", "_ZGVbN2v_cos", FIXED(2))
-TLI_DEFINE_VECFUNC("cos", "_ZGVdN4v_cos", FIXED(4))
+TLI_DEFINE_VECFUNC("cos", "_ZGVbN2v_cos", FIXED(2), "_ZGV_LLVM_N2v_cos(_ZGVbN2v_cos)")
+TLI_DEFINE_VECFUNC("cos", "_ZGVdN4v_cos", FIXED(4), "_ZGV_LLVM_N4v_cos(_ZGVdN4v_cos)")
 
-TLI_DEFINE_VECFUNC("cosf", "_ZGVbN4v_cosf", FIXED(4))
-TLI_DEFINE_VECFUNC("cosf", "_ZGVdN8v_cosf", FIXED(8))
+TLI_DEFINE_VECFUNC("cosf", "_ZGVbN4v_cosf", FIXED(4), "_ZGV_LLVM_N4v_cosf(_ZGVbN4v_cosf)")
+TLI_DEFINE_VECFUNC("cosf", "_ZGVdN8v_cosf", FIXED(8), "_ZGV_LLVM_N8v_cosf(_ZGVdN8v_cosf)")
 
-TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVbN2v_cos", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVdN4v_cos", FIXED(4))
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVbN2v_cos", FIXED(2), "_ZGV_LLVM_N2v_llvm.cos.f64(_ZGVbN2v_cos)")
+TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVdN4v_cos", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f64(_ZGVdN4v_cos)")
 
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVbN4v_cosf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVdN8v_cosf", FIXED(8))
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVbN4v_cosf", FIXED(4), "_ZGV_LLVM_N4v_llvm.cos.f32(_ZGVbN4v_cosf)")
+TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVdN8v_cosf", FIXED(8), "_ZGV_LLVM_N8v_llvm.cos.f32(_ZGVdN8v_cosf)")
 
-TLI_DEFINE_VECFUNC("pow", "_ZGVbN2vv_pow", FIXED(2))
-TLI_DEFINE_VECFUNC("pow", "_ZGVdN4vv_pow", FIXED(4))
+TLI_DEFINE_VECFUNC("pow", "_ZGVbN2vv_pow", FIXED(2), "_ZGV_LLVM_N2vv_pow(_ZGVbN2vv_pow)")
+TLI_DEFINE_VECFUNC("pow", "_ZGVdN4vv_pow", FIXED(4), "_ZGV_LLVM_N4vv_pow(_ZGVdN4vv_pow)")
 
-TLI_DEFINE_VECFUNC("powf", "_ZGVbN4vv_powf", FIXED(4))
-TLI_DEFINE_VECFUNC("powf", "_ZGVdN8vv_powf", FIXED(8))
+TLI_DEFINE_VECFUNC("powf", "_ZGVbN4vv_powf", FIXED(4), "_ZGV_LLVM_N4vv_powf(_ZGVbN4vv_powf)")
+TLI_DEFINE_VECFUNC("powf", "_ZGVdN8vv_powf", FIXED(8), "_ZGV_LLVM_N8vv_powf(_ZGVdN8vv_powf)")
 
-TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVbN2vv___pow_finite", FIXED(2))
-TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVdN4vv___pow_finite", FIXED(4))
+TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVbN2vv___pow_finite", FIXED(2), "_ZGV_LLVM_N2vv___pow_finite(_ZGVbN2vv___pow_finite)")
+TLI_DEFINE_VECFUNC("__pow_finite", "_ZGVdN4vv___pow_finite", FIXED(4), "_ZGV_LLVM_N4vv___pow_finite(_ZGVdN4vv___pow_finite)")
 
-TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVbN4vv___powf_finite", FIXED(4))
-TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVdN8vv___powf_finite", FIXED(8))
+TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVbN4vv___powf_finite", FIXED(4), "_ZGV_LLVM_N4vv___powf_finite(_ZGVbN4vv___powf_finite)")
+TLI_DEFINE_VECFUNC("__powf_finite", "_ZGVdN8vv___powf_finite", FIXED(8), "_ZGV_LLVM_N8vv___powf_finite(_ZGVdN8vv___powf_finite)")
 
-TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVbN2vv_pow", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVdN4vv_pow", FIXED(4))
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVbN2vv_pow", FIXED(2), "_ZGV_LLVM_N2vv_llvm.pow.f64(_ZGVbN2vv_pow)")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_ZGVdN4vv_pow", FIXED(4), "_ZGV_LLVM_N4vv_llvm.pow.f64(_ZGVdN4vv_pow)")
 
-TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVbN4vv_powf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVdN8vv_powf", FIXED(8))
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVbN4vv_powf", FIXED(4), "_ZGV_LLVM_N4vv_llvm.pow.f32(_ZGVbN4vv_powf)")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_ZGVdN8vv_powf", FIXED(8), "_ZGV_LLVM_N8vv_llvm.pow.f32(_ZGVdN8vv_powf)")
 
-TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2))
-TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4))
+TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2), "_ZGV_LLVM_N2v_exp(_ZGVbN2v_exp)")
+TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4), "_ZGV_LLVM_N2v_exp(_ZGVdN4v_exp)")
 
-TLI_DEFINE_VECFUNC("expf", "_ZGVbN4v_expf", FIXED(4))
-TLI_DEFINE_VECFUNC("expf", "_ZGVdN8v_expf", FIXED(8))
+TLI_DEFINE_VECFUNC("expf", "_ZGVbN4v_expf", FIXED(4), "_ZGV_LLVM_N4v_expf(_ZGVbN4v_expf)")
+TLI_DEFINE_VECFUNC("expf", "_ZGVdN8v_expf", FIXED(8), "_ZGV_LLVM_N8v_expf(_ZGVdN8v_expf)")
 
-TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVbN2v___exp_finite", FIXED(2))
-TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVdN4v___exp_finite", FIXED(4))
+TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVbN2v___exp_finite", FIXED(2), "_ZGV_LLVM_N2v___exp_finite(_ZGVbN2v___exp_finite)")
+TLI_DEFINE_VECFUNC("__exp_finite", "_ZGVdN4v___exp_finite", FIXED(4), "_ZGV_LLVM_N4v___exp_finite(_ZGVdN4v___exp_finite)")
 
-TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVbN4v___expf_finite", FIXED(4))
-TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVdN8v___expf_finite", FIXED(8))
+TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVbN4v___expf_finite", FIXED(4), "_ZGV_LLVM_N4v___expf_finite(_ZGVbN4v___expf_finite)")
+TLI_DEFINE_VECFUNC("__expf_finite", "_ZGVdN8v___expf_finite", FIXED(8), "_ZGV_LLVM_N8v___expf_finite(_ZGVdN8v___expf_finite)")
 
-TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVbN2v_exp", FIXED(2))
-TLI_DEFINE_VECFUNC("llvm.exp.f64", "_ZGVdN4v_...
[truncated]

paulwalker-arm · 2023-09-18T15:51:51Z

llvm/include/llvm/Analysis/VecFuncs.def

-TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4))
-TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4))
+TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4), "_ZGV_LLVM_N4v_ceilf(vceilf)")


MANGLN shouldn't need to be a full vector-function-abi-variant string because that string has a specific use and is built up based on the fields within this table. We should try to keep things minimal and that means certainly dropping the (vector-name) part of the string.

I cannot quite decide whether the scalar name is useful to include or not but that likely comes down to the interface used to extract the data. You're current returning the mangling and the vector name so that's likely where the need comes from. However if you instead have queries that return a reference to the whole record then you might be able to restrict the new field to just mangling prefix because the scalar name will be accessible.

I'm afraid I cannot shorten the mangled name removing the optional part.
If I do I'll crash in VFABI::setVectorVariantNames() with
assert(VI && "Cannot add an invalid VFABI name.").
The culprit is the code below. VectorName will only be changed if there
is an optional part. For LLVM isa this causes a std::nullopt return
because the VectorName is unchanged and equals OriginalName.

std::optional VFABI::tryDemangleForVFABI(StringRef MangledName,
const Module &M) {
const StringRef OriginalName = MangledName;
StringRef VectorName = MangledName;
...
// Reduce MangledName to [()].
MangledName = MangledName.ltrim(ScalarName);
// Find the optional custom name redirection.
if (MangledName.consume_front("(")) {
if (!MangledName.consume_back(")"))
return std::nullopt;
// Update the vector variant with the one specified by the user.
VectorName = MangledName;
// If the vector name is missing, bail out.
if (VectorName.empty())
return std::nullopt;
}
// LLVM internal mapping via the TargetLibraryInfo (TLI) must be
// redirected to an existing name.
if (ISA == VFISAKind::LLVM && VectorName == OriginalName)
return std::nullopt;

Sorry for not formatting the above as code but this removes all line breaks for me.

I think perhaps you've misunderstood the intent of this work and thus my request. I'm not expecting there to be any changes to the value of the vector-function-abi-variant attribute. I'm only expecting the way the text is generated to be changed so that for follow on patches we can stop tryDemangleForVFABI from incorrectly looking through the module in order to figure out the element count for scalable vector functions.

If you look at how mangleTLIVectorName is used you'll see that it's given data that has been extracted from TargetLibraryInfo (via its database of VecDesc objects). The only missing piece of information is the Vector ABI mangling prefix that it instead computes. The request is for this prefix to be store within TargetLibraryInfo as well.

Hopefully fixed now.

paulwalker-arm · 2023-09-18T15:54:00Z

llvm/include/llvm/Analysis/TargetLibraryInfo.h

+  std::pair<StringRef, StringRef>
+  getVectorizedFunction(StringRef F, const ElementCount &VF, bool Masked) const;


This feels clunky and I think the original interface is just wrong because it's not going to scale well with different variants for the same scalar functions. I think it's best to return the complete record (essentially turn VecDesc into a class) so that all the related information is kept together.

NOTE: I'm not saying you need to convert all the code paths, feel free to sit the old interface on top of new one and then just use the new one where you need it (i.e. when you need access to the mangle string).

paulwalker-arm · 2023-09-18T16:01:58Z

llvm/include/llvm/Analysis/VecFuncs.def

+TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2), "_ZGV_LLVM_N2v_erf(_simd_erf_d2)")
+TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4), "_ZGV_LLVM_N4v_erff(_simd_erf_f4)")
+TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_pow(_simd_pow_d2)")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.pow.f64(_simd_pow_d2)")


Not sure how many of these are wrong but pow takes two parameters and yet the vector ABI string only shows one v (i.e. one operand). I had thought there would be existing tests to ensure this couldn't go wrong. I'm guessing now there are no such existing tests?

It might be worth using a macro a bit like we have for the ElementCount fields to cover the most common usages (e.g. VABI_MANGLE_OneOp(#NoElts), VABI_MANGLE_TwoOp(#NoElts)).

Yes, pow usually takes two arguments. This one seems to be an exception. From the veclib-calls-libsystem-darwin.ll test file:
declare double @pow(double) ... %call = tail call double @pow(double %lv)
All darwin tests are passing but I cannot guarantee the tests cover all mappings.

This looks like a bug with the original implementation. @fhahn does the use of pow/powf (i.e. only having a single operand) within veclib-calls-libsystem-darwin.ll look correct to you?

I think we're moving towards only the prefix needing to be store. If true then that reenforces the idea of using a macros of some form to help reduce the chances of typos.

paulwalker-arm · 2023-09-26T15:59:08Z

llvm/include/llvm/Analysis/TargetLibraryInfo.h

-  /// Return the name of the equivalent of F, vectorized with factor VF. If no
-  /// such mapping exists, return the empty string.
+  /// Return the name of the equivalent of F, vectorized with factor VF.
+  /// If no such mapping exists, return empty strings.


Presumably this change can now be reverted?

paulwalker-arm · 2023-09-26T16:00:26Z

llvm/include/llvm/Analysis/TargetLibraryInfo.h

+  /// Return a pointer to a VecDesc object holding all info for scalar to vector
+  /// mappings in TLI for the equivalent of F, vectorized with factor VF.
+  /// If no such mapping exists, return nullpointer.
+  const VecDesc *getMangledTLIVectorName(StringRef F, const ElementCount &VF,


Something like getVectorMappingInfo seems more representative of what the functions does?

paulwalker-arm · 2023-09-26T16:00:45Z

llvm/include/llvm/Analysis/TargetLibraryInfo.h

@@ -354,6 +389,10 @@ class TargetLibraryInfo {
                                  bool Masked = false) const {
    return Impl->getVectorizedFunction(F, VF, Masked);
  }
+  const VecDesc *getMangledTLIVectorName(StringRef F, const ElementCount &VF,


As above regarding the function name.

paulwalker-arm · 2023-09-26T16:20:33Z

llvm/include/llvm/Analysis/VecFuncs.def

-TLI_DEFINE_VECFUNC("floorf", "vfloorf", FIXED(4))
-TLI_DEFINE_VECFUNC("sqrtf", "vsqrtf", FIXED(4))
-TLI_DEFINE_VECFUNC("llvm.sqrt.f32", "vsqrtf", FIXED(4))
+TLI_DEFINE_VECFUNC("ceilf", "vceilf", FIXED(4), "_ZGV_LLVM_N4v_ceilf(vceilf)")


I think perhaps you've misunderstood the intent of this work and thus my request. I'm not expecting there to be any changes to the value of the vector-function-abi-variant attribute. I'm only expecting the way the text is generated to be changed so that for follow on patches we can stop tryDemangleForVFABI from incorrectly looking through the module in order to figure out the element count for scalable vector functions.

If you look at how mangleTLIVectorName is used you'll see that it's given data that has been extracted from TargetLibraryInfo (via its database of VecDesc objects). The only missing piece of information is the Vector ABI mangling prefix that it instead computes. The request is for this prefix to be store within TargetLibraryInfo as well.

paulwalker-arm · 2023-09-26T16:25:09Z

llvm/include/llvm/Analysis/TargetLibraryInfo.h

+  StringRef getScalarFnName() const { return ScalarFnName; }
+  StringRef getVectorFnName() const { return VectorFnName; }
+  ElementCount getVectorizationFactor() const { return VectorizationFactor; }
+  bool getMasked() const { return Masked; }


Can this be isMasked?

paulwalker-arm · 2023-09-26T16:28:39Z

llvm/include/llvm/Analysis/TargetLibraryInfo.h

+  StringRef getVectorFnName() const { return VectorFnName; }
+  ElementCount getVectorizationFactor() const { return VectorizationFactor; }
+  bool getMasked() const { return Masked; }
+  StringRef getMangledName() const { return MangledName; }


What about adding a function like getVectorFunctionABIVariantString that uses all the fields to construct _ZGV<isa><mask><vlen><vparams>_<scalarname>(<vectorname>)?

Once this exists then perhaps getMangledName() becomes redundant and can be removed.

Fixed. getMangledName() is removed.

paulwalker-arm · 2023-09-26T16:29:40Z

llvm/include/llvm/Analysis/TargetLibraryInfo.h

+/// <scalarname> = the name of the scalar function.
+/// <vectorname> = the name of the vector function.
+class VecDesc {
+private:


The private: can be removed because by default class members are private.

paulwalker-arm · 2023-09-26T16:37:41Z

llvm/include/llvm/Analysis/VecFuncs.def

+TLI_DEFINE_VECFUNC("erf", "_simd_erf_d2", FIXED(2), "_ZGV_LLVM_N2v_erf(_simd_erf_d2)")
+TLI_DEFINE_VECFUNC("erff", "_simd_erf_f4", FIXED(4), "_ZGV_LLVM_N4v_erff(_simd_erf_f4)")
+TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_pow(_simd_pow_d2)")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v_llvm.pow.f64(_simd_pow_d2)")


I think we're moving towards only the prefix needing to be store. If true then that reenforces the idea of using a macros of some form to help reduce the chances of typos.

paulwalker-arm

FYI: I'll take a proper look at VecFuncs.def once I'm happy with the rest of the patch.

paulwalker-arm · 2023-09-28T09:46:07Z

llvm/test/Transforms/Util/add-TLI-mappings.ll

+; SLEEFGNUABI-SAME: [4 x ptr] [
+; SLEEFGNUABI-SAME:   ptr @_ZGVnN2v_sin,
+; SLEEFGNUABI-SAME:   ptr @_ZGVsMxv_sin,
+; SLEEFGNUABI_SAME;   ptr @_ZGVnN4v_log10f,
+; SLEEFGNUABI-SAME:   ptr @_ZGVsMxv_log10f
+; ARMPL-SAME:       [4 x ptr] [
+; ARMPL-SAME:         ptr @armpl_vsinq_f64,
+; ARMPL-SAME:         ptr @armpl_svsin_f64_x,
+; ARMPL-SAME:         ptr @armpl_vlog10q_f32,
+; ARMPL-SAME:         ptr @armpl_svlog10_f32_x


Please can you rebase the pull request to remove the non-relevant changes. I know that might mess up review comment linking but that doesn't matter at this stage.

paulwalker-arm · 2023-09-28T09:47:54Z

llvm/lib/Analysis/VectorUtils.cpp

+std::string
+VFABI::getVectorFunctionABIVariantString(const StringRef MangledNamePrefix,
+                                         const StringRef ScalarFnName,
+                                         const StringRef VectorFnName) {


I think this function would be better as a VecDesc member function.

paulwalker-arm · 2023-09-28T10:04:07Z

llvm/include/llvm/Analysis/VecFuncs.def

@@ -14,7 +14,7 @@

 #if defined(TLI_DEFINE_MASSV_VECFUNCS_NAMES)
 #define TLI_DEFINE_MASSV_VECFUNCS
-#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF) VEC,
+#define TLI_DEFINE_VECFUNC(SCAL, VEC, VF, MANGLN) VEC,


Now that we've settled on just storing the prefix I suggest VABI_PREFIX as a more accurate name.

Currently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern: _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)] The problem is the mangled name uses _LLVM_ as the ISA name which prevents the compiler to compute vectorization factor for scalable vectors as it cannot make any decision based on the _LLVM_ ISA. If we use "s" as the ISA name, the compiler can make decisions based on VFABI specification where SVE spacific rules are described. This patch is only a refactoring stage where there is no change to the compiler's behaviour.

the functionality.

and adjusted the implementation.

renamed a macro field and rebased on top of the main branch.

paulwalker-arm

I know we've not heard back about the pow issue but I think it's just a bug that we can fix in this patch.

paulwalker-arm · 2023-09-29T12:18:07Z

llvm/include/llvm/Analysis/VecFuncs.def

-TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2))
-TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4))
+TLI_DEFINE_VECFUNC("exp", "_ZGVbN2v_exp", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("exp", "_ZGVdN4v_exp", FIXED(4), "_ZGV_LLVM_N2v")


Should this be _ZGV_LLVM_N4v?

paulwalker-arm · 2023-09-29T13:50:53Z

llvm/include/llvm/Analysis/VecFuncs.def

+TLI_DEFINE_VECFUNC("pow", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.pow.f64", "_simd_pow_d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("powf", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4v")


These take two parameters and so show end vv.

paulwalker-arm · 2023-09-29T13:54:21Z

llvm/include/llvm/Analysis/VecFuncs.def

+TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4v")


These take two parameters and this should end vv.

paulwalker-arm · 2023-09-29T13:54:59Z

llvm/include/llvm/Analysis/VecFuncs.def

+TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4), "_ZGV_LLVM_N4v")


These take two parameters and so show end vv.

mikaelholmen · 2023-10-03T10:48:05Z

Hi @JolantaJensen and others,

I'm seeing problems with this patch for my out-of-tree target.

For this targert we have a bunch of target intrinsics that we say are vectorizable, but we don't provide a name of the vector version of the intrinsic. This means that before this patch, addMappingsFromTLI wouldn't do much since TLI.getVectorizedFunction(ScalarName, VF, Predicate) returned "" and AddVariantDecl wouldn't add anything.

Vectorization then worked as we wanted anyway.

With this patch, the vectorFn name guard in AddVariantDecl has apparently been removed and instead
something is indeed added to Mappings. And then setVectorVariantNames fails with
opt: ../lib/Transforms/Utils/ModuleUtils.cpp:352: void llvm::VFABI::setVectorVariantNames(llvm::CallInst *, ArrayRef<std::string>): AssertionVI && "Cannot add an invalid VFABI name."' failed.
`
since it couldn't demangle the mapping (since there is no vector name).

Is this change on purpose?

We had some similar problems a couple of years ago with
https://reviews.llvm.org/D67572
which was then reverted and improved in
https://reviews.llvm.org/D72734
so that vectorization would work even without a vector function name.

EDIT: The commit message says
"This patch is only a refactoring stage where there is no change to the compiler's behaviour."
which makes me think the change in behaviour indeed is not on purpose.

JolantaJensen · 2023-10-03T12:27:46Z

Hi @mikaelholmen ,
I'll push a patch covering the case of an empty vector function name.

A guard for empty vector function name was removed in llvm#66656. This patch adds the guard back.

A guard for empty vector function name was removed in #66656. This patch adds the guard back.

…ed names and the call's argument types to construct the VFInfo. Currently the tryDemangleForVFABI is incorrectly making assumptions that when the mangled name has __LLVM__ set as ISA and _vlen_ is set to scalable/VLA the module can be queried to identify the correct VF. This works currently only by chance rather than by design and limits the ability to maximise the use for the Vector ABI. This behaviour is changed by llvm#66656 and llvm#67308 and means demangling can be implemented as expected based solely on the rules as documented within the VFABI.

JolantaJensen self-assigned this Sep 18, 2023

llvmbot added llvm:analysis llvm:transforms labels Sep 18, 2023

JolantaJensen requested review from mgabka and paulwalker-arm September 18, 2023 15:26

paulwalker-arm reviewed Sep 18, 2023

View reviewed changes

paulwalker-arm reviewed Sep 26, 2023

View reviewed changes

paulwalker-arm reviewed Sep 28, 2023

View reviewed changes

JolantaJensen force-pushed the TLI-NFC-Fix-mechanism-propagating-mangled-names-for-TLI-function-mappings branch from e466c0e to 477a0b3 Compare September 28, 2023 15:53

JolantaJensen added 5 commits September 28, 2023 17:25

Changed struct VecDesc into a class VecDesc and adjusted

b7d3c9c

the functionality.

Corrected the TLI macros to only contain the mangled name prefix

d419001

and adjusted the implementation.

Made VFABI::getVectorFunctionABIVariantString a VecDesc member function,

7e2ed4a

renamed a macro field and rebased on top of the main branch.

Some renaming.

1ec6a90

JolantaJensen force-pushed the TLI-NFC-Fix-mechanism-propagating-mangled-names-for-TLI-function-mappings branch from 477a0b3 to 1ec6a90 Compare September 28, 2023 17:56

paulwalker-arm reviewed Sep 29, 2023

View reviewed changes

Correction to VABI prefixes and the tests.

28ea2e6

paulwalker-arm approved these changes Sep 29, 2023

View reviewed changes

JolantaJensen merged commit 01797da into llvm:main Oct 2, 2023
3 checks passed

JolantaJensen added a commit to JolantaJensen/jjensen-llvm-project that referenced this pull request Oct 3, 2023

[TLI] Added a guard for empty vector function name.

feb323c

A guard for empty vector function name was removed in llvm#66656. This patch adds the guard back.

JolantaJensen mentioned this pull request Oct 3, 2023

[TLI] Added a guard for empty vector function name. #68100

Merged

JolantaJensen added a commit that referenced this pull request Oct 3, 2023

[TLI] Added a guard for empty vector function name. (#68100)

9546b5d

A guard for empty vector function name was removed in #66656. This patch adds the guard back.

		std::pair<StringRef, StringRef>
		getVectorizedFunction(StringRef F, const ElementCount &VF, bool Masked) const;

		TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2v")
		TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4v")

		TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2), "_ZGV_LLVM_N2v")
		TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4), "_ZGV_LLVM_N4v")

Fix mechanism propagating mangled names for TLI function mappings #66656

Fix mechanism propagating mangled names for TLI function mappings #66656

Conversation

JolantaJensen commented Sep 18, 2023

llvmbot commented Sep 18, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JolantaJensen Sep 19, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulwalker-arm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulwalker-arm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mikaelholmen commented Oct 3, 2023 • edited

JolantaJensen commented Oct 3, 2023

llvmbot commented Sep 18, 2023 •

edited

JolantaJensen Sep 19, 2023 •

edited

mikaelholmen commented Oct 3, 2023 •

edited