Vectorize Atan for Vector64/128/256/512 and TensorPrimitives#126422
Vectorize Atan for Vector64/128/256/512 and TensorPrimitives#126422stephentoub wants to merge 1 commit intodotnet:mainfrom
Conversation
Add vectorized Atan implementations for float and double across all SIMD vector types. - AtanDouble: AMD atan.c Remez(4,4) rational polynomial with 5-region argument reduction - AtanSingle: widens to double, uses AMD atanf.c Remez(2,2) rational polynomial - Hook up TensorPrimitives.Atan to use vectorized implementations Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
There was a problem hiding this comment.
Pull request overview
Adds SIMD-accelerated Atan support to the System.Runtime.Intrinsics vector APIs and wires it up for use by TensorPrimitives (float/double), improving throughput for element-wise atan operations across supported vector widths.
Changes:
- Introduces
Vector64/128/256/512.Atanoverloads forfloatanddouble, backed by newVectorMathimplementations. - Adds new vectorized
atanpolynomial/range-reduction implementations inVectorMathfordoubleandfloat(via widening to double). - Enables
TensorPrimitives.Atan<T>vectorization forfloat/doubleonNET11_0_OR_GREATERand updates tensor tests to use explicit tolerances forAtan/AtanPi.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/libraries/System.Runtime.Intrinsics/ref/System.Runtime.Intrinsics.cs | Adds new public ref API surface for Vector*.Atan (float/double). |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/VectorMath.cs | Adds core vector math implementations: AtanDouble and AtanSingle (widen-to-double). |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector64.cs | Adds Vector64.Atan public APIs + scalar fallback helper. |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector128.cs | Adds Vector128.Atan public APIs delegating to VectorMath or to Vector64 halves. |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs | Adds Vector256.Atan public APIs delegating to VectorMath or to Vector128 halves. |
| src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector512.cs | Adds Vector512.Atan public APIs delegating to VectorMath or to Vector256 halves. |
| src/libraries/System.Numerics.Tensors/tests/TensorPrimitives.Generic.cs | Updates tensor test tolerances for Atan / AtanPi. |
| src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Atan.cs | Enables vectorized TensorPrimitives.Atan for float/double (NET11+), mapping to Vector*.Atan. |
| TVectorDouble r1 = TVectorDouble.LessThan(ax, TVectorDouble.Create(R7_16)); | ||
| TVectorDouble r2 = TVectorDouble.LessThan(ax, TVectorDouble.Create(R11_16)); | ||
| TVectorDouble r3 = TVectorDouble.LessThan(ax, TVectorDouble.Create(R19_16)); | ||
| TVectorDouble r4 = TVectorDouble.LessThan(ax, TVectorDouble.Create(R39_16)); |
There was a problem hiding this comment.
In AtanSingleCoreDouble, the region boundary masks use LessThan for R7_16/R11_16/R19_16/R39_16. This misclassifies inputs exactly equal to the boundary values (e.g., x == 39/16 falls into the pi/2 - atan(1/x) region), which contradicts the documented ranges (<= upper bound) and can introduce discontinuities. Use LessThanOrEqual for these comparisons (or otherwise adjust the selection logic) so boundary values map to the intended region.
| /// <summary>Computes the arc tangent of each element in a vector.</summary> | ||
| /// <param name="vector">The vector whose arc tangent is to be computed.</param> | ||
| /// <returns>A vector whose elements are the arc tangent of the corresponding elements in <paramref name="vector" />.</returns> | ||
| /// <remarks>The angles are returned in radians.</remarks> | ||
| [MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
| public static Vector64<double> Atan(Vector64<double> vector) | ||
| { | ||
| if (IsHardwareAccelerated) | ||
| { | ||
| return VectorMath.AtanDouble<Vector64<double>>(vector); | ||
| } |
There was a problem hiding this comment.
New Vector*.Atan APIs don't appear to have direct coverage in System.Runtime.Intrinsics tests (there are tests for Vector*.Asin, but no Atan tests). Please add analogous test cases in src/libraries/System.Runtime.Intrinsics/tests/Vectors/Vector{64,128,256,512}Tests.cs to validate accuracy/variance across representative inputs (including NaN, ±Infinity, and signed zero).
Add vectorized Atan implementations for float and double across all SIMD vector types.