Skip to content

Vectorize Atan for Vector64/128/256/512 and TensorPrimitives#126422

Closed
stephentoub wants to merge 1 commit intodotnet:mainfrom
stephentoub:copilot/vectorize-atan
Closed

Vectorize Atan for Vector64/128/256/512 and TensorPrimitives#126422
stephentoub wants to merge 1 commit intodotnet:mainfrom
stephentoub:copilot/vectorize-atan

Conversation

@stephentoub
Copy link
Copy Markdown
Member

Add vectorized Atan implementations for float and double across all SIMD vector types.

Add vectorized Atan implementations for float and double across all SIMD vector types.
- AtanDouble: AMD atan.c Remez(4,4) rational polynomial with 5-region argument reduction
- AtanSingle: widens to double, uses AMD atanf.c Remez(2,2) rational polynomial
- Hook up TensorPrimitives.Atan to use vectorized implementations

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 1, 2026 18:12
@stephentoub stephentoub closed this Apr 1, 2026
@stephentoub stephentoub deleted the copilot/vectorize-atan branch April 1, 2026 18:12
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds SIMD-accelerated Atan support to the System.Runtime.Intrinsics vector APIs and wires it up for use by TensorPrimitives (float/double), improving throughput for element-wise atan operations across supported vector widths.

Changes:

  • Introduces Vector64/128/256/512.Atan overloads for float and double, backed by new VectorMath implementations.
  • Adds new vectorized atan polynomial/range-reduction implementations in VectorMath for double and float (via widening to double).
  • Enables TensorPrimitives.Atan<T> vectorization for float/double on NET11_0_OR_GREATER and updates tensor tests to use explicit tolerances for Atan / AtanPi.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/libraries/System.Runtime.Intrinsics/ref/System.Runtime.Intrinsics.cs Adds new public ref API surface for Vector*.Atan (float/double).
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/VectorMath.cs Adds core vector math implementations: AtanDouble and AtanSingle (widen-to-double).
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector64.cs Adds Vector64.Atan public APIs + scalar fallback helper.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector128.cs Adds Vector128.Atan public APIs delegating to VectorMath or to Vector64 halves.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs Adds Vector256.Atan public APIs delegating to VectorMath or to Vector128 halves.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector512.cs Adds Vector512.Atan public APIs delegating to VectorMath or to Vector256 halves.
src/libraries/System.Numerics.Tensors/tests/TensorPrimitives.Generic.cs Updates tensor test tolerances for Atan / AtanPi.
src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Atan.cs Enables vectorized TensorPrimitives.Atan for float/double (NET11+), mapping to Vector*.Atan.

Comment on lines +3416 to +3419
TVectorDouble r1 = TVectorDouble.LessThan(ax, TVectorDouble.Create(R7_16));
TVectorDouble r2 = TVectorDouble.LessThan(ax, TVectorDouble.Create(R11_16));
TVectorDouble r3 = TVectorDouble.LessThan(ax, TVectorDouble.Create(R19_16));
TVectorDouble r4 = TVectorDouble.LessThan(ax, TVectorDouble.Create(R39_16));
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In AtanSingleCoreDouble, the region boundary masks use LessThan for R7_16/R11_16/R19_16/R39_16. This misclassifies inputs exactly equal to the boundary values (e.g., x == 39/16 falls into the pi/2 - atan(1/x) region), which contradicts the documented ranges (<= upper bound) and can introduce discontinuities. Use LessThanOrEqual for these comparisons (or otherwise adjust the selection logic) so boundary values map to the intended region.

Copilot uses AI. Check for mistakes.
Comment on lines +822 to +832
/// <summary>Computes the arc tangent of each element in a vector.</summary>
/// <param name="vector">The vector whose arc tangent is to be computed.</param>
/// <returns>A vector whose elements are the arc tangent of the corresponding elements in <paramref name="vector" />.</returns>
/// <remarks>The angles are returned in radians.</remarks>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector64<double> Atan(Vector64<double> vector)
{
if (IsHardwareAccelerated)
{
return VectorMath.AtanDouble<Vector64<double>>(vector);
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New Vector*.Atan APIs don't appear to have direct coverage in System.Runtime.Intrinsics tests (there are tests for Vector*.Asin, but no Atan tests). Please add analogous test cases in src/libraries/System.Runtime.Intrinsics/tests/Vectors/Vector{64,128,256,512}Tests.cs to validate accuracy/variance across representative inputs (including NaN, ±Infinity, and signed zero).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants