Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64: Add S.P.CoreLib instrinsics for AddAcross #27663

Merged

Conversation

@TamarChristinaArm
Copy link
Contributor

TamarChristinaArm commented Nov 4, 2019

This implements the AddAcross intrinsics for Arm64.

AddAcross is a sum reduction operation.

See dotnet/corefx#26581 and is split of #26769
CoreFX reference assembly update at dotnet/corefx#42359

/CC @tannergooding @CarolEidt @echesakovMSFT

Thanks,
Tamar

@tannergooding

This comment has been minimized.

Copy link
Member

tannergooding commented Nov 4, 2019

There are some merge conflicts that need to be resolved here.

@TamarChristinaArm TamarChristinaArm force-pushed the TamarChristinaArm:implement-arm64-addacross branch from ed00c5a to 1bec747 Nov 4, 2019
@TamarChristinaArm

This comment has been minimized.

Copy link
Contributor Author

TamarChristinaArm commented Nov 4, 2019

Rebased :)

@TamarChristinaArm TamarChristinaArm force-pushed the TamarChristinaArm:implement-arm64-addacross branch 2 times, most recently from 6a6c1c8 to d9056bb Nov 5, 2019
Copy link
Member

echesakovMSFT left a comment

@TamarChristinaArm Is there a floating-point version of the intrinsic?

@maryamariyan

This comment has been minimized.

Copy link
Member

maryamariyan commented Nov 6, 2019

Thank you for your contribution. As announced in #27549 this repository will be moving to dotnet/runtime on November 13. If you would like to continue working on this PR after this date, the easiest way to move the change to dotnet/runtime is:

  1. In your coreclr repository clone, create patch by running git format-patch origin
  2. In your runtime repository clone, apply the patch by running git apply --directory src/coreclr <path to the patch created in step 1>
@TamarChristinaArm TamarChristinaArm force-pushed the TamarChristinaArm:implement-arm64-addacross branch from d9056bb to a639bd6 Nov 7, 2019
@TamarChristinaArm

This comment has been minimized.

Copy link
Contributor Author

TamarChristinaArm commented Nov 7, 2019

@TamarChristinaArm Is there a floating-point version of the intrinsic?

Not for AdvSimd only SVE. In AdvSimd you have FADDP that'll do pairwise addition, which will work but will do extra additions and you need multiple of them for vectors of more than 2 numbers. So it's not equivalent and so is a different intrinsic.

@tannergooding

This comment has been minimized.

Copy link
Member

tannergooding commented Nov 7, 2019

@TamarChristinaArm, this does a sum of all elements where-as FADDP is a pairwise addition (equivalent to Horizontal Add on x86) and also has integer equivalents, correct?

/// int32_t vaddvq_s32(int32x4_t a)
/// A64: ADDV Sd, Vn.4S
/// </summary>
public static int AddAcross(Vector128<int> value) { throw new PlatformNotSupportedException(); }

This comment has been minimized.

Copy link
@tannergooding

tannergooding Nov 7, 2019

Member

There is no Vector64<int> overload because it is equivalent to a AddPairwise, correct?

This comment has been minimized.

Copy link
@tannergooding

tannergooding Nov 7, 2019

Member

(so the manual marks size = 10, Q = 0 and size = 11, Q = 1 as reserved; with size = 11, Q = 0 as reserved because it would be a nop).

This comment has been minimized.

Copy link
@TamarChristinaArm

TamarChristinaArm Nov 7, 2019

Author Contributor

Yeah, vector64<int> would be an ADDP, but I couldn't see anyway to encode that in the intrinsics table in hwintrinsiclistarm64.h as it assumes you have the same instruction for all sizes of the same type unless I'm mistaken? It doesn't seem to allow overloading on the maximum vector size if the name stays the same.

This comment has been minimized.

Copy link
@tannergooding

tannergooding Nov 7, 2019

Member

It doesn't seem to allow overloading on the maximum vector size if the name stays the same.

Right, you would need to have special-handling in codegen to ensure the right instruction was selected.

Given that the underlying instruction encoding on ADDV is blocked and users have the same functionality via the explicit AddPairwise, I think not exposing it is the right decision.

/// uint32_t vaddvq_u32(uint32x4_t a)
/// A64: ADDV Sd, Vn.4S
/// </summary>
public static uint AddAcross(Vector128<uint> value) => AddAcross(value);

This comment has been minimized.

Copy link
@tannergooding

tannergooding Nov 7, 2019

Member

What is the behavior for overflow, if any?

This comment has been minimized.

Copy link
@TamarChristinaArm

TamarChristinaArm Nov 7, 2019

Author Contributor

There is none, in this case you wouldn't know about it, and this instruction doesn't have a saturated version either.

@TamarChristinaArm

This comment has been minimized.

Copy link
Contributor Author

TamarChristinaArm commented Nov 7, 2019

@TamarChristinaArm, this does a sum of all elements where-as FADDP is a pairwise addition (equivalent to Horizontal Add on x86) and also has integer equivalents, correct?

Yup that's correct.

@tannergooding

This comment has been minimized.

Copy link
Member

tannergooding commented Nov 7, 2019

This can be merged after @echesakovMSFT or @CarolEidt also sign-off.

Copy link
Member

echesakovMSFT left a comment

LGTM. Thank you!

@echesakovMSFT echesakovMSFT merged commit 12b9ed4 into dotnet:master Nov 7, 2019
50 checks passed
50 checks passed
WIP Ready for review
Details
coreclr-ci Build #20191107.4 succeeded
Details
coreclr-ci (Build Linux arm checked) Build Linux arm checked succeeded
Details
coreclr-ci (Build Linux arm64 checked) Build Linux arm64 checked succeeded
Details
coreclr-ci (Build Linux arm64 release) Build Linux arm64 release succeeded
Details
coreclr-ci (Build Linux x64 checked) Build Linux x64 checked succeeded
Details
coreclr-ci (Build Linux_musl x64 checked) Build Linux_musl x64 checked succeeded
Details
coreclr-ci (Build Linux_musl x64 release) Build Linux_musl x64 release succeeded
Details
coreclr-ci (Build Linux_rhel6 x64 release) Build Linux_rhel6 x64 release succeeded
Details
coreclr-ci (Build OSX x64 checked) Build OSX x64 checked succeeded
Details
coreclr-ci (Build OSX x64 release) Build OSX x64 release succeeded
Details
coreclr-ci (Build Test Pri0 Linux arm checked) Build Test Pri0 Linux arm checked succeeded
Details
coreclr-ci (Build Test Pri0 Linux arm64 checked) Build Test Pri0 Linux arm64 checked succeeded
Details
coreclr-ci (Build Test Pri0 OSX x64 checked) Build Test Pri0 OSX x64 checked succeeded
Details
coreclr-ci (Build Test Pri0 OSX x64 release) Build Test Pri0 OSX x64 release succeeded
Details
coreclr-ci (Build Test Pri0 Windows_NT arm checked) Build Test Pri0 Windows_NT arm checked succeeded
Details
coreclr-ci (Build Test Pri0 Windows_NT arm64 checked) Build Test Pri0 Windows_NT arm64 checked succeeded
Details
coreclr-ci (Build Test Pri0 Windows_NT x64 checked) Build Test Pri0 Windows_NT x64 checked succeeded
Details
coreclr-ci (Build Test Pri0 Windows_NT x86 checked) Build Test Pri0 Windows_NT x86 checked succeeded
Details
coreclr-ci (Build Windows_NT arm checked) Build Windows_NT arm checked succeeded
Details
coreclr-ci (Build Windows_NT arm release) Build Windows_NT arm release succeeded
Details
coreclr-ci (Build Windows_NT arm64 checked) Build Windows_NT arm64 checked succeeded
Details
coreclr-ci (Build Windows_NT arm64 release) Build Windows_NT arm64 release succeeded
Details
coreclr-ci (Build Windows_NT x64 checked) Build Windows_NT x64 checked succeeded
Details
coreclr-ci (Build Windows_NT x64 debug) Build Windows_NT x64 debug succeeded
Details
coreclr-ci (Build Windows_NT x64 release) Build Windows_NT x64 release succeeded
Details
coreclr-ci (Build Windows_NT x86 checked) Build Windows_NT x86 checked succeeded
Details
coreclr-ci (Build Windows_NT x86 debug) Build Windows_NT x86 debug succeeded
Details
coreclr-ci (Checkout (Unix)) Checkout (Unix) succeeded
Details
coreclr-ci (Checkout (Windows)) Checkout (Windows) succeeded
Details
coreclr-ci (Formatting Linux x64) Formatting Linux x64 succeeded
Details
coreclr-ci (Formatting Windows_NT x64) Formatting Windows_NT x64 succeeded
Details
coreclr-ci (Run Test Pri0 Linux arm checked) Run Test Pri0 Linux arm checked succeeded
Details
coreclr-ci (Run Test Pri0 Linux arm64 checked) Run Test Pri0 Linux arm64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Linux x64 checked) Run Test Pri0 Linux x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Linux_musl x64 checked) Run Test Pri0 Linux_musl x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Linux_musl x64 release) Run Test Pri0 Linux_musl x64 release succeeded
Details
coreclr-ci (Run Test Pri0 OSX x64 checked) Run Test Pri0 OSX x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Windows_NT arm checked) Run Test Pri0 Windows_NT arm checked succeeded
Details
coreclr-ci (Run Test Pri0 Windows_NT arm64 checked) Run Test Pri0 Windows_NT arm64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Windows_NT x64 checked) Run Test Pri0 Windows_NT x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Windows_NT x86 checked) Run Test Pri0 Windows_NT x86 checked succeeded
Details
coreclr-ci (Run Test Pri0 CoreFX Linux x64 checked) Run Test Pri0 CoreFX Linux x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 CoreFX Windows_NT x64 checked) Run Test Pri0 CoreFX Windows_NT x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 R2R Linux x64 checked) Run Test Pri0 R2R Linux x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 R2R OSX x64 checked) Run Test Pri0 R2R OSX x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 R2R Windows_NT x64 checked) Run Test Pri0 R2R Windows_NT x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 R2R Windows_NT x86 checked) Run Test Pri0 R2R Windows_NT x86 checked succeeded
Details
coreclr-ci (Test crossgen-comparison Linux arm checked) Test crossgen-comparison Linux arm checked succeeded
Details
license/cla All CLA requirements met.
Details
Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/corefx that referenced this pull request Nov 7, 2019
Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/mono that referenced this pull request Nov 7, 2019
Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Anipik added a commit to dotnet/corefx that referenced this pull request Nov 7, 2019
Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
marek-safar added a commit to mono/mono that referenced this pull request Nov 7, 2019
Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/corert that referenced this pull request Nov 8, 2019
Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@TamarChristinaArm TamarChristinaArm deleted the TamarChristinaArm:implement-arm64-addacross branch Nov 8, 2019
jkotas added a commit to dotnet/corert that referenced this pull request Nov 8, 2019
Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.