Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review the multi-op instruction usage for Arm64 #68028

Open
15 of 28 tasks
Tracked by #94464
tannergooding opened this issue Apr 14, 2022 · 12 comments
Open
15 of 28 tasks
Tracked by #94464

Review the multi-op instruction usage for Arm64 #68028

tannergooding opened this issue Apr 14, 2022 · 12 comments
Assignees
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Milestone

Comments

@tannergooding
Copy link
Member

tannergooding commented Apr 14, 2022

After seeing the msub PR (#66621) which folds mul, sub into a single msub I went and looked in the Arm64 manual for other interesting "combined operation" instructions.

There is a set of (shifted register) instructions which can combine a shift, op (the variants ending with s also set flags):

  • add, adds - Add
  • sub, subs - Subtract
  • cmp - Compare #84605
  • cmn - Compare Negative #84667
  • neg, negs - Negate #84667
  • and, ands - Bitwise AND
  • bic, bics - Bitwise bit clear (@TIHan)
  • eon - Bitwise exclusive OR NOT (@TIHan)
  • eor - Bitwise exclusive OR
  • orr - Bitwise inclusive OR
  • mvn - Bitwise NOT (@TIHan)
  • orn - Bitwise inclusive OR NOT (@TIHan)
  • tst - Test Bits (@TIHan)

There is a set of (extended register) instructions which can combine a zero-extend, op or sign-extend, op:

  • add, adds - Add
  • sub, subs - Subtract
  • cmp - Compare
  • cmn - Compare Negative

There is a set of (carry) instructions which can utilize the carry from a previous operation:

  • adc, adcs - Add with carry
  • sbc, sbcs - Subtract with carry
  • ngc, ngcs - Negate with carry

There are the multiply integer instructions which can combine an op, mul:

  • madd - Multiply-add
  • msub - Multiply-subtract
  • mneg - Multiply-negate

There are then some long multiply instructions which can return a product twice the size of the inputs (i32 * i32 = i64 or similar; effectively good for covering zero or sign extend, multiply):

Finally there is the "multiply high" instructions which can return just the upper bits of a wide multiply:

  • smulh, umulh - Multiply High

There may be other interesting instructions as well, but these are ones that may have broader usage/application and which likely be good to validate we are covering

category:implementation
theme:intrinsics

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Apr 14, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@tannergooding tannergooding added arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Apr 14, 2022
@ghost
Copy link

ghost commented Apr 14, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

After seeing the msub PR (#66621) which folds mul, sub into a single msub I went and looked in the Arm64 manual for other interesting "combined operation" instructions.

There is a set of (shifted register) instructions which can combine a shift, op (the variants ending with s also set flags):

  • add, adds - Add
  • sub, subs - Subtract
  • cmp - Compare
  • cmn - Compare Negative
  • neg, negs - Negate
  • and, ands - Bitwise AND
  • bic, bics - Bitwise bit clear
  • eon - Bitwise exclusive OR NOT
  • eor - Bitwise exclusive OR
  • orr - Bitwise inclusive OR
  • mvn - Bitwise NOT
  • orn - Bitwise inclusive OR NOT
  • tst - Test Bits

There is a set of (extended register) instructions which can combine a zero-extend, op or sign-extend, op:

  • add, adds - Add
  • sub, subs - Subtract
  • cmp - Compare
  • cmn - Compare Negative

There is a set of (carry) instructions which can utilize the carry from a previous operation:

  • adc, adcs - Add with carry
  • sbc, sbcs - Subtract with carry
  • ngc, ngcs - Negate with carry

There are the multiply integer instructions which can combine an op, mul:

  • madd - Multiply-add
  • msub - Multiply-subtract
  • mneg - Multiply-negate

There are then some long multiply instructions which can return a product twice the size of the inputs (i32 * i32 = i64 or similar):

  • smull, umull - Multiply long
  • smaddl, umaddl - Multiply-add long
  • smsubl, umsubl - Multiply-subtract long
  • smnegl, umnegl - Multiply-negate long

Finally there is the "multiply high" instructions which can return just the upper bits of a wide multiply:

  • smulh, umulh - Multiply High

There may be other interesting instructions as well, but these are ones that may have broader usage/application and which likely be good to validate we are covering

Author: tannergooding
Assignees: -
Labels:

arch-arm64, area-CodeGen-coreclr, untriaged

Milestone: -

@TIHan
Copy link
Member

TIHan commented Apr 14, 2022

These look like really good optimization opportunities.

@JulieLeeMSFT
Copy link
Member

Thanks @tannergooding for compiling all these cases. Let me put this in the future item.

@a74nh
Copy link
Contributor

a74nh commented Nov 18, 2022

@tannergooding and @kunalspathak : is there anyone looking at the remaining instructions? If not, then this could be a good set of items for @SwapnilGaikwad to work through.

While doing this, it's probably worth adding some DisasmCheck tests for them too.

@kunalspathak
Copy link
Member

@TIHan - are you planning to work on any of these?

@kunalspathak
Copy link
Member

this could be a good set of items for @SwapnilGaikwad to work through.

Sounds good to me.

@a74nh
Copy link
Contributor

a74nh commented Dec 19, 2022

MNEG support: #79550

@kunalspathak
Copy link
Member

@a74nh - Are there more items that you or @SwapnilGaikwad will be working on?

@SwapnilGaikwad
Copy link
Contributor

Hi @kunalspathak, we plan to look at the extended register instruction combinations next (add, adds, sub, subs, cmp and cmn).

@kunalspathak
Copy link
Member

Hi @kunalspathak, we plan to look at the extended register instruction combinations next (add, adds, sub, subs, cmp and cmn).

Sounds good.

@kunalspathak
Copy link
Member

@TIHan - Could you please update the issue description on which instructions are you planning to work on? It will help @SwapnilGaikwad to pick some other instructions to optimize.

@TIHan TIHan modified the milestones: 8.0.0, 9.0.0 Jul 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Projects
None yet
Development

No branches or pull requests

6 participants