Review the multi-op instruction usage for Arm64 #68028

tannergooding · 2022-04-14T16:57:59Z

After seeing the msub PR (#66621) which folds mul, sub into a single msub I went and looked in the Arm64 manual for other interesting "combined operation" instructions.

There is a set of (shifted register) instructions which can combine a shift, op (the variants ending with s also set flags):

There is a set of (extended register) instructions which can combine a zero-extend, op or sign-extend, op:

add, adds - Add
sub, subs - Subtract
cmp - Compare
cmn - Compare Negative

There is a set of (carry) instructions which can utilize the carry from a previous operation:

adc, adcs - Add with carry
sbc, sbcs - Subtract with carry
ngc, ngcs - Negate with carry

There are the multiply integer instructions which can combine an op, mul:

madd - Multiply-add
msub - Multiply-subtract
mneg - Multiply-negate

There are then some long multiply instructions which can return a product twice the size of the inputs (i32 * i32 = i64 or similar; effectively good for covering zero or sign extend, multiply):

smull, umull - Multiply long
smaddl, umaddl - Multiply-add long Optimise long multiply + add/sub/neg on arm64. #91886
smsubl, umsubl - Multiply-subtract long Optimise long multiply + add/sub/neg on arm64. #91886
smnegl, umnegl - Multiply-negate long Optimise long multiply + add/sub/neg on arm64. #91886

Finally there is the "multiply high" instructions which can return just the upper bits of a wide multiply:

smulh, umulh - Multiply High

There may be other interesting instructions as well, but these are ones that may have broader usage/application and which likely be good to validate we are covering

category:implementation
theme:intrinsics

The text was updated successfully, but these errors were encountered:

dotnet-issue-labeler · 2022-04-14T16:58:03Z

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

ghost · 2022-04-14T16:58:42Z

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

After seeing the msub PR (#66621) which folds mul, sub into a single msub I went and looked in the Arm64 manual for other interesting "combined operation" instructions.

There is a set of (shifted register) instructions which can combine a shift, op (the variants ending with s also set flags):

add, adds - Add
sub, subs - Subtract
cmp - Compare
cmn - Compare Negative
neg, negs - Negate
and, ands - Bitwise AND
bic, bics - Bitwise bit clear
eon - Bitwise exclusive OR NOT
eor - Bitwise exclusive OR
orr - Bitwise inclusive OR
mvn - Bitwise NOT
orn - Bitwise inclusive OR NOT
tst - Test Bits

There is a set of (extended register) instructions which can combine a zero-extend, op or sign-extend, op:

add, adds - Add
sub, subs - Subtract
cmp - Compare
cmn - Compare Negative

There is a set of (carry) instructions which can utilize the carry from a previous operation:

adc, adcs - Add with carry
sbc, sbcs - Subtract with carry
ngc, ngcs - Negate with carry

There are the multiply integer instructions which can combine an op, mul:

madd - Multiply-add
msub - Multiply-subtract
mneg - Multiply-negate

There are then some long multiply instructions which can return a product twice the size of the inputs (i32 * i32 = i64 or similar):

smull, umull - Multiply long
smaddl, umaddl - Multiply-add long
smsubl, umsubl - Multiply-subtract long
smnegl, umnegl - Multiply-negate long

Finally there is the "multiply high" instructions which can return just the upper bits of a wide multiply:

smulh, umulh - Multiply High

There may be other interesting instructions as well, but these are ones that may have broader usage/application and which likely be good to validate we are covering

Author:	tannergooding
Assignees:	-
Labels:	`arch-arm64`, `area-CodeGen-coreclr`, `untriaged`
Milestone:	-

TIHan · 2022-04-14T17:18:10Z

These look like really good optimization opportunities.

JulieLeeMSFT · 2022-04-18T23:41:51Z

Thanks @tannergooding for compiling all these cases. Let me put this in the future item.

a74nh · 2022-11-18T10:29:14Z

@tannergooding and @kunalspathak : is there anyone looking at the remaining instructions? If not, then this could be a good set of items for @SwapnilGaikwad to work through.

While doing this, it's probably worth adding some DisasmCheck tests for them too.

kunalspathak · 2022-11-18T18:05:59Z

@TIHan - are you planning to work on any of these?

kunalspathak · 2022-12-01T17:26:49Z

this could be a good set of items for @SwapnilGaikwad to work through.

Sounds good to me.

a74nh · 2022-12-19T16:47:33Z

MNEG support: #79550

kunalspathak · 2023-01-23T17:55:48Z

@a74nh - Are there more items that you or @SwapnilGaikwad will be working on?

SwapnilGaikwad · 2023-01-24T17:40:56Z

Hi @kunalspathak, we plan to look at the extended register instruction combinations next (add, adds, sub, subs, cmp and cmn).

kunalspathak · 2023-01-24T18:03:41Z

Hi @kunalspathak, we plan to look at the extended register instruction combinations next (add, adds, sub, subs, cmp and cmn).

Sounds good.

kunalspathak · 2023-04-24T15:03:02Z

@TIHan - Could you please update the issue description on which instructions are you planning to work on? It will help @SwapnilGaikwad to pick some other instructions to optimize.

dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Apr 14, 2022

tannergooding added arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Apr 14, 2022

JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Apr 18, 2022

JulieLeeMSFT added this to the Future milestone Apr 18, 2022

JulieLeeMSFT mentioned this issue Apr 18, 2022

Improving ARM64 Performance in .NET 7.0 #64820

Closed

32 tasks

tannergooding mentioned this issue Sep 19, 2022

Support a few "shifted register" operations on Arm64 #75823

Merged

tannergooding mentioned this issue Sep 27, 2022

Remove GT_ADDEX and replace with more generalized containment handling #76273

Merged

xoofx mentioned this issue Nov 12, 2022

ARM64: Missing combine eor+lsr and duplicate constant reloads #78263

Closed

SwapnilGaikwad mentioned this issue Dec 12, 2022

Emit mneg for mul+neg on Arm64 #79550

Merged

kunalspathak mentioned this issue Jan 23, 2023

Improving Arm64 Performance in .NET 8.0 #77010

Closed

28 tasks

JulieLeeMSFT mentioned this issue Feb 8, 2023

What's new in .NET 8 Preview 1 dotnet/core#8133

Closed

3 tasks

kunalspathak assigned TIHan Apr 6, 2023

JulieLeeMSFT modified the milestones: Future, 8.0.0 Apr 6, 2023

TIHan mentioned this issue Apr 18, 2023

[JIT] ARM64 - Combine 'neg' and 'cmp' to 'cmn' #84667

Merged

TIHan modified the milestones: 8.0.0, 9.0.0 Jul 17, 2023

kunalspathak mentioned this issue Sep 20, 2023

Optimise long multiply + add/sub/neg on arm64. #91886

Merged

kunalspathak mentioned this issue Oct 6, 2023

Arm64: Add SVE/SVE2 support in .NET 9 #93095

Closed

31 tasks

kunalspathak mentioned this issue Oct 13, 2023

JIT: Recognize ORN on ARM64 #93435

Closed

kunalspathak mentioned this issue Nov 7, 2023

Improving Arm64 Performance in .NET 9.0 #94464

Closed

13 tasks

TIHan added the Priority:3 Work that is nice to have label May 6, 2024

TIHan modified the milestones: 9.0.0, Future Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review the multi-op instruction usage for Arm64 #68028

Review the multi-op instruction usage for Arm64 #68028

tannergooding commented Apr 14, 2022 •

edited by kunalspathak

Loading

dotnet-issue-labeler bot commented Apr 14, 2022

ghost commented Apr 14, 2022

TIHan commented Apr 14, 2022

JulieLeeMSFT commented Apr 18, 2022

a74nh commented Nov 18, 2022

kunalspathak commented Nov 18, 2022

kunalspathak commented Dec 1, 2022

a74nh commented Dec 19, 2022

kunalspathak commented Jan 23, 2023

SwapnilGaikwad commented Jan 24, 2023

kunalspathak commented Jan 24, 2023

kunalspathak commented Apr 24, 2023

Review the multi-op instruction usage for Arm64 #68028

Review the multi-op instruction usage for Arm64 #68028

Comments

tannergooding commented Apr 14, 2022 • edited by kunalspathak Loading

dotnet-issue-labeler bot commented Apr 14, 2022

ghost commented Apr 14, 2022

TIHan commented Apr 14, 2022

JulieLeeMSFT commented Apr 18, 2022

a74nh commented Nov 18, 2022

kunalspathak commented Nov 18, 2022

kunalspathak commented Dec 1, 2022

a74nh commented Dec 19, 2022

kunalspathak commented Jan 23, 2023

SwapnilGaikwad commented Jan 24, 2023

kunalspathak commented Jan 24, 2023

kunalspathak commented Apr 24, 2023

tannergooding commented Apr 14, 2022 •

edited by kunalspathak

Loading