Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Arm64] ASIMD Implement widening, narrowing, saturating instructions #35379

Conversation

echesakov
Copy link
Contributor

@echesakov echesakov commented Apr 23, 2020

Adds new instructions for #32512 #35143

Also fixes #35303

Attached are jitDisasm and WinDbg u command outputs.
jitDisasm
WinDbg(u)

@echesakov echesakov added arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Apr 23, 2020
src/coreclr/src/jit/codegenarm64.cpp Show resolved Hide resolved
case IF_BI_0B:
case IF_BI_0C:
case IF_BI_1A:
case IF_BI_1B:
Copy link
Contributor Author

@echesakov echesakov Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BruceForstall @briansull Do you know why this giant sequence of case-s is needed? It's obviously not checking if fmt argument passed from outside (i.e. from one of the emitIns_X_Y_Z functions) matches the instruction format specified in INST1() in hwintrinsiclistarm64.h table. If someone makes a mistake in this table - it will go unnoticed (it happened to me yesterday). I believe it is better to replace this with the following logic in default: Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me. I guess the idea here is that for INST1, there is exactly one fmt. For INST2+, there is a "pseudo-format" in the "fmt" part of the macro that then maps to the actual format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ones named here were all of the formats that only had one possible formats. This switch insures that you properly update this method when adding new instr formats. With your change the assumption is made that a newly added instrFormat is an INST1 case by default. Where as before your chanfge it would assert on the newly added format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But after looking into your change I believe that what you have added here will also work fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ones named here were all of the formats that only had one possible formats. This switch insures that you properly update this method when adding new instr formats. With your change the assumption is made that a newly added instrFormat is an INST1 case by default. Where as before your chanfge it would assert on the newly added format.

That would assert only if you haven't added a new format and set fmt to the same value in one of emitIns_X_Y_Z functions. But if the latter is false (i.e. the function returns wrong format) then it wouldn't assert.

After this change it is not longer required to update this switch and it will automatically check that the fmt set in emitIns is consistent with the one declared in instrsarm64.h table.

@echesakov echesakov marked this pull request as ready for review April 24, 2020 00:03
@@ -7864,6 +7858,40 @@ void CodeGen::genArm64EmitterUnitTests()
theEmitter->emitIns_R_R(INS_fcvtn2, EA_8BYTE, REG_V0, REG_V1);
#endif

#ifdef ALL_ARM64_EMITTER_UNIT_TESTS
// sadalp vector
theEmitter->emitIns_R_R(INS_sadalp, EA_8BYTE, REG_V0, REG_V1, INS_OPTS_8B);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are a little odd because Vd and Vn take different arrangement specifiers, even though one implies the other. So which register does this insOpts specifier refer to? There must be other SIMD instructions like this as well. Should we have a function that takes two, and asserts they are as expected (e.g., if we have Vd.4H then we have Vn.8B)?

Are there instructions where the two specifies can both be arbitrary (and unrelated)?

Copy link
Contributor Author

@echesakov echesakov Apr 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say most of the added "long" instructions (except high narrow - addhn{2}, subhn{2}) specify their source register(s) arrangement specifiers.

I don't think we want to specify two arrangements when emitting at the moment - this would complicate the logic in the intrinsic backend - you would not only need to infer the insOpts from an incoming argument/return value but also to compute the second arrangement somehow.

Are there instructions where the two specifies can both be arbitrary (and unrelated)?

I have not seen such instructions - I hope there won't be any

Copy link
Contributor

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

case IF_BI_0B:
case IF_BI_0C:
case IF_BI_1A:
case IF_BI_1B:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ones named here were all of the formats that only had one possible formats. This switch insures that you properly update this method when adding new instr formats. With your change the assumption is made that a newly added instrFormat is an INST1 case by default. Where as before your chanfge it would assert on the newly added format.

case IF_BI_0B:
case IF_BI_0C:
case IF_BI_1A:
case IF_BI_1B:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But after looking into your change I believe that what you have added here will also work fine.

@echesakov
Copy link
Contributor Author

Last commits are:

  1. Fix dissasembly of saddw{2}, uaddw{2}, ssubw{2}, usubw{2}
  2. Address Bruce's feedback regarding diagnostic message
  3. Merge latest master

@echesakov echesakov merged commit f03d585 into dotnet:master Apr 29, 2020
@echesakov echesakov deleted the Arm64-ASIMD-Widening-Narrowing-Saturating-Instructions branch April 29, 2020 03:54
@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[arm64] sxtb, sxth, sxtw instructions disassembly incorrect
3 participants