JIT: Avoid mask<->vector optimization for masks used in unhandled ways #110307

jakobbotsch · 2024-12-02T12:49:39Z

When a local is used as a return buffer it is not address exposed, so the address-exposure check was not sufficient. Add checks for LCL_ADDR, LCL_FLD and STORE_LCL_FLD to make sure any use of a mask local that is not converted disqualifies it from participating in the optimization.

Also avoid doing some work for locals that are not SIMD/mask typed (common case). Previously we would do some unnecessary hash table lookups and other things in these cases.

Fix #110306

When a local is used as a return buffer it is not address exposed, so the address-exposure check was not sufficient. Add checks for `LCL_ADDR`, `LCL_FLD` and `STORE_LCL_FLD` to make sure any use of a mask local that is not converted disqualifies it from participating in the optimization.

amanasifkhalid · 2024-12-02T16:16:53Z

I haven't looked into them, but I suspect this will also fix the "Assert unreached()" failures from the latest Antigen run.

jakobbotsch · 2024-12-02T16:19:28Z

cc @dotnet/jit-contrib PTAL @a74nh @tannergooding

No diffs

tannergooding · 2024-12-02T17:06:16Z

Do we not have an outerloop job testing SVE under the AltJit?

Is that why these didn't get caught after the initial PR went in, and only after the x64 support came online?

kunalspathak · 2024-12-02T17:10:55Z

Do we not have an outerloop job testing SVE under the AltJit?

Is that why these didn't get caught after the initial PR went in, and only after the x64 support came online?

The existing pipeline is broken because it was not testing the right thing. I had #107475 but I saw some other non-sve failures which I didn't get time to investigate. We would get Cobalt machines in CI added shortly, so going forward, it will get caught naturally.

jakobbotsch · 2024-12-02T17:17:23Z

Is that why these didn't get caught after the initial PR went in, and only after the x64 support came online?

This doesn't repro on arm64 because SIMD16 is always returned by value there, which the existing checks handled. On xarch we have the possibility of the vector being defined as a return buffer, where it doesn't end up address exposed. The checks weren't sufficient for that.

a74nh · 2024-12-02T17:18:23Z

When a local is used as a return buffer it is not address exposed, so the address-exposure check was not sufficient. Add checks for LCL_ADDR, LCL_FLD and STORE_LCL_FLD to make sure any use of a mask local that is not converted disqualifies it from participating in the optimization.

Could this appear on Arm64? Interesting as I never saw any failures myself on Fuzzlyn Arm64.

tannergooding · 2024-12-02T17:24:55Z

This doesn't repro on arm64 because SIMD16 is always returned by value there

I don't believe we do this for cases like struct S { Vector128<T> field; } or other HVA qualifying structs today; it is a known gap/mismatch in the ABI handling.

jakobbotsch · 2024-12-02T17:31:38Z

This doesn't repro on arm64 because SIMD16 is always returned by value there

I don't believe we do this for cases like struct S { Vector128<T> field; } or other HVA qualifying structs today; it is a known gap/mismatch in the ABI handling.

16 byte structs on ARM64 are returned in two registers, not via return buffer.

jakobbotsch · 2024-12-02T17:32:12Z

Could this appear on Arm64? Interesting as I never saw any failures myself on Fuzzlyn Arm64.

The LCL_FLD and STORE_LCL_FLD cases could probably occur, but it requires some odd looking C# to make them appear.

tannergooding · 2024-12-02T18:02:48Z

16 byte structs on ARM64 are returned in two registers, not via return buffer.

For some classifications, not all classifications. There are multiple factors that influence it including total size, whether it is all integer, all floating-point, all SIMD, or mixed data types, etc.

The classifications are effectively Composite Type - Known Size, Compositive Type - Unknown Size, HFA, HVA. It is only Composite Type - Known size where the size is no more than 16 bytes that it may get passed/returned in 2 general purpose registers. An HFA/HVA may instead be passed in 1-4 floating-point/SIMD registers, while all other structs are effectively passed by return buffer. Similarly other special considerations (such as copy constructors in C++) may change the classification.

But regardless, my point was that I imagine this general issue should've been reproducible on SVE, it is not unique to x64

jakobbotsch · 2024-12-02T18:21:00Z

But regardless, my point was that I imagine this general issue should've been reproducible on SVE, it is not unique to x64

If we had larger SIMD types on arm64 than TYP_SIMD16, then yes, it would presumably be reproducible. But as it currently is, since we only end up with TYP_SIMD16 it is never passed by return buffer and we never see this there.

kunalspathak

LGTM

amanasifkhalid · 2024-12-02T21:01:09Z

/azp run Antigen

azure-pipelines · 2024-12-02T21:01:26Z

Azure Pipelines successfully started running 1 pipeline(s).

jakobbotsch · 2024-12-03T08:41:27Z

/ba-g Helix work item was dead-lettered

dotnet#110307) When a local is used as a return buffer it is not address exposed, so the address-exposure check was not sufficient. Add checks for `LCL_ADDR`, `LCL_FLD` and `STORE_LCL_FLD` to make sure any use of a mask local that is not converted disqualifies it from participating in the optimization. Also avoid doing some work for locals that are not SIMD/mask typed (common case). Previously we would do some unnecessary hash table lookups and other things in these cases.

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 2, 2024

dotnet-policy-service bot assigned jakobbotsch Dec 2, 2024

jakobbotsch added 2 commits December 2, 2024 13:51

Clean up

f080e32

Handle not finding locals

a8a0d8d

jakobbotsch force-pushed the fix-110306 branch from 15b3c39 to a8a0d8d Compare December 2, 2024 14:16

jakobbotsch marked this pull request as ready for review December 2, 2024 16:18

jakobbotsch requested review from a74nh and tannergooding December 2, 2024 16:19

tannergooding approved these changes Dec 2, 2024

View reviewed changes

build-analysis bot mentioned this pull request Dec 2, 2024

System.Formats.Nrbf.Tests timeouts #110285

Open

kunalspathak approved these changes Dec 2, 2024

View reviewed changes

amanasifkhalid mentioned this pull request Dec 2, 2024

windows/x64: Assertion failed 'unreached' during 'Physical promotion' #110326

Closed

jakobbotsch merged commit 05fa881 into dotnet:main Dec 3, 2024
110 of 122 checks passed

jakobbotsch deleted the fix-110306 branch December 3, 2024 08:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Avoid mask<->vector optimization for masks used in unhandled ways #110307

JIT: Avoid mask<->vector optimization for masks used in unhandled ways #110307

jakobbotsch commented Dec 2, 2024 •

edited

Loading

amanasifkhalid commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

tannergooding commented Dec 2, 2024

kunalspathak commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

a74nh commented Dec 2, 2024

tannergooding commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

tannergooding commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

kunalspathak left a comment

amanasifkhalid commented Dec 2, 2024

azure-pipelines bot commented Dec 2, 2024

jakobbotsch commented Dec 3, 2024

JIT: Avoid mask<->vector optimization for masks used in unhandled ways #110307

JIT: Avoid mask<->vector optimization for masks used in unhandled ways #110307

Conversation

jakobbotsch commented Dec 2, 2024 • edited Loading

amanasifkhalid commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

tannergooding commented Dec 2, 2024

kunalspathak commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

a74nh commented Dec 2, 2024

tannergooding commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

tannergooding commented Dec 2, 2024

jakobbotsch commented Dec 2, 2024

kunalspathak left a comment

Choose a reason for hiding this comment

amanasifkhalid commented Dec 2, 2024

azure-pipelines bot commented Dec 2, 2024

jakobbotsch commented Dec 3, 2024

jakobbotsch commented Dec 2, 2024 •

edited

Loading