Skip to content

Arm64: Don't use GT_LEA for masks#128684

Open
a74nh wants to merge 2 commits into
dotnet:mainfrom
a74nh:async_sve_github
Open

Arm64: Don't use GT_LEA for masks#128684
a74nh wants to merge 2 commits into
dotnet:mainfrom
a74nh:async_sve_github

Conversation

@a74nh
Copy link
Copy Markdown
Contributor

@a74nh a74nh commented May 28, 2026

genCreateAddrMode() will need rewrites for scalable vectors/masks. Until then, avoid using LEA nodes.

In addition, remove invalid code for LDR/STR from the emitter. The emitter expects the offset to be a multiple of the VL/PL

Fixes #127605

genCreateAddrMode() will need rewrites for scalable vectors/masks.
Until then, avoid using LEA nodes.

In addition, remove invalid code for LDR/STR from the emitter.
The emitter expects the offset to be a multiple of the VL/PL

Fixes dotnet#127605
Copilot AI review requested due to automatic review settings May 28, 2026 08:55
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 28, 2026
@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label May 28, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Removes the workaround logic in the ARM64 SVE emitter that handled out-of-range immediate offsets for ldr/str by scaling or stashing to a reserved register, and instead relies on the lowering phase to avoid generating address modes for SVE mask/SIMD loads/stores. The emitter now accepts SP as a base register and simply asserts the immediate is a valid signed 9-bit value.

Changes:

  • Lowering refuses to create address modes for TYP_MASK/TYP_SIMD parents on ARM64 (SVE TODO).
  • SVE ldr/str emitter paths drop the scaled-immediate workaround and now allow SP (encoded via encodingSPtoZR).
  • Removes the now-unused emitIns_valid_imm_for_scaled_sve_ldst_offset helper from header and implementation.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/coreclr/jit/lower.cpp Skips address-mode creation for SVE mask/SIMD parents.
src/coreclr/jit/emitarm64sve.cpp Simplifies SVE ldr/str emission; allows SP base; removes imm scaling/reserved-reg fallback.
src/coreclr/jit/emitarm64.h Removes declaration of unused SVE scaled imm validator.
src/coreclr/jit/emitarm64.cpp Removes implementation of unused SVE scaled imm validator.

Comment thread src/coreclr/jit/emitarm64sve.cpp
Comment thread src/coreclr/jit/lower.cpp
@jakobbotsch
Copy link
Copy Markdown
Member

/azp run Fuzzlyn

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@a74nh a74nh requested a review from jakobbotsch May 28, 2026 09:36
@a74nh a74nh marked this pull request as ready for review May 28, 2026 09:36
@jakobbotsch
Copy link
Copy Markdown
Member

/azp run Fuzzlyn

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@jakobbotsch
Copy link
Copy Markdown
Member

This helps a lot on Fuzzlyn: https://dev.azure.com/dnceng-public/public/_build/results?buildId=1438694&view=ms.vss-build-web.run-extensions-tab

I still see a number of crash issues with runtime async for arm64 specifically. I think it is likely there is another issue hiding here.
The Fuzzlyn CI run does not automatically reduce crashes (that takes a very long time). Can you try to see if you are able to reproduce one of those issues manually with Fuzzlyn? E.g. I see

6034233908977920848-async,runtimeasync,vectort,vector64,vector128,armadvsimd,armadvsimdarm64,armaes,armarmbase,armarmbasearm64,armcrc32,armcrc32arm64,armdp,armrdm,armrdmarm64,armsha1,armsha256,armsve,armsve2

as a crashing example in the above. On SVE capable hardware you should be able to reproduce this with

<path to Fuzzlyn> --host <path to checked corerun> --seed 6034233908977920848-async,runtimeasync,vectort,vector64,vector128,armadvsimd,armadvsimdarm64,armaes,armarmbase,armarmbasearm64,armcrc32,armcrc32arm64,armdp,armrdm,armrdmarm64,armsha1,armsha256,armsve,armsve2 --reduce

If it reports no issue then the issue is intermittent and this will be harder to track down. Hopefully it reproduces consistently.

@jakobbotsch
Copy link
Copy Markdown
Member

/ba-g x64 NativeAOT timeouts

@a74nh
Copy link
Copy Markdown
Contributor Author

a74nh commented May 29, 2026

6034233908977920848-async,runtimeasync,vectort,vector64,vector128,armadvsimd,armadvsimdarm64,armaes,armarmbase,armarmbasearm64,armcrc32,armcrc32arm64,armdp,armrdm,armrdmarm64,armsha1,armsha256,armsve,armsve2

I ran this one in a loop and didn't see any failures after 84 runs.

I also ran Fuzzlyn for 50mins / 109800 examples. I got one failure in if conversion, which I raised as #128749.

@a74nh
Copy link
Copy Markdown
Contributor Author

a74nh commented May 29, 2026

6034233908977920848-async,runtimeasync,vectort,vector64,vector128,armadvsimd,armadvsimdarm64,armaes,armarmbase,armarmbasearm64,armcrc32,armcrc32arm64,armdp,armrdm,armrdmarm64,armsha1,armsha256,armsve,armsve2

I ran this one in a loop and didn't see any failures after 84 runs.

I also ran Fuzzlyn for 50mins / 109800 examples. I got one failure in if conversion, which I raised as #128749.

Scratch that. Not sure why, but my HEAD was a month old.

Change-Id: I3295d1856c2cc5a5c4403a28dffedec1ff6390d9
@a74nh
Copy link
Copy Markdown
Contributor Author

a74nh commented May 29, 2026

6034233908977920848-async,runtimeasync,vectort,vector64,vector128,armadvsimd,armadvsimdarm64,armaes,armarmbase,armarmbasearm64,armcrc32,armcrc32arm64,armdp,armrdm,armrdmarm64,armsha1,armsha256,armsve,armsve2

I ran this one in a loop and didn't see any failures after 84 runs.
I also ran Fuzzlyn for 50mins / 109800 examples. I got one failure in if conversion, which I raised as #128749.

Scratch that. Not sure why, but my HEAD was a month old.

Ok, after merged up to latest HEAD:

  • The above example didn't fail in 476 attempts
  • In 45mins I generated 104200 programs and got 0 failures

@jakobbotsch
Copy link
Copy Markdown
Member

Ok, after merged up to latest HEAD:

* The above example didn't fail in 476 attempts

* In 45mins I generated 104200 programs and got 0 failures

Note that the failures are with runtime-async only. To enable runtime-async you need to pass --gen-extensions default,async,runtimeasync to Fuzzlyn. But I am not sure how simple it is going to be to get a repro for this.

@a74nh
Copy link
Copy Markdown
Contributor Author

a74nh commented May 29, 2026

Ok, after merged up to latest HEAD:

* The above example didn't fail in 476 attempts

* In 45mins I generated 104200 programs and got 0 failures

Note that the failures are with runtime-async only. To enable runtime-async you need to pass --gen-extensions default,async,runtimeasync to Fuzzlyn. But I am not sure how simple it is going to be to get a repro for this.

Yes, I made sure to include that:

~/dotnet/Fuzzlyn/Fuzzlyn/bin/Release/net8.0/linux-arm64/Fuzzlyn --seconds-to-run 3000 --output-events-to out.txt --host $CORE_ROOT/corerun --parallelism -1 --known-errors dotnet/runtime --gen-extensions async,runtimeasync,vectort,vector64,vector128,armadvsimd,armadvsimdarm64,armaes,armarmbase,armarmbasearm64,armcrc32,armcrc32arm64,armdp,armrdm,armrdmarm64,armsha1,armsha256,armsve,armsve2

This PR should be good to go though?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JIT: Invalid runtime async result on arm64

3 participants