Arm64: Don't use GT_LEA for masks#128684
Conversation
genCreateAddrMode() will need rewrites for scalable vectors/masks. Until then, avoid using LEA nodes. In addition, remove invalid code for LDR/STR from the emitter. The emitter expects the offset to be a multiple of the VL/PL Fixes dotnet#127605
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Removes the workaround logic in the ARM64 SVE emitter that handled out-of-range immediate offsets for ldr/str by scaling or stashing to a reserved register, and instead relies on the lowering phase to avoid generating address modes for SVE mask/SIMD loads/stores. The emitter now accepts SP as a base register and simply asserts the immediate is a valid signed 9-bit value.
Changes:
- Lowering refuses to create address modes for
TYP_MASK/TYP_SIMDparents on ARM64 (SVE TODO). - SVE
ldr/stremitter paths drop the scaled-immediate workaround and now allow SP (encoded viaencodingSPtoZR). - Removes the now-unused
emitIns_valid_imm_for_scaled_sve_ldst_offsethelper from header and implementation.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/coreclr/jit/lower.cpp | Skips address-mode creation for SVE mask/SIMD parents. |
| src/coreclr/jit/emitarm64sve.cpp | Simplifies SVE ldr/str emission; allows SP base; removes imm scaling/reserved-reg fallback. |
| src/coreclr/jit/emitarm64.h | Removes declaration of unused SVE scaled imm validator. |
| src/coreclr/jit/emitarm64.cpp | Removes implementation of unused SVE scaled imm validator. |
|
/azp run Fuzzlyn |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run Fuzzlyn |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
This helps a lot on Fuzzlyn: https://dev.azure.com/dnceng-public/public/_build/results?buildId=1438694&view=ms.vss-build-web.run-extensions-tab I still see a number of crash issues with runtime async for arm64 specifically. I think it is likely there is another issue hiding here. as a crashing example in the above. On SVE capable hardware you should be able to reproduce this with If it reports no issue then the issue is intermittent and this will be harder to track down. Hopefully it reproduces consistently. |
|
/ba-g x64 NativeAOT timeouts |
I ran this one in a loop and didn't see any failures after 84 runs. I also ran Fuzzlyn for 50mins / 109800 examples. I got one failure in if conversion, which I raised as #128749. |
Scratch that. Not sure why, but my HEAD was a month old. |
Change-Id: I3295d1856c2cc5a5c4403a28dffedec1ff6390d9
Ok, after merged up to latest HEAD:
|
Note that the failures are with runtime-async only. To enable runtime-async you need to pass |
Yes, I made sure to include that: This PR should be good to go though? |
genCreateAddrMode() will need rewrites for scalable vectors/masks. Until then, avoid using LEA nodes.
In addition, remove invalid code for LDR/STR from the emitter. The emitter expects the offset to be a multiple of the VL/PL
Fixes #127605