Skip to content

JIT: Don't allow broadcast lowering to remove cast from decomposed long #116002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 27, 2025

Conversation

saucecontrol
Copy link
Member

Fixes #115202

I was unable to create a minimal repro for this, but using the latest SPMI context from #115202 (comment), I found the cause.
 
The problematic IR starts out as:

N025 (???,???) [016563] -ACXG+-----                t16563 = *  CALL      long   <unknown method>
                                                            /--*  t16563 long   
N026 (???,???) [016564] -ACXG+-----                t16564 = *  CAST      int <- long
                                                            /--*  t16564 int    
N027 (???,???) [016565] -ACXG+-----                t16565 = *  HWINTRINSIC simd32 32 int Create

This Vector256.Create is lowered to Avx2.BroadcastScalarToVector256 <- Vector256.CreateScalarUnsafe <- CAST(int <- long).

Then the containment check for broadcast, which looks for that specific pattern, is eating both the CreateScalarUnsafe and the GT_CAST.

Resulting in an uncontained GT_LONG:

               [038143] -----------                t38143 =    LCL_FLD   int    V1485 tmp1445    [+0]
               [038144] -----------                t38144 =    LCL_FLD   int    V1485 tmp1445    [+4]
                                                            /--*  t38143 int    
                                                            +--*  t38144 int    
               [038145] -----------                t38145 = *  LONG      long  
                                                            /--*  t38145 long   
N027 (???,???) [016565] -ACXG+-----                t16565 = *  HWINTRINSIC simd32 32 int BroadcastScalarToVector256

In this case, the cast should have been preserved, and it should contain the GT_LONG.

Correct IR after lowering:

               [038143] -----------                t38143 =    LCL_FLD   int    V1485 tmp1445    [+0]
               [038144] -----------                t38144 =    LCL_FLD   int    V1485 tmp1445    [+4]
                                                            /--*  t38143 int    
                                                            +--*  t38144 int    
               [038145] -c---------                t38145 = *  LONG      long  
                                                            /--*  t38145 long   
N026 (???,???) [016564] -ACXG+-----                t16564 = *  CAST      int <- long
                                                            /--*  t16564 int    
N027 (???,???) [016565] -ACXG+-----                t16565 = *  HWINTRINSIC simd32 32 int BroadcastScalarToVector256

@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label May 26, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 26, 2025
@tannergooding tannergooding added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels May 27, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC. @dotnet/jit-contrib, @jakobbotsch for secondary review.

It would be nice if we could get a small regression test added for this, but much like @saucecontrol I wasn't able to get a minimal repro either. Even the smaller sample given in the issue I was never able to get to reproduce locally, although it repro's in the SPMI context.

@jakobbotsch jakobbotsch merged commit b04d40e into dotnet:main May 27, 2025
113 checks passed
@jakobbotsch
Copy link
Member

Thanks for fixing this @saucecontrol. I am ok with not adding a test if it is unreasonably hard to come up with one. Fuzzlyn has also had issues reducing these test cases. Although it would be interesting to understand why that is the case.

@saucecontrol saucecontrol deleted the fix-115202 branch May 27, 2025 14:49
@github-actions github-actions bot locked and limited conversation to collaborators Jun 27, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assertion failed 'tree->IsUnusedValue()' in 'Program:M0()' during 'Linear scan register alloc'
3 participants