Skip to content

Conversation

@merykitty
Copy link
Member

@merykitty merykitty commented Nov 20, 2025

Hi,

This fixes the crash in Load/StoreVectorMaskedNode::Ideal. The issue here is that the graph is not canonical during idealization, which leads to us processing a dead node. The fix I propose is to bail-out when that happens.

To be more specific, for this issue, we have the graph that looks like:

ConI -> ConvI2L -> CastLL(0..32) -> VectorMaskGen

with ConI being 45 and MaxVectorSize being 32. In this instance, CastLL is processed before ConvI2L, and when it is processed, it sees that the type of ConvI2L being its bottom type. As a result, it does not know that it is top, and since we are after macro expansion, which is after loop opts, the CastLL goes away, leaving us with:

ConI -> ConvI2L -> VectorMaskGen

After ConvI2L is processed, we know that the input of VectorMaskGen is a constant 45, which is larger than MaxVectorSize, leading to the assert failure.

Please take a look and leave your thoughts, thanks a lot.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8371964: C2 compilation asserts with "Unexpected load/store size" (Bug - P2)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28410/head:pull/28410
$ git checkout pull/28410

Update a local copy of the PR:
$ git checkout pull/28410
$ git pull https://git.openjdk.org/jdk.git pull/28410/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 28410

View PR using the GUI difftool:
$ git pr show -t 28410

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28410.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 20, 2025

👋 Welcome back qamai! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 20, 2025

@merykitty This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8371964: C2 compilation asserts with "Unexpected load/store size"

Reviewed-by: chagedorn, epeter

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 231 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Nov 20, 2025
@openjdk
Copy link

openjdk bot commented Nov 20, 2025

@merykitty The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 20, 2025
@mlbridge
Copy link

mlbridge bot commented Nov 20, 2025

Webrevs

Comment on lines 1148 to 1152
assert(load_sz <= MaxVectorSize, "Unexpected load size");
if (load_sz > MaxVectorSize) {
// Dead node, should go away
return nullptr;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that this widening/removal of the CastLL node is happening on an actual dead path that is going to be removed anyway?

It sounds like this problem is specific to post loop opts IGVN phases where we are allowed to widen CastII/LL nodes. Could we assert that this bailout only happens after post loop opts?

Apart from that, I think your fix is reasonable. Were you able to also extract a reproducer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, running compiler/arraycopy/TestArrayCopyDisjoint.java with -XX:+UnlockDiagnosticVMOptions -XX:-TieredCompilation -XX:+StressArrayCopyMacroNode -XX:+StressLCM -XX:+StressGCM -XX:+StressIGVN -XX:+StressCCP -XX:+StressMacroExpansion -XX:+StressMethodHandleLinkerInlining -XX:+StressCompiledExceptionHandlers -XX:VerifyConstraintCasts=1 -XX:+StressLoopPeeling encounters this issue. Do you think it is necessary to add a separate case for that test, then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! If it's a short running test/config, then I think it would be good to have this extra config to cover the changes of this patch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it truly specific to post-loop opts phase? Isn't it yet another paradoxical IR shape occurring in effectively dead code?

In the longer term, it would be good to ensure such effectively dead nodes eventually go away. Or, better, eagerly trigger their elimination. Otherwise, it could cause issues later in compilation process unless the problematic conditions are explicitly handled everywhere (e.g., during matching or code generation for vmask_gen_imm on x64 and AArch64).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, so I change the function to returning top in those cases. For the VectorMaskGenNode itself, the situation seems harder, because it can float anywhere, so after loop opts and cast node removal, GVN may common multiple different instances of VectorMaskGenNode, and it is inevitable that it may be executed with an out-of-bounds input value. I think we just need to make sure that its uses can live with the fact that the result would be unspecified then.

@eme64
Copy link
Contributor

eme64 commented Nov 25, 2025

Is this issue at all related to #24575?

It seems we remove a CastLL from the graph, because the input type is wider than the Cast's type, right?

If I remember correctly from #24575, if a CastLL is narrowing, we don't want to remove it, see ConstraintCastNode::Identity.

Can you elaborate a bit more on where the CastLL came from, and what it is supposed to do?

@merykitty
Copy link
Member Author

@eme64 Yes, it is indeed similar. The issue here is that after loop opts, we try to remove almost all CastNodes so that the graph can be GVN-ed better (think of x = a + b and y = cast(a) + b).

Can you elaborate a bit more on where the CastLL came from, and what it is supposed to do?

Macro expansion tries to be smart for an array copy and does this:

byte[] dst;
byte[] src;
int len;
if (len <= 32) {
    int casted_len = cast(len, 0, 32);
    vectormask<byte, 32> mask = VectorMaskGen(casted_len);
    vector<byte, 32> v = LoadVectorMasked(src, 0, mask);
    StoreVectorMasked(dst, 0, v, mask);
} else {
    // do the copy normally;
}

As you can see, the masked accesses are only meaningful if len <= 32. But after loop opts, the cast is gone, leaving us with a len which happens to be larger than 32. The path should be dead, but IGVN reaches the LoadVectorMaskedNode first, which triggers the assert.

@eme64
Copy link
Contributor

eme64 commented Nov 26, 2025

@merykitty Thanks for the explanations!
So the CastLL is a narrowing cast, right? And ConstraintCastNode::Identity removes it, because the input type is wider, right? To me this part sounds incorrect. Narrowing casts should only be removed if the input is already narrower. No?

Any opinions from @rwestrel ?

@rwestrel
Copy link
Contributor

@merykitty Thanks for the explanations! So the CastLL is a narrowing cast, right? And ConstraintCastNode::Identity removes it, because the input type is wider, right? To me this part sounds incorrect. Narrowing casts should only be removed if the input is already narrower. No?

But the type of the CastLL is widened after loop opts, right?
So it's similar to #24575 but with a constant input to the cast. That's a case that #24575 doesn't address (it doesn't prevent constant folding of a cast) and can cause issues. See #24575 (comment)
I intend to create a follow up to 24575 that will address the remaining issues in a way that's similar to what @merykitty proposes here.

@eme64
Copy link
Contributor

eme64 commented Nov 26, 2025

@rwestrel Ok, thanks for the clarifying details. That makes sense. I missed the widening after loop-opts: before the constant input lay outside the range, now it is inside and so the CastLL is folded, replaced with the (wrong) constant rather than top.

@eme64
Copy link
Contributor

eme64 commented Nov 26, 2025

@rwestrel Is there any conflict with your solution? If not, we can go ahead with @merykitty 's solution here.

@rwestrel
Copy link
Contributor

@rwestrel Is there any conflict with your solution? If not, we can go ahead with @merykitty 's solution here.

No, no conflict.

@shipilev
Copy link
Member

shipilev commented Dec 1, 2025

Are we moving forward with this? Still too many failures in local testing without this fix :)

@dholmes-ora
Copy link
Member

We either need this fix or a backout of whatever caused the problem. The fork is this week and this causes a lot of failures in testing.

Copy link
Member

@chhagedorn chhagedorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, looks good to me.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 2, 2025
Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for fixing this @merykitty !

@eme64
Copy link
Contributor

eme64 commented Dec 2, 2025

@merykitty Hold off with integration for a few hours, @chhagedorn just launched some internal testing.

@merykitty
Copy link
Member Author

Thanks a lot for your reviews, please reapprove when the tests pass.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Dec 2, 2025
Copy link
Member

@chhagedorn chhagedorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing passed!

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 2, 2025
@shipilev
Copy link
Member

shipilev commented Dec 2, 2025

Let's go then? I am eager to try and enable deeper CTW testing again :)

@merykitty
Copy link
Member Author

Thanks for the approval!
/integrate

@openjdk
Copy link

openjdk bot commented Dec 2, 2025

Going to push as commit ca4ae80.
Since your change was applied there have been 243 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 2, 2025
@openjdk openjdk bot closed this Dec 2, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 2, 2025
@openjdk
Copy link

openjdk bot commented Dec 2, 2025

@merykitty Pushed as commit ca4ae80.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

7 participants