Skip to content

8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs#24575

Closed
rwestrel wants to merge 19 commits intoopenjdk:masterfrom
rwestrel:JDK-8354282
Closed

8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs#24575
rwestrel wants to merge 19 commits intoopenjdk:masterfrom
rwestrel:JDK-8354282

Conversation

@rwestrel
Copy link
Contributor

@rwestrel rwestrel commented Apr 10, 2025

This is a variant of 8332827. In 8332827, an array access becomes
dependent on a range check CastII for another array access. When,
after loop opts are over, that RC CastII was removed, the array
access could float and an out of bound access happened. With the fix
for 8332827, RC CastIIs are no longer removed.

With this one what happens is that some transformations applied after
loop opts are over widen the type of the RC CastII. As a result, the
type of the RC CastII is no longer narrower than that of its input,
the CastII is removed and the dependency is lost.

There are 2 transformations that cause this to happen:

  • after loop opts are over, the type of the CastII nodes are widen
    so nodes that have the same inputs but a slightly different type can
    common.

  • When pushing a CastII through an Add, if of the type both inputs
    of the Adds are non constant, then we end up widening the type
    (the resulting Add has a type that's wider than that of the
    initial CastII).

There are already 3 types of Cast nodes depending on the
optimizations that are allowed. Either the Cast is floating
(depends_only_test() returns true) or pinned. Either the Cast
can be removed if it no longer narrows the type of its input or
not. We already have variants of the CastII:

  • if the Cast can float and be removed when it doesn't narrow the type
    of its input.

  • if the Cast is pinned and be removed when it doesn't narrow the type
    of its input.

  • if the Cast is pinned and can't be removed when it doesn't narrow
    the type of its input.

What we need here, I think, is the 4th combination:

  • if the Cast can float and can't be removed when it doesn't narrow
    the type of its input.

Anyway, things are becoming confusing with all these different
variants named in ways that don't always help figure out what
constraints one of them operate under. So I refactored this and that's
the biggest part of this change. The fix consists in marking Cast
nodes when their type is widen in a way that prevents them from being
optimized out.

Tobias ran performance testing with a slightly different version of
this change and there was no regression.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs (Bug - P3)(⚠️ The fixVersion in this issue is [26] but the fixVersion in .jcheck/conf is 27, a new backport will be created when this pr is integrated.)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24575/head:pull/24575
$ git checkout pull/24575

Update a local copy of the PR:
$ git checkout pull/24575
$ git pull https://git.openjdk.org/jdk.git pull/24575/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24575

View PR using the GUI difftool:
$ git pr show -t 24575

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24575.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 10, 2025

👋 Welcome back roland! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 10, 2025

@rwestrel This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs

Reviewed-by: chagedorn, qamai, galder, epeter

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 173 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@rwestrel rwestrel marked this pull request as ready for review April 10, 2025 15:17
@openjdk
Copy link

openjdk bot commented Apr 10, 2025

@rwestrel The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot-compiler hotspot-compiler-dev@openjdk.org rfr Pull request is ready for review labels Apr 10, 2025
@mlbridge
Copy link

mlbridge bot commented Apr 10, 2025

@merykitty
Copy link
Member

If a CastII that does not narrow its input has its type being a constant, do you think GVN should transform it into a constant, or such nodes should return the bottom type so that it is not folded into a floating ConNode?

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwestrel thanks for looking into this one!

I have not yet deeply studied the PR, but am feeling some confusion about the naming.

I think the DependencyType is really a good step into the right direction, it helps clean things up.

I'm wondering if we should pick either depends_only_on_test or pinned, and use it everywhere consistently. Having both around as near synonymes (antonymes?) is a bit confusing for me.

I'll look into the code more later.

Comment on lines +36 to +39
const ConstraintCastNode::DependencyType ConstraintCastNode::RegularDependency(true, true, "regular dependency"); // not pinned, narrows type
const ConstraintCastNode::DependencyType ConstraintCastNode::WidenTypeDependency(true, false, "widen type dependency"); // not pinned, doesn't narrow type
const ConstraintCastNode::DependencyType ConstraintCastNode::StrongDependency(false, true, "strong dependency"); // pinned, narrows type
const ConstraintCastNode::DependencyType ConstraintCastNode::UnconditionalDependency(false, false, "unconditional dependency"); // pinned, doesn't narrow type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really a good reason to have the names Regular, WidenType, Strong and Unconditional? Did we just get used to these names over time, or do they really have a good reason for existance. They just don't really mean that much to me. Calling them (non)pinned and (non)narrowing would make more sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So NonPinnedNarrowingDependency, NonPinnedNonNarrowingDependeny, PinnedNarrowingDependency and NonPinnedNonNarrowingDependency?

Or to avoid using a negation for the one that's the weakest dependency:

FloatingNarrowingDependency, FloatingNonNarrowingDependency, NonFloatingNarrowingDependency and NonFloatingNonNarrowingDependency ?

What do you think @eme64 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either of these sound great :)

Comment on lines +56 to +58
bool depends_only_on_test() const {
return _depends_only_on_test;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this synonimous to non_pinning? Might that be more descriptive?

Comment on lines +90 to +91
const bool _depends_only_on_test; // Does this Cast depends on its control input or is it pinned?
const bool _narrows_type; // Does this Cast narrows the type i.e. if input type is narrower can it be removed?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to have a really strong definition of these two, because everything else depends on it.

I would recommend to either use depends_only_on_test as the "primary" word here, or else pinned. But then try to consistently use the chosen one everywhere. Just to avoid confusion with these near synonymes.

It may also be helpful to have an example for each of the 4 combinations, just as an illustration of your definitions.

@rwestrel
Copy link
Contributor Author

If a CastII that does not narrow its input has its type being a constant, do you think GVN should transform it into a constant, or such nodes should return the bottom type so that it is not folded into a floating ConNode?

The current patch constant folds the CastII in that case. I could write a test case where that's an issue (it causes an out of bound load to float above the range check it depends on). I'm working on an update to the patch to address this.

@rwestrel
Copy link
Contributor Author

@emea thanks for the comments. As mentioned in another comment, I'm in the process of reworking the patch.

I'm wondering if we should pick either depends_only_on_test or pinned, and use it everywhere consistently. Having both around as near synonymes (antonymes?) is a bit confusing for me.

depends_only_on_test comes from Node::depends_only_on_test.

@TobiHartmann
Copy link
Member

Just wondering, since we are getting closer to RDP 1 for JDK 25 (June 05, 2025), should we defer this to JDK 26?

@rwestrel
Copy link
Contributor Author

Just wondering, since we are getting closer to RDP 1 for JDK 25 (June 05, 2025), should we defer this to JDK 26?

Deferring makes sense. This is a corner case anyway. I've been reworking the patch and it's getting more complicated so it will likely need more time for reviews.

@TobiHartmann
Copy link
Member

Sounds good, I'll defer it to JDK 26 then. Thanks for the quick reply!

@openjdk
Copy link

openjdk bot commented May 13, 2025

@rwestrel this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8354282
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label May 13, 2025
@bridgekeeper
Copy link

bridgekeeper bot commented Jun 10, 2025

@rwestrel This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 8, 2025

@rwestrel This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Jul 8, 2025
@rwestrel
Copy link
Contributor Author

rwestrel commented Jul 8, 2025

/keepalive

@openjdk
Copy link

openjdk bot commented Jul 8, 2025

@rwestrel This command can only be used in open pull requests.

@rwestrel
Copy link
Contributor Author

rwestrel commented Jul 8, 2025

/open

@openjdk openjdk bot reopened this Jul 8, 2025
@openjdk
Copy link

openjdk bot commented Jul 8, 2025

@rwestrel This pull request is now open

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 5, 2025

@rwestrel This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 2, 2025

@rwestrel This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Sep 2, 2025
Copy link
Member

@chhagedorn chhagedorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, it looks good to me! If @eme64 also agrees with the latest patch, we can submit some testing and then hopefully get it in right before the fork.

Comment on lines +96 to +101
const DependencyType& with_pinned_dependency() const {
if (_narrows_type) {
return NonFloatingNarrowing;
}
return NonFloatingNonNarrowing;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's true. I was also unsure about whether we should stick with one or just allow both interchangeably. I guess since there are so many uses, we can just move forward with what you have now and still come back to clean it up if necessary - we can always do that.

// used when a floating node is sunk out of loop: we don't want the cast that forces the node to be out of loop to
// be removed in any case otherwise the sunk node floats back into the loop.
static const DependencyType NonFloatingNonNarrowing;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking it over :-)

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 2, 2025
Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwestrel Nice work! We not just only fixed the bug but made the concepts much clearer. This makes me very happy 😊


// All the possible combinations of floating/narrowing with example use cases:

// Use case example: Range Check CastII
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is incorrect, a range check should be floating non-narrowing. It is only narrowing if the length of the array is a constant. It is because this cast encodes the dependency on the condition index u< length. This condition cannot be expressed in terms of Type unless length is a constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Range check CastII were added to protect the ConvI2L in the address expression on 64 bits. The problem there was, in some cases, that the ConvI2L would float above the range check (because ConvI2L has no control input) and could end up with an out of range input (which in turn would cause the ConvI2L to become top in places where it wasn't expected).
So CastII doesn't carry the control dependency of an array access on its range check. That dependency is carried by the MemNode which has its control input set to the range check.
What you're saying, if I understand it correctly, would be true if the CastII was required to prevent an array Load from floating. But that's not the case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, sorry I misunderstood!

Co-authored-by: Emanuel Peter <emanuel.peter@oracle.com>
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Dec 5, 2025
@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 8, 2025

@merykitty @eme64 @chhagedorn thanks for the reviews
Does testing need to be run on this before I integrate?

@eme64
Copy link
Contributor

eme64 commented Dec 8, 2025

@rwestrel I'll run some testing now ...

@eme64
Copy link
Contributor

eme64 commented Dec 9, 2025

@chhagedorn I see that an internal IR test is failing - one that you added a while back. Could you have a look what may have gone wrong?

@chhagedorn
Copy link
Member

chhagedorn commented Dec 9, 2025

I had a look and it seems that the internal test is relying on a CastII node to be removed after loop opts, when we widen CastII nodes, to trigger an ideal optimization. That is no longer the case with this patch because we keep the CastII node in the graph. The fix would be to improve the ideal optimization to look through cast nodes. However, this feels out of scope, especially since this PR is a bug fix for JDK 26.

I therefore propose to fix the internal test before integrating this PR and then follow up with an RFE to fix the ideal optimization. I can take care of this and let you know once this is done.

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 9, 2025

I therefore propose to fix the internal test before integrating this PR and then follow up with an RFE to fix the ideal optimization. I can take care of this and let you know once this is done.

That sounds good to me. Should I take care of the ideal transformation? Let me know when the internal test is so I can proceed with the integration.

@chhagedorn
Copy link
Member

Thanks Roland! I'll let you know and file a follow-up RFE and assign it to you. I will dump all the relevant information in there with a test case.

@chhagedorn
Copy link
Member

The internal test is fixed and sanity testing passed - you can move forward with integrating this PR :-)

@rwestrel
Copy link
Contributor Author

@chhagedorn @eme64 @merykitty thanks for the reviews and testing

@rwestrel
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Dec 10, 2025

@rwestrel This pull request has not yet been marked as ready for integration.

@rwestrel
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 10, 2025
@openjdk
Copy link

openjdk bot commented Dec 10, 2025

Going to push as commit 00068a8.
Since your change was applied there have been 173 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 10, 2025
@openjdk openjdk bot closed this Dec 10, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 10, 2025
@openjdk
Copy link

openjdk bot commented Dec 10, 2025

@rwestrel Pushed as commit 00068a8.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@rwestrel
Copy link
Contributor Author

/backport :jdk26

@openjdk
Copy link

openjdk bot commented Dec 18, 2025

@rwestrel the backport was successfully created on the branch backport-rwestrel-00068a80-jdk26 in my personal fork of openjdk/jdk. To create a pull request with this backport targeting openjdk/jdk:jdk26, just click the following link:

➡️ Create pull request

The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit 00068a80 from the openjdk/jdk repository.

The commit being backported was authored by Roland Westrelin on 10 Dec 2025 and was reviewed by Christian Hagedorn, Quan Anh Mai, Galder Zamarreño and Emanuel Peter.

Thanks!

If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk:

$ git fetch https://github.com/openjdk-bots/jdk.git backport-rwestrel-00068a80-jdk26:backport-rwestrel-00068a80-jdk26
$ git checkout backport-rwestrel-00068a80-jdk26
# make changes
$ git add paths/to/changed/files
$ git commit --message 'Describe additional changes made'
$ git push https://github.com/openjdk-bots/jdk.git backport-rwestrel-00068a80-jdk26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

graal graal-dev@openjdk.org hotspot hotspot-dev@openjdk.org hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated shenandoah shenandoah-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

6 participants