Skip to content

Conversation

@dafedafe
Copy link
Contributor

@dafedafe dafedafe commented Jul 30, 2025

Issue

While compiling java.util.zip.ZipFile in C2 this assert is triggered

assert(Opcode() == Op_Initialize, "Only seen when there are no use of init memory");

Cause

While compiling the constructor of java.util.zip.ZipFile$CleanableResource the following happens:

  • we insert a trailing MemBarStoreStore in the constructor
before_folding
  • during IGVN we completely fold the memory subtree of the MemBarStoreStore node. The node still has a control output attached.
after_folding
  • later during the same IGVN run the MemBarStoreStore node is handled and we try to remove it (because the Allocate node of the MembBar is not escaping the thread )
    if ((alloc != nullptr) && alloc->is_Allocate() &&
    alloc->as_Allocate()->does_not_escape_thread()) {
  • the assert
    assert(Opcode() == Op_Initialize, "Only seen when there are no use of init memory");

    triggers because the barrier has only 1 (control) output and is a MemBarStoreStore (not Initialize) barrier

The issue happens only when the UseStoreStoreForCtor is set (default as well), which makes C2 use MemBarStoreStore instead of MemBarRelease at the end of constructors. MemBarStoreStore are processed separately by EA and this happens after the IGVN pass that folds the memory subtree. MemBarRelease on the other hand are handled during same IGVN pass before the memory subtree gets removed and it’s still got 2 outputs (assert skipped).

Fix

Adapting the assert to accept that MemBarStoreStore can also have != 2 outputs (when +UseStoreStoreForCtor is used) seems to be an OK solution as this seems like a perfectly plausible situation.

Testing

Unfortunately reproducing the issue with a simple regression test has proven very hard. The test seems to rely on very peculiar profiling and IGVN worklist sequence. JBS replay compilation passes. Running JCK's api/java_util 100 times triggers the assert a couple of times on average before the fix, none after.
Tier 1-3+ tests passed.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8360031: C2 compilation asserts in MemBarNode::remove (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26556/head:pull/26556
$ git checkout pull/26556

Update a local copy of the PR:
$ git checkout pull/26556
$ git pull https://git.openjdk.org/jdk.git pull/26556/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 26556

View PR using the GUI difftool:
$ git pr show -t 26556

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26556.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 30, 2025

👋 Welcome back dfenacci! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 30, 2025

@dafedafe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8360031: C2 compilation asserts in MemBarNode::remove

Reviewed-by: dlong, kvn, shade

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 869 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Jul 30, 2025

@dafedafe The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jul 30, 2025
@dafedafe dafedafe changed the title JDK-8360031: compilation asserts in MemBarNode::remove JDK-8360031: Compilation asserts in MemBarNode::remove Jul 31, 2025
@dafedafe dafedafe changed the title JDK-8360031: Compilation asserts in MemBarNode::remove JDK-8360031: C2 compilation asserts in MemBarNode::remove Jul 31, 2025
@openjdk openjdk bot changed the title JDK-8360031: C2 compilation asserts in MemBarNode::remove 8360031: C2 compilation asserts in MemBarNode::remove Jul 31, 2025
@dafedafe dafedafe marked this pull request as ready for review August 13, 2025 14:12
@dafedafe
Copy link
Contributor Author

@shipilev you might want to have a look. Thanks!

@openjdk openjdk bot added the rfr Pull request is ready for review label Aug 13, 2025
@mlbridge
Copy link

mlbridge bot commented Aug 13, 2025

Webrevs

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. So it looks to be an overly zealous assert rather than compiler bug? Someone more savvy with C2 code need to look and confirm.

/reviewers 2

@shipilev
Copy link
Member

Oh, maybe pull from the recent master to get GHA fixes, and other fixes?

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 14, 2025
@openjdk
Copy link

openjdk bot commented Aug 14, 2025

@shipilev
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Aug 14, 2025
@anton-seoane
Copy link
Contributor

Hi! Wanted to mention this might be related to the following: JDK-8330062, which I'm giving a look at the moment

@dean-long
Copy link
Member

This look OK on the surface, but isn't handling MemBarStoreStore and MemBarRelease differently asking for trouble? Is there a reason why they need to be handled in different passes?

@dafedafe
Copy link
Contributor Author

dafedafe commented Sep 1, 2025

This look OK on the surface, but isn't handling MemBarStoreStore and MemBarRelease differently asking for trouble? Is there a reason why they need to be handled in different passes?

I'm not sure of the reason why EA handles MemBarStoreStore separately. Maybe @vnkozlov can shed some light...

BTW the original assert with condition Opcode() == Op_Initialize seems to have been added because that was the case of the JDK-8269771 bug (PR). I'm not sure that there couldn't be any other additional case (apart from the current two) that makes the membar node have only one out edge.

@dean-long
Copy link
Member

I stepped through the crash with the replay file, and I'm not convinced that the problem is only with MemBarStoreStore and not MemBarRelease. What happens in the replay crash is the MemBarStoreStore gets onto the worklist through an indirect route in ConnectionGraph::split_unique_types() because of its memory edge. I think this explains why it is intermittent and hard to reproduce. A MemBarRelease on the other hand would get added to the worklist directly in compute_escape() if it has a Precedent edge.
The different handling of MemBarStoreStore vs MemBarRelease in this code is confusing. The MemBarRelease code came from JDK-6934604. It adds the node to the worklist, and lets MemBarNode::Ideal remove it based on does_not_escape_thread() on the alloc node. Contrast that with the MemBarStoreStore handling, which came from JDK-7121140, and instead of removing the node, it replaces it with a MemBarCPUOrder based on not_global_escape() on the alloc node. This MemBarStoreStore handling is for "MemBarStoreStore nodes added in library_call.cpp" and seems to fail to work for MemBarStoreStore nodes added in the ctor, which means MemBarStoreStore nodes added in the ctor only get on the worklist by accident, as mentioned above.
I think the conservative fix is to have compute_escape() always add the MemBarStoreStore to the worklist if it has a Precedent edge. Because of StressIGVN randomizing the worklist, I think the outcnt() can be 1 for either MemBarStoreStore or MemBarRelease, so we should relax the assert accordingly. I'm not sure how useful the assert will be after that. It might be better to remove it.
Longer-term, it might be nice to get rid of the separate handling of "MemBarStoreStore nodes added in library_call.cpp" if the MemBarCPUOrder is not really needed.

@dafedafe
Copy link
Contributor Author

dafedafe commented Sep 8, 2025

What happens in the replay crash is the MemBarStoreStore gets onto the worklist through an indirect route in ConnectionGraph::split_unique_types() because of its memory edge.

Oh I see! Thanks @dean-long! I noticed that MemBarStoreStore was added later on but didn't really figure out where/why.

I think the conservative fix is to have compute_escape() always add the MemBarStoreStore to the worklist if it has a Precedent edge. Because of StressIGVN randomizing the worklist, I think the outcnt() can be 1 for either MemBarStoreStore or MemBarRelease, so we should relax the assert accordingly. I'm not sure how useful the assert will be after that. It might be better to remove it.

I made compute_escape add MemBarStoreStore to the worklist. By doing so the assert doesn't trigger anymore with the reproducer but, as you wrote, there seems to be no reason why outcnt() couldn't be 1 for MemBarStoreStore or MemBarRelease. So I modified the assert to only leave the outcnt() <=2 part.

}

void MemBarNode::remove(PhaseIterGVN *igvn) {
if (outcnt() != 2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By itself, this allows outcnt() == 0, so maybe we need to continue to fail if that happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the condition to the assert.

@dafedafe
Copy link
Contributor Author

dafedafe commented Sep 9, 2025

The fix made the ConstructorBarrier.java JTREG test fail because the argument of the consume method wasn't actually escaping (and IGVN was removing the MemBar). So I added an assignment to a volatile field to make it escape.

Copy link
Member

@dean-long dean-long left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's wait for @vnkozlov to approve it.

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 8, 2025

@dafedafe This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@dafedafe
Copy link
Contributor Author

dafedafe commented Oct 8, 2025

LGTM, but let's wait for @vnkozlov to approve it.

@vnkozlov, would you mind having a look when you get a chance? Thanks!

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks good. Thank you @dean-long for suggestions and @dafedafe implementation. I agree with this.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 8, 2025
@dafedafe
Copy link
Contributor Author

dafedafe commented Oct 9, 2025

Thanks a lot for your reviews @dean-long @shipilev @vnkozlov!

@dafedafe
Copy link
Contributor Author

dafedafe commented Oct 9, 2025

/integrate

@openjdk
Copy link

openjdk bot commented Oct 9, 2025

Going to push as commit 991f8e6.
Since your change was applied there have been 879 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 9, 2025
@openjdk openjdk bot closed this Oct 9, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 9, 2025
@openjdk
Copy link

openjdk bot commented Oct 9, 2025

@dafedafe Pushed as commit 991f8e6.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

5 participants