Skip to content

8342775: [Graal] java/util/concurrent/locks/Lock/OOMEInAQS.java fails OOME thrown from the UncaughtExceptionHandler #21745

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

tkrodriguez
Copy link
Contributor

@tkrodriguez tkrodriguez commented Oct 28, 2024

Deoptimization with escape analysis can fail when trying to rematerialize objects as described in JDK-8227309. In this test this can happen in Xcomp mode in the framework of the test resulting in a test failure. Making the number of threads non-final avoids scalar replacement and thus the OOM during deopt.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8342775: [Graal] java/util/concurrent/locks/Lock/OOMEInAQS.java fails OOME thrown from the UncaughtExceptionHandler (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21745/head:pull/21745
$ git checkout pull/21745

Update a local copy of the PR:
$ git checkout pull/21745
$ git pull https://git.openjdk.org/jdk.git pull/21745/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 21745

View PR using the GUI difftool:
$ git pr show -t 21745

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21745.diff

Using Webrev

Link to Webrev Comment

… OOME thrown from the UncaughtExceptionHandler
@bridgekeeper
Copy link

bridgekeeper bot commented Oct 28, 2024

👋 Welcome back never! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 28, 2024

@tkrodriguez This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8342775: [Graal] java/util/concurrent/locks/Lock/OOMEInAQS.java fails OOME  thrown from the UncaughtExceptionHandler

Reviewed-by: jpai, dholmes

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 2 new commits pushed to the master branch:

  • 2cce5ee: 8349142: [JMH] compiler.MergeLoadBench.getCharBV fails
  • 305bbda: 8348402: PerfDataManager stalls shutdown for 1ms

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot changed the title 8342775: [Graal] java/util/concurrent/locks/Lock/OOMEInAQS.java fails OOME thrown from the UncaughtExceptionHandler 8342775: [Graal] java/util/concurrent/locks/Lock/OOMEInAQS.java fails OOME thrown from the UncaughtExceptionHandler Oct 28, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 28, 2024
@openjdk
Copy link

openjdk bot commented Oct 28, 2024

@tkrodriguez The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Oct 28, 2024
@mlbridge
Copy link

mlbridge bot commented Oct 28, 2024

Webrevs

static final int NTHREADS = 2; // intentionally not a scalable test; > 2 is very slow
// Intentionaly non-final to avoid EA of the threads array in main which can cause this test to
// fail in Xcomp mode.
static int NTHREADS = 2; // intentionally not a scalable test; > 2 is very slow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Tom, I don't have the necessary knowledge of runtime compilers, so consider this as drive-by questions than a review.

On its own, the static final construct appears to be the correct one for this field. Removing the final to address an escape analysis implementation detail appears odd.
Do you know if the failure happens only in -Xcomp mode? Looking at JDK-8342775 it wasn't clear to me that was the case. If it's happening only in -Xcomp mode, perhaps due to additional work being done by the compiler threads (?) and the fact that this test intentional runs with a very low -Xmx, maybe we should just skip the test from -Xcomp mode instead of changing the field declaration? We have several such tests which we skip in -Xcomp mode by using:

@requires (vm.compMode != "Xcomp")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to actually execute the test which is trying ensure that OOM is handled by library code in all configurations. Using final is just a stylistic choice in the harness so removing it to allow to test to run in more configurations seems like better than avoiding those configurations.

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 30, 2024

@tkrodriguez This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 20, 2025

@tkrodriguez This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Jan 20, 2025
@tkrodriguez
Copy link
Contributor Author

/open

@openjdk openjdk bot reopened this Jan 21, 2025
@openjdk
Copy link

openjdk bot commented Jan 21, 2025

@tkrodriguez This pull request is now open

@tkrodriguez
Copy link
Contributor Author

@AlanBateman Could you take a look at this one?

@viktorklang-ora
Copy link
Contributor

@tkrodriguez I'm a bit hesitant about the proposed change—it sounds to me like it relies too much on implementation details. If EA is decides to treat NTHREADS as effectively final (at some point) then this test will start failing again under Xcomp. What are the alternatives here?

@tkrodriguez
Copy link
Contributor Author

tkrodriguez commented Jan 21, 2025

I'm not sure there are a lot of great alternatives. There's no easy global solution to the problem of EA being unable to materialize values during deopt so these kinds of problems crop up in tests, particularly with Xcomp. So the choices boil down to don't run it under Xcomp with Graal or work around it. Something about the structure of this test reliably causes a unreached deopt in this code path and that seems to caused of some relatively recent random JDK change as this had been passing previously.

Modifying the test in this way seem like a pragmatic way of dealing with it. The main is effectively the harness, so the change has no effect on the validity of the test itself. We can always revisit this if it crops up again. But I'm happy to make whatever change you'd prefer.

@dholmes-ora
Copy link
Member

Changing the test this way does not sit well with me either. This adjustment to the test is far too intricately entwined with the implementation details of a particular JIT. I'm much more inclined to exclude Xcomp mode for OOME tests; or perhaps just disable EA (if Graal supports that) so we get most Xcomp coverage whilst side-stepping the EA issue.

@tkrodriguez
Copy link
Contributor Author

I've disabled the test for Graal with Xcomp. Is that an acceptable solution? Testing indicates that it properly stops it for Graal without affecting other configurations.

@AlanBateman
Copy link
Contributor

I've disabled the test for Graal with Xcomp. Is that an acceptable solution?

Yes, this should be okay.

@jaikiran
Copy link
Member

jaikiran commented Feb 2, 2025

Hello Tom, the change to the OOMEInAQS.java test to skip it when running the combination of Graal and -Xcomp looks fine to me. I see that the OOMEInStampedLock.java has been updated too to do the same. The original bug report doesn't mention this test. Is the change to OOMEInStampedLock.java done as precaution against the same issue?

@tkrodriguez
Copy link
Contributor Author

At the time it was filed, OOMEInStampedLock.java didn't exist but it's a direct clone of OOMEInAQS.java and has the exact same problem in our testing.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 3, 2025
Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix.

@tkrodriguez
Copy link
Contributor Author

Thanks!

/integrate

@openjdk
Copy link

openjdk bot commented Feb 3, 2025

Going to push as commit bb837d2.
Since your change was applied there have been 7 commits pushed to the master branch:

  • a57c9b1: 8349184: [JMH] jdk.incubator.vector.ColumnFilterBenchmark.filterDoubleColumn fails on linux-aarch64
  • d330421: 8337548: Parallel class loading can pass is_superclass true for interfaces
  • 3f1d9b5: 8348575: SpinLockT is typedef'ed but unused
  • 6f4fc82: 8348675: TrayIcon tests fail in Ubuntu 24.10 Wayland
  • 9aa6d09: 8326485: Assertion due to Type.addMetadata adding annotations to already-annotated type
  • 2cce5ee: 8349142: [JMH] compiler.MergeLoadBench.getCharBV fails
  • 305bbda: 8348402: PerfDataManager stalls shutdown for 1ms

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 3, 2025
@openjdk openjdk bot closed this Feb 3, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 3, 2025
@openjdk
Copy link

openjdk bot commented Feb 3, 2025

@tkrodriguez Pushed as commit bb837d2.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants