Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8267293: vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java fails when JTREG_JOBS > 25 #4076

Closed
wants to merge 2 commits into from

Conversation

DamonFool
Copy link
Member

@DamonFool DamonFool commented May 17, 2021

Hi all,

vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java fails on our many-core machines due to -XX:MaxRAMPercentage=0.
This is because -XX:MaxRAMPercentage=0 will be 0 if JTREG_JOBS > 25 [1].

We can also reproduce the bug by: make test TEST="vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java" JTREG="JOBS=26" on almost all machines.

This fix will make it to be more robust, which is suggested by @shipilev [2] and many thanks to him.

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk/blob/master/make/RunTests.gmk#L741
[2] #4062 (review)


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8267293: vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java fails when JTREG_JOBS > 25

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4076/head:pull/4076
$ git checkout pull/4076

Update a local copy of the PR:
$ git checkout pull/4076
$ git pull https://git.openjdk.java.net/jdk pull/4076/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4076

View PR using the GUI difftool:
$ git pr show -t 4076

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4076.diff

@bridgekeeper
Copy link

bridgekeeper bot commented May 17, 2021

👋 Welcome back jiefu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label May 17, 2021
@openjdk
Copy link

openjdk bot commented May 17, 2021

@DamonFool The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label May 17, 2021
@mlbridge
Copy link

mlbridge bot commented May 17, 2021

Webrevs

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jie,

You can safely get by with less. A very safe bet for all platforms would be:

  • 256m heap
  • 8m MaxMetaspaceSize

The latter could be probably reduced more (the smaller metaspace, the faster the test comes to conclusion), down to 6m or 4m. On my machine 4m works for both 64 and 32bit. But 8m is probably safe on all platforms.

Cheers, Thomas

@shipilev
Copy link
Member

Yeah, this works on my machines:

diff --git a/test/hotspot/jtreg/vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java b/test/hotspot/jtreg/vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java
index 0d5f1a1626f..980c9303f04 100644
--- a/test/hotspot/jtreg/vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java
+++ b/test/hotspot/jtreg/vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java
@@ -36,7 +36,7 @@
  * @build vm.mlvm.anonloader.stress.oome.metaspace.Test
  * @run driver vm.mlvm.share.IndifiedClassesBuilder
  *
- * @run main/othervm -XX:-UseGCOverheadLimit -XX:MetaspaceSize=10m -XX:MaxMetaspaceSize=20m vm.mlvm.anonloader.stress.oome.metaspace.Test
+ * @run main/othervm -Xmx256m -XX:-UseGCOverheadLimit -XX:MaxMetaspaceSize=8m vm.mlvm.anonloader.stress.oome.metaspace.Test
  */
 
 package vm.mlvm.anonloader.stress.oome.metaspace;

@DamonFool
Copy link
Member Author

Thanks @tstuefe and @shipilev .

The test time has been reduced from 41.8s to 34.8s .
And it passed on all our x86 platforms.
Updated.
Thanks.

@shipilev
Copy link
Member

Seems like GHA is barfing up due to infrastructure issues. Please re-start the jobs, once current run finishes.

@DamonFool
Copy link
Member Author

/test

@openjdk
Copy link

openjdk bot commented May 18, 2021

@DamonFool you need to get approval to run the tests in tier1 for commits up until 2e618c1

@openjdk openjdk bot added the test-request label May 18, 2021
@shipilev
Copy link
Member

"/test" command does not work. GHA start automatically on PR updates and/or with manual trigger: https://github.com/DamonFool/jdk/actions/workflows/submit.yml

@DamonFool
Copy link
Member Author

"/test" command does not work. GHA start automatically on PR updates and/or with manual trigger: https://github.com/DamonFool/jdk/actions/workflows/submit.yml

Got it.
Thanks for your help. @shipilev

@DamonFool
Copy link
Member Author

Seems like GHA is barfing up due to infrastructure issues. Please re-start the jobs, once current run finishes.

Hi @shipilev ,

Some jobs seem to be blocked.
But I think the unfinished jobs make no sense since they don't cover the affected changes at all.

Thanks.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openjdk
Copy link

openjdk bot commented May 19, 2021

@DamonFool This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8267293: vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java fails when JTREG_JOBS > 25

Reviewed-by: stuefe, shade

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 26 new commits pushed to the master branch:

  • 70f6c67: 8233380: CHT: Node allocation and freeing
  • 2563a6a: 8266962: Add arch supporting check for "Op_VectorLoadConst" before creating the node
  • 4954383: 8267364: Remove mask.incr which is introduced by JDK-8256973
  • c2b50f9: 8266480: Implicit null check optimization does not update control of hoisted memory operation
  • 3f883e8: 8267351: runtime/cds/SharedBaseAddress.java fails on x86_32 due to Unrecognized VM option 'UseCompressedOops'
  • 7aa6568: 8256973: Intrinsic creation for VectorMask query (lastTrue,firstTrue,trueCount) APIs
  • 65a8bf5: 8265126: [REDO] unified handling for VectorMask object re-materialization during de-optimization
  • ff84577: 8267098: AArch64: C1 StubFrames end confusingly
  • 0daec49: 8267246: -XX:MaxRAMPercentage=0 is unreasonable for jtreg tests on many-core machines
  • 324defe: 8267212: test/jdk/java/util/Collections/FindSubList.java intermittent crash with "no reachable node should have no use"
  • ... and 16 more: https://git.openjdk.java.net/jdk/compare/cd1c17c0a6416a8d16cf2035f3e97dba95b6b8af...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 19, 2021
@DamonFool
Copy link
Member Author

Thanks @tstuefe and @shipilev .
/integrate

@openjdk openjdk bot closed this May 19, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 19, 2021
@DamonFool DamonFool deleted the JDK-8267293 branch May 19, 2021 09:04
@openjdk
Copy link

openjdk bot commented May 19, 2021

@DamonFool Since your change was applied there have been 26 commits pushed to the master branch:

  • 70f6c67: 8233380: CHT: Node allocation and freeing
  • 2563a6a: 8266962: Add arch supporting check for "Op_VectorLoadConst" before creating the node
  • 4954383: 8267364: Remove mask.incr which is introduced by JDK-8256973
  • c2b50f9: 8266480: Implicit null check optimization does not update control of hoisted memory operation
  • 3f883e8: 8267351: runtime/cds/SharedBaseAddress.java fails on x86_32 due to Unrecognized VM option 'UseCompressedOops'
  • 7aa6568: 8256973: Intrinsic creation for VectorMask query (lastTrue,firstTrue,trueCount) APIs
  • 65a8bf5: 8265126: [REDO] unified handling for VectorMask object re-materialization during de-optimization
  • ff84577: 8267098: AArch64: C1 StubFrames end confusingly
  • 0daec49: 8267246: -XX:MaxRAMPercentage=0 is unreasonable for jtreg tests on many-core machines
  • 324defe: 8267212: test/jdk/java/util/Collections/FindSubList.java intermittent crash with "no reachable node should have no use"
  • ... and 16 more: https://git.openjdk.java.net/jdk/compare/cd1c17c0a6416a8d16cf2035f3e97dba95b6b8af...master

Your commit was automatically rebased without conflicts.

Pushed as commit 2d407e1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@DamonFool
Copy link
Member Author

Hi Jie,

You can safely get by with less. A very safe bet for all platforms would be:

  • 256m heap
  • 8m MaxMetaspaceSize

The latter could be probably reduced more (the smaller metaspace, the faster the test comes to conclusion), down to 6m or 4m. On my machine 4m works for both 64 and 32bit. But 8m is probably safe on all platforms.

Cheers, Thomas

Maybe -Xmx256m is not enough.
OOME on aarch64: https://bugs.openjdk.java.net/browse/JDK-8267404
Will do more investigation tomorrow.

@tstuefe
Copy link
Member

tstuefe commented May 19, 2021

Hmm... sorry for that. Weird though since I would not have expected a lot of platform dependency here (apart from 32 vs 64bit).

Ideally this test should be rewritten to load large classes, not these tiny ones, to get a better ratio between metaspace and heap space.

@DamonFool
Copy link
Member Author

Hmm... sorry for that. Weird though since I would not have expected a lot of platform dependency here (apart from 32 vs 64bit).

Ideally this test should be rewritten to load large classes, not these tiny ones, to get a better ratio between metaspace and heap space.

Maybe, there is something special on Oracle's aarch64 platforms.
All our aarch64 machines fail to reproduce the OOME.

I've asked the help of @dcubed-ojdk to test with -Xmx1g and -Xmx2g.
If it still fails, I'd like to revert the change.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated test-request
3 participants