Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes #1492

Closed
wants to merge 9 commits into from

Conversation

@DamonFool
Copy link
Member

@DamonFool DamonFool commented Nov 28, 2020

Hi all,

Ergonomics for InitialHeapSize can be broken if the memory resource is limited by the administrator.
For example, this assert [1] fired on our testing boxes.

It can be reproduced by the following two steps on Linux-64:

  1) ulimit -v 8388608
  2) java -XX:MinHeapSize=5g -version

The reason was that limit_by_allocatable_memory() [2] returns a value less than MinHeapSize.

One more important fact is that this bug can be more common on Linux-32 systems.
Since the virtual memory is limited to 3800M [3] on Linux-32, it can be always reproduced when MinHeapSize > 1900M.

Testing:

  • tier1 ~ tier3 on Linux/x64

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcArguments.cpp#L96
[2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
[3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/posix/os_posix.cpp#L567


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1492/head:pull/1492
$ git checkout pull/1492

…cided on incompatible initial and minimum heap sizes
@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Nov 28, 2020

/issue add JDK-8257230
/test
/label add hotspot-runtime
/cc hotspot-runtime

Loading

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Nov 28, 2020

👋 Welcome back jiefu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@openjdk openjdk bot added the rfr label Nov 28, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 28, 2020

@DamonFool This issue is referenced in the PR title - it will now be updated.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Nov 28, 2020

@DamonFool
The hotspot-runtime label was successfully added.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Nov 28, 2020

@DamonFool The hotspot-runtime label was already applied.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 28, 2020

Loading

@dholmes-ora
Copy link
Member

@dholmes-ora dholmes-ora commented Nov 30, 2020

This is really something the GC folk should review.

/label add hotspot-gc

Loading

@openjdk openjdk bot added the hotspot-gc label Nov 30, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 30, 2020

@dholmes-ora
The hotspot-gc label was successfully added.

Loading

Copy link
Contributor

@tschatzl tschatzl left a comment

I think the change is good, but please add a test for this.

E.g. vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java shows how to run a command with an ulimit prepended.

Loading

@@ -1906,7 +1906,7 @@ void Arguments::set_heap_size() {

reasonable_initial = limit_by_allocatable_memory(reasonable_initial);

FLAG_SET_ERGO(InitialHeapSize, (size_t)reasonable_initial);
FLAG_SET_ERGO(InitialHeapSize, MAX2((size_t)reasonable_initial, MinHeapSize));
Copy link
Contributor

@kstefanj kstefanj Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is a good fix, this would violate the limit set by:
limit_by_allocatable_memory(reasonable_initial);

Wouldn't the proper fix be to make sure that MinHeapSize is also limited by what's allowed to allocate?

Loading

Copy link
Member Author

@DamonFool DamonFool Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change just follows what is done for MaxHeapSize.

MaxHeapSize is limited by reasonable_max = limit_by_allocatable_memory(reasonable_max) [1].
But it will be set to MAX2(reasonable_max, (julong)MinHeapSize) [2] if MinHeapSize is set.

What do you think?

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1876
[2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1885

Loading

Copy link
Contributor

@kstefanj kstefanj Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the fix is in line with the current code and I guess setting MinHeapSize should override MaxVirtMemFraction and allow us to use more than half the address space specified.

In this case I think I would prefer moving the the call limit_by_allocatable_memory(reasonable_initial); [1] to right after the calculation on line 1902 [2]. This way we would only have one line doing lower limiting and one line doing upper limiting.

Makes sense? Or will that lead to some other problem?

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
[2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1902

Loading

Copy link
Member Author

@DamonFool DamonFool Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I think I would prefer moving the the call limit_by_allocatable_memory(reasonable_initial); [1] to right after the calculation on line 1902 [2]. This way we would only have one line doing lower limiting and one line doing upper limiting.

Good suggestion!

Will test it soon.
Thanks.

Loading

Copy link
Member Author

@DamonFool DamonFool Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No regression.
Thanks.

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 1, 2020

I think the change is good, but please add a test for this.

E.g. vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java shows how to run a command with an ulimit prepended.

The jtreg test had been added.
And the fix had been refined based on @kstefanj 's suggestion.
Thanks.

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 1, 2020

/test

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Dec 1, 2020

@DamonFool you need to get approval to run the tests in tier1 for commits up until 0389bc4

Loading

@openjdk openjdk bot added the test-request label Dec 1, 2020
@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 3, 2020

/test

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Dec 3, 2020

@DamonFool you need to get approval to run the tests in tier1 for commits up until 92208d4

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 3, 2020

Hi @tschatzl and @kstefanj , could you help to review this change?
Thanks.

Loading

Copy link
Contributor

@kstefanj kstefanj left a comment

Took a closer look at the test now, some comment below.

Loading

* @comment Not run on AIX as it does not support ulimit -v
* @requires os.family != "aix"
Copy link
Contributor

@kstefanj kstefanj Dec 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change this to only be run on Linux:

 * @requires os.family == "linux"

Loading

public static void main(String[] args) throws Throwable {
String cmd = ProcessTools.getCommandLine(ProcessTools.createTestJvm(
"-XX:MinHeapSize=" + "260m", "-version"));
cmd = escapeCmd(cmd);
Copy link
Contributor

@kstefanj kstefanj Dec 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change to only run on Linux, this is not needed.

Loading

String cmd = ProcessTools.getCommandLine(ProcessTools.createTestJvm(
"-XX:MinHeapSize=" + "260m", "-version"));
Copy link
Contributor

@kstefanj kstefanj Dec 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createTestJvm() will pick up additional options passed by the test runner and there might be conflicting options. So I would suggest not picking up any additional options.

Loading

oa.shouldNotContain("hs_err")
.shouldNotContain("Internal Error");
Copy link
Contributor

@kstefanj kstefanj Dec 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it also check that the exit status is ok, or can't we expect that for sure?

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 4, 2020

Hi @kstefanj ,

Thanks for your review.

The fix has been updated according to your comments.

I don't check the exit value because I found different platforms may return different values (Zero on MacOS, but non-zero on Linux).
Maybe, It should return 0 on Linux too, which I think is more reasonable.
And I'd like to file another bug to fix it in the future.

Thanks.
Best regards,
Jie

Loading

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Dec 4, 2020

It should be 0 on Linux and after the addition of @requires os.family == "linux" it should only be run on Linux. Doing some manual runs show that the JVM can't start with an ulimit as low as in the test. If the startup in the test is not successful I don't think the test has any value, so we need to find values that make it reliable.

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 4, 2020

It should be 0 on Linux and after the addition of @requires os.family == "linux" it should only be run on Linux. Doing some manual runs show that the JVM can't start with an ulimit as low as in the test. If the startup in the test is not successful I don't think the test has any value, so we need to find values that make it reliable.

Hi @kstefanj ,

The test is used to check whether the assert is triggered.
Before the fix, it failed.
After the fix, it passed.

As I mentioned above, there seems to be another bug on Linux.
It does return 0 on MacOS.
And I also think it should return 0 on Linux.
I'll file another bug to fix it.

What do you think?

Thanks.
Best regards,
Jie

Loading

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Dec 4, 2020

Yes it might check that the assert doesn't trigger, but if the test is not robust enough to always manage to execute java -version we might start to see other failures in that test. In some sense the test is just lucky that it fails in a way that a hs_err-file is not created.

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 4, 2020

Yes it might check that the assert doesn't trigger, but if the test is not robust enough to always manage to execute java -version we might start to see other failures in that test. In some sense the test is just lucky that it fails in a way that a hs_err-file is not created.

Hi @kstefanj ,

After some experiments, I finally got a configuration which can return 0 on Linux.
Could you please review it again?
Thanks.

Loading

Copy link
Contributor

@kstefanj kstefanj left a comment

This looks better. I'll run it through or testing env to make sure it passes there as well.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Dec 4, 2020

@DamonFool This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes

Reviewed-by: tschatzl, sjohanss

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 33 new commits pushed to the master branch:

  • 05dac03: 8257803: Add -Xbatch to compiler/blackhole tests
  • 29a09c8: 8257668: SA JMap - skip non-java thread stack dump for heap dump
  • e590618: 8252505: C1/C2 compiler support for blackholes
  • 972bc3b: 8256167: Convert JDK use of Reference::get to Reference::refersTo
  • 78be334: 8242332: Add SHA3 support to SunPKCS11 provider
  • c4339c3: 8243614: Typo in ReentrantLock's Javadoc
  • d3ac1bf: 8198390: Test MultiResolutionDrawImageWithTransformTest.java fails when -esa is passed
  • 51d325e: 8257633: Missing -mmacosx-version-min=X flag when linking libjvm
  • e27ea4d: 8257750: writeBuffer field of java.io.DataOutputStream should be final
  • dd0b945: 8257531: Super word not applied to a loop of simple Buffer operations
  • ... and 23 more: https://git.openjdk.java.net/jdk/compare/f83fd4acb4c04285d14eae6b8fee0de58bfcdd45...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Loading

@openjdk openjdk bot added the ready label Dec 4, 2020
@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 5, 2020

Hi @tschatzl ,

Are you OK with the latest change?
Thanks.

Loading

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Dec 6, 2020

Didn't see any problems with the test in the testing environment, so I'm good.

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 7, 2020

Thanks @kstefanj and @tschatzl for your review and help.
/integrate

Loading

@openjdk openjdk bot closed this Dec 7, 2020
@openjdk openjdk bot added integrated and removed ready rfr labels Dec 7, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Dec 7, 2020

@DamonFool Since your change was applied there have been 33 commits pushed to the master branch:

  • 05dac03: 8257803: Add -Xbatch to compiler/blackhole tests
  • 29a09c8: 8257668: SA JMap - skip non-java thread stack dump for heap dump
  • e590618: 8252505: C1/C2 compiler support for blackholes
  • 972bc3b: 8256167: Convert JDK use of Reference::get to Reference::refersTo
  • 78be334: 8242332: Add SHA3 support to SunPKCS11 provider
  • c4339c3: 8243614: Typo in ReentrantLock's Javadoc
  • d3ac1bf: 8198390: Test MultiResolutionDrawImageWithTransformTest.java fails when -esa is passed
  • 51d325e: 8257633: Missing -mmacosx-version-min=X flag when linking libjvm
  • e27ea4d: 8257750: writeBuffer field of java.io.DataOutputStream should be final
  • dd0b945: 8257531: Super word not applied to a loop of simple Buffer operations
  • ... and 23 more: https://git.openjdk.java.net/jdk/compare/f83fd4acb4c04285d14eae6b8fee0de58bfcdd45...master

Your commit was automatically rebased without conflicts.

Pushed as commit 7620124.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Loading

@DamonFool DamonFool deleted the JDK-8257230 branch Dec 7, 2020
@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 7, 2020

Mailing list message from David Holmes on hotspot-runtime-dev:

On 7/12/2020 7:30 pm, Jie Fu wrote:

On Fri, 4 Dec 2020 12:37:42 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

Jie Fu has updated the pull request incrementally with one additional commit since the last revision:

Check the exit status

This looks better. I'll run it through or testing env to make sure it passes there as well.

Thanks @kstefanj and @tschatzl for your review and help.

We are seeing the new test crash on Aarch64 due to native OOM.

https://bugs.openjdk.java.net/browse/JDK-8257820

Cheers,
David

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 7, 2020

Mailing list message from David Holmes on hotspot-runtime-dev:

On 7/12/2020 7:30 pm, Jie Fu wrote:

On Fri, 4 Dec 2020 12:37:42 GMT, Stefan Johansson wrote:

Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
Check the exit status

This looks better. I'll run it through or testing env to make sure it passes there as well.

Thanks @kstefanj and @tschatzl for your review and help.

We are seeing the new test crash on Aarch64 due to native OOM.

https://bugs.openjdk.java.net/browse/JDK-8257820

Cheers,
David

Is it always reproducible?
I'll try to find an aarch64 machine to reproduce it.
Thanks.

Loading

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Dec 7, 2020

I had hoped my testing would have caught this, but might be that the test is to brittle after all. I saw it pass on aarch64, so not happening every time.

I would be good with removing the test as a fix.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 7, 2020

Mailing list message from David Holmes on hotspot-runtime-dev:

On 7/12/2020 10:15 pm, Jie Fu wrote:

On Mon, 7 Dec 2020 09:27:02 GMT, Jie Fu <jiefu at openjdk.org> wrote:

This looks better. I'll run it through or testing env to make sure it passes there as well.

Thanks @kstefanj and @tschatzl for your review and help.

_Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_

On 7/12/2020 7:30 pm, Jie Fu wrote:

On Fri, 4 Dec 2020 12:37:42 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
Check the exit status

This looks better. I'll run it through or testing env to make sure it passes there as well.

Thanks @kstefanj and @tschatzl for your review and help.

We are seeing the new test crash on Aarch64 due to native OOM.

https://bugs.openjdk.java.net/browse/JDK-8257820

Cheers,
David

Is it always reproducible?

No. It passed when this commit was integrated, but then failed in later
test runs.

David

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 7, 2020

I had hoped my testing would have caught this, but might be that the test is to brittle after all. I saw it pass on aarch64, so not happening every time.

I would be good with removing the test as a fix.

OK.
I'm fine to remove it and will do it soon.

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 7, 2020

I had hoped my testing would have caught this, but might be that the test is to brittle after all. I saw it pass on aarch64, so not happening every time.
I would be good with removing the test as a fix.

OK.
I'm fine to remove it and will do it soon.

It has been assigned to @tschatzl .
Thanks for fixing it.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4 participants