Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working #3286

Closed
wants to merge 3 commits into from

Conversation

@DamonFool
Copy link
Member

@DamonFool DamonFool commented Mar 31, 2021

Hi all,

jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails on some of our testing platforms.
This is because testMemoryFailCount [1] fails due to OOM killed.
This test fails to avoid OOM killed [2] if memory.failcnt is always 0.

The fix will print "Not OOM killed" if OOM killed doesn't happen.
And also fix another bug if the test get returned here [3].

Testing:

  • jdk/internal/platform/docker/ hotspot/jtreg/containers on Linux/x64

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java#L80
[2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L87
[3] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java#L96


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3286/head:pull/3286
$ git checkout pull/3286

Update a local copy of the PR:
$ git checkout pull/3286
$ git pull https://git.openjdk.java.net/jdk pull/3286/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3286

View PR using the GUI difftool:
$ git pr show -t 3286

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3286.diff

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Mar 31, 2021

/issue add JDK-8264524
/test
/label add hotspot-runtime
/cc hotspot-runtime

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Mar 31, 2021

👋 Welcome back jiefu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

…ils due to OOM killed
@openjdk openjdk bot added the rfr label Mar 31, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Mar 31, 2021

@DamonFool This issue is referenced in the PR title - it will now be updated.

@openjdk
Copy link

@openjdk openjdk bot commented Mar 31, 2021

@DamonFool
The hotspot-runtime label was successfully added.

@openjdk
Copy link

@openjdk openjdk bot commented Mar 31, 2021

@DamonFool The hotspot-runtime label was already applied.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 31, 2021

Webrevs

OutputAnalyzer oa = DockerTestUtils.dockerRunJava(opts);
String output = oa.getOutput();
if (output.contains("Not OOM killed")) {
if (!output.contains("Ignoring test")) {
oa.shouldHaveExitValue(0).shouldContain("TEST PASSED!!!");
}
}
Comment on lines 114 to 120

This comment has been minimized.

@jerboaa

jerboaa Mar 31, 2021
Contributor

Consider a broken implementation of Metrics.systemMetrics().getMemoryFailCount(). Wouldn't the test now (falsely) pass?

What is the actual test output on those systems where the test fails? There should be a docker log file. Does it enter line 91?

This comment has been minimized.

@DamonFool

DamonFool Mar 31, 2021
Author Member

Thanks @jerboaa for your review.

Test output when failing is just:

[failcount] 

It doesn't enter line 91 of test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java.

I assume memory.failcnt has been disabled on these platforms since it's always 0.
What about checking the value of memory.failcnt after testMemoryFailCount like:

// check after testMemoryFailCount()
if (memory.failcnt is zero) {
  // memory.failcnt has been disabled
  pass
} else {
  // a broken implementation of Metrics.systemMetrics().getMemoryFailCount()
  fail
}

Thanks.

This comment has been minimized.

@DamonFool

DamonFool Apr 1, 2021
Author Member

Hi @jerboaa ,

After more thinking, I think the reason why memory.failcnt is always 0 is that there is no swap space on the host machine.
So the testMemoryFailCount should be skipped in that case.

But is there any API which can be used to get the swap space size of the host machine?

Thanks.

This comment has been minimized.

@DamonFool

DamonFool Apr 2, 2021
Author Member

Consider a broken implementation of Metrics.systemMetrics().getMemoryFailCount(). Wouldn't the test now (falsely) pass?

Hi @jerboaa ,

A pre-test run has been added to check whether swapping really works for testMemoryFailCount.
Swapping should be OK for memory.failcnt testing otherwise it will fail due to OOM killed.

What do you think?
Thanks.

This comment has been minimized.

@jerboaa

jerboaa Apr 6, 2021
Contributor

@DamonFool Hmm, if swap not working is the issue the test shouldn't enter the branch which you say is failing. From MetricsMemoryTester.java:

    private static void testMemoryFailCount() {
        long memAndSwapLimit = Metrics.systemMetrics().getMemoryAndSwapLimit();
        long memLimit = Metrics.systemMetrics().getMemoryLimit();

        // We need swap to execute this test or will SEGV
        if (memAndSwapLimit <= memLimit) { // <=============== This is checking whether or not swap works
            System.out.println("No swap memory limits, test case skipped");
        } else {

It has been added with JDK-8250984. So now I'm even more confused what's going on here...

This comment has been minimized.

@DamonFool

DamonFool Apr 6, 2021
Author Member

if (memAndSwapLimit <= memLimit)

Hi @jerboaa ,

Unfortunately, this check fails to work on a host machine which doesn't support swapping (e.g., total size of Swap is 0 byte).

You can still specify --memory and --memory-swap as you like on a host machine without swapping space.
And Metrics.systemMetrics().getMemoryLimit()/getMemoryAndSwapLimit() do return the values as you specified.
But the swapping will never happen since there is no swapping space on the host machine.

Thanks.

@DamonFool DamonFool changed the title 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to OOM killed 8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working Apr 2, 2021
@@ -102,6 +102,21 @@ private static void testMemoryLimit(String value) throws Exception {

private static void testMemoryFailCount(String value) throws Exception {
Common.logNewTestCase("testMemoryFailCount" + value);

// Check whether swapping really works for this test

This comment has been minimized.

@jerboaa

jerboaa Apr 6, 2021
Contributor

Please explain what "swapping not working" actually means in this comment. One version of it is already handled via JDK-8250984 so this is sort-of ambiguous. Suggestion: "On some systems there is no swap space enabled. On those systems running java -version??? with a memory limit fails due to swap space size being 0". Or something like that.

This comment has been minimized.

@DamonFool

DamonFool Apr 6, 2021
Author Member

Please explain what "swapping not working" actually means in this comment.

Updated.
Thanks.

@jerboaa
jerboaa approved these changes Apr 6, 2021
Copy link
Contributor

@jerboaa jerboaa left a comment

LGTM.

@openjdk
Copy link

@openjdk openjdk bot commented Apr 6, 2021

@DamonFool This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8264524: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails due to swapping not working

Reviewed-by: sgehwolf

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 68 new commits pushed to the master branch:

  • 114e3c3: 8263856: Github Actions for macos/aarch64 cross-build
  • a611c46: 8264048: Fix caching in Jar URL connections when an entry is missing
  • bf26a25: 8264027: Refactor "CLEANUP" region printing
  • eb6330e: 8264047: Duplicate global variable 'jvm' in libjavajpeg and libawt
  • 8132548: 8264359: Compiler directives should enable DebugNonSafepoints when PrintAssembly is requested
  • ec7b002: 8264626: C1 should be able to inline excluded methods
  • ff22353: 8264565: Templatize num_arguments() functions of DCmd subclasses
  • 54b4070: 8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive
  • 43d4a6f: 8264564: AArch64: use MOVI instead of FMOV to zero FP register
  • dc608fd: 8264411: serviceability/jvmti/HeapMonitor tests intermittently fail due to large TLAB size
  • ... and 58 more: https://git.openjdk.java.net/jdk/compare/6225ae636e303a56d66449c895989f1ec46c6530...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Apr 6, 2021
@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Apr 6, 2021

LGTM.

Thanks @jerboaa for your review.
/integrate

@openjdk openjdk bot closed this Apr 6, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Apr 6, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Apr 6, 2021

@DamonFool Since your change was applied there have been 72 commits pushed to the master branch:

  • a756d8d: 8264759: x86_32 Minimal VM build failure after JDK-8262355
  • 0f13e22: 8264791: java/util/Random/RandomTestBsi1999.java failed "java.security.SecureRandom nextFloat consecutive"
  • 4bb80f3: 8262898: com/sun/net/httpserver/bugs/8199849/ParamTest.java times out
  • 2f51699: 8264554: X509KeyManagerImpl calls getProtectionParameter with incorrect alias
  • 114e3c3: 8263856: Github Actions for macos/aarch64 cross-build
  • a611c46: 8264048: Fix caching in Jar URL connections when an entry is missing
  • bf26a25: 8264027: Refactor "CLEANUP" region printing
  • eb6330e: 8264047: Duplicate global variable 'jvm' in libjavajpeg and libawt
  • 8132548: 8264359: Compiler directives should enable DebugNonSafepoints when PrintAssembly is requested
  • ec7b002: 8264626: C1 should be able to inline excluded methods
  • ... and 62 more: https://git.openjdk.java.net/jdk/compare/6225ae636e303a56d66449c895989f1ec46c6530...master

Your commit was automatically rebased without conflicts.

Pushed as commit bfb034a.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@DamonFool DamonFool deleted the DamonFool:JDK-8264524 branch Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants