Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8324817: Parallel GC does not pre-touch all heap pages when AlwaysPreTouch enabled and large page disabled #17610

Closed
wants to merge 9 commits into from

Conversation

sandlerwang
Copy link
Contributor

@sandlerwang sandlerwang commented Jan 29, 2024

AlwaysPreTouch requires all freshly committed pages to be pre-touched. While currently PS GC does not pre-touch heap pages with -XX:+AlwaysPreTouch.
It is related to JDK-8315923, which fixes the issue when huge pages are used. But the bug still stands if regular page are used. On linux we can reproduce this bug when /sys/kernel/mm/transparent_hugepage/enabled is madvise or never, but cannot reproduce when it is always.

A simple way to reproduce, please make sure large pages are NOT used.
Just run a test with the following parameters
-XX:+AlwaysPreTouch -Xms31g -Xmx31g -XX:+UseParallelGC
There is no pre-touching during heap initialing stage. After initialization, RSS of the test process is much smaller than memory committed.
On the contrary, using -XX:+UseG1GC and run the test again
-XX:+AlwaysPreTouch -Xms31g -Xmx31g -XX:+UseG1GC
it takes several seconds to pre-touch heap pages during initialization, and RSS usage after initialization is similar to memory committed. This is the expected behavior of AlwaysPreTouch

This issue related to JDK-8283935, which uses alignment() instead of os::vm_page_size() as pre-touching step size. The value of alignment() is usually bigger than OS page size, which causes most heap pages are not pre-touched.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8324817: Parallel GC does not pre-touch all heap pages when AlwaysPreTouch enabled and large page disabled (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17610/head:pull/17610
$ git checkout pull/17610

Update a local copy of the PR:
$ git checkout pull/17610
$ git pull https://git.openjdk.org/jdk.git pull/17610/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17610

View PR using the GUI difftool:
$ git pr show -t 17610

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17610.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 29, 2024

👋 Welcome back wzhuo! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 29, 2024

@sandlerwang The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Jan 29, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 29, 2024
@mlbridge
Copy link

mlbridge bot commented Jan 29, 2024

@albertnetymk
Copy link
Member

On linux we can reproduce this bug when /sys/kernel/mm/transparent_hugepage/enabled is madvise or never, but cannot reproduce when it is always.

Could you explain why this bug is affected by that OS config flag? As you pointed out, alignment() (is GenAlignment as I read the code) can be different from os::vm_page_size(), so pretouching uses the wrong page-size. However, I don't get how transparent_hugepage is related here.

A semi-related issue, if alignment() can have the wrong page-size, does it mean numa_setup_pages (a few lines above) also needs revision?

@sandlerwang
Copy link
Contributor Author

On linux we can reproduce this bug when /sys/kernel/mm/transparent_hugepage/enabled is madvise or never, but cannot reproduce when it is always.

Could you explain why this bug is affected by that OS config flag? As you pointed out, alignment() (is GenAlignment as I read the code) can be different from os::vm_page_size(), so pretouching uses the wrong page-size. However, I don't get how transparent_hugepage is related here.

A semi-related issue, if alignment() can have the wrong page-size, does it mean numa_setup_pages (a few lines above) also needs revision?

This bug is affected by that OS config flag because JDK-8315923 added check for the config flag, and set page_size to os::vm_page_size() when the flag is always. But page_size is still alignment() when the flag is madivise or never.
Please check src/hotspot/os/linux/os_linux.cpp os::pd_pretouch_memory. In this function if HugePages::thp_mode() equals THPMode::always, page_size is set to os::vm_page_size();

For numa_setup_pages, because numa_setup_pages sets memory for from/to/eden regions, I think alignment() is sufficient here. There is some discussion about page size in numa_setup_pages here, please check. #8090

Copy link
Member

@albertnetymk albertnetymk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check src/hotspot/os/linux/os_linux.cpp os::pd_pretouch_memory. In this function if HugePages::thp_mode() equals THPMode::always, page_size is set to os::vm_page_size();

Thank you for the clarification.

For numa_setup_pages, because numa_setup_pages sets memory for from/to/eden regions, I think alignment() is sufficient here.

I dug into the numa_setup_pages, and it seems that page_size is treated as alignment, not os-page-size... (The naming here is quite misleading.)

/**
* @test TestAlwaysPreTouchBehavior
* @summary Tests AlwaysPreTouch Bahavior, pages of java heap should be pretouched with AlwaysPreTouch enabled. This test reads RSS of test process, which should be bigger than heap size(1g) with AlwaysPreTouch enabled.
* @requires os.family == "linux"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe * @requires vm.gc.Parallel is also required here.

Comment on lines 89 to 94
} else if (rss < base) {
System.out.println("RSS = " + rss + " smaller than committed heap memory");
} else {
System.out.println("Passed RSS = " + rss + " base value " + base);
}
Asserts.assertTrue(rss >= base, "heap rss should be bigger than committed heap mem");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be just Asserts.assertTrue(rss >= base, "RSS: " + rss + " should be >= Committed Heap Mem: " + base);?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is OK to use committed heap mem. Test updated

long rss = 0;
Runtime runtime = Runtime.getRuntime();
long committedMemory = (runtime.totalMemory()) / 1024; // in kb
long base = (long)(committedMemory * 0.9);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the meaning of 0.9 here? Why can one use committedMemory directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.9 removed, just use committedMemory directly in the updated test

Comment on lines 88 to 92
} else if (rss < committedMemory) {
System.out.println("RSS = " + rss + " smaller than committed heap memory " + committedMemory);
} else {
System.out.println("Passed RSS = " + rss + " committed memory " + committedMemory);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these branches/print here? Could these println be incorporated in the assert msg below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These printing are not necessary. Information are moved to the assertion message.

Comment on lines 70 to 72
} catch (Exception e) {
return EXCEPTION_VALUE;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is identical to what caller does. I believe try-catch can be omitted in this method (callee), and keep the only the try-catch in the caller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Redundant try-catch removed.

public static void main(String [] args) {
long rss = 0;
Runtime runtime = Runtime.getRuntime();
long committedMemory = (runtime.totalMemory()) / 1024; // in kb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The () seems not needed. Also, one doesn't need introduce a tmp runtime var.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant () removed.

/**
* @test TestAlwaysPreTouchBehavior
* @summary Tests AlwaysPreTouch Bahavior, pages of java heap should be pretouched with AlwaysPreTouch enabled. This test reads RSS of test process, which should be bigger than heap size(1g) with AlwaysPreTouch enabled.
* @requires vm.gc.Serial & os.family == "linux" & os.maxMemory > 2G
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be Parallel, not Serial. The GC requirement should probably be on its own line to distinguish btw vm and os constraint..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That a real problem. Serial was changed to Parallel in a separate line.

@openjdk
Copy link

openjdk bot commented Jan 31, 2024

@sandlerwang This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8324817: Parallel GC does not pre-touch all heap pages when AlwaysPreTouch enabled and large page disabled

Reviewed-by: ayang, tschatzl

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 79 new commits pushed to the master branch:

  • 692c9f8: 8325201: (zipfs) Disable TestPosix.setPermissionsShouldConvertToUnix which fails on Windows
  • ed06846: 8325037: x86: enable and fix hotspot/jtreg/compiler/vectorization/TestRoundVectFloat.java
  • a18b03b: 8324635: (zipfs) Regression in Files.setPosixFilePermissions called on existing MSDOS entries
  • 7476e29: 8323680: SA PointerFinder code can do a better job of leveraging existing code to determine if an address is in the TLAB
  • 63cb1f8: 8321396: Retire test/jdk/java/util/zip/NoExtensionSignature.java
  • f613e13: 8313739: ZipOutputStream.close() should always close the wrapped stream
  • adc3604: 8325148: Enable restricted javac warning in java.base
  • 1ae8513: 8324858: [vectorapi] Bounds checking issues when accessing memory segments
  • 38c0197: 8318105: [jmh] the test java.security.HSS failed with 2 active threads
  • 6787c4c: 8325055: Rename Injector.h
  • ... and 69 more: https://git.openjdk.org/jdk/compare/2e748c998ee490d8c3b1c7ab2fadfcb4596fc07b...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@albertnetymk, @tschatzl) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 31, 2024
Comment on lines 77 to 83
} catch (Exception e) {
rss = EXCEPTION_VALUE;
}
if (rss == EXCEPTION_VALUE) {
System.out.println("cannot get RSS, just skip");
return; // Did not get avaiable RSS, just ignore this test
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EXCEPTION_VALUE constant seems superfluous, and the marked code can be simplified to just

Suggested change
} catch (Exception e) {
rss = EXCEPTION_VALUE;
}
if (rss == EXCEPTION_VALUE) {
System.out.println("cannot get RSS, just skip");
return; // Did not get avaiable RSS, just ignore this test
}
} catch (Exception e) {
System.out.println("cannot get RSS, just skip");
return; // Did not get avaiable RSS, just ignore this test
}

imo.

Actually, thinking a bit further, I think there is no reason to actually try to catch anything here. If the proc filesystem isn't readable (or something changes in its contents) I would think it is worth looking at the environment or fixing the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, EXCEPTION_VALUE related code were deleted. Test updated

Comment on lines 57 to 69
StringBuilder stringBuilder = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) {
if (line.startsWith("VmRSS:")) {
stringBuilder.append(line);
break;
}
}
stringBuilder.deleteCharAt(stringBuilder.length() - 1);
reader.close();

String content = stringBuilder.toString();
return Long.valueOf(content.split("\\s+")[1].trim());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There does not need to be a reason to use the StringBuilder here. I also do not see a reason for the deleteCharAt call. Just use line.

Suggested change
StringBuilder stringBuilder = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) {
if (line.startsWith("VmRSS:")) {
stringBuilder.append(line);
break;
}
}
stringBuilder.deleteCharAt(stringBuilder.length() - 1);
reader.close();
String content = stringBuilder.toString();
return Long.valueOf(content.split("\\s+")[1].trim());
String line = null;
while ((line = reader.readLine()) != null) {
if (line.startsWith("VmRSS:")) {
break;
}
}
reader.close();
return Long.valueOf(line.split("\\s+")[1].trim());

If line is null, the NPE will be caught in the catch in the caller all the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in case of an IOException the file will not be closed in either case. Maybe some try-with-resources is appropriate here when opening the Reader. I do not see this as critical as the test will be terminated soon enough anyway, freeing the file handle.

Runtime runtime = Runtime.getRuntime();
long committedMemory = runtime.totalMemory() / 1024; // in kb
try {
rss = getProcessRssInKb();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation compared to other indentation - this is 3 spaces, the file uses 4 spaces otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent issue fixed

try {
rss = getProcessRssInKb();
} catch (Exception e) {
rss = EXCEPTION_VALUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation.

* @summary Tests AlwaysPreTouch Bahavior, pages of java heap should be pretouched with AlwaysPreTouch enabled. This test reads RSS of test process, which should be bigger than heap size(1g) with AlwaysPreTouch enabled.
* @requires vm.gc.Parallel
* @requires os.family == "linux" & os.maxMemory > 2G
* @library /test/lib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test only works in release builds as the debug builds incidentally always pretouch bases.
I.e. add @requires vm.debug != true.
There does not seem to be a @requires vm.release or something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to @require vm.debug and add -XX:-ZapUnusedHeapArea to the run options ( suggested by @albertnetymk ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added @requires vm.debug != true

Copy link
Contributor

@tschatzl tschatzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the typo I did not find last time before integrating.

Lgtm.

@@ -48,36 +49,27 @@
import java.io.*;

public class TestAlwaysPreTouchBehavior {
static final long EXCEPTION_VALUE = -1L;
public static long getProcessRssInKb() throws IOException {
String pid = ManagementFactory.getRuntimeMXBean().getName().split("@")[0];
// Read RSS from /proc/$pid/status. Only avaiable on Linux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Read RSS from /proc/$pid/status. Only avaiable on Linux
// Read RSS from /proc/$pid/status. Only available on Linux.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. Typo fixed, thanks

@sandlerwang
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Feb 5, 2024
@openjdk
Copy link

openjdk bot commented Feb 5, 2024

@sandlerwang
Your change (at version 0594700) is now ready to be sponsored by a Committer.

@D-D-H
Copy link
Contributor

D-D-H commented Feb 5, 2024

/sponsor

@openjdk
Copy link

openjdk bot commented Feb 5, 2024

Going to push as commit 80642dd.
Since your change was applied there have been 79 commits pushed to the master branch:

  • 692c9f8: 8325201: (zipfs) Disable TestPosix.setPermissionsShouldConvertToUnix which fails on Windows
  • ed06846: 8325037: x86: enable and fix hotspot/jtreg/compiler/vectorization/TestRoundVectFloat.java
  • a18b03b: 8324635: (zipfs) Regression in Files.setPosixFilePermissions called on existing MSDOS entries
  • 7476e29: 8323680: SA PointerFinder code can do a better job of leveraging existing code to determine if an address is in the TLAB
  • 63cb1f8: 8321396: Retire test/jdk/java/util/zip/NoExtensionSignature.java
  • f613e13: 8313739: ZipOutputStream.close() should always close the wrapped stream
  • adc3604: 8325148: Enable restricted javac warning in java.base
  • 1ae8513: 8324858: [vectorapi] Bounds checking issues when accessing memory segments
  • 38c0197: 8318105: [jmh] the test java.security.HSS failed with 2 active threads
  • 6787c4c: 8325055: Rename Injector.h
  • ... and 69 more: https://git.openjdk.org/jdk/compare/2e748c998ee490d8c3b1c7ab2fadfcb4596fc07b...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 5, 2024
@openjdk openjdk bot closed this Feb 5, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Feb 5, 2024
@openjdk
Copy link

openjdk bot commented Feb 5, 2024

@D-D-H @sandlerwang Pushed as commit 80642dd.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@mmyxym
Copy link

mmyxym commented Feb 22, 2024

/backport jdk21u-dev

@openjdk
Copy link

openjdk bot commented Feb 22, 2024

@mmyxym the backport was successfully created on the branch backport-mmyxym-80642dd7 in my personal fork of openjdk/jdk21u-dev. To create a pull request with this backport targeting openjdk/jdk21u-dev:master, just click the following link:

➡️ Create pull request

The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit 80642dd7 from the openjdk/jdk repository.

The commit being backported was authored by Wang Zhuo on 5 Feb 2024 and was reviewed by Albert Mingkun Yang and Thomas Schatzl.

Thanks!

If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk21u-dev:

$ git fetch https://github.com/openjdk-bots/jdk21u-dev.git backport-mmyxym-80642dd7:backport-mmyxym-80642dd7
$ git checkout backport-mmyxym-80642dd7
# make changes
$ git add paths/to/changed/files
$ git commit --message 'Describe additional changes made'
$ git push https://github.com/openjdk-bots/jdk21u-dev.git backport-mmyxym-80642dd7

⚠️ @mmyxym You are not yet a collaborator in my fork openjdk-bots/jdk21u-dev. An invite will be sent out and you need to accept it before you can proceed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
5 participants