Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8319969: os::large_page_init() turns off THPs for ZGC #16690

Closed
wants to merge 5 commits into from

Conversation

stefank
Copy link
Member

@stefank stefank commented Nov 16, 2023

There is code in os::large_page_init() that checks /sys/kernel/mm/transparent_hugepage/enabled and forcefully turns off UseTransparentHugePages if anonymous THPs are disabled in the OS:

  if (UseTransparentHugePages && !HugePages::supports_thp()) {
    if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) {
      log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system.");
    }
    UseLargePages = UseTransparentHugePages = false;
    return;
  }

This is problematic because ZGC doesn't use the /sys/kernel/mm/transparent_hugepage/enabled THPs, but instead the /sys/kernel/mm/transparent_hugepage/shmem_enabled THPs. So, with the following settings:

/sys/kernel/mm/transparent_hugepage/enabled: never
/sys/kernel/mm/transparent_hugepage/shmem_enabled: advise

the above code will force ZGC to run without THPs.

This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch:

  1. remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM.

  2. adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of madvise + MADV_HUGEPAGE.

  3. tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used.

The result of this change can be seen in these tables:

ZGC large pages log output:

E (T)     = Enabled (Transparent)
E (T, OS) = Enabled (Transparent, OS enforced)
D         = Disabled
D         = Disabled (OS enforced)

-XX:+UseTransparentHugePages

shem \ anon | always | madvise | never
------------+--------+---------+-------
always      | E (T)  | E (T)   | E (T)
within_size | E (T)  | E (T)   | E (T)
advise      | E (T)  | E (T)   | E (T)
never       | D (OS) | D (OS)  | D (OS)
deny        | D (OS) | D (OS)  | D (OS)
force       | E (T)  | E (T)   | E (T)

-XX:-UseTransparentHugePages

shem \ anon | always    | madvise   | never
------------+-----------+-----------+-------
always      | E (T, OS) | E (T, OS) | E (T, OS)
within_size | E (T, OS) | E (T, OS) | E (T, OS)
advise      | D         | D         | D
never       | D         | D         | D
deny        | D         | D         | D
force       | E (T, OS) | E (T, OS) | E (T, OS)`

OS reported usage of shared memory huge pages

Y = Yes
- = No

-XX:+UseTransparentHugePages

shem \ anon | always | madvise | never
------------+--------+---------+-------
always      | Y      | Y       | Y
within_size | Y      | Y       | Y
advise      | Y      | Y       | Y
never       | -      | -       | -
deny        | -      | -       | -
force       | Y      | Y       | Y

-XX:-UseTransparentHugePages

shem \ anon | always | madvise | never
------------+--------+---------+-------
always      | Y      | Y       | Y
within_size | Y      | Y       | Y
advise      | -      | -       | -
never       | -      | -       | -
deny        | -      | -       | -
force       | Y      | Y       | Y

OS reported usage of anonymous memory huge pages

Y = Yes
- = No

-XX:+UseTransparentHugePages

shem \ anon | always | madvise | never
------------+--------+---------+-------
always      | Y      | Y       | - 
within_size | Y      | Y       | - 
advise      | Y      | Y       | - 
never       | Y      | Y       | - 
deny        | Y      | Y       | - 
force       | Y      | Y       | - 

-XX:-UseTransparentHugePages

shem \ anon | always | madvise | never
------------+--------+---------+-------
always      | Y      | -       | -
within_size | Y      | -       | -
advise      | Y      | -       | -
never       | Y      | -       | -
deny        | Y      | -       | -
force       | Y      | -       | -

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8319969: os::large_page_init() turns off THPs for ZGC (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16690/head:pull/16690
$ git checkout pull/16690

Update a local copy of the PR:
$ git checkout pull/16690
$ git pull https://git.openjdk.org/jdk.git pull/16690/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16690

View PR using the GUI difftool:
$ git pr show -t 16690

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16690.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 16, 2023

👋 Welcome back stefank! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 16, 2023

@stefank The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Nov 16, 2023
@stefank stefank force-pushed the 8319969_zgc_thp_workaround branch 3 times, most recently from ea584ae to 7c70720 Compare November 20, 2023 13:41
@stefank stefank marked this pull request as ready for review December 1, 2023 09:46
@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 1, 2023
@mlbridge
Copy link

mlbridge bot commented Dec 1, 2023

Webrevs

@tstuefe
Copy link
Member

tstuefe commented Dec 1, 2023

At first glance it looks reasonable, but I will look at it closer next week (no time). Thanks for following the clean separation of OS-info vs what-the-jvm-does-with-it.

@stefank
Copy link
Member Author

stefank commented Dec 1, 2023

At first glance it looks reasonable, but I will look at it closer next week (no time). Thanks for following the clean separation of OS-info vs what-the-jvm-does-with-it.

Thanks!

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small question:

https://wiki.openjdk.org/display/zgc/Main#Main-EnablingTransparentHugePagesOnLinux

mentions that to use THPs with ZGC, one needs both

/sys/kernel/mm/transparent_hugepage/enabled -> "madvise" and /sys/kernel/mm/transparent_hugepage/shmem_enabled -> "advise" in conjunction. Is that correct, the latter needs the former? I did not read this from https://www.kernel.org/doc/html/next/admin-guide/mm/transhuge.html.

src/hotspot/os/linux/hugepages.cpp Outdated Show resolved Hide resolved
src/hotspot/os/linux/os_linux.cpp Outdated Show resolved Hide resolved
return;
}

log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be not clearer to define when to warn, as we do in warn_no_large_pages?

Related to that, should we not warn if ZGC and +shmemthp configured but -anonymous thp? I am not sure the heap is the only part of the JVM that uses THP, and other parts would still use anon THP, or? E.g. Code heap.

Also, maybe a better message for the poor admin that tries to setup. E.g.:

bool requires_shmem_thp = UseTHP + UseZGC
bool requires_anon_thp = UseTHP
bool off = false;

if (requires_shmem && !shmem configured) 
  (log_warning "Shmem thp are not supported. Set /sys/kernel/mm/transparent_hugepage/shmem_enabled  to advise to support shmem thp")
  off = true;

if (requires_anonthp && !anon_thp configured) 
  (log_warning "anonymous Thp are not supported. Set /sys/kernel/mm/transparent_hugepage/enabled  to madvise")
  off = true;

if (off) 
  UseTHP = 0
  log_warning(UseTHP disabled (see previous messages)

if ZGC and !supports shmemthp or

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be not clearer to define when to warn, as we do in warn_no_large_pages?

I don't understand what you are suggesting with this question / request, so I'm not sure exactly what you are looking for. Instead, I made my own version of the pseudo code you posted.

This is the warnings I get with that change:

Without ZGC:

$ thp never never
always madvise [never]
always within_size advise [never] deny force

$ java -XX:+UseTransparentHugePages -version
[0.002s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them.
[0.002s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system.
...

$ thp never advise
always madvise [never]
always within_size [advise] never deny force

java -XX:+UseTransparentHugePages -version
[0.002s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them.
[0.002s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system.
...

$ thp madvise never
always [madvise] never
always within_size advise [never] deny force

$ java -XX:+UseTransparentHugePages -version
...

$ thp madvise advise
always [madvise] never
always within_size [advise] never deny force
$ java -XX:+UseTransparentHugePages -version
...

With ZGC:

$ thp never never
always madvise [never]
always within_size advise [never] deny force

$ java -XX:+UseTransparentHugePages -XX:+UseZGC -version
[0.002s][warning][pagesize] Shared memory transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/shmem_enabled to 'advise' to enable them.
[0.002s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them.
[0.002s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system.
...

$ thp never advise
always madvise [never]
always within_size [advise] never deny force

$ java -XX:+UseTransparentHugePages -XX:+UseZGC -version
[0.001s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them.
[0.001s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system.
...

$ thp madvise never
always [madvise] never
always within_size advise [never] deny force

$ java -XX:+UseTransparentHugePages -XX:+UseZGC -version
[0.002s][warning][pagesize] Shared memory transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/shmem_enabled to 'advise' to enable them.
...

$ thp madvise advise
always [madvise] never
always within_size [advise] never deny force

$ java -XX:+UseTransparentHugePages -XX:+UseZGC -version
...

Please take a look and see if this is an OK solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is okay, what do you think? Too many messages?

@@ -3713,16 +3733,24 @@ struct LargePageInitializationLoggerMark {
os::page_sizes().print_on(&ls);
ls.print_cr(". Default large page size: " EXACTFMT ".", EXACTFMTARGS(os::large_page_size()));
} else {
ls.print("Large page support disabled.");
ls.print("Large page support %sdisabled.", uses_zgc_shmem_thp() ? "partially " : "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we could make our life simpler by not supporting mixes: we could require that for ZGC, to use THP, both shmen and anon thps have to be active. Would that be acceptable or do you think there are too many misconfigured systems out there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not force users to set both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. It is better to be able to run efficiently on as many configurations as possible.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good. Thank you.

@openjdk
Copy link

openjdk bot commented Dec 4, 2023

@stefank This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8319969: os::large_page_init() turns off THPs for ZGC

Reviewed-by: stuefe, aboldtch

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 4, 2023
Copy link
Member

@xmas92 xmas92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

Copy link
Member Author

@stefank stefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks all for reviewing!

I've run this through our tier1-tier5 testing.

@stefank
Copy link
Member Author

stefank commented Dec 6, 2023

/integrate

@openjdk
Copy link

openjdk bot commented Dec 6, 2023

Going to push as commit f482260.
Since your change was applied there have been 10 commits pushed to the master branch:

  • 3edc24a: 8321073: Defer policy of disabling annotation processing by default
  • dc9c77b: 8318756: Create better internal buffer for AEADs
  • a9cb120: 8320948: NPE due to unreported compiler error
  • 4ef24e2: 8319938: TestFileChooserSingleDirectorySelection.java fails with "getSelectedFiles returned empty array"
  • cc25d8b: 8319662: ForkJoinPool trims worker threads too slowly
  • 90e433d: 8320144: Compilation crashes when a custom annotation with invalid default value is used
  • 50f3124: 8320892: AArch64: Restore FPU control state after JNI
  • 0217b5a: 8321248: ClassFile API ClassModel::verify is inconsistent with the rest of the API
  • 7fbfb3b: 8321369: Unproblemlist gc/cslocker/TestCSLocker.java
  • 2678e4c: 8319111: Mismatched MemorySegment heap access is not consistently intrinsified

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 6, 2023
@openjdk openjdk bot closed this Dec 6, 2023
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Dec 6, 2023
@openjdk openjdk bot removed the rfr Pull request is ready for review label Dec 6, 2023
@openjdk
Copy link

openjdk bot commented Dec 6, 2023

@stefank Pushed as commit f482260.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
3 participants