JDK-8312018: Improve zero-base-optimized reservation of class space #14867

tstuefe · 2023-07-13T10:17:40Z

TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding.

A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation.

With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges.

We do this by examining the java heap end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress.

This approach has many disadvantages:

it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free.
HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away.
We only get 1 shot. It's either one of these two addresses.
And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed.
It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput).
It actually reduces the chance of getting a zero-based java heap. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is CompressedClassSpaceSize bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation.
It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle.
Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loosely correlated to the heap range (the latter must contain the former). The fact that CompressedOops::end() returns the heap range end can be seen as aberration since the actual encoding range end usually far outreaches the heap range end. But as long this code relies on it returning the heap range end, we cannot fix that.

This RFE instead proposes a different approach:

Let us have an API, os::attempt_reserve_memory_below(address max, ...). This API will do its best to reserve memory with a given size and alignment below a given max address. It will, on supporting OSes, attempt to use OS-specific means to find a suitable address space hole to place the reservation in. Otherwise, it will do the typical ladder-reservation approach with an adjustable maximum number of tries.
Let's use this API to reserve zero-base-friendly class space. Let's remove all knowledge of heap and CompressedOops. Now we are independent of what the heap does. It may or may not be located in lower address ranges. If it is, the new API will work around it and find a gap to place the class space if it is not, even better.
Let's remove the "leave a gap for class space" logic from Heap reservation. We don't need it. It is harmful: Heap should have the best chance for zero-based - if I only can have one, I rather have a zero-based heap than a zero-based class space.

The end result will be a JVM with a much better chance to get zero-based class space and zero-based heap; we will have removed dependencies between heap and class space; we will have an API that can be used for similar problems (e.g. an obvious future enhancement would be to use this new reservation API for zero-based heap reservation as well, and other places could use it too, eg. Shenandoah CollectionSet reservation).

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8312018: Improve zero-base-optimized reservation of class space (Enhancement - P3)

Reviewers

Roman Kennke (@rkennke - Reviewer) ⚠️ Review applies to 8bb7d705

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/14867/head:pull/14867
$ git checkout pull/14867

Update a local copy of the PR:
$ git checkout pull/14867
$ git pull https://git.openjdk.org/jdk.git pull/14867/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 14867

View PR using the GUI difftool:
$ git pr show -t 14867

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/14867.diff

Webrev

Link to Webrev Comment

bridgekeeper · 2023-07-13T10:19:13Z

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2023-07-13T10:21:03Z

@tstuefe The following label will be automatically applied to this pull request:

hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

tstuefe · 2023-07-14T10:03:05Z

Pinging @iklam

mlbridge · 2023-07-14T10:07:31Z

Webrevs

rkennke

I am not familiar enough with these places of HotSpot (and the OSes, for that matter), but I have questions/comments.

src/hotspot/share/memory/metaspace.cpp

src/hotspot/os/linux/os_linux.cpp

tstuefe · 2023-07-14T13:01:10Z

I am not familiar enough with these places of HotSpot (and the OSes, for that matter), but I have questions/comments.

Thank you Roman. I worked in your feedback; while testing I found an off-by-one, and a minor flaw with tracing.

About your procfs question, this should be safe. We only do this once, at start, and have a reasonable fallback. Note that hotspot already reads from this file for other purposes, it seems to work well.

rkennke

Looks good to me now, but somebody else who is more familiar with these things should review it as well. Thank you!

openjdk · 2023-07-14T14:36:51Z

@tstuefe This change is no longer ready for integration - check the PR body for details.

dholmes-ora · 2023-07-17T05:04:52Z

src/hotspot/share/memory/metaspace.cpp

-    }
-
-    // ...failing that, reserve anywhere, but let platform do optimized placement:
+    // ...otherwise let JVM chose the best placing:


s/chose/choose/

dholmes-ora · 2023-07-17T05:06:58Z

src/hotspot/share/utilities/globalDefinitions.hpp

+// Convert pointer to uintptr_t
+inline uintptr_t p2u(const volatile void* p) {
+  return (uintptr_t) p;
+}
+


Seems overkill for one use.

…servation-of-class-space

tstuefe · 2023-07-18T16:42:50Z

Thanks @dholmes-ora . I worked in your feedback.

iklam · 2023-07-18T19:32:30Z

I think we should document the interaction with "-Xshare:dump". Maybe we should add comments that the value of CompressedKlass::base() is irrelevant to the dumped CDS archive when running "java -Xshared:dump", because of this code.

I.e., if CDS is enabled, we always use a non-zero based encoding.

narrowKlass ArchiveBuilder::get_requested_narrow_klass(Klass* k) {
  assert(DumpSharedSpaces, "sanity");
  k = get_buffered_klass(k);
  Klass* requested_k = to_requested(k);
  address narrow_klass_base = _requested_static_archive_bottom; // runtime encoding base == runtime mapping start
  const int narrow_klass_shift = ArchiveHeapWriter::precomputed_narrow_klass_shift;
  return CompressedKlassPointers::encode_not_null(requested_k, narrow_klass_base, narrow_klass_shift);
}

iklam · 2023-07-18T19:40:39Z

src/hotspot/share/runtime/os.cpp

+char* os::get_lowest_attach_address() {
+  return (char*)os::vm_allocation_granularity();
+}
+
+char* os::get_highest_attach_address() {
+  return (char*)(
+#ifdef _LP64
+  (128 * 1024 * G)
+#else
+  SIZE_MAX
+#endif
+  - os::vm_page_size());
+}


I am not sure what "attach" means in this sense. If it's the usable address range, wouldn't it need to be OS-specific?

Yes, highest and lowest usable address. You are right in that its OS specific. I'll move these to the respective OS files.

iklam · 2023-07-18T19:43:26Z

src/hotspot/share/memory/metaspace.cpp

+  // cause the reserved range to nestle alongside the heap.
+  {
+    // First try for zero-base zero-shift (lower 4G); failing that, try for zero-based with max shift (lower 32G)
+    constexpr int num_tries = 8;


num_tries should be computed instead of hard-coded.

Why? In pre-existing code that does similar things, we always hardcode them implicitly (typically by attempt-mapping from A->B in hardcoded stride C). And how would I calculate it?

Hmm, I am a bit unsure about the meaning of the num_tries parameter. It looks like Windows and Linux will scan the system's memory map and look for the first hole that's larger than size. However, if there are a lot of small holes at lower addresses, then doing 8 tries won't find a large enough block, even though such a block may exist below unscaled_max.

The default implementation will try to find a free block with fixed steps. In this case, it seems like

num_tries = (unscaled_max + size - 1) / size;

would be more appropriate. Otherwise if size changes in the future (to be smaller), again you won't be able to find an appropriate block, even if one exists.

So I am not sure if the caller passing in an arbitrary num_tries parameter is a good idea. Maybe the API needs to be redesigned.

tstuefe · 2023-07-19T12:39:06Z

I think we should document the interaction with "-Xshare:dump". Maybe we should add comments that the value of CompressedKlass::base() is irrelevant to the dumped CDS archive when running "java -Xshared:dump", because of this code.

Okay. Maybe in a different RFE? Since its a bit tangential to what this patch does.

I.e., if CDS is enabled, we always use a non-zero based encoding.

Not necessarily. If CDS is enabled and we don't get the preferred mapping address, we will fallback to traditional Klass range reservation and potentially go zero based. With this patch, that path is optimized too.

tstuefe · 2023-07-26T07:53:49Z

@iklam @rkennke @dholmes-ora Thanks for your feedback. I'll close this PR in favor of a fresh one, since I did some considerable changes to the API and I don't want to flood your mailboxes with skara spam.

openjdk bot added the hotspot hotspot-dev@openjdk.org label Jul 13, 2023

tstuefe force-pushed the JDK-8312018-Improve-zero-base-optimized-reservation-of-class-space branch 5 times, most recently from 194e7c4 to 814675f Compare July 13, 2023 17:38

better zero-based reservation strategy

22fe1e4

tstuefe force-pushed the JDK-8312018-Improve-zero-base-optimized-reservation-of-class-space branch from 814675f to 22fe1e4 Compare July 13, 2023 17:41

tstuefe marked this pull request as ready for review July 14, 2023 10:00

openjdk bot added the rfr Pull request is ready for review label Jul 14, 2023

rkennke reviewed Jul 14, 2023

View reviewed changes

src/hotspot/share/memory/metaspace.cpp Outdated Show resolved Hide resolved

src/hotspot/os/linux/os_linux.cpp Show resolved Hide resolved

Feedback Roman; fix off-by-1; fix tracing

8bb7d70

rkennke approved these changes Jul 14, 2023

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Jul 14, 2023

Fix Windows

8d6a1ed

dholmes-ora reviewed Jul 17, 2023

View reviewed changes

tstuefe added 2 commits July 18, 2023 18:04

Merge branch 'master' into JDK-8312018-Improve-zero-base-optimized-re…

ce694a8

…servation-of-class-space

Feedback David

60923aa

iklam reviewed Jul 18, 2023

View reviewed changes

wip

7ff7131

tstuefe added 3 commits July 25, 2023 08:41

rework API

0e176b5

wip

8bef3ff

wip

b9863cb

tstuefe marked this pull request as draft July 25, 2023 14:38

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jul 25, 2023

tstuefe added 3 commits July 25, 2023 16:49

wip

8ade913

wip

dc32dcf

cowabonga

b315888

tstuefe closed this Jul 26, 2023

tstuefe mentioned this pull request Jul 26, 2023

JDK-8312018: Improve reservation of class space and CDS #15041

Closed

3 tasks

JDK-8312018: Improve zero-base-optimized reservation of class space #14867

JDK-8312018: Improve zero-base-optimized reservation of class space #14867

Uh oh!

Conversation

tstuefe commented Jul 13, 2023 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewers

Reviewing

Webrev

Uh oh!

bridgekeeper bot commented Jul 13, 2023

Uh oh!

openjdk bot commented Jul 13, 2023

Uh oh!

tstuefe commented Jul 14, 2023

Uh oh!

mlbridge bot commented Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

rkennke left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tstuefe commented Jul 14, 2023

Uh oh!

rkennke left a comment

Choose a reason for hiding this comment

Uh oh!

openjdk bot commented Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dholmes-ora Jul 17, 2023

Choose a reason for hiding this comment

Uh oh!

dholmes-ora Jul 17, 2023

Choose a reason for hiding this comment

Uh oh!

tstuefe commented Jul 18, 2023

Uh oh!

iklam commented Jul 18, 2023

Uh oh!

iklam Jul 18, 2023

Choose a reason for hiding this comment

Uh oh!

tstuefe Jul 21, 2023

Choose a reason for hiding this comment

Uh oh!

iklam Jul 18, 2023

Choose a reason for hiding this comment

Uh oh!

tstuefe Jul 21, 2023

Choose a reason for hiding this comment

Uh oh!

iklam Jul 23, 2023

Choose a reason for hiding this comment

Uh oh!

tstuefe commented Jul 19, 2023

Uh oh!

tstuefe commented Jul 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

tstuefe commented Jul 13, 2023 •

edited by openjdk bot

Loading

mlbridge bot commented Jul 14, 2023 •

edited

Loading

openjdk bot commented Jul 14, 2023 •

edited

Loading