JVMTI: Don't use object header for marking #45

rkennke · 2022-03-23T19:16:58Z

JVMTI marks objects in order to track whether or not it has already visited objects during heap walking. This uses the usual GC marking bits in the object header. However, this proves to be confusing and brittle because some GCs also uses those header bits for marking and/or indicating forwarded objects. In particular, it becomes unreliable for Shenandoah GC to distinguish JVMTI marked objects from forwarded objects.

JVMTI should have no business in marking objects in their header. This change proposes to let JVMTI use its own (temporary) marking bitmap instead. This decouples JVMTI better from GCs.

Testing:

tier1
tier2
tier3

Progress

Change must not contain extraneous whitespace
Change must be properly reviewed

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/lilliput pull/45/head:pull/45
$ git checkout pull/45

Update a local copy of the PR:
$ git checkout pull/45
$ git pull https://git.openjdk.java.net/lilliput pull/45/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 45

View PR using the GUI difftool:
$ git pr show -t 45

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/lilliput/pull/45.diff

bridgekeeper · 2022-03-23T19:18:33Z

👋 Welcome back rkennke! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

tstuefe

Hi Roman,

Trying to understand.

You reserve a bitmap to cover the whole reserved heap space, with one bit per possible object location? So the memory use would be -Xmx / 12, right? (8 bits per byte, 3 bit shift)?

So I calculate ~90MB per GB address space. Am I thinking right?

This may hurt a bit for applications which use jvmti heap walk as OOM analysis tool. But I guess short term its okay.

Will in the course of walking every object be visited, so every bit be set eventually, or would the bitmap be more sparse? If the latter, we may reduce the footprint in the future by making a on-demand-commited bitmap, only committing pages with set bits. Would make clearing faster too.

Do we need a bitmap for the whole reserved range, would a committed range not simpler? But probably difficult to do if multiple committed regions exist.

Oh, but then more motivation in the future to support a sparse bitmap.

Comment before RestoreMarksClosure needs massaging.

Cheers, Thomas

tstuefe · 2022-03-24T08:55:31Z

src/hotspot/share/prims/jvmtiTagMap.cpp

-GrowableArray<markWord>* ObjectMarker::_saved_mark_stack = NULL;
-bool ObjectMarker::_needs_reset = true;  // need to reset mark bits by default
+void ObjectMarker::initialize(MemRegion heap_region) {
+  new (&_mark_bit_map) MarkBitMap();


This construct is a bit awkward. Would be cleaner to have the bitmap just newd and deleted, or even build a compound object containing the bitmap and its ReservedSpace.

tstuefe · 2022-03-24T09:18:16Z

src/hotspot/share/prims/jvmtiTagMap.cpp

+  new (&_mark_bit_map) MarkBitMap();
+  size_t bitmap_size = MarkBitMap::compute_size(heap_region.byte_size());
+  ReservedSpace bitmap(bitmap_size);
+  _bitmap_region = MemRegion((HeapWord*) bitmap.base(), bitmap.size() / HeapWordSize);


I'd use BitsPerWord, not HeapWordSize. Potential conflict: MemRegion uses word size in HeapWord, but Bitmap ultimately expects a memory range in bm_word_t, and uses BitsPerWord internally.

Its a bit of a brain teaser. I would feel better if MarkBitMap::initialize() would assert that the bitmap region covers what it thinks it should cover. Currently it does not check the bitmap region size at all.

tstuefe · 2022-03-24T09:20:23Z

src/hotspot/share/prims/jvmtiTagMap.hpp

+  static MarkBitMap _mark_bit_map;
+  static MemRegion  _bitmap_region;
+public:
+  static void initialize(MemRegion heap_region);


So, initialize() runs at VM startup, init() runs when the heap walk starts, if ever?

I'd rename init() to prepare_heapwalk or something similar. initialize vs init is not clear.

tstuefe · 2022-03-24T09:24:30Z

src/hotspot/share/prims/jvmtiTagMap.cpp

-  _saved_oop_stack = new (ResourceObj::C_HEAP, mtServiceability) GrowableArray<oop>(4000, mtServiceability);
+  if (!os::commit_memory((char*)_bitmap_region.start(), _bitmap_region.byte_size(), false)) {
+    vm_exit_out_of_memory(_bitmap_region.byte_size(), OOM_MALLOC_ERROR,
+                          "Could not commit native memory for auxiliary marking bitmap for JVMTI object marking");


Use os::commit_memory_or_exit ?

Also, its not a OOM_MALLOC

tstuefe · 2022-03-24T09:25:38Z

src/hotspot/share/prims/jvmtiTagMap.cpp

-    // flag to the default for the next call.
-    set_needs_reset(true);
+  if (!os::uncommit_memory((char*)_bitmap_region.start(), _bitmap_region.byte_size())) {
+    log_warning(gc)("Could not uncommit native memory for auxiliary marking bitmap for JVMTI object marking");
  }


Clear bitmap?

tstuefe · 2022-03-24T10:08:26Z

While thinking this through more fully, if the bitmap is used sparsely it won't hurt that much, since we don't touch a lot of memory, even if committed. It only hurts if you hit the commit limit.

tstuefe · 2022-03-24T10:09:55Z

Just thinking, you probably don't need to clear the bitmap at all. os::commit_memory() remaps the space, so its a fresh mmap segment, and anonymous mmap segments are zero-initialized.

rkennke · 2022-03-24T10:28:12Z

You reserve a bitmap to cover the whole reserved heap space, with one bit per possible object location? So the memory use would be -Xmx / 12, right? (8 bits per byte, 3 bit shift)?

No. Each bit covers one word in Java heap (or more, if object alignment is higher). Bitmap size is 1/64 of heap size.

So I calculate ~90MB per GB address space. Am I thinking right?

16MB/1GB address space.

This may hurt a bit for applications which use jvmti heap walk as OOM analysis tool. But I guess short term its okay.

It should only be short-lived, as long as the heap walk takes.

Will in the course of walking every object be visited, so every bit be set eventually, or would the bitmap be more sparse? If the latter, we may reduce the footprint in the future by making a on-demand-commited bitmap, only committing pages with set bits. Would make clearing faster too.

Yes, that would be an option. I assume it would be relatively sparse: objects are larger than single words, many regions are unused, e.g. to have headroom for GC, etc.

Do we need a bitmap for the whole reserved range, would a committed range not simpler? But probably difficult to do if multiple committed regions exist.

Yes. We do that for example in Shenandoah GC: we only commit marking bitmaps for ranges where we have committed heap regions. But it requires better integration with GC. I have been thinking to move ObjectMarker into GC, accessible by a GC interface, and possibly implemented in a GC specific manner, and then GCs could optimize this, or even keep using object-header-marking if it doesn't cause problems with the GC (I believe it only really causes troubles with Shenandoah right now). Maybe this would be preferable overall. ?

Thanks,
Roman

tstuefe · 2022-03-24T10:50:51Z

You reserve a bitmap to cover the whole reserved heap space, with one bit per possible object location? So the memory use would be -Xmx / 12, right? (8 bits per byte, 3 bit shift)?

No. Each bit covers one word in Java heap (or more, if object alignment is higher). Bitmap size is 1/64 of heap size.

Right, thanks.

This may hurt a bit for applications which use jvmti heap walk as OOM analysis tool. But I guess short term its okay.

It should only be short-lived, as long as the heap walk takes.

Sure, but in OOM situations you may not have that memory. E.g. cloudfoundry jvmkill does a jvmti heapwalk before killing the VM due to OOM.

It is not an argument against your patch, especially since it just costs 1/64th. Just saying, it may degrade usefulness of jvmti heap walk. There are ways around it though, and possible optimizations later.

Will in the course of walking every object be visited, so every bit be set eventually, or would the bitmap be more sparse? If the latter, we may reduce the footprint in the future by making a on-demand-commited bitmap, only committing pages with set bits. Would make clearing faster too.

Yes, that would be an option. I assume it would be relatively sparse: objects are larger than single words, many regions are unused, e.g. to have headroom for GC, etc.

Do we need a bitmap for the whole reserved range, would a committed range not simpler? But probably difficult to do if multiple committed regions exist.

Yes. We do that for example in Shenandoah GC: we only commit marking bitmaps for ranges where we have committed heap regions. But it requires better integration with GC. I have been thinking to move ObjectMarker into GC, accessible by a GC interface, and possibly implemented in a GC specific manner, and then GCs could optimize this, or even keep using object-header-marking if it doesn't cause problems with the GC (I believe it only really causes troubles with Shenandoah right now). Maybe this would be preferable overall. ?

Up to you. I think it's fine as it is, and leaving this in jvmti has its charm too.

Personally, I probably would prefer a more generic solution, e.g. a bitmap which on the fly commits pages with set bits. Bitmap clear would be super simple, just uncommit the whole area. getbit() would return false for uncommitted areas. It is simpler in that you have no dependencies to GC or anything else, easier to reuse, saves more memory (since you only commit what is set, not what is committed). It would be an optimization one could do later though.

Thanks, Roman

rkennke · 2022-03-24T11:13:20Z

This may hurt a bit for applications which use jvmti heap walk as OOM analysis tool. But I guess short term its okay.

It should only be short-lived, as long as the heap walk takes.

Sure, but in OOM situations you may not have that memory. E.g. cloudfoundry jvmkill does a jvmti heapwalk before killing the VM due to OOM.

Right. The current implementation also allocates native memory, in needs to preserve object headers of locked objects. It's probably very small, though.

It is not an argument against your patch, especially since it just costs 1/64th. Just saying, it may degrade usefulness of jvmti heap walk. There are ways around it though, and possible optimizations later.

Hmm, yeah.

Will in the course of walking every object be visited, so every bit be set eventually, or would the bitmap be more sparse? If the latter, we may reduce the footprint in the future by making a on-demand-commited bitmap, only committing pages with set bits. Would make clearing faster too.

Yes, that would be an option. I assume it would be relatively sparse: objects are larger than single words, many regions are unused, e.g. to have headroom for GC, etc.

Do we need a bitmap for the whole reserved range, would a committed range not simpler? But probably difficult to do if multiple committed regions exist.

Yes. We do that for example in Shenandoah GC: we only commit marking bitmaps for ranges where we have committed heap regions. But it requires better integration with GC. I have been thinking to move ObjectMarker into GC, accessible by a GC interface, and possibly implemented in a GC specific manner, and then GCs could optimize this, or even keep using object-header-marking if it doesn't cause problems with the GC (I believe it only really causes troubles with Shenandoah right now). Maybe this would be preferable overall. ?

Up to you. I think it's fine as it is, and leaving this in jvmti has its charm too.

I would say, the marking bits in object header is GC concern, and managing a marking bitmap for heapwalk would also be a GC concern (it requires knowledge of heap layout, even more so if we wanted to implement your proposed optimizations).

Personally, I probably would prefer a more generic solution, e.g. a bitmap which on the fly commits pages with set bits. Bitmap clear would be super simple, just uncommit the whole area. getbit() would return false for uncommitted areas. It is simpler in that you have no dependencies to GC or anything else, easier to reuse, saves more memory (since you only commit what is set, not what is committed). It would be an optimization one could do later though.

Yes. Let me implement a little abstraction in GC and a single implementation (which uses object header) for the start. I will do the bitmap-based heapwalk in the Shenandoah patch, and we can implement optimizations then.

Thank you!
Roman

rkennke · 2022-03-24T19:11:47Z

I've reworked the change as follows:

Added GC interface ObjectMarker and ObjectMarkerController by which JVMTI can interact with GC to do object marking.
I provide two implementations: HeaderObjectMarker - which is 1:1 the existing implementation and BitmapObjectMarker which is the new implementation using a bitmap (still without any possible optimizations).
Add a flag, mostly for testing, to switch implementations. Shenandoah may hardwire the flag later, or we drop the flag altogether and hardwire init_object_marker() in the CollectedHeap subclass.

This seems much cleaner than the previous implementation and also cleaner than the previous proposed change (no in-place constructor, no global state (except temporary global pointer to ObjectMarker impl), proper separation of concerns, etc).

mlbridge · 2022-03-24T19:18:50Z

Webrevs

coleenp · 2022-03-24T20:47:49Z

Just a question. Can you make this change for mainline?

rkennke · 2022-03-24T20:49:35Z

Just a question. Can you make this change for mainline?

ha! Yes sure I can, if you think there is interest. Maybe without the BitmapObjectMarker stuff, but possibly with some improvements to the oop/mark saving machinery.

rkennke · 2022-03-25T20:16:03Z

Just a question. Can you make this change for mainline?

See openjdk/jdk#7964

openjdk · 2022-03-28T14:17:41Z

⚠️ @rkennke This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

rkennke · 2022-03-28T16:11:24Z

Withdrawing this in favor of openjdk/jdk#7964

JVMTI: Don't use object header for marking

7f42dd6

rkennke requested review from shipilev and tstuefe March 23, 2022 20:28

tstuefe reviewed Mar 24, 2022

View reviewed changes

Roman Kennke added 2 commits March 24, 2022 20:04

GC abstraction for ObjectMarker

861b8dc

Revert leftover changes

1385125

rkennke marked this pull request as ready for review March 24, 2022 19:13

openjdk bot added the rfr Pull request is ready for review label Mar 24, 2022

Restore missing include

f2fb036

rkennke mentioned this pull request Mar 25, 2022

8283710: JVMTI: Use BitSet for object marking openjdk/jdk#7964

Closed

8 tasks

Roman Kennke added 2 commits March 28, 2022 16:13

Simpler needs_reset handling

e3e310c

Merge remote-tracking branch 'origin/jvmti-marking' into jvmti-marking

5c21f22

rkennke closed this Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JVMTI: Don't use object header for marking #45

JVMTI: Don't use object header for marking #45

rkennke commented Mar 23, 2022 •

edited by openjdk bot

Loading

bridgekeeper bot commented Mar 23, 2022

tstuefe left a comment

tstuefe Mar 24, 2022

tstuefe Mar 24, 2022

tstuefe Mar 24, 2022

tstuefe Mar 24, 2022

tstuefe Mar 24, 2022

tstuefe Mar 24, 2022

tstuefe commented Mar 24, 2022

tstuefe commented Mar 24, 2022

rkennke commented Mar 24, 2022

tstuefe commented Mar 24, 2022

rkennke commented Mar 24, 2022

rkennke commented Mar 24, 2022

mlbridge bot commented Mar 24, 2022 •

edited

Loading

coleenp commented Mar 24, 2022

rkennke commented Mar 24, 2022

rkennke commented Mar 25, 2022

openjdk bot commented Mar 28, 2022

rkennke commented Mar 28, 2022

JVMTI: Don't use object header for marking #45

JVMTI: Don't use object header for marking #45

Conversation

rkennke commented Mar 23, 2022 • edited by openjdk bot Loading

Progress

Reviewing

bridgekeeper bot commented Mar 23, 2022

tstuefe left a comment

Choose a reason for hiding this comment

tstuefe Mar 24, 2022

Choose a reason for hiding this comment

tstuefe Mar 24, 2022

Choose a reason for hiding this comment

tstuefe Mar 24, 2022

Choose a reason for hiding this comment

tstuefe Mar 24, 2022

Choose a reason for hiding this comment

tstuefe Mar 24, 2022

Choose a reason for hiding this comment

tstuefe Mar 24, 2022

Choose a reason for hiding this comment

tstuefe commented Mar 24, 2022

tstuefe commented Mar 24, 2022

rkennke commented Mar 24, 2022

tstuefe commented Mar 24, 2022

rkennke commented Mar 24, 2022

rkennke commented Mar 24, 2022

mlbridge bot commented Mar 24, 2022 • edited Loading

Webrevs

coleenp commented Mar 24, 2022

rkennke commented Mar 24, 2022

rkennke commented Mar 25, 2022

openjdk bot commented Mar 28, 2022

rkennke commented Mar 28, 2022

rkennke commented Mar 23, 2022 •

edited by openjdk bot

Loading

mlbridge bot commented Mar 24, 2022 •

edited

Loading