Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JVMTI: Don't use object header for marking #45

Closed
wants to merge 6 commits into from

Conversation

rkennke
Copy link
Collaborator

@rkennke rkennke commented Mar 23, 2022

JVMTI marks objects in order to track whether or not it has already visited objects during heap walking. This uses the usual GC marking bits in the object header. However, this proves to be confusing and brittle because some GCs also uses those header bits for marking and/or indicating forwarded objects. In particular, it becomes unreliable for Shenandoah GC to distinguish JVMTI marked objects from forwarded objects.

JVMTI should have no business in marking objects in their header. This change proposes to let JVMTI use its own (temporary) marking bitmap instead. This decouples JVMTI better from GCs.

Testing:

  • tier1
  • tier2
  • tier3

Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/lilliput pull/45/head:pull/45
$ git checkout pull/45

Update a local copy of the PR:
$ git checkout pull/45
$ git pull https://git.openjdk.java.net/lilliput pull/45/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 45

View PR using the GUI difftool:
$ git pr show -t 45

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/lilliput/pull/45.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 23, 2022

👋 Welcome back rkennke! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Roman,

Trying to understand.

You reserve a bitmap to cover the whole reserved heap space, with one bit per possible object location? So the memory use would be -Xmx / 12, right? (8 bits per byte, 3 bit shift)?

So I calculate ~90MB per GB address space. Am I thinking right?

This may hurt a bit for applications which use jvmti heap walk as OOM analysis tool. But I guess short term its okay.

Will in the course of walking every object be visited, so every bit be set eventually, or would the bitmap be more sparse? If the latter, we may reduce the footprint in the future by making a on-demand-commited bitmap, only committing pages with set bits. Would make clearing faster too.

Do we need a bitmap for the whole reserved range, would a committed range not simpler? But probably difficult to do if multiple committed regions exist.

Oh, but then more motivation in the future to support a sparse bitmap.


Comment before RestoreMarksClosure needs massaging.

Cheers, Thomas

GrowableArray<markWord>* ObjectMarker::_saved_mark_stack = NULL;
bool ObjectMarker::_needs_reset = true; // need to reset mark bits by default
void ObjectMarker::initialize(MemRegion heap_region) {
new (&_mark_bit_map) MarkBitMap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This construct is a bit awkward. Would be cleaner to have the bitmap just newd and deleted, or even build a compound object containing the bitmap and its ReservedSpace.

new (&_mark_bit_map) MarkBitMap();
size_t bitmap_size = MarkBitMap::compute_size(heap_region.byte_size());
ReservedSpace bitmap(bitmap_size);
_bitmap_region = MemRegion((HeapWord*) bitmap.base(), bitmap.size() / HeapWordSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use BitsPerWord, not HeapWordSize. Potential conflict: MemRegion uses word size in HeapWord, but Bitmap ultimately expects a memory range in bm_word_t, and uses BitsPerWord internally.

Its a bit of a brain teaser. I would feel better if MarkBitMap::initialize() would assert that the bitmap region covers what it thinks it should cover. Currently it does not check the bitmap region size at all.

static MarkBitMap _mark_bit_map;
static MemRegion _bitmap_region;
public:
static void initialize(MemRegion heap_region);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, initialize() runs at VM startup, init() runs when the heap walk starts, if ever?

I'd rename init() to prepare_heapwalk or something similar. initialize vs init is not clear.

_saved_oop_stack = new (ResourceObj::C_HEAP, mtServiceability) GrowableArray<oop>(4000, mtServiceability);
if (!os::commit_memory((char*)_bitmap_region.start(), _bitmap_region.byte_size(), false)) {
vm_exit_out_of_memory(_bitmap_region.byte_size(), OOM_MALLOC_ERROR,
"Could not commit native memory for auxiliary marking bitmap for JVMTI object marking");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use os::commit_memory_or_exit ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, its not a OOM_MALLOC

// flag to the default for the next call.
set_needs_reset(true);
if (!os::uncommit_memory((char*)_bitmap_region.start(), _bitmap_region.byte_size())) {
log_warning(gc)("Could not uncommit native memory for auxiliary marking bitmap for JVMTI object marking");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear bitmap?

@tstuefe
Copy link
Member

tstuefe commented Mar 24, 2022

While thinking this through more fully, if the bitmap is used sparsely it won't hurt that much, since we don't touch a lot of memory, even if committed. It only hurts if you hit the commit limit.

@tstuefe
Copy link
Member

tstuefe commented Mar 24, 2022

Just thinking, you probably don't need to clear the bitmap at all. os::commit_memory() remaps the space, so its a fresh mmap segment, and anonymous mmap segments are zero-initialized.

@rkennke
Copy link
Collaborator Author

rkennke commented Mar 24, 2022

You reserve a bitmap to cover the whole reserved heap space, with one bit per possible object location? So the memory use would be -Xmx / 12, right? (8 bits per byte, 3 bit shift)?

No. Each bit covers one word in Java heap (or more, if object alignment is higher). Bitmap size is 1/64 of heap size.

So I calculate ~90MB per GB address space. Am I thinking right?

16MB/1GB address space.

This may hurt a bit for applications which use jvmti heap walk as OOM analysis tool. But I guess short term its okay.

It should only be short-lived, as long as the heap walk takes.

Will in the course of walking every object be visited, so every bit be set eventually, or would the bitmap be more sparse? If the latter, we may reduce the footprint in the future by making a on-demand-commited bitmap, only committing pages with set bits. Would make clearing faster too.

Yes, that would be an option. I assume it would be relatively sparse: objects are larger than single words, many regions are unused, e.g. to have headroom for GC, etc.

Do we need a bitmap for the whole reserved range, would a committed range not simpler? But probably difficult to do if multiple committed regions exist.

Yes. We do that for example in Shenandoah GC: we only commit marking bitmaps for ranges where we have committed heap regions. But it requires better integration with GC. I have been thinking to move ObjectMarker into GC, accessible by a GC interface, and possibly implemented in a GC specific manner, and then GCs could optimize this, or even keep using object-header-marking if it doesn't cause problems with the GC (I believe it only really causes troubles with Shenandoah right now). Maybe this would be preferable overall. ?

Thanks,
Roman

@tstuefe
Copy link
Member

tstuefe commented Mar 24, 2022

You reserve a bitmap to cover the whole reserved heap space, with one bit per possible object location? So the memory use would be -Xmx / 12, right? (8 bits per byte, 3 bit shift)?

No. Each bit covers one word in Java heap (or more, if object alignment is higher). Bitmap size is 1/64 of heap size.

Right, thanks.

This may hurt a bit for applications which use jvmti heap walk as OOM analysis tool. But I guess short term its okay.

It should only be short-lived, as long as the heap walk takes.

Sure, but in OOM situations you may not have that memory. E.g. cloudfoundry jvmkill does a jvmti heapwalk before killing the VM due to OOM.

It is not an argument against your patch, especially since it just costs 1/64th. Just saying, it may degrade usefulness of jvmti heap walk. There are ways around it though, and possible optimizations later.

Will in the course of walking every object be visited, so every bit be set eventually, or would the bitmap be more sparse? If the latter, we may reduce the footprint in the future by making a on-demand-commited bitmap, only committing pages with set bits. Would make clearing faster too.

Yes, that would be an option. I assume it would be relatively sparse: objects are larger than single words, many regions are unused, e.g. to have headroom for GC, etc.

Do we need a bitmap for the whole reserved range, would a committed range not simpler? But probably difficult to do if multiple committed regions exist.

Yes. We do that for example in Shenandoah GC: we only commit marking bitmaps for ranges where we have committed heap regions. But it requires better integration with GC. I have been thinking to move ObjectMarker into GC, accessible by a GC interface, and possibly implemented in a GC specific manner, and then GCs could optimize this, or even keep using object-header-marking if it doesn't cause problems with the GC (I believe it only really causes troubles with Shenandoah right now). Maybe this would be preferable overall. ?

Up to you. I think it's fine as it is, and leaving this in jvmti has its charm too.

Personally, I probably would prefer a more generic solution, e.g. a bitmap which on the fly commits pages with set bits. Bitmap clear would be super simple, just uncommit the whole area. getbit() would return false for uncommitted areas. It is simpler in that you have no dependencies to GC or anything else, easier to reuse, saves more memory (since you only commit what is set, not what is committed). It would be an optimization one could do later though.

Thanks, Roman

@rkennke
Copy link
Collaborator Author

rkennke commented Mar 24, 2022

This may hurt a bit for applications which use jvmti heap walk as OOM analysis tool. But I guess short term its okay.

It should only be short-lived, as long as the heap walk takes.

Sure, but in OOM situations you may not have that memory. E.g. cloudfoundry jvmkill does a jvmti heapwalk before killing the VM due to OOM.

Right. The current implementation also allocates native memory, in needs to preserve object headers of locked objects. It's probably very small, though.

It is not an argument against your patch, especially since it just costs 1/64th. Just saying, it may degrade usefulness of jvmti heap walk. There are ways around it though, and possible optimizations later.

Hmm, yeah.

Will in the course of walking every object be visited, so every bit be set eventually, or would the bitmap be more sparse? If the latter, we may reduce the footprint in the future by making a on-demand-commited bitmap, only committing pages with set bits. Would make clearing faster too.

Yes, that would be an option. I assume it would be relatively sparse: objects are larger than single words, many regions are unused, e.g. to have headroom for GC, etc.

Do we need a bitmap for the whole reserved range, would a committed range not simpler? But probably difficult to do if multiple committed regions exist.

Yes. We do that for example in Shenandoah GC: we only commit marking bitmaps for ranges where we have committed heap regions. But it requires better integration with GC. I have been thinking to move ObjectMarker into GC, accessible by a GC interface, and possibly implemented in a GC specific manner, and then GCs could optimize this, or even keep using object-header-marking if it doesn't cause problems with the GC (I believe it only really causes troubles with Shenandoah right now). Maybe this would be preferable overall. ?

Up to you. I think it's fine as it is, and leaving this in jvmti has its charm too.

I would say, the marking bits in object header is GC concern, and managing a marking bitmap for heapwalk would also be a GC concern (it requires knowledge of heap layout, even more so if we wanted to implement your proposed optimizations).

Personally, I probably would prefer a more generic solution, e.g. a bitmap which on the fly commits pages with set bits. Bitmap clear would be super simple, just uncommit the whole area. getbit() would return false for uncommitted areas. It is simpler in that you have no dependencies to GC or anything else, easier to reuse, saves more memory (since you only commit what is set, not what is committed). It would be an optimization one could do later though.

Yes. Let me implement a little abstraction in GC and a single implementation (which uses object header) for the start. I will do the bitmap-based heapwalk in the Shenandoah patch, and we can implement optimizations then.

Thank you!
Roman

@rkennke
Copy link
Collaborator Author

rkennke commented Mar 24, 2022

I've reworked the change as follows:

  • Added GC interface ObjectMarker and ObjectMarkerController by which JVMTI can interact with GC to do object marking.
  • I provide two implementations: HeaderObjectMarker - which is 1:1 the existing implementation and BitmapObjectMarker which is the new implementation using a bitmap (still without any possible optimizations).
  • Add a flag, mostly for testing, to switch implementations. Shenandoah may hardwire the flag later, or we drop the flag altogether and hardwire init_object_marker() in the CollectedHeap subclass.

This seems much cleaner than the previous implementation and also cleaner than the previous proposed change (no in-place constructor, no global state (except temporary global pointer to ObjectMarker impl), proper separation of concerns, etc).

@rkennke rkennke marked this pull request as ready for review March 24, 2022 19:13
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 24, 2022
@mlbridge
Copy link

mlbridge bot commented Mar 24, 2022

Webrevs

@coleenp
Copy link
Collaborator

coleenp commented Mar 24, 2022

Just a question. Can you make this change for mainline?

@rkennke
Copy link
Collaborator Author

rkennke commented Mar 24, 2022

Just a question. Can you make this change for mainline?

ha! Yes sure I can, if you think there is interest. Maybe without the BitmapObjectMarker stuff, but possibly with some improvements to the oop/mark saving machinery.

@rkennke
Copy link
Collaborator Author

rkennke commented Mar 25, 2022

Just a question. Can you make this change for mainline?

See openjdk/jdk#7964

@openjdk
Copy link

openjdk bot commented Mar 28, 2022

⚠️ @rkennke This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@rkennke
Copy link
Collaborator Author

rkennke commented Mar 28, 2022

Withdrawing this in favor of openjdk/jdk#7964

@rkennke rkennke closed this Mar 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

3 participants