Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8283710: JVMTI: Use BitSet for object marking #7964

Closed
wants to merge 13 commits into from

Conversation

rkennke
Copy link
Contributor

@rkennke rkennke commented Mar 25, 2022

JVMTI heap walking marks objects in order to track which have been visited already. In order to do that, it uses bits in the object header. Those are the same bits that are also used by some GCs to mark objects (the lowest two bits, also used by locking code). Some GCs also use the bits in order to indicate 'forwarded' objects, where the upper bits of the header represent the forward-pointer. In the case of Shenandoah, it's even more problematic because this happens concurrently, even while JVMTI heap walks can intercept. So far we carefully worked around that problem, but it becomes very problematic in Lilliput, where accesses to the Klass* also requires to decode the header, and figure out what bits means what.

In addition to that, marking objects in their header requires that the original header gets saved and restored. We only do that for 'interesting' headers, that is headers that have a stack-lock, monitor or hash-code. All other headers are reset to their default value. This means we are losing object's GC age. This is not catastrophic, but nontheless interferes with GC.

JFR already has a datastructure called BitSet to support object marking without messing with object's headers. We can use that in JVMTI too.

Testing:

  • tier1
  • tier2
  • tier3
  • serviceability/jvmti
  • vmTestbase/nsk/jvmti

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7964/head:pull/7964
$ git checkout pull/7964

Update a local copy of the PR:
$ git checkout pull/7964
$ git pull https://git.openjdk.java.net/jdk pull/7964/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 7964

View PR using the GUI difftool:
$ git pr show -t 7964

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7964.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 25, 2022

👋 Welcome back rkennke! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 25, 2022

@rkennke The following labels will be automatically applied to this pull request:

  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Mar 25, 2022
@rkennke rkennke marked this pull request as ready for review March 28, 2022 13:23
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 28, 2022
@mlbridge
Copy link

mlbridge bot commented Mar 28, 2022

@tstuefe
Copy link
Member

tstuefe commented Mar 31, 2022

Hi Roman,

I have a question about this code, unrelated to your patch. I'd like to make sure I understand it correctly.

When walking the heap for JVMTI:

  1. If the object had been locked, or has an identity hash != 0, we remember the mark word.
  2. We then set each object mark word to 00..11 - "no hash, marked".
    After we are done walking, we:
  3. iterate over all objects and set them to 00..01 - "no hash no lock".
  4. reinstall the mark words for those objects we remembered at (1) - which had a hash, or were locked.

I try to understand why we zero out the upper 62 bits of the mark word at all at (2). It seems to me the JVMTI heap walker does not use those bits. AFAIU its read only, it only reports the objects it did find without modifying them. Would it not be sufficient just to set the lock bits to "marked" and leave the rest of the mark word intact?

IIUC the only thing we really need to restore are the lock bits, for locked objects. So whenever lock bits had not been "01". But a single byte would be sufficient for that, no need to save the whole 63bits, or? So _saved_mark_stack and _saved_oop_stack could be made a lot smaller.

Another question, what is the effect of age bits getting reset by JVMTI heap walking? Won't that interfere with the next GC?

Thanks a a lot,

..Thomas

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks straightforward enough.

One question, ObjectMarker::mark() had been a static inline function before, now we call via the new Controller, so it is not inlined and a virtual function call. Won't that cost performance?

Cheers, Thomas

src/hotspot/share/prims/jvmtiTagMap.cpp Show resolved Hide resolved
@@ -427,6 +430,8 @@ class CollectedHeap : public CHeapObj<mtGC> {

virtual void initialize_serviceability() = 0;

virtual ObjectMarker* init_object_marker();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does creation happen in collectedHeap.cpp, not objectMarker.cpp? And why expose this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention here was to be able for a GC to provide an alternative implementation, especially in the context of Lilliput. See for example here: openjdk/lilliput#32

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I get that :) I just wondered whether implementing this with, say, metaprogramming may be faster. E.g. make the controller a template and plugging in a Marker implementation that provides a "mark" function. Which then could be inlined. That way you don't pay for virtual calls at runtime. For a function that is invoked for every object this may be significant.

May also be done as a future RFE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I get that :) I just wondered whether implementing this with, say, metaprogramming may be faster. E.g. make the controller a template and plugging in a Marker implementation that provides a "mark" function. Which then could be inlined. That way you don't pay for virtual calls at runtime. For a function that is invoked for every object this may be significant.

May also be done as a future RFE.

Right. Let me thinker with this idea for a bit.

@rkennke
Copy link
Contributor Author

rkennke commented Mar 31, 2022

When walking the heap for JVMTI:

1. If the object had been locked, or has an identity hash != 0, we remember the mark word.

2. We then set each object mark word to 00..11 - "no hash, marked".
   After we are done walking, we:

3. iterate over all objects and set them to 00..01 - "no hash no lock".

4. reinstall the mark words for those objects we remembered at (1) - which had a hash, or were locked.

This is correct.

I try to understand why we zero out the upper 62 bits of the mark word at all at (2). It seems to me the JVMTI heap walker does not use those bits. AFAIU its read only, it only reports the objects it did find without modifying them. Would it not be sufficient just to set the lock bits to "marked" and leave the rest of the mark word intact?

Yes, that would be sufficient, and actually better:

  1. In the context of Lilliput, it would preserve the Klass* bits (we fixed this in Lilliput already).
  2. We would preserve the GC age.

IIUC the only thing we really need to restore are the lock bits, for locked objects. So whenever lock bits had not been "01". But a single byte would be sufficient for that, no need to save the whole 63bits, or? So _saved_mark_stack and _saved_oop_stack could be made a lot smaller.

Absolutely! First step for improving this would be to use preservedMarks.hpp|cpp infrastructure. This would avoid the headache of two distinct lists going out of sync (they don't, really, but brain is used anyway), and improve code sharing (GC code uses this for full-GCs). Second step might be to only preserve mark-bits and make the structure smaller, and avoid thrashing upper bits (although, GCs would overwrite some upper bits with the fwdptr anyway, so may not be worth).

Another question, what is the effect of age bits getting reset by JVMTI heap walking? Won't that interfere with the next GC?

It would make all somewhat-aged objects young again. Yes, it's an interference, but not catastrophic. If we can avoid it easily, then it would be better, though.

Thanks,
Roman

@rkennke
Copy link
Contributor Author

rkennke commented Mar 31, 2022

This looks straightforward enough.

One question, ObjectMarker::mark() had been a static inline function before, now we call via the new Controller, so it is not inlined and a virtual function call. Won't that cost performance?

Possibly. I don't think this is particularily performance-critical code, though, and calling those methods most likely not what really matters.

@tstuefe
Copy link
Member

tstuefe commented Mar 31, 2022

This looks straightforward enough.
One question, ObjectMarker::mark() had been a static inline function before, now we call via the new Controller, so it is not inlined and a virtual function call. Won't that cost performance?

Possibly. I don't think this is particularily performance-critical code, though, and calling those methods most likely not what really matters.

This was not about performance, just about keeping interfaces small. I like to keep things local if possible.

@tstuefe
Copy link
Member

tstuefe commented Mar 31, 2022

IIUC the only thing we really need to restore are the lock bits, for locked objects. So whenever lock bits had not been "01". But a single byte would be sufficient for that, no need to save the whole 63bits, or? So _saved_mark_stack and _saved_oop_stack could be made a lot smaller.

Absolutely! First step for improving this would be to use preservedMarks.hpp|cpp infrastructure. This would avoid the headache of two distinct lists going out of sync (they don't, really, but brain is used anyway), and improve code sharing (GC code uses this for full-GCs). Second step might be to only preserve mark-bits and make the structure smaller, and avoid thrashing upper bits (although, GCs would overwrite some upper bits with the fwdptr anyway, so may not be worth).

If all one needs to preserve are the two lock bits, could one hide them in the lowest bit of the oop? Then you'd get away with a single array of 64-bit elements.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm finally now looking at this. What I was hoping you'd do instead is have JVMTI not mark objects at all, and use an {oop} hashtable to determine which objects are marked. It already has to do the growable array to restore the marks, so it already costs space.
Use ResourceHashtable in preference to hashtable.hpp Hashtable though.

@rkennke
Copy link
Contributor Author

rkennke commented Mar 31, 2022

Sorry, I'm finally now looking at this. What I was hoping you'd do instead is have JVMTI not mark objects at all, and use an {oop} hashtable to determine which objects are marked. It already has to do the growable array to restore the marks, so it already costs space. Use ResourceHashtable in preference to hashtable.hpp Hashtable though.

That might actually be the better solution: it leaves the header alone and would not require any interaction with the GC. I will try that instead. Thank you!

@tstuefe
Copy link
Member

tstuefe commented Mar 31, 2022

Hi Coleen,

Sorry, I'm finally now looking at this. What I was hoping you'd do instead is have JVMTI not mark objects at all, and use an {oop} hashtable to determine which objects are marked. It already has to do the growable array to restore the marks, so it already costs space. Use ResourceHashtable in preference to hashtable.hpp Hashtable though.

I don't understand. You would put every visited oop into that hashset? So, every object, if the walker walks the whole heap? Would that not be very expensive? You'd pay at least 16 bytes per object (oop + whatever the table entry costs). Plus, we need a good hash distribution.

There would be a cutoff where the Romans bitmap proposal in Lilliput would be cheaper. And the bitmap would have O(1) lookup.

@coleenp
Copy link
Contributor

coleenp commented Mar 31, 2022

Wouldn't the hashtable contain only the oops that you wanted to mark? That's not all of them? Maybe it is and it's a bad idea.

@tstuefe
Copy link
Member

tstuefe commented Apr 1, 2022

Wouldn't the hashtable contain only the oops that you wanted to mark? That's not all of them? Maybe it is and it's a bad idea.

AFAIU the JVMTI heap walk iterates over whole heap, and marks visited oops. So, eventually, that would be all of them. The RA allocated stacks are just for a subset of objects (locked or hashed).

@coleenp
Copy link
Contributor

coleenp commented Apr 1, 2022

I'm told by @fisk that there's code in jfr/leakprofiler/chains/bitset.hpp that does some non-gc dependent oop marking. Maybe something like this can be used instead? At one point in rewriting the JvmtiTagMap table we were doing obj->hash() on the oops during the tagging so marking them turned out to be a huge pain. It would be nice not to mark them.
See RFE https://bugs.openjdk.java.net/browse/JDK-8256072 filed by @kimbarrett

@rkennke
Copy link
Contributor Author

rkennke commented Apr 4, 2022

I'm told by @fisk that there's code in jfr/leakprofiler/chains/bitset.hpp that does some non-gc dependent oop marking. Maybe something like this can be used instead? At one point in rewriting the JvmtiTagMap table we were doing obj->hash() on the oops during the tagging so marking them turned out to be a huge pain. It would be nice not to mark them. See RFE https://bugs.openjdk.java.net/browse/JDK-8256072 filed by @kimbarrett

This is an excellent suggestion! I moved bitset.* to share/utilities and changed JVMTI to use that class instead. It seems to do the job just fine.
One open question is which MEMFLAGS to use. mtTracing doesn't seem to be exactly right. Should I templatize BitSet and make JVMTI use mtServiceability and JRF use mtTracing as it did before?

@rkennke
Copy link
Contributor Author

rkennke commented Apr 4, 2022

And yes, I will change the title and description text once we settled on how we want to do it. ;-)

@coleenp
Copy link
Contributor

coleenp commented Apr 4, 2022

One open question is which MEMFLAGS to use. mtTracing doesn't seem to be exactly right. Should I templatize BitSet and make JVMTI use mtServiceability and JRF use mtTracing as it did before?

Yes, I think templatizing for MEMFLAGS makes sense (concurrentHashTable.hpp is too).

Wow, this came out really well! Thank you @fisk again for the idea.

@fisk
Copy link
Contributor

fisk commented Apr 4, 2022

One open question is which MEMFLAGS to use. mtTracing doesn't seem to be exactly right. Should I templatize BitSet and make JVMTI use mtServiceability and JRF use mtTracing as it did before?

Yes, I think templatizing for MEMFLAGS makes sense (concurrentHashTable.hpp is too).

Wow, this came out really well! Thank you @fisk again for the idea.

Happy to help!

@tstuefe
Copy link
Member

tstuefe commented Apr 8, 2022

BTW, one neat solution to the MEMFLAGS problem would be to make it a property of the thread, and stackable. With sort of a MemflagMark. Each Thread could have a property "default MEMFLAGS", set with a MemFlagMark, and that gets used by default with every malloc in this thread. This way we may be able to get rid of a lot (most?) manual MEMFLAG declarations.

@tschatzl
Copy link
Contributor

tschatzl commented Apr 8, 2022

A general thought, maybe for a future RFE:

We now have BitMap and BitSet, which do almost the same thing. Instead of having a new class BitSet, whose job is to be sparse, we could give BitMap the ability to be sparse. We'd save code, reuse the rich BitMap interface and get all the test code for free.

One way to do that would be to make BitMap responsible for committing its underlying backing memory, in chunks, on-demand only. To do that BitMap would need a secondary map - a commit map - to remember which fragments of its backing memory are committed. Commit map would have a bit per 64M fragment.

Please don't do this. BitMap (well, BitMapView) is also used as a lightweight way to overlay it on arbitrary memory temporarily (i.e. remembered sets bitmaps). This typically boils down to no additional memory usage (and overhead) at all.
So any additional memory usage or initialization should not be added lightly.
Also in most other cases, BitMap is used for very tiny bitmaps anyway, this would just add unnecessary functionality (and overhead) for 99% of the usage of BitMap.

Further GCs are using BitMap to implement such sparse bit sets already, managing their memory with the corresponding Java heap memory; so adding this functionality at the BitMap level would just duplicate functionality with probably not-so-hilarious side effects.

I also do think that the use cases for a dense BitMap are significantly different from (huge) sparse "BitMap"s to warrant its own separate class (not saying that one couldn't use the other internally, or that if that BitSet could be reused in GCs if configurable enough).

I am also not sure that the automatic lazy backing of the OS isn't sufficient for these use cases (e.g. JVMTI) too.

Thanks,
Thomas

@tstuefe
Copy link
Member

tstuefe commented Apr 8, 2022

Hi Thomas,

A general thought, maybe for a future RFE:
We now have BitMap and BitSet, which do almost the same thing. Instead of having a new class BitSet, whose job is to be sparse, we could give BitMap the ability to be sparse. We'd save code, reuse the rich BitMap interface and get all the test code for free.
One way to do that would be to make BitMap responsible for committing its underlying backing memory, in chunks, on-demand only. To do that BitMap would need a secondary map - a commit map - to remember which fragments of its backing memory are committed. Commit map would have a bit per 64M fragment.

Please don't do this. BitMap (well, BitMapView) is also used as a lightweight way to overlay it on arbitrary memory temporarily (i.e. remembered sets bitmaps). This typically boils down to no additional memory usage (and overhead) at all. So any additional memory usage or initialization should not be added lightly. Also in most other cases, BitMap is used for very tiny bitmaps anyway, this would just add unnecessary functionality (and overhead) for 99% of the usage of BitMap.

Further GCs are using BitMap to implement such sparse bit sets already, managing their memory with the corresponding Java heap memory; so adding this functionality at the BitMap level would just duplicate functionality with probably not-so-hilarious side effects.

Yes, you are right. I had some more time to think about this and came to the same conclusion. Also abstracting bitwise getters and setters as I thought originally would probably not be enough.

I also do think that the use cases for a dense BitMap are significantly different from (huge) sparse "BitMap"s to warrant its own separate class (not saying that one couldn't use the other internally, or that if that BitSet could be reused in GCs if configurable enough).

I am also not sure that the automatic lazy backing of the OS isn't sufficient for these use cases (e.g. JVMTI) too.

I'm not sure about this. We often run against the commit charge of the underlying OS, which seems to be more frequent than running out of physical memory. I think voluntary uncommit does make sense. Whether the libc actually uncommits on free() is a different question, but in this case, I think they do, the fragments are large enough.

Thanks, Thomas

@rkennke
Copy link
Contributor Author

rkennke commented Apr 8, 2022

One open question is which MEMFLAGS to use. mtTracing doesn't seem to be exactly right. Should I templatize BitSet and make JVMTI use mtServiceability and JRF use mtTracing as it did before?

Yes, I think templatizing for MEMFLAGS makes sense (concurrentHashTable.hpp is too).

I haven't had time to look at the code, but I don't know about this. Slapping a template parameter on everything isn't necessarily a good idea. We recently (JDK-8283368) undid exactly this sort of thing in the cardset code, instead making the MEMFLAGS value a runtime parameter provided at construction time. This avoids a bunch of generated code duplication, additional template syntax, and allows more implementation be put in .cpp files because it isn't a template.

The trouble is that ObjectBitSet is a subclass of CHeapObj, which takes a template parameter for memflags itself. How could I avoid templatizing the ObjectBitSet itself?

@coleenp
Copy link
Contributor

coleenp commented Apr 8, 2022

Here's an idea. Add an mtflag called mtBitMap and use that. There's plenty of bits in mt flags.

@rkennke
Copy link
Contributor Author

rkennke commented Apr 8, 2022

Here's an idea. Add an mtflag called mtBitMap and use that. There's plenty of bits in mt flags.

Naa, I don't think this is justified. It isn't used as widely or frequently to warrant its own flag.

@tstuefe
Copy link
Member

tstuefe commented Apr 8, 2022

Here's an idea. Add an mtflag called mtBitMap and use that. There's plenty of bits in mt flags.

Naa, I don't think this is justified. It isn't used as widely or frequently to warrant its own flag.

Plus, it would muddy the water, since what we actually want is the memory accounted toward the subsystem on behalf of which we create the data structure. mtBitMap would be a bit like "mtArray" for GrowableArray.

Note that there was the idea, and an associated JBS issue (https://bugs.openjdk.java.net/browse/JDK-8281819) of making NMT categories groupable in a one- or multi-layered hierarchy. That way, one could account to "mtGC" -> "mtBitSet", for instance.

For now, I opt to either live with what we have with the typedef'ed BitSet, and leave MEMFLAG cleanups to a later RFE.

Alternatively, make the MEMFLAG a runtime parameter of the bitset, use that for the heap allocated fragments, but use "mtInternal" for the actual class. I mean, those are just a few bytes.

@tstuefe
Copy link
Member

tstuefe commented Apr 8, 2022

Third alternative would be not to derive from CHeapObj at all, since we don't new it anywhere if I see correctly.

@coleenp
Copy link
Contributor

coleenp commented Apr 8, 2022

We must call new on it somewhere. I am not opposed to making this an mtFlags template then. This is a small point in this improvement.

@rkennke
Copy link
Contributor Author

rkennke commented Apr 8, 2022

We must call new on it somewhere. I am not opposed to making this an mtFlags template then. This is a small point in this improvement.

Yes, at some point we need to allocate the fragments and fragments table. We could perhaps work around it by passing MEMFLAGS as constexpr, but I don't think this would change generated code much.

@tstuefe
Copy link
Member

tstuefe commented Apr 8, 2022

We must call new on it somewhere. I am not opposed to making this an mtFlags template then. This is a small point in this improvement.

Yes, at some point we need to allocate the fragments and fragments table. We could perhaps work around it by passing MEMFLAGS as constexpr, but I don't think this would change generated code much.

(all bikeshedding since we all agree the current version is fine, no?)

No, sorry, I meant make BitSet a normal class. Not derived from CHeapObj, since it gets not newed, only used via composition or on a stack. And give it a runtime parm to set the MEMFLAG.

So:

class BitSet {
  const MEMFLAGS _flags;
public:
  BitSet(MEMFLAGS f = mtInternal) : _flags(f) {...}
};

and use those flags for the c-heap allocation of the fragments.

@rkennke
Copy link
Contributor Author

rkennke commented Apr 8, 2022

We must call new on it somewhere. I am not opposed to making this an mtFlags template then. This is a small point in this improvement.

Yes, at some point we need to allocate the fragments and fragments table. We could perhaps work around it by passing MEMFLAGS as constexpr, but I don't think this would change generated code much.

(all bikeshedding since we all agree the current version is fine, no?)

No, sorry, I meant make BitSet a normal class. Not derived from CHeapObj, since it gets not newed, only used via composition or on a stack. And give it a runtime parm to set the MEMFLAG.

So:

class BitSet {
  const MEMFLAGS _flags;
public:
  BitSet(MEMFLAGS f = mtInternal) : _flags(f) {...}
};

and use those flags for the c-heap allocation of the fragments.

Yes, I get that. But the fragments and fragment-table are themselves inner classes that take a MEMFLAGS template. I could (perhaps) either use a constexpr MEMFLAGS arg and pass this through, or do at some point a switch like:

switch (_flags) {
  case mtServiceability:
  ... new BitMapFragmentTable<mtServiceability>(); break;
  case mtServiceability:
  ... new BitMapFragmentTable<mtServiceability>(); break;
  default: ShouldNotReachHere();
}

Which seems kinda-ugly but would work (I think), and avoid making the outer class template-ized.

@tstuefe
Copy link
Member

tstuefe commented Apr 9, 2022

Yes, I get that. But the fragments and fragment-table are themselves inner classes that take a MEMFLAGS template. I could (perhaps) either use a constexpr MEMFLAGS arg and pass this through, or do at some point a switch like:

switch (_flags) {
  case mtServiceability:
  ... new BitMapFragmentTable<mtServiceability>(); break;
  case mtServiceability:
  ... new BitMapFragmentTable<mtServiceability>(); break;
  default: ShouldNotReachHere();
}

Which seems kinda-ugly but would work (I think), and avoid making the outer class template-ized.

I see what you mean. This MEMFLAGS template parameter is deeply interwoven into everything. I'd just live with the current solution. It uses established pattern, so at least nobody is surprised.

I think the basic problem is that CHeapObj itself is a template class. Rethinking MEMFLAGS seems worthwhile for a future RFE. As I wrote, one approach could be to make them a property of the current thread, and switchable and stackable via a Mark class. That way, everything allocated within a given range of frames would count toward a given category. No need to decide on a fine-granular basis. No need for templates. Maybe no need even to have a MEMFLAGS argument for every allocation.

@mlbridge
Copy link

mlbridge bot commented Apr 10, 2022

Mailing list message from Kim Barrett on hotspot-dev:

On Apr 9, 2022, at 2:44 AM, Thomas Stuefe <stuefe at openjdk.java.net> wrote:

On Fri, 8 Apr 2022 17:34:57 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

Yes, I get that. But the fragments and fragment-table are themselves inner classes that take a MEMFLAGS template. I could (perhaps) either use a constexpr MEMFLAGS arg and pass this through, or do at some point a switch like:

```
switch (_flags) {
case mtServiceability:
... new BitMapFragmentTable<mtServiceability>(); break;
case mtServiceability:
... new BitMapFragmentTable<mtServiceability>(); break;
default: ShouldNotReachHere();
}
```

Which seems kinda-ugly but would work (I think), and avoid making the outer class template-ized.

I see what you mean. This MEMFLAGS template parameter is deeply interwoven into everything. I'd just live with the current solution. It uses established pattern, so at least nobody is surprised.

I think the basic problem is that CHeapObj itself is a template class. Rethinking MEMFLAGS seems worthwhile for a future RFE. As I wrote, one approach could be to make them a property of the current thread, and switchable and stackable via a Mark class. That way, everything allocated within a given range of frames would count toward a given category. No need to decide on a fine-granular basis. No need for templates. Maybe no need even to have a MEMFLAGS argument for every allocation.

While working on something else I ran into a similar problem and found a different
approach that seemed to work well. I?m planning to explore it in the context of
CHeapObj, but haven?t gotten around to it yet. I should file an RFE in case someone
else is interested.

@rkennke
Copy link
Contributor Author

rkennke commented Apr 11, 2022

Have we reached a consensus that the current proposal is the way to go? If so, could you please mark the latest revision as Reviewed (again), @tstuefe and @coleenp (and whoever else feels like doing so)?

I also added some basic gtests for ObjectBitSet. Please note my remark on the NULL test case.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great and thank you for all your work resolving comments and bringing this in from the lilliput changes you were going to make. This is a big improvement imo and lets us potentially fix other problems in the jvmti tag map implementation. The template instantiation doesn't bother me. There's only 2 and it's not a big code bloat and it's the way the code has been dealing with memflags. Maybe the mechanism should be revisited, like the GC change that Kim pointed to, but I don't see the motivation to make it something different. Maybe a later RFE can propose a different mechanism for memflags in general. We picked this because we thought it was simple.
Thank you for your work on this, Roman and your comments Thomas.

// NOTE: This is a little weird. NULL is not treated any special: ObjectBitSet will happily
// allocate a fragement for the memory range starting at 0 and mark the first bit when passing NULL.
// In the absense of any error handling, I am not sure what would possibly be a reasonable better
// way to do it, though.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. If it matters someday we'll have the comment..

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good

@rkennke
Copy link
Contributor Author

rkennke commented Apr 11, 2022

Thank you all! GHA are green too, so I'll

/integrate

@openjdk
Copy link

openjdk bot commented Apr 11, 2022

Going to push as commit abfd2f9.
Since your change was applied there have been 18 commits pushed to the master branch:

  • 7edd186: 8283507: Create a regression test for RFE 4287690
  • 74835f7: 8283719: java/util/logging/CheckZombieLockTest.java failing intermittently
  • 205cfb8: 8284093: Memory leak: X11SD_DisposeXImage should also free obdata
  • f4edb59: 8284567: Collapse identical catch branches in java.base
  • 40ddb75: 8284641: Doc errors in sun.security.ssl.SSLSessionContextImpl
  • 8ebea44: 8270090: C2: LCM may prioritize CheckCastPP nodes over projections
  • 755bfcb: 8284581: Serial: Remove unused GenCollectedHeap::collect_locked
  • 0c04bf8: 8284198: Undo JDK-8261137: Optimization of Box nodes in uncommon_trap
  • eb3ead9: 8284036: Make ConcurrentHashMap.CollectionView a sealed hierarchy
  • 92f5e42: 8284549: JFR: FieldTable leaks FieldInfoTable member
  • ... and 8 more: https://git.openjdk.java.net/jdk/compare/0a0267590fad6a2d14d499588c97bb11e554feb9...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Apr 11, 2022
@openjdk openjdk bot closed this Apr 11, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Apr 11, 2022
@openjdk
Copy link

openjdk bot commented Apr 11, 2022

@rkennke Pushed as commit abfd2f9.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@rkennke rkennke deleted the JDK-8283710 branch April 11, 2022 14:51
Comment on lines +25 to +26
#ifndef SHARE_JFR_LEAKPROFILER_JFRBITMAP_HPP
#define SHARE_JFR_LEAKPROFILER_JFRBITMAP_HPP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong - it should be BITSET not BITMAP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong - it should be BITSET not BITMAP

Oh, good find! I filed a follow-up issue:
https://bugs.openjdk.java.net/browse/JDK-8284725

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

7 participants