Skip to content

Conversation

@eme64
Copy link
Contributor

@eme64 eme64 commented Dec 19, 2023

JDK-8247755 introduced the GrowableArrayCHeap. This duplicates the current C-Heap allocation capability in GrowableArray. I now remove that from GrowableArray and move all usages to GrowableArrayCHeap.

This has a few advantages:

  • Clear separation between arena (and resource area) allocating array and C-heap allocating array.
  • We can prevent assigning / copying between arrays of different allocation strategies already at compile time, and not only with asserts at runtime.
  • We should not have multiple implementations of the same thing (C-Heap backed array).
  • GrowableArrayCHeap is NONCOPYABLE. This is a nice restriction, we now know that C-Heap backed arrays do not get copied unknowingly.

Bonus
We can now restrict GrowableArray element type E to be std::is_trivially_destructible<E>::value == true. The idea is that arena / resource allocated arrays get abandoned, often without being even cleared. Hence, the elements in the array are never destructed. But if we only use elements that are trivially destructible, then it makes no difference if the destructors are ever called, or the elements simply abandoned.

For GrowableArrayCHeap, we expect that the user eventually calls the destructor for the array, which in turn calls the destructors of the remaining elements. Hence, it is up to the user to ensure the cleanup. And so we can allow non-trivial destructors.

Testing
Tier1-3 + stress testing: pending


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Integration blockers

 ⚠️ Too few reviewers with at least role reviewer found (have 0, need at least 1) (failed with updated jcheck configuration in pull request)
 ⚠️ Whitespace errors (failed with updated jcheck configuration in pull request)

Issue

  • JDK-8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17160/head:pull/17160
$ git checkout pull/17160

Update a local copy of the PR:
$ git checkout pull/17160
$ git pull https://git.openjdk.org/jdk.git pull/17160/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17160

View PR using the GUI difftool:
$ git pr show -t 17160

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17160.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 19, 2023

👋 Welcome back epeter! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title 8322476 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap Dec 19, 2023
@openjdk
Copy link

openjdk bot commented Dec 19, 2023

@eme64 The following labels will be automatically applied to this pull request:

  • graal
  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added graal graal-dev@openjdk.org serviceability serviceability-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Dec 19, 2023
@eme64 eme64 marked this pull request as ready for review December 20, 2023 12:53
@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 20, 2023
@mlbridge
Copy link

mlbridge bot commented Dec 20, 2023

Webrevs

Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing: There are a lot of non-static class data members that are pointers to
GrowableArray that seem like they would be better as direct, e.g. non-pointers.

pre-existing: There are a lot of iterations over GrowableArray's that would be
simplified by using range-based-for.

I'm not a fan of the additional clutter in APIs that the static memory types add.
If we had a variant of GrowableArrayCHeap that was not itself dynamically allocatable
and took a memory type to use internally as a constructor argument, then I think a
lot of that clutter could be eliminated. It could be used for ordinary data members
that are direct GAs rather than pointers to GAs. I think there is a way to do something
similar for static data members that are pointers that are dynamically allocated later,
though that probably requires more work.

I've not yet reviewed the changes to growableArray.[ch]pp yet, nor the test changes.
But I've run out of time and energy for this for today.

private:
template <typename T>
static int array_length_or_zero(GrowableArray<T>* array) {
static int array_length_or_zero(GrowableArrayCHeap<T, mtClass>* array) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argument could be GrowableArrayView<T>*, removing the coupling on the memory type.

Also, pre-existing: the argument should be const.

void dump_java_heap_objects(GrowableArray<Klass*>* klasses) NOT_CDS_JAVA_HEAP_RETURN;
void dump_shared_symbol_table(GrowableArray<Symbol*>* symbols) {
void dump_java_heap_objects(GrowableArrayCHeap<Klass*, mtClassShared>* klasses) NOT_CDS_JAVA_HEAP_RETURN;
void dump_shared_symbol_table(GrowableArrayView<Symbol*>* symbols) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing: Perhaps the arguments to these should be const.


#if INCLUDE_CDS_JAVA_HEAP
void VM_PopulateDumpSharedSpace::dump_java_heap_objects(GrowableArray<Klass*>* klasses) {
void VM_PopulateDumpSharedSpace::dump_java_heap_objects(GrowableArrayCHeap<Klass*, mtClassShared>* klasses) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing: Perhaps the argument should be const.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing and can't attach comment to line#50: int i; is dead variable.


_num_entries_written = 0;
_buckets = NEW_C_HEAP_ARRAY(GrowableArray<Entry>*, _num_buckets, mtSymbol);
_buckets = NEW_C_HEAP_ARRAY(EntryBucket*, _num_buckets, mtSymbol);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing: It seems like the code could be simpler if the type of _buckets was
GrowableArrayCHeap<EntryBucket*, mtSymbol>.

int init_length = 64;
_methods = new (mtInternal) GrowableArray<Method*>(init_length, mtInternal);
_bcis = new (mtInternal) GrowableArray<int>(init_length, mtInternal);
_methods = new GrowableArrayCHeap<Method*, mtInternal>(init_length);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming init_length => init_capacity.

static GrowableArray<CodeHeap*>* _compiled_heaps;
static GrowableArray<CodeHeap*>* _nmethod_heaps;
static GrowableArray<CodeHeap*>* _allocable_heaps;
typedef GrowableArrayCHeap<CodeHeap*, mtCode> CodeHeapArray;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing: Consider moving CodeHeapArray to namespace scope and prefer using it to the long-form.
If not at namespace scope, it could at least be public in this class and used throughout, including in public
APIs.

NEW_ARENA_ARRAY(arena, type, 1)

#ifdef ASSERT
bool Arena_contains(const Arena* arena, const void* ptr);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't seem necessary. Directly calling arena->contains(ptr) in the one place it's being seems
like it should suffice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kimbarrett the reason was that I need to call this from the hpp file, and I encountered some circular dependency I did could not resolve. So I needed to move something off to the cpp files. Either I put it in arena.cpp, or in growableArray.cpp. But If I put things off to growableArray.cpp from the GrowableArray class, then it will not easily instantiate the templates, so that is not possible then. Hence I have to put it into arena.cpp

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a global API is warranted to support that local debug-only
implementation detail.

The inclusion problem arises because the PR eliminates GrowableArrayMetadata
(a non-templated class), thereby forcing the init_checks helper to be moved to
GrowableArray (a class template). That forced moving the implementation from
the .cpp to the .hpp.

One solution might be to move the new version of init_checks to
growableArray.inline.hpp. After all, breaking circularities is one of the
reasons one might have an inline.hpp file. However, that file doesn't
currently exist, and I think introducing it would have too much fannout. And
it might not even work without even more effort; there might be .hpp files
that contain allocations of GrowableArray.

Much simpler, and I think probably better, is to keep GrowableArrayMetadata
(perhaps under a different name - GrowableArrayArenaHolder?), now holding the
allocation arena and providing init_checks (and anything else that seems
appropriate), with init_checks defined similarly to the current definition in
the .cpp file.

KlassInfoHisto::KlassInfoHisto(KlassInfoTable* cit) :
_cit(cit) {
_elements = new (mtServiceability) GrowableArray<KlassInfoEntry*>(_histo_initial_size, mtServiceability);
_elements = new GrowableArrayCHeap<KlassInfoEntry*, mtServiceability>(_histo_initial_size);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing: Why is this initialization separate from the ctor-initializer? And this looks like an example of
where it would be better as a direct GA member rather than a pointer to GA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can name be changed to _histo_initial_capacity?

@eme64
Copy link
Contributor Author

eme64 commented Dec 21, 2023

@kimbarrett Thanks for looking at the PR!
I see you address a lot of "pre-existing" issues. And you would like GrowableArrayCHeap not have the MEMFLAGS in the template argument but maybe as a constructor argument instead. Or maybe a GACH version that only allocates once, though I guess that would limit what kinds of methods you could call on it...
Can we address these issues as separate RFE's?

Copy link
Contributor

@jdksjolen jdksjolen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Thank you for this Emanuel.

I went through your changes and I am happy with them. There are some spelling issues and my opinions on how to write the doc strings. I also asked for some "length"/"size" naming to be changed to "capacity", you don't have to do this as it's pre-existing, but it would make that code clearer.

assert(edge != nullptr, "invariant");
if (_leak_context_edges == nullptr) {
_leak_context_edges = new (mtTracing) GrowableArray<const StoredEdge*>(initial_size, mtTracing);
_leak_context_edges = new GrowableArrayCHeap<const StoredEdge*, mtTracing>(initial_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing: initial_capacity is a better name.

static GrowableArray<T>* c_heap_allocate_array(int size = initial_array_size) {
return new (mtTracing) GrowableArray<T>(size, mtTracing);
static GrowableArrayCHeap<T, mtTracing>* c_heap_allocate_array(int size = initial_array_size) {
return new GrowableArrayCHeap<T, mtTracing>(size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capacity instead of size


JfrThreadGroup::JfrThreadGroup() :
_list(new (mtTracing) GrowableArray<JfrThreadGroupEntry*>(initial_array_size, mtTracing)) {}
_list(new GrowableArrayCHeap<JfrThreadGroupEntry*, mtTracing>(initial_array_size)) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capacity

assert(length >= 1, "invariant");
assert(dcmd_recordings_array == nullptr, "invariant");
dcmd_recordings_array = new (mtTracing) GrowableArray<JfrStartFlightRecordingDCmd*>(length, mtTracing);
dcmd_recordings_array = new GrowableArrayCHeap<JfrStartFlightRecordingDCmd*, mtTracing>(length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capacity

template <typename T>
static GrowableArray<T>* c_heap_allocate_array(int size = initial_array_size) {
return new (mtTracing) GrowableArray<T>(size, mtTracing);
static GrowableArrayCHeap<T, mtTracing>* c_heap_allocate_array(int size = initial_array_size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capacity


// THE GrowableArray.
// The GrowableArray internal data is allocated from either:
// - Resrouce area (default)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling

//
// Memory overhead: The multiple allocation strategies uses extra metadata
// embedded in the instance.
// Itself, it can be embedded, on stack, resource_arena or arena allocated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Itself can be allocated on stack, resource area or arena allocated."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That it can be embedded into another class/struct is a given, imho.

//
// For C-Heap allocation use GrowableArrayCHeap.
//
// Note, that with GrowableArray does not deallocate the allocated memory from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"that the" not "that with"

Comment on lines +636 to +638
// GrowableArray is copyable, but it only creates a shallow copy. Hence, one has
// to be careful not to duplicate the state and then diverge while sharing the
// underlying data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sad but true :-(

static E* allocate(int max) {
return (E*)GrowableArrayResourceAllocator::allocate(max, sizeof(E));
}
// Since GrowableArray is arena / resource area allocated, it is a custom to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"it is a custom to" can basically be removed

"Since Growable array is arena/resource area allocated it does not destruct its elements. Therefore, ..." is sufficient.

@kimbarrett
Copy link

... I also asked for some "length"/"size" naming to be changed to "capacity", you don't have to do this as it's pre-existing, but it would make that code clearer.

I think I only commented on one in my pass over the code, but I agree with all of @jdksjolen suggestions for those.

@stefank
Copy link
Member

stefank commented Dec 22, 2023

I'm not a fan of the additional clutter in APIs that the static memory types add. If we had a variant of GrowableArrayCHeap that was not itself dynamically allocatable and took a memory type to use internally as a constructor argument, then I think a lot of that clutter could be eliminated. It could be used for ordinary data members that are direct GAs rather than pointers to GAs. I think there is a way to do something similar for static data members that are pointers that are dynamically allocated later, though that probably requires more work.

FWIW, I added the GrowableArrayCHeap and the static memory type. I did that because there was a perceived need to minimize the memory usage, because we were going to use an extreme amount of these arrays for one of our subsystems in ZGC. It later turned out that we really didn't need to squeeze out the last bit of memory for that use-case. I would really like to get rid of the the static memory type from GrowableArrayCHeap, and just add it as an instance member.

@eme64
Copy link
Contributor Author

eme64 commented Dec 22, 2023

@stefank Maybe it would then make sense to do that before this change here? Because we will really have to touch all these changes here again for that.

@kimbarrett @stefank @jdksjolen
Or alternatively, we just say that we want to keep the C-Heap functionality inside GrowableArray, and simply remove GrowableArrayCHeap? Though I honestly would prefer a world where every allocation strategy has its own GrowableArray* variant, so that it is clear statically that they cannot be assigned to each other. So my preference would even be to have these:

CHeapGrowableArray
ArenaGrowableArray
ResourceAreaGrowableArray

What do you think?

And how important is it that we do not use virtual with GrowableArray? Because the implementation and testing is much nastier without it. Is the V-table still such a overhead that we need to avoid it?

@mlbridge
Copy link

mlbridge bot commented Dec 23, 2023

Mailing list message from Kim Barrett on graal-dev:

[Kim Barret wrote:]

pre-existing: There are a lot of non-static class data members that are pointers to
GrowableArray that seem like they would be better as direct, e.g. non-pointers.

pre-existing: There are a lot of iterations over GrowableArray's that would be
simplified by using range-based-for.

I'm not a fan of the additional clutter in APIs that the static memory types add.
If we had a variant of GrowableArrayCHeap that was not itself dynamically allocatable
and took a memory type to use internally as a constructor argument, then I think a
lot of that clutter could be eliminated. It could be used for ordinary data members
that are direct GAs rather than pointers to GAs. I think there is a way to do something
similar for static data members that are pointers that are dynamically allocated later,
though that probably requires more work.

On Fri, 22 Dec 2023 13:22:19 GMT, Stefan Karlsson <stefank at openjdk.org> wrote:
FWIW, I added the GrowableArrayCHeap and the static memory type. I did that because there was a perceived need to minimize the memory usage, because we were going to use an extreme amount of these arrays for one of our subsystems in ZGC. It later turned out that we really didn't need to squeeze out the last bit of memory for that use-case. I would really like to get rid of the the static memory type from GrowableArrayCHeap, and just add it as an instance member.

Currently the basic overhead for a static-MEMTYPE GA is 16 bytes (data
pointer, int length and capacity). So 16 bytes on 64bit platforms. Adding
the memory type adds 8 bytes (due to alignment), so 50% increase. Though it
may not matter for dynamically allocated GAs, since std::max_align_t is
probably at least 32 bytes. Though I think we seriously overuse dynamic
allocation for GAs.

Does it really matter. I don't know.

StringDedup uses a lot of GAs. (Though it could use 1/2 as many with some work,
since they are used in pairs that always have the same length and capacity. At
the time I was working on it I was feeling lazy and just used pairs of GAs, with the
intention of coming back to that at some point if it proved to be a significant problem.)

On Dec 22, 2023, at 8:33 AM, Emanuel Peter <epeter at openjdk.org> wrote:
@stefank Maybe it would then make sense to do that before this change here? Because we will really have to touch all these changes here again for that.

@kimbarrett @stefank @jdksjolen
Or alternatively, we just say that we want to keep the C-Heap functionality inside `GrowableArray`, and simply remove `GrowableArrayCHeap`? Though I honestly would prefer a world where every allocation strategy has its own `GrowableArray*` variant, so that it is clear statically that they cannot be assigned to each other. So my preference would even be to have these:

CHeapGrowableArray
ArenaGrowableArray
ResourceAreaGrowableArray

What do you think?

I think having distinct CHeap, Resource, and Arena allocated types would be an
improvement. If we were using std::vector with associated allocators we'd have
something like that. (The allocator type is a template parameter for
std::vector.) Refactoring GA in that direction is certainly possible, and
might be an improvement.

We could also have two variants of CHeap GAs, one with the memtype being a
constructor argument (and possibly different from the memtype used to
dynamically allocate the GA, as is currently the case, although I think that
feature is never used), and a more compact one with a static memtype.

I did some work on HotSpot allocators for standard containers like std::vector
a while ago that could be applied to GA. It includes support for dynamic and
static memtypes, and shows that isn't hard to do.

I thought about whether this PR should go ahead without some of that stuff in
place. It touches a lot of places that would probably be touched again if we
went in that sort of direction. OTOH, a lot of these same places would
probably need to be touched repeatedly anyway, one way or another. For
example, dealing with what appears to be a large amount of unnecessary dynamic
allocation of GAs would also touch a lot of these places, and I'm suspicious
of combining the type changes and the allocation changes in one PR.

And how important is it that we do not use `virtual` with `GrowableArray`? Because the implementation and testing is much nastier without it. Is the V-table still such a overhead that we need to avoid it?

I don't think there is any need for virtual functions in GA.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <https://mail.openjdk.org/pipermail/graal-dev/attachments/20231222/4d93430b/signature-0001.asc>

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 20, 2024

@eme64 This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@jdksjolen
Copy link
Contributor

Not yet bot!

@openjdk
Copy link

openjdk bot commented Feb 13, 2024

@eme64 this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8322476
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed rfr Pull request is ready for review labels Feb 13, 2024
@bridgekeeper
Copy link

bridgekeeper bot commented Mar 12, 2024

@eme64 This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@openjdk
Copy link

openjdk bot commented Mar 13, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 11, 2024

@eme64 This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented May 9, 2024

@eme64 This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

graal graal-dev@openjdk.org hotspot hotspot-dev@openjdk.org merge-conflict Pull request has merge conflict with target branch serviceability serviceability-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

4 participants