-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8333658: NMT: Use an allocator with 4-byte pointers to save memory in NativeCallStackStorage #18979
Conversation
👋 Welcome back jsjolen! A progress list of the required criteria for merging this PR into |
@jdksjolen This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 240 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@jdksjolen The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
558f3d6
to
f275962
Compare
f275962
to
0a69068
Compare
An obvious improvement: Have the returned pointers remember which allocator object it was allocated from when in debug mode to avoid free:ing an element using the wrong allocator. You can also support detecting double frees by assigning each allocation a unique id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started looking at this. I like this.
The smaller beauty spot of your implementation is that by relying on GrowableHeapArray, we are bound to the size of its index type, which is int. So we will never be able to hold more than 2G-1 entries. I think thats okay for now though, and if not, its easy to change in GrowableArray.
A bigger beauty spot is that it needs explicit initialization for every member (IIUC). That feels just wasteful, especially in combination with the generous hard-coded power-of-2 growth of GrowableArray.
} else { | ||
// Follow the link to the next free element | ||
free_start = be.link; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit uncommon and contrary to the normal linked list implementation. If possible, I would opt for a more traditional approach that is easier to understand and does not rely on at_grow (e.g. in case we want adapt this to some other form of backing storage).
e.g.
if (_freelist == -1) {
allocate new slot
} else {
reuse slot at freelist
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that at_grow also could have the problem that an error, resulting in a high index, would cause a large allocation to happen.
I think this class could use an optional max value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Thomas! Sorry for the messy code, I felt that it was in good enough shape to show people and see what they think. I'll clean it up and ping you when it's ready. I answered some of your comments also.
/ Johan
@jdksjolen Github is being weird again and shows all your comments twice. Bad Github. |
Could we try another round of coming up with a better name for this utility? HomogenousObjectArray is eerily similar to G1's humongous object arrays. It's also not clear to me what makes this array an homogenous array. Is our other arrays non-homogenous? Would it make sense to put this in utilities/ instead of nmt/. The include guards should be placed before the includes. |
Thanks!
Do we have anyone else that wants to use this? The threshold in terms of API quality is typically higher for something in utilities/ than a local utility. I'm working on some improvements to this allocator already, maybe we can take a move to utilities/ together with those improvements?
I agree on the name part. We could call it a |
Yes, I will use it to replace some code in Metaspace at least, but for that I need some more features (placing this thing atop of an existing memory range, and templatized index type). Can all be done in a follow-up RFE.
Names are hard. We already have two types of arenas in hotspots, plus glibc has arenas, so "arena" is not ideal either. Arena does usually not imply a free list either, nor homogenous sizes. Thinking about it, Array already sort of implies homogeneous sizes, so maybe "homogeneous" is redundant. ArrayWithFreeList? Unsexy but precise. |
We don't, but why forbid the user from doing that? I may want to place a non-trivial object in there, e.g. one where I forbid copying altogether, or one with a single non-default ctor? What would be the problem with that? Forbidding copying would be a good way to make sure the only instance exists at the place the array provides. So its address stable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost good.
alloc.deallocate(i1); | ||
A::I i3 = alloc.allocate(0); | ||
EXPECT_EQ(p1, &alloc.at(i3)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for what little the tests do, testing with different list types is overengineered. You could just scrap both LL and LL2, and just use a HOA directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to push back on that. I believe that while the list tests aren't testing that much they do serve as a form of example of how to use the HOA. It's also useful to have this in the form of a test, as opposed to a comment, as breaking changes to the HOA will break the examples, forcing them to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I am not sold on the "example" benefit. The array is quite easy in itself.
You could scratch the lists, since their implementations don't do anything to really test the AWFL (new acronym, yay). For the saved LOCs could expand the tests to test with a collection of various types, e.g.
- u1, u2, u8
- unaligned structures that need alignment (e.g. struct (void*; int; ))
- trivial objects
Interesting to see would be that it works, that alignment is correct, that nothing breaks across resizes (indexes stay stable, etc.).
I wont insist of it, just thinking that the current test complexity is somewhat wasted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really test the AWFL (new acronym, yay).
Can't wait for templatized indices so I can write
using AWFUL = ArrayWithFreeListAllocator<int, uint8_t>;
Interesting to see would be that it works, that alignment is correct, that nothing breaks across resizes (indexes stay stable, etc.).
I wont insist of it, just thinking that the current test complexity is somewhat wasted.
I think those tests make sense if we replace the GrowableArray
as the backing memory area with something else, otherwise that's just testing the GA.
OK, how about we scrap the tests for when moving this to utilities and having the index being a part of the template? I've got that in the pipeline already, so it's not going to take too long I hope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine for me
Agreed, let's go with
We can't have what you suggest at the moment, as objects aren't address stable without a fixed and non-growing memory area. Any form of constructor is actually fine, as those are called as they are. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good now.
One last change: I've removed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, my comments are applied.
Thank you for the in-depth reviewing of this change. /integrate |
Going to push as commit 57f8b91.
Your commit was automatically rebased without conflicts. |
@jdksjolen Pushed as commit 57f8b91. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Hi @jdksjolen, |
Hi Sonia, thanks for contacting me. It seems like this is a C++20 issue, but Hotspot only supports up to C++14. Here's a Godbolt for reproducing the issue, change the That ought to mean that I don't have to back out this PR and re-introduce it with the bug fixed, but I still think that we should fix the bug because we do want to get to C++20 some day :-). There's something weird with the fact that you get builds with C++20 support enabled, not sure what your configure options are. |
Hi @jdksjolen, thanks for looking into it!
Makes sense. I'll submit a follow-up PR to fix this.
I'm not sure. I don't have any special configure options. I came across this error after a plain |
Mailing list message from Thomas Stüfe on hotspot-dev: Johan, I would just remove the noncopyable. Its questionable anyway, there On Tue 25. Jun 2024 at 21:01, Sonia Zaldana Calles <szaldana at openjdk.org> -------------- next part -------------- |
Whichever is fine by me. The class will still be non-copyable however, as GrowableArrayCHeap already is. |
Hi,
This PR introduces a new allocator,
IndexedFreelistAllocator
. It uses aGrowableArray
in order to have 4-byte "pointers" to its elements and also works as a freelist as unused slots form a linked list. This allocator cannot shrink its used memory. I'm always open for better names.We then use this allocator in order to store the
NativeCallStackStorage
hash table. This saves4 * 4099 = ~16KiB
of memory as each table element is now only 4 bytes. For each stored stack it also saves 4 bytes. The fact that the allocator cannot shrink is fine, we do not ever remove stacks from the storage.The main point of this is to introduce a pointer-generic experimential API, so I also implemented CHeap and Arena allocators. It's currently placed in NMT, but we might want to move it into utilities. It uses a bit of template-magic, but my IDE (Emacs+clangd) was actually able to figure things out when the types didn't line up correctly etc, so it's not an enemy to IDE help.
It sounded expensive to have the GrowableArray continously realloc its underlying data array, so I did a basic test where we allocate 1 000 000 stacks and push them into NativeCallStackStorage backed by different allocators. This is available in the PR.
The results are as follows on linux-x64-slowdebug:
And on linux-x64:
Obviously, this is a very basic benchmark, but it seems like we're faster than CHeap and Arena for this case.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18979/head:pull/18979
$ git checkout pull/18979
Update a local copy of the PR:
$ git checkout pull/18979
$ git pull https://git.openjdk.org/jdk.git pull/18979/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 18979
View PR using the GUI difftool:
$ git pr show -t 18979
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18979.diff
Webrev
Link to Webrev Comment