-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK-8297958: NMT: Display peak values #11497
Conversation
👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into |
19d57bb
to
40a4dfd
Compare
Webrevs
|
Important addition to the memory stats in my opinion. I agree having it in release build would be great.
It can be confusing for others not aware of this behavior, and may think that the "peak" value is missing in the output. I believe it would be less confusing if "peak" is "always" displayed. |
Thank you, @ashu-mehra! The first version had "(peak)" written after every line if at current peak. I found the output cluttered, but let's see what others say. I'm fine with reintroducing it again. Cheers, Thomas |
Example output when all peak values are indicated with "(peak)":
Seems a bit verbose, but I'm fine with it if others are. |
If we are at peak, then instead of writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Thomas!
I believe that this patch is in shape to be merged, so I will approve. I would however appreciate if you consider my nits and comments for this PR, especially regarding the peak values that Ashutosh also mentioned.
Cheers,
Johan
|
||
size_t MemoryCounter::peak_size() const { | ||
return Atomic::load(&_peak_size); | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit and pre-existing: Can we add in // ASSERT
at end of line here?
DEBUG_ONLY(inline const MemoryCounter& malloc_counter() const { return _malloc; }) | ||
DEBUG_ONLY(inline const MemoryCounter& arena_counter() const { return _arena; }) | ||
const MemoryCounter* malloc_counter() const { return &_malloc; } | ||
const MemoryCounter* arena_counter() const { return &_arena; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me like we can keep these as returning references, if the printers take them in as references instead. Am I missing something?
I'm fine with keeping them as pointers, but I do want to make sure that I'm not missing some reasoning here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(My first answer got lost somehow)
You understand correctly. I have a slight preference for pointers since one can tell them easier apart in coding, but ultimately just a coin toss here.
Okay, "at peak" it is :-) |
Hi @jdksjolen, thanks a lot for the quick review. You and @ashu-mehra both wanted a visual indication for the current peak, I added that:
Cheers, Thomas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A very useful change Thomas and it looks good in general. Just two small things to clean up.
Regarding having it on in release build, I think that would make sense as long as the perf impact isn't to big. I would not expect it to be so bad, but certainly good to measure it closely before doing it.
@@ -58,11 +61,21 @@ void MemReporterBase::print_malloc(size_t amount, size_t count, MEMFLAGS flag) c | |||
amount_in_current_scale(amount), scale); | |||
} | |||
|
|||
// blends out mtChunk count number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we explicitly passed in 0 as the count for mtChunk
, but this is not true anymore since we pass in the MemoryCounter
. So if we want to omit mtChunk
count we need to reset the count here or handle it some other way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What it the reason for omitting the mtChunk
count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good catch.
I think this has to do with this ugly hack in MallocMemorySnapshot::make_adjustment():
So, we account chunks with mtChunk. Easy. But we also account for arenas. Arenas consists of chunks. To not report memory twice (once under malloced-with-mtChunk, once as "arena") we decrease the malloc size of mtChunk by the combined arena size of all arenas. What's left should be malloc size of free chunks in the free pool.
But that adjustment messes up the mtChunk malloc counter, since it is implemented as one gigantic free. So the counter is garbage.
I'll think of something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. I guess the easy fix is just:
const size_t count = (flag == mtChunk) ? 0 : c->count();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Christ this is complex. :-(
Both detail reporter and summary reporter share this printer code, but via MEMFLAG the print output is subtly changed. MEMFLAG=mtNone is taken as "print in summary report format" and != mtNone is "print in detail report format". This is all fixable but makes the already convoluted report not prettier.
The other option would be to, in MallocMemorySnapshot::make_adjustment(), set the count to zero. But I don't dare this because afterwards the counter would assert if it were to be decreased further.
See this mess? MallocMemorySnapshot is not really usable anymore as counter holder after the adjustment is made, because mtChunk counter got modified and now may assert for negative overflow. I believe MallocMemorySnapshot::make_adjustment() is only valid if you don't modify the counters anymore afterwards.
I decided to leave it as it is. I don't think that anyone cares. As I wrote, I have plans to simplify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this comment should go as well, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
@tstuefe This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 70 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Thanks @jdksjolen and @kstefanj for reviews! I'll wait for the GHAs to complete, then integrate. |
/integrate |
Going to push as commit 336d230.
Your commit was automatically rebased without conflicts. |
Historical peak values can be very useful in analyzing memory footprint.
For example, want to know how much memory the compiler needs during warmup? You have to get an NMT report at the exact time, with compile arenas at their largest combined extensions. But if we had peak values, you'd see how much the compiler needed by just looking at its arena peak size.
We already collect peak values in debug builds, but never actually display them. Since we already pay for them, we might just as well print them.
There is also a small issue that peak size and count are updated separately. It makes not much sense to treat these as independent values. Therefore this patch changes the meaning and implementation of peak count from today's "highest count" to "count at the point peak size was reached".
How this looks like:
Notes:
This RFE just adds peak to the debug VM. In debug, we already have all values, its just a matter of printing them. I would love to see peak values in release builds too, but collecting them does cost one or more CAS per malloc and therefore we must analyze performance before enabling. Having them in release would also remove the remaining #ifdef ASSERT.
I omit printing peak values when we are at peak. So if "peak" is missing, current peak is implied.
I only print peak values in summary and detail mode, not in any of the diff modes, to keep code complexity low and because diff modes are more about baseline compare.
In detail mode, there is a small display issue that call sites will be omitted that have no current allocations. A hypothetical call site that allocated a zillion byte, then freed them all, will not be shown even though its peak value would be interesting. That is an issue for another RFE.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11497/head:pull/11497
$ git checkout pull/11497
Update a local copy of the PR:
$ git checkout pull/11497
$ git pull https://git.openjdk.org/jdk pull/11497/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 11497
View PR using the GUI difftool:
$ git pr show -t 11497
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11497.diff