New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK-8268893: jcmd to trim the glibc heap #4510
Conversation
👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into |
@tstuefe The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
815e076
to
97d71ea
Compare
Webrevs
|
Mailing list message from David Holmes on hotspot-runtime-dev: Hi Thomas, On 17/06/2021 4:41 pm, Thomas Stuefe wrote:
Is it perhaps worthwhile trying to generalize this to a jcmd to request Thanks, |
1 similar comment
Mailing list message from David Holmes on hotspot-runtime-dev: Hi Thomas, On 17/06/2021 4:41 pm, Thomas Stuefe wrote:
Is it perhaps worthwhile trying to generalize this to a jcmd to request Thanks, |
Hi David, thanks for looking into this!
I thought about this too. It would certainly make the jcmd look nicer. But I think that this would be one of the cases where the supporter needs to know the details of what the command does under the hood, so by hiding the complexity behind the platform nothing is gained; you need to know what happens to gauge the influence it has on the system (eg performance loss by dropping caches) and to analyze the precise after-effects. I I am undecided on this. I will ask around offline. ..Thomas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this little new diagnostic command which I think can be quite useful in specific situations.
However, in contrast to other reviewers, I'd rather keep this simple and Glibc specific instead of extending it to a more general but mostly empty command.
I'd therefore propose to rename this command to glibc_trim_heap
to make it evident from the command name already that it is Glibc-specific.
Besides that, just cosmetic changes and suggestions.
Thanks a lot Volker!
Yes, that was my thought too. Lets wait for @dholmes-ora to chime in, whether we can all agree on a glibc specific variant. I also preferred that one.
All of which make sense, I'll work them in. Thanks! ..Thomas |
Mailing list message from David Holmes on hotspot-runtime-dev: On 21/06/2021 7:58 pm, Thomas Stuefe wrote:
I don't totally oppose the specialised variant, but it certainly isn't Cheers, |
1 similar comment
Mailing list message from David Holmes on hotspot-runtime-dev: On 21/06/2021 7:58 pm, Thomas Stuefe wrote:
I don't totally oppose the specialised variant, but it certainly isn't Cheers, |
I'll prepare the CSR and bring it to review. Cheers, Thomas |
Yes, we have. E.g.:
|
} | ||
_output->print_raw(ss_report.base()); | ||
log_info(os)("malloc_trim: "); | ||
log_info(os)("%s", ss_report.base()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might also make sense to put this into a single log line like log_info(os)("malloc_trim:\n%s", ss_report.base());
to get something like:
[139,068s][info][os] malloc_trim:
Virtual size before: 28849744k, after: 28849724k, (-20k)
RSS before: 8685896k, after: 920740k, (-7765156k)
Swap before: 0k, after: 0k, (0k)
instead of:
[139,068s][info][os] malloc_trim:
[139,068s][info][os] Virtual size before: 28849744k, after: 28849724k, (-20k)
RSS before: 8685896k, after: 920740k, (-7765156k)
Swap before: 0k, after: 0k, (0k)
Hi, I created the CSR for this command: https://bugs.openjdk.java.net/browse/JDK-8269345 When you have time, please take a look. Thank you. ..Thomas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the cleanups. Looks good to me now.
@tstuefe This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 92 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
/csr |
@jerboaa this pull request will not be integrated until the CSR request JDK-8269345 for issue JDK-8268893 has been approved. |
I renamed the command as agreed upon in the CSR discussion. |
Gentle ping. Need a second review. @dholmes-ora ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Thomas,
This seems okay to me.
Thanks,
David
Thank you David and Volker! |
/integrate |
@tstuefe This PR has not yet been marked as ready for integration. |
The CSR https://bugs.openjdk.java.net/browse/JDK-8269345 is in Provisional. Once it's approved it should become ready for integration. |
Oh, you are right. Good that skara caught that. |
Not sure what to do here honestly. The CSR seems stuck in "provisional" without any indication if any action is required from my side. This PR has gathered enough reviews meanwhile, but I cannot integrate it without the CSR moving forward. Someone from Oracle, please advise? |
You have to move the CSR to "Proposed" or "Finalized" otherwise it probably won't appear on Joe's work list. I've moved it to "Finalized" now because I think all the issues have been resolved. Please feel free to change the state again if you think that's not appropriate. |
Thanks a lot Volker! |
/integrate |
Going to push as commit 6096dd9.
Your commit was automatically rebased without conflicts. |
Proposal to add a Linux+glibc-only jcmd to manually induce malloc_trim(3) in the VM process.
The glibc is somewhat notorious for retaining released C Heap memory: calling free(3) returns memory to the glibc, and most libc variants will return at least a portion of it back to the Operating System, but the glibc often does not.
This depends on the granularity of the allocations and a number of other factors, but we found that many small allocations in particular may cause the process heap segment (hence RSS) to get bloaty. This can cause the VM to not recover from C-heap usage spikes.
The glibc offers an API, "malloc_trim", which can be used to cause the glibc to return free'd memory back to the Operating System.
This may cost performance, however, and therefore I hesitate to call malloc_trim automatically. That may be an idea for another day.
Instead of an automatic trim I propose to add a jcmd which allows to manually trigger a libc heap trim. Such a command would have two purposes:
Note that this command also helps with analyzing libc peaks which had nothing to do with the VM - e.g. peaks created by customer code which just happens to share the same process as the VM. Such memory does not even have to show up in NMT.
I propose to introduce this command for Linux only. Other OSes (apart maybe AIX) do not seem to have this problem, but Linux is arguably important enough in itself to justify a Linux specific jcmd.
CSR for this command: https://bugs.openjdk.java.net/browse/JDK-8269345
Note that an alternative to a Linux-only jcmd would be a command which would trim the C-heap on all platforms, with implementations to be filled out later.
=========
This patch:
=========
Example:
A programm causes a temporary peak in C-heap usage (in this case, triggered via Unsafe.allocateMemory), right away frees the memory again, so its not leaky. The peak in RSS was ~8G (even though the user allocation was way smaller - glibc has a lot of overhead). The effects of this peak linger even after returning that memory to the glibc:
We execute the new trim command via jcmd:
It prints out reduction in virtual size, rss and swap. The virtual size did not decrease since no mappings had been unmapped by the glibc. However, the process heap was shrunk heavily by the glibc, resulting in a large drop in RSS (8.5G->900M), freeing >7G of memory:
When the VM is started with -Xlog:os, this is also logged:
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4510/head:pull/4510
$ git checkout pull/4510
Update a local copy of the PR:
$ git checkout pull/4510
$ git pull https://git.openjdk.java.net/jdk pull/4510/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 4510
View PR using the GUI difftool:
$ git pr show -t 4510
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4510.diff