-
Notifications
You must be signed in to change notification settings - Fork 5.8k
JDK-8263495: Gather liveness info in the mark phase of G1 full gc #2966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back mli! A progress list of the required criteria for merging this PR into |
@Hamlin-Li The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
@tschatzl Hi Thomas, the patch is ready, would you mind to review the patch when available? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than that, initial review looks good. Thanks for extracting this piece of functionality, which makes the follow-up change likely easier to review.
for (uint j = 0; j < _heap->max_regions(); j++) { | ||
sum += _live_stats[j]._live_words; | ||
} | ||
_heap->set_used(sum * HeapWordSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
used()
is different to live()
in that used()
contains all space that is not allocatable (or should at least). This includes space lost due to fragmentation or just not compacting it. I think this hunk should be removed at this time until it is required.
Also, this value will be overwritten by the call to rebuild_region_sets
in G1FullCollector::prepare_heap_for_mutators()
as far as I can tell.
G1RegionMarkStatsCache _mark_region_cache; | ||
// Number of entries in the per-task stats entry. This seems enough to have a very | ||
// low cache miss rate. | ||
static const uint RegionMarkStatsCacheSize = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this constant to G1RegionMarkStatsCache
and reuse in both places where this constant is used instead of duplicating it.
@@ -74,7 +80,8 @@ class G1FullGCMarker : public CHeapObj<mtGC> { | |||
inline void follow_array(objArrayOop array); | |||
inline void follow_array_chunk(objArrayOop array, int index); | |||
public: | |||
G1FullGCMarker(G1FullCollector* collector, uint worker_id, PreservedMarks* preserved_stack); | |||
G1FullGCMarker(G1FullCollector* collector, uint worker_id, | |||
PreservedMarks* preserved_stack, G1RegionMarkStats* mark_stats); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better at this point to put one argument per line.
@@ -31,6 +31,7 @@ | |||
#include "gc/g1/g1FullGCCompactionPoint.hpp" | |||
#include "gc/g1/g1FullGCMarker.hpp" | |||
#include "gc/g1/g1FullGCOopClosures.inline.hpp" | |||
#include "gc/g1/g1RegionMarkStatsCache.inline.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed, the cache is not accessed in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought so at the beginning, but pre-submit test report compilation error on mac and linux x86, so I added these includes.
compilation error log is at: https://github.com/Hamlin-Li/jdk/runs/2095446669?check_suite_focus=true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It complains that mark_object()
needs it, so it should be enough to add the include to this file: src/hotspot/share/gc/g1/g1FullGCMarker.inline.hpp
which is where mark_object()
is defined.
@@ -27,6 +27,7 @@ | |||
#include "gc/g1/g1FullCollector.hpp" | |||
#include "gc/g1/g1FullGCMarker.inline.hpp" | |||
#include "gc/g1/g1FullGCOopClosures.inline.hpp" | |||
#include "gc/g1/g1RegionMarkStatsCache.inline.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file does not seem to use the G1RegionMarkStatsCache
so this include seems unnecessary.
@@ -31,6 +31,7 @@ | |||
#include "gc/g1/g1FullGCOopClosures.inline.hpp" | |||
#include "gc/g1/g1FullGCPrepareTask.hpp" | |||
#include "gc/g1/g1HotCardCache.hpp" | |||
#include "gc/g1/g1RegionMarkStatsCache.inline.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed.
@@ -28,6 +28,7 @@ | |||
#include "gc/g1/g1FullGCMarker.hpp" | |||
#include "gc/g1/g1FullGCOopClosures.inline.hpp" | |||
#include "gc/g1/g1FullGCReferenceProcessorExecutor.hpp" | |||
#include "gc/g1/g1RegionMarkStatsCache.inline.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed.
@@ -27,6 +27,7 @@ | |||
|
|||
#include "gc/g1/g1BlockOffsetTable.hpp" | |||
#include "gc/g1/g1HeapRegionTraceType.hpp" | |||
#include "gc/g1/g1RegionMarkStatsCache.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file does not use G1RegionMarkStatsCache
, so shouldn't need an include.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out.
it compiles successfully on my local env, let's see if it will compile in pre-submit build/test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This typically indicates that the header include is missing elsewhere. One thing that often helps is compiling with configure disabled precompiled headers, ie. --disable-precompiled-headers
in configure
.
@@ -44,6 +45,17 @@ G1RegionMarkStatsCache::~G1RegionMarkStatsCache() { | |||
FREE_C_HEAP_ARRAY(G1RegionMarkStatsCacheEntry, _cache); | |||
} | |||
|
|||
// cache size is equal to or bigger than region size to intialize region_index | |||
void G1RegionMarkStatsCache::initialize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the change really wants to avoid the initial cache misses (for regions != 0), please merge this with the existing reset()
method. I do not think it makes a difference to have this optimization that exploits the simplicity of G1RegionMarkStatsCacheEntry::hash()
.
(And if the hash()
implementation ever changes, this optimization won't work any more).
If you do so, also add a clear comment why the cache is initialized with these values. The comment
// cache size is equal to or bigger than region size to intialize region_index
seems at least misplaced.
The change could, at the assignment (maybe even add an optional parameter to clear()
) state:
// Avoid the initial cache miss and eviction by setting the i'th's cache region_idx to the region_idx due to how the hash is calculated.
Note that resetting to constant values in that loop will likely be folded into a memset
, while this loop needs to be translated as is, and may be a lot slower (with 100k+ regions...) in the pause.
In a textbook implementation, G1RegionMarkStatsCacheEntry::clear()
should probably set the region index to G1_NO_HRM_INDEX
(i.e. (uint)-1) to indicate emptiness so that all initial cache lookups fail (i.e. evict the previous, non-existing value). But then the simple implementation of the G1RegionMarkStatsCache::evict
method will access the liveness array out of bounds... (that's probably why the initial region_idx
values are zero, not as some sort of optimization - we save some conditional in that method).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Thomas, I will merge reset and initialize, and add the comment you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, a quick question: is there a limit on the region number in G1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The limit is 2^32-2
, i.e. the range of uint - 1 for that G1_NO_HRM_INDEX
which is a special value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few additional comments to the ones made by Thomas.
@@ -31,6 +31,7 @@ | |||
#include "gc/g1/g1FullGCCompactionPoint.hpp" | |||
#include "gc/g1/g1FullGCMarker.hpp" | |||
#include "gc/g1/g1FullGCOopClosures.inline.hpp" | |||
#include "gc/g1/g1RegionMarkStatsCache.inline.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It complains that mark_object()
needs it, so it should be enough to add the include to this file: src/hotspot/share/gc/g1/g1FullGCMarker.inline.hpp
which is where mark_object()
is defined.
@@ -25,14 +25,16 @@ | |||
#include "precompiled.hpp" | |||
#include "classfile/classLoaderData.hpp" | |||
#include "gc/g1/g1FullGCMarker.inline.hpp" | |||
#include "gc/g1/g1RegionMarkStatsCache.inline.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't need the inline.hpp
here, .hpp
should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Stefan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting to look good, just a few more cleanups.
@@ -30,6 +30,7 @@ | |||
#include "gc/g1/g1FullGCMarkTask.hpp" | |||
#include "gc/g1/g1FullGCOopClosures.inline.hpp" | |||
#include "gc/g1/g1FullGCReferenceProcessorExecutor.hpp" | |||
#include "gc/g1/g1RegionMarkStatsCache.inline.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed afaict.
void flush_mark_region_cache() { | ||
_mark_region_cache.evict_all(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this definition to the cpp-file.
@@ -63,6 +63,9 @@ class G1FullGCMarker : public CHeapObj<mtGC> { | |||
G1FollowStackClosure _stack_closure; | |||
CLDToOopClosure _cld_closure; | |||
|
|||
|
|||
G1RegionMarkStatsCache _mark_region_cache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the include to this file:
#include "gc/g1/g1RegionMarkStatsCache.hpp"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Stefan, Just modified as you suggested.
There are windows build failure, but it seems not related to my patch. |
/contributor add Shoubing Ma mashoubing1@huawei.com |
@Hamlin-Li Could not parse
|
Thanks for addressing the comments, I've kicked off some additional functional testing and will do some manual performance testing as well. |
/contributor add Shoubing Ma mashoubing1@huawei.com |
@Hamlin-Li |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some additional sweep through the code showed some more minor issues. Sorry for not noticing earlier.
@@ -60,6 +60,8 @@ void G1FullGCMarkTask::work(uint worker_id) { | |||
|
|||
// Mark stack is populated, now process and drain it. | |||
marker->complete_marking(collector()->oop_queue_set(), collector()->array_queue_set(), &_terminator); | |||
// flush live bytes to regions | |||
marker->flush_mark_region_cache(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use mark_stats
instead of mark_region
in all identifier names. It is kind of jarring to have the class called G1Mark*Stats*Cache
, all other related variables also with something containing mark_stats
and then to use something else in the instances for this for no discernible reason.
I.e. please rename this method and the member from _mark_region_cache
to _mark_stats_cache
.
Of course, if you have a reason to do so please tell.
@@ -60,6 +60,8 @@ void G1FullGCMarkTask::work(uint worker_id) { | |||
|
|||
// Mark stack is populated, now process and drain it. | |||
marker->complete_marking(collector()->oop_queue_set(), collector()->array_queue_set(), &_terminator); | |||
// flush live bytes to regions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment containing full sentences should start with upper case like the one before and end with a full stop. Also that comment is probably better placed at the declaration in the .hpp file.
// Reset liveness of all cache entries to their default values, | ||
// initialize _region_idx to avoid initial cache miss. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments describing the functionality of a method should be at the declaration. Only implementation details should be added at the definition.
void clear() { | ||
_region_idx = 0; | ||
void clear(uint idx = 0) { | ||
_region_idx = idx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I unfortunately can't comment on the other method is_clear()
here - please remove because it is a) wrong now with that special initialization and b) not used anywhere apparently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed review.
@@ -98,6 +98,10 @@ class G1RegionMarkStatsCache { | |||
|
|||
G1RegionMarkStatsCacheEntry* find_for_add(uint region_idx); | |||
public: | |||
// Number of entries in the per-task stats entry. This seems enough to have a very |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-existing: please add "value" after the "This" to read "This value seems enough to...." (or "The suggested value" or something to that effect.
Testing showed that the regression I reported in the other PR is still present for this fix. Same small reproducer can be used:
I've made a change that eliminates the regression by making sure the cache update is not inlined and by doing so allows for other inlining to be done: |
On my local env, it seems make no difference wIth or w/o your patch, for both marking phase and whole full gc pause. But I will apply your patch, Thanks a lot for the benchmarking. |
What inlining gets done will depend a lot on the build env, so it might be that your build don't experience the same issue. Because you mean that you can't reproduce the regression at all right? Or is the regression present even after my patch? |
It's all the same, following data is in the order of master/with/w/o, (I don't "make clean" before make images) $ java -Xlog:gc+phases=trace SystemGCLarge | grep "Phase 1: Mark live objects" $ java -Xlog:gc+phases=trace SystemGCLarge | grep "Phase 1: Mark live objects" $ java -Xlog:gc+phases=trace SystemGCLarge | grep "Phase 1: Mark live objects" |
Thanks for verifying. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm assuming that Windows build failure is just something specific to Github. Thanks.
To answer my own question: it builds internally on Windows. So it's good to go.
@Hamlin-Li This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 791 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with the current changes as well.
/integrate |
for (uint j = 0; j < heap->max_regions(); j++) { | ||
_live_stats[j].clear(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: i
could be used instead of j
, right?
@Hamlin-Li Since your change was applied there have been 791 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 8c8d1b3. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Gather liveness info in the mark phase of G1 full gc.
Per-region liveness info in the mark phase of G1 full gc is for several purposes:
JDK-8262068 needs it to determine whether compaction of a region should be skipped
JDK-8258431 for a JFR event that prints live set size estimate
so add this functionality.
Progress
Issue
Reviewers
Contributors
<mashoubing1@huawei.com>
Download
To checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/2966/head:pull/2966
$ git checkout pull/2966
To update a local copy of the PR:
$ git checkout pull/2966
$ git pull https://git.openjdk.java.net/jdk pull/2966/head