New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8254739: G1: Optimize evacuation failure for regions with few failed objects #5181
8254739: G1: Optimize evacuation failure for regions with few failed objects #5181
Conversation
👋 Welcome back mli! A progress list of the required criteria for merging this PR into |
@Hamlin-Li The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
Initial feedback: we do not care about cases like One problem I can see now with performance is the use of a linked list for storing the regions. Using such requires quite a lot of memory allocation (every I am also not really convinced that the failed object lists should be attached to Overall we can add some cut-off for whole-region iteration at some point as you described as needed later I think. Do you have a breakdown of the time taken for the parts of the I understand this is a bit hard to measure in a realistic environment because while Later I will follow up with a more thorough look. Hth, |
Thanks a lot for the detailed review and suggestion.
I agree.
I have written a simple version of the "array" for this specific usage, I do not implement a buffer to cache the "node" used in "array". Seems it's getting better performance than my previous linked list version. ( I measure the end-to-end time simply)
I agree. We could refactor it thoroughly. Could we do it in a separate issue? please kindly point to the issue if you already track this with an issue.
yes.
I will measure the time of the new implementation with "array", and update the information later.
Thanks a lot, this will be great helpful.
|
…failure objs in one region
Hi,
Yes, we can merge this later; there is the related JDK-8267834: Refactor G1CardSetAllocator and BufferNode::Allocator to use a common base class that may provide some useful base.
I am currently working on separating the G1 young collection algorithm from
I filed JDK-8272977 to not forget on this.
Thanks, |
The test based on new version (segmented array) shows that most (more than 90%) of iteration time is spent on "iterate_internal", compact cost almost no time, and less than 10% time is spent on sort. And I also attach the perf data on the JBS bug, for "end to end" time/"pause young" time/"Evacuate Collection Set” time/"Post Evacuate Collection Set" time/"Remove Self Forwards" time. Generally I think the new implemention works well for G1EvacuationFailureALotCount == 1/2/..., not just for G1EvacuationFailureALotCount >= 10. ( Although I'm not sure why this optimization also gets better data than origin on "Evacuate Collection Set", in my mind it should cost more because we "record" more information when evacuation. Am I miss something here? ) |
Just fyi, I am looking at the change, but it's a relatively significant patch and I want to do some local testing too. |
Performance is good as is, particularly the Remove self-forwards phase is much much faster now. I added a figure to the CR here. Really nice. Although there are a few existing abnormalities with this change (not newly introduced, the old code is as bad), see JDK-8273309. There are a few things that need to be improved:
Let's work on renaming and reusing the infrastructure provided by the remembered set ( |
Agree, I will first work on https://bugs.openjdk.java.net/browse/JDK-8273626 which is to refactor G1CardSetAllocator and related classes to support element size less pointer size. The pr is at #5478 |
Seems I have accidently closed the pr, reopen it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are quite a few comments from me about this change, and at some point I decided to provide a change that would cover these and more suggestions I had (note that this change might contradict some of the other suggestions - sometimes this happens when actually tinkering and seeing the whole code); communicating via text is too cumbersome. The commit containing the direction I think the change should go is available at b793f5b .
Mostly it is about hiding the code for preparing and iterating the sorted list of objects that failed evacuation.
Maybe G1EvacFailureObjsInHR
should be renamed to something like G1EvacFailedObjectsSet
(I do not think the HR
postfix is important).
As for the other question, for now I think it is good to keep the G1EvacFailureObjsInHR
in HeapRegion
.
Please have a look.
Fwiw, I did some cursory testing of this change, currently running tier1-5 in our internal CI without issues so far (~half done). Still there might be bugs :)
Thanks a lot Thomas, I like the refacoring :) Based on your comments and code, I made some adjustment. BTW, Not sure why, but seems the rename from g1EvacFailureObjsInHR.cpp to g1EvacFailureObjectsSet.cpp is not recongnized by git. |
@tschatzl Hi Thomas, is there any way I can reproduce this compilation error (drop_all reference) on my local? |
return cast_to_oop(_bottom + offset); | ||
} | ||
|
||
G1EvacFailureObjectsSet::OffsetInRegion G1EvacFailureObjectsSet::cast_to_offset(oop obj) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be to_offset
to match from_offset
(forgot that) - this is not a cast to me (changing the interpretation of a bit pattern) but a transformation (changing the bit pattern for storage savings).
So cast
seems to be the wrong word to me here.
G1EvacFailureObjectsSet::G1EvacFailureObjectsSet(uint region_idx, HeapWord* bottom) : | ||
DEBUG_ONLY(_region_idx(region_idx) COMMA) | ||
_bottom(bottom), | ||
_offsets("", &_alloc_options, &_free_buffer_list) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fwiw, I recently removed the first parameter of G1SegmentedArray
. Please make sure you merge with tip before pushing.
friend class G1SegmentedArray<OffsetInRegion, mtGC>; | ||
friend class G1SegmentedArrayBuffer<mtGC>; | ||
|
||
G1EvacFailureObjectsSet* _collector; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably rename _collector
to _objects_set
or so to match the type better.
|
||
// Helper class to join, sort and iterate over the previously collected segmented | ||
// array of objects that failed evacuation. | ||
class G1EvacFailureObjectsIterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is not an Iterator
in C++ STL sense, so it is a good idea to not name it like that. I understand that Hotspot code (and even the recent remembered set code - there is a CR to fix that) adds to the confusion, but it would be nice to not add to the confusion.
#ifdef ASSERT | ||
// Callback of G1SegmentedArrayBuffer::iterate_elems | ||
// Verify a single element in a segment node | ||
void visit_elem(void* elem) { | ||
uint* ptr = (uint*)elem; | ||
_collector->assert_is_valid_offset(*ptr); | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intentionally removed this visit_elem
method in the original suggestion because I thought that to be too much checking. It is kind of obvious that we only store data in here.
We check that these are valid offsets both when writing to and reading from the array, so this seems to be unnecessary triple-checking.
Fwiw, in Hotspot code we do not use the visit_
prefix but do_
(and pass something that ends with a Closure
, not a Visitor
in the iterate*
methods).
I am aware that in design pattern lingo this is the Visitor pattern, but Hotspot is probably much older than that and it is somewhat jarring to have some code use this terminology and others use another one.
A change here needs to be discussed with a wider audience.
|
||
void HeapRegion::iterate_evac_failure_objs(ObjectClosure* closure) { | ||
_evac_failure_objs.iterate(closure); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This one is fine to be placed in the cpp file - the additional single call per HeapRegion
does not matter)
@@ -554,6 +557,11 @@ class HeapRegion : public CHeapObj<mtGC> { | |||
|
|||
// Update the region state after a failed evacuation. | |||
void handle_evacuation_failure(); | |||
// Records evac failure objs during evaucation, this will help speed up iteration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/evucation/evacuation.
Please try to avoid "evac failure objs" in text as it seems strange to me grammatically, and isn't particularly obvious what this is to readers not working on this all the time. Just say "Record an object that failed evacuation within this region.` I do not think the second part of the sentence is necessary, i.e. talking about speeding up something here.
// Records evac failure objs during evaucation, this will help speed up iteration | ||
// of these objs later in *remove self forward* phase of post evacuation. | ||
void record_evac_failure_obj(oop obj); | ||
// Iterates evac failure objs which are recorded during evcauation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/evauation/evacuation
Better would probably be "Applies the given closure to all previously recorded objects that failed evacuation in ascending address order"
G1EvacFailureObjectsSet::OffsetInRegion G1EvacFailureObjectsSet::cast_to_offset(oop obj) const { | ||
const HeapWord* o = cast_from_oop<const HeapWord*>(obj); | ||
size_t offset = pointer_delta(o, _bottom); | ||
assert_is_valid_offset(offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The from_offset
call below already does this assert.
friend class G1SegmentedArray<OffsetInRegion, mtGC>; | ||
friend class G1SegmentedArrayBuffer<mtGC>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to make visit_*
public here instead of the friends. This is an internal class not visible outside, and we actually use it as a closure.
Probably try to compile with Dug into that a bit deeper - can you try The |
I tried --disable-precompiled-headers before, sometimes it helps to locate this kind of issue, but not for this time. |
Kindly reminder ~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good now - great!
There is one minor comment for some refactoring that you may want to consider.
join_and_sort(); | ||
iterate_internal(closure); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably move the array allocation and freeing here instead of having this at the start and end of join_and_sort
and iterate_internal
respectively. Otherwise there is a hidden dependency on the first method allocating and the second freeing it, and looks cleaner as allocation and deallocation is obvious and on the same call level.
I.e.
void iterate(ObjectClosure* closure) {
uint num = _segments->num_allocated_nodes();
_offset_array = NEW_C_HEAP_ARRAY(OffsetInRegion, num, mtGC);
join_and_sort();
iterate_internal(closure);
FREE_C_HEAP_ARRAY(OffsetInRegion, _offset_array);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Thomas, good catch.
@Hamlin-Li This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 52 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still good.
Is someone else available to have a look on this change? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only some subjective and minor comments.
|
||
template <class Elem, MEMFLAGS flag> | ||
template <typename BufferClosure> | ||
void G1SegmentedArray<Elem, flag>::iterate_nodes(BufferClosure& cloure) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: cloure
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch.
void G1EvacFailureObjectsSet::iterate(ObjectClosure* closure) { | ||
assert_at_safepoint(); | ||
|
||
G1EvacFailureObjectsIterationHelper helper(this); | ||
helper.iterate(closure); | ||
|
||
_offsets.drop_all(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having some destructive operations (drop_all
) inside a method named iterate
could come as a surprise, IMO. If I understand this correctly, the following would be problematic.
evac_failed_objects.iterate(closure1);
...
evac_failed_objects.iterate(closure2);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop_all just returns buffers to free list, it will not destruct the buffers. So, iterate multiple times is OK, because next time it will get memory from free list or allocate a new buffer. Hope this answer your question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since drop_all()
resets all counters (e.g. _num_allocated_nodes
), the subsequent iteration will think the array is empty, won't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, we modify the code as below:
void HeapRegion::iterate_evac_failure_objs(ObjectClosure* closure) {
_evac_failure_objs.iterate(closure);
_evac_failure_objs.iterate(closure);
}
For the second time iteration, all thing will be empty, so iterate_nodes will be an empty operation, QuickSort::sort too, and iterate_internal too. These empty operations will not do harm things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hamlin-Li : I think @albertnetymk concern is that typically an iterate
method does not modify the list itself. That is surprising for readers. The documentation also does not indicate any of that. I do not think he believes this will cause a VM failure.
Maybe change HeapRegion::iterate_evac_failure_objs
to call a (new) drop()
method on _evac_failure_objs
?
I think such a change would solve Albert's concerns.
An alternative could be renaming iterate
to something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank Thomas for unpacking my concern more precisely. I used "problematic" to mean the second iteration will not do what developers expect it to do, not necessarily a VM crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your clarification, Albert, Thomas, I see your point, it make sense to me.
As this patch has been blocking some other issues for a while, and I think it's better to think of some good solution for Albert's concern (seems add a drop is a little bit redundant for me :).)
If you don't mind, can I do this refinement later in another issue? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a rather minor issue as I stated originally; I am fine either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I created JDK-8276721 to track it.
I think the windows build failure is not related to this change, will push it. |
Thanks @tschatzl @albertnetymk for your reviews. /integrate |
Going to push as commit ed7ecca.
Your commit was automatically rebased without conflicts. |
@Hamlin-Li Pushed as commit ed7ecca. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
This is a try to optimize evcuation failure for regions.
I record every evacuation failure object per region (by G1EvacuationFailureObjsInHR), and then iterate (which indeed includes compact/sort/iteration) these objects directly in RemoveSelfForwardPtrHRClosure.
I have tested it with following parameters,
It saves "Remove Self Forwards" time all the time ,and in most condition it saves "Evacuate Collection Set" time.
It brings some performance degradation when -XX:G1EvacuationFailureALotCount is low, such as 2. To improve this a little, we can record the number evacuation failure object per region, and not record these objects when the number hit some limit. But I'm not sure if it's necessary to do so, as I think such condition is so extreme to be met in real environment, although I'm not quite sure.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5181/head:pull/5181
$ git checkout pull/5181
Update a local copy of the PR:
$ git checkout pull/5181
$ git pull https://git.openjdk.java.net/jdk pull/5181/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 5181
View PR using the GUI difftool:
$ git pr show -t 5181
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5181.diff