New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8271880: Tighten condition for excluding regions from collecting cards with cross-references #5037
8271880: Tighten condition for excluding regions from collecting cards with cross-references #5037
Conversation
👋 Welcome back tschatzl! A progress list of the required criteria for merging this PR into |
Webrevs
|
G1SkipCardEnqueueSetter(G1ScanEvacuatedObjClosure* closure, bool skip_enqueue_cards) : _closure(closure) { | ||
assert(_closure->_skip_card_enqueue == G1ScanEvacuatedObjClosure::Uninitialized, "Must not be set"); | ||
_closure->_skip_card_enqueue = skip_enqueue_cards ? G1ScanEvacuatedObjClosure::True : G1ScanEvacuatedObjClosure::False; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no reason for class G1SkipCardEnqueueSetter
, field _skip_card_enqueue
and method arg skip_enqueue_cards
to not follow the same pattern; both alternatives are fine, "skip enqueue cards" or "skip card enqueue".
@@ -112,17 +112,17 @@ class G1ScanEvacuatedObjClosure : public G1ScanClosureBase { | |||
}; | |||
|
|||
// RAII object to properly set the _scanning_in_young field in G1ScanEvacuatedObjClosure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obsolete comments; can be dropped completely, IMO.
|
||
enum ScanningInYoungValues { | ||
False = 0, | ||
True, | ||
Uninitialized | ||
}; | ||
|
||
ScanningInYoungValues _scanning_in_young; | ||
ScanningInYoungValues _skip_card_enqueue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enum name can be tristate
, right?
assert(!dest_attr.is_in_cset(), "must not scan object from cset here"); | ||
G1SkipCardEnqueueSetter x(&_scanner, dest_attr.is_new_survivor() || dest_attr.is_in_cset()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just assert !dest_attr.is_in_cset()
, why still || dest_attr.is_in_cset()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(!dest_attr.is_in_cset(), "must not scan object from cset here");
is because this should not happen (yet) as we do not split large objarrays that failed evacuation at this time. I'll remove that one, because something like "not implemented" or "should not happen" isn't great either.
dest_attr.is_new_survivor() || dest_attr.is_in_cset()
is incorrect, just dest_attr.is_new_survivor()
is correct (then again, it could not have happened yet).
// FIXME: check if we could just use dest_attr | ||
dest_attr = _g1h->region_attr(to_array); | ||
assert(!_g1h->region_attr(to_array).is_in_cset(), "must not scan object from cset here"); | ||
G1SkipCardEnqueueSetter x(&_scanner, dest_attr.is_new_survivor() || dest_attr.is_in_cset()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still WIP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but has the same bug as the other similar place.
Fyi, there is a bug (or at least a crash in another tier1-5 run) in the current code that I'm currently tracking down, so please hold off further reviews for now. Thanks, |
Here is a description of the problem: G1 crashes because of a missing remembered set entry in the young generation for j.l.ref.References' discovered field. The reason is that
Previously there has been no issue because the fixup self-forwards phase re-enqueued these cards into the "correct" DCQS. |
@tschatzl This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 35 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
... and using the normal write barrier is good at that time for objects evacuated somewhere else: either this had been into survivor, where we do not care about card marks (and the "young" card filter works as expected), or they are clean (previously unallocated areas in the old gen region). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch looks fine. Some confusion around assertions and comments.
Re
// ...
assert(!dest_attr.is_young() || _g1h->heap_region_containing(to_array)->is_survivor(), "must be");
G1SkipCardEnqueueSetter x(&_scanner, dest_attr.is_young());
Even after reading the preceding comments, I can't follow the assertion logic, and appreciate why it's related to the constructor. I wonder if the following revised assertion/comment is correct and clearer.
// Skip the card enqueue iff the objective (to_array) is in survivor region. However, HeapRegion::is_survivor() is too expensive here.
// Instead, we use dest_attr.is_young() because the two values are always equal: successfully allocated young regions must be survivor regions.
assert(dest_attr.is_young() == _g1h->heap_region_containing(to_array)->is_survivor(), "must be");
G1SkipCardEnqueueSetter x(&_scanner, dest_attr.is_young());
@tschatzl this pull request can not be integrated into git checkout submit/evac-failure-no-scan-during-remove-self-forwards
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
The recent change fixes the remaining issue mentioned earlier by introducing a GC specific closure that makes the This fixes the issue (note that the change introduced in JDK-8270842 introduced the same issue, just much less common I believe). Passes tier1-5, a closed test that failed with around 3% failures with these changes now passes always. From a performance POV I tested and analyzed reference processing performance on G1 so far with no particular regressions at both our test suite as well as some detailed look at discovery (i.e. the time spent per discovery/enqueue seems to be the same as before). Currently looking into Parallel GC differences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not getting to this one until now, but took a first quick look at this.
Apart from the small comment below, I think we might want to split this into two separate PRs. One to add the new reference processing abstraction with the general use and then do the specific G1 changes as another PR. What do you think about that?
// References to the current collection set are references to objects that failed | ||
// evacuation. Currently these regions are always relabelled as old without | ||
// remembered sets, so skip them. | ||
if (!dest_attr.is_in_cset()) { | ||
enqueue_card_if_tracked(dest_attr, p, obj); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, the only time dest_attr.is_in_cset() == true
is when obj
is in a region that failed evacuation, right? To make this even more obvious could we add an else-statement with an assert that this is the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunately not possible due to races (I tried :)) - one thread may set the flag, another one in the meanwhile already fail again an object in the region, and the evacuation-failed flag not yet visible to this thread.
I will first revert the faulty JDK-8270842, then add the API, then add the G1 specific code if you prefer. |
Sounds good to me. |
c708d5c
to
cefe7fd
Compare
Please hold off reviewing, it looks I messed something up in the merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Still I think maybe we should just go with just calling it Survivor
instead of NewSurvivor
. I see how it could lead to some confusion, but the comments could be expanded a bit to explain that these are referring to the newly allocated survivors.
But I'm good with the change as is, so if you prefer this, let's go with this.
Co-authored-by: Stefan Johansson <54407259+kstefanj@users.noreply.github.com>
Not sure about |
@albertnetymk made me aware that we can check whether the |
Thanks @albertnetymk @kstefanj for your reviews |
Going to push as commit 5a12af7.
Your commit was automatically rebased without conflicts. |
Hi,
can I have reviews for this change that by tightening the condition for excluding regions from collecting cards with cross-references allows us to avoid the rescanning of objects that failed evacuation in the fix up self-forwards phase after evacuation failure.
I.e. during gc g1 never collects cards/references from the young gen (including eden) for later refinement - which means that we need to rescan all live objects remaining in eden regions for cross-references.
The problem or the reason why we did that was that we did not want to add cards to refine from survivor regions (i.e. next gc's young gen) because we just don't need to as we always collect young gen, so references from there need not be recorded in the remembered sets (and actually, if we did, we errorneouosly marked cards in young gen which later card table verification will not like) - but we did not have that information on hand anywhere already quickly accessible.
This change solves that problem by actually recording this information in the region attribute table as "NewSurvivor" type region. "NewSurvivor" because I did want to make explicit that these are the survivor regions from the new (next) young generation (i.e. just survivor) and not the survivor regions of the previous gc (that were turned eden at the start of this gc) but something like "NewYoung" or so would be fine with me as well (or certainly just "Survivor", but that might be confusing).
Another interesting addition is probably the new assert in
G1ParThreadScanState::enqueue_card_if_tracked
This, at this time, verifies the assumption that g1 is not trying to collect references to the collection set, i.e. other objects that failed evacuation - after all we later relabel their regions as old without a remembered set; we would do otherwise unnecessarily because the reason is that (currently) cset tracking for these regions is enabled (at least during gc - we only later relabel and drop the remembered sets).
This might change later if we want to move evacuation failed regions into survivor (or keep their remembered sets for some reason), but for now we filter attempts to add cards in the dcqs for those this way.
Testing: tier1-5, gc/g1 with
JAVA_OPTIONS_=-XX+G1EvacuationFailureALot -XX:+VerifyAfterGC
.Thanks,
Thomas
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5037/head:pull/5037
$ git checkout pull/5037
Update a local copy of the PR:
$ git checkout pull/5037
$ git pull https://git.openjdk.java.net/jdk pull/5037/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 5037
View PR using the GUI difftool:
$ git pr show -t 5037
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5037.diff