New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8267073: Race between Card Redirtying and Freeing Collection Set regions results in missing remembered set entries with G1 #4429
Conversation
|
80fd8e2
to
b7c52d0
Compare
An early comment, haven't thought about all implications and if it would be bad for future work or performance. But wouldn't it be quite convenient to bake the evacuation failed status into the attribute table? Similar to have the |
There are a few concerns in addition to those that you mentioned that does not make it that proposal clear cut:
Hth, |
Tier1-5 testing completed with no particular issues so I opened it as ready for review. |
Thanks Thomas for the reasoning, I think this all makes sense. I agree that the attribute table is not optimal, but at a first glance it looked like a good candidate for this information. I think working towards grouping young-gen things together is good a idea and once that is done we might find an even better abstraction for this. |
Generally looks good. One suggestion about the representation of the evacuation-failed info.
@@ -867,6 +867,8 @@ class G1CollectedHeap : public CollectedHeap { | |||
|
|||
// Number of regions evacuation failed in the current collection. | |||
volatile uint _num_regions_failed_evacuation; | |||
// Records for every region on the heap whether evacuation failed for it. | |||
volatile bool* _regions_failed_evacuation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using BitMap and par_xxx operations instead of an array of bool and explicit atomic operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also be fine with the change as is, if that gets this fix for a fairly noisy bug pushed sooner. So also consider a followup RFE to switch to using BitMap.
Changes looks good. I like Kim's suggestion to use BitMap
, so would not object to that, but I'm fine with the change as is.
@tschatzl This change now passes all automated pre-integration checks. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 42 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
|
Looks good, with possible change to _region_failed_evacuation
either now or as a future enhancement.
@@ -867,6 +867,8 @@ class G1CollectedHeap : public CollectedHeap { | |||
|
|||
// Number of regions evacuation failed in the current collection. | |||
volatile uint _num_regions_failed_evacuation; | |||
// Records for every region on the heap whether evacuation failed for it. | |||
volatile bool* _regions_failed_evacuation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also be fine with the change as is, if that gets this fix for a fairly noisy bug pushed sooner. So also consider a followup RFE to switch to using BitMap.
I'll change to bitmaps later - the earlier we get the noise away from CI, the better imho. Thanks @kstefanj and @kimbarrett for your reviews. /integrate |
@tschatzl Since your change was applied there have been 44 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 2b41459. |
Hi all,
can I have reviews for this change that fixes a race between card redirtying and collection set freeing?
So card redirtying tries to be clever about which cards to redirty by not keeping (redirtying) cards in regions that have been fully evacuated as they can't have any references. (We did collect them exactly in the case of evac failure of an old region X: if that region X fails evacuation it is kept and may still contain references to other old regions Y; if it is successfully evacuated, then obviously we keep these cards for nothing):
At the same time, since JDK-8214237, G1 frees the collection set which modifies both the is-in-cset field in the attribute table in
is_in_cset()
and the region's_evacuation_failed
field which yields to returning the wrong value and so dropping cards from the dirty card queue (and in extension from the remembered sets).The suggested fix consists of two parts:
_evacuation_failed
field in theHeapRegion
The first item is fairly trivial, there is actually no need to clear that field at that point. Almost right after this phase the entire attribute table is reset again (in
start_new_collection_set()
).The second is a bit trickier, as if we did not clear then, we would need to do it at a later point. So while this could be done, adding the clearing somewhere in the
pre_evacuate_collection_set_phase()
, the easier (and imo better solution) is to move that_evacuation_failed
field for regions that is only necessary during young gc, out ofHeapRegion
and collecting it in a side table.There are two reasons for this: first I want to in the future split out all transient, young-gc related data and code from
G1CollectedHeap
into a separate class likeG1FullCollector
, and we need that information available in such a side table for future improved pinned region support JDK-8254167.After this change a test where I could reproduce this failure fairly consistently with
-XX:+VerifyAfterGC
(rate ~1/100), did not fail in thousands of iterations.Testing: tier1-5
Thanks,
Thomas
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4429/head:pull/4429
$ git checkout pull/4429
Update a local copy of the PR:
$ git checkout pull/4429
$ git pull https://git.openjdk.java.net/jdk pull/4429/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 4429
View PR using the GUI difftool:
$ git pr show -t 4429
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4429.diff