8320525: G1: G1UpdateRemSetTrackingBeforeRebuild::distribute_marked_bytes accesses partially unloaded klass#16766
Conversation
|
👋 Welcome back tschatzl! A progress list of the required criteria for merging this PR into |
Webrevs
|
| assert(marked_bytes == 0 || obj_size_in_words * HeapWordSize == marked_bytes, | ||
| // regions; also, we should not access their header any more them as their | ||
| // klass may have been unloaded. | ||
| assert(marked_bytes == 0 || cast_to_oop(hr->bottom())->size() * HeapWordSize == marked_bytes, |
There was a problem hiding this comment.
Is it possible to separate these two cases into two methods, for instance? Taking a step back, why do we even need to call note_end_of_marking on these effectively empty regions?
There was a problem hiding this comment.
We need to call HeapRegion::note_end_of_marking on them, even if they are empty for completeness. It's not completely necessary because reclamation will probably reset them correctly, but it's easier to reason if they (empty and nonempty regions) are handled the same to me.
I.e. so all regions have note_start/end_of_marking called.
There was a problem hiding this comment.
I do not think it is easier to understand to separate these two cases (empty/non-empty) regions here: both distributing bytes (it's fine to distribute 0 bytes) and being consistent with calling the start/end notifications (and note_end works as expected on empty regions too) for all regions is easier to follow to me compared to having an unnecessary exception.
Because then the question is: why have that exception?
Even if they do not do anything "meaningful" other than resetting some internal state that is later overwritten for these specially handled empty regions anyway.
There was a problem hiding this comment.
so all regions have note_start/end_of_marking called.
Currently, the pair is invoked on all regions and the filtering (skipping young-region for example) is done inside these methods. However, the logic why a particular kind of region can (should) be skipped really belongs to the caller. The region itself doesn't know how to react to marking-start/end. (This is kind of tied to the ticket of moving marking-related fields outside region.)
why have that exception?
Because live-region and effective-region are diff, and mixing them causes confusion. I think the existence of the new comment "we should not access their header any more them..." demonstrates that it's not super obvious why the current code (before this PR) is problematic.
There was a problem hiding this comment.
so all regions have note_start/end_of_marking called.
Currently, the pair is invoked on all regions and the filtering (skipping young-region for example) is done inside these methods. However, the logic why a particular kind of region can (should) be skipped really belongs to the caller. The region itself doesn't know how to react to marking-start/end. (This is kind of tied to the ticket of moving marking-related fields outside region.)
We already agreed that these note* methods are basically part of the caller, placed in the wrong location because the members it accesses are in the wrong location. I do not think messing with this here in this CR half-heartedly is a good idea. As soon as the work to move TAMS and PB starts, this is going to change anyway and is imho a more appropriate time to reconsider this (and will probably naturally fix itself).
The problematic code is assertion code, which quite often accesses internals one normally would not. The regular code is independent of whether the region's klass is live or not after all.
Deleting this assert would fix the issue at hand as well. Another option would be to just not do class unloading this early; there is no particular reason to do it right after marking completed.
why have that exception?
Because live-region and effective-region are diff, and mixing them causes confusion.
What is an "effective-region" in this context? I do not understand this sentence.
I think the existence of the new comment "we should not access their header any more them..." demonstrates that it's not super obvious why the current code (before this PR) is problematic.
To me this change indicates that the sanity check code (this problematic statement is part of sanity check code - regular code does not use it) is doing things it should not. The original author (me I guess) correctly considered that we already unloaded classes super-early for some reason, wanted to have some extra check there, but then botched the refactoring (factoring out the obj_size_in_words calculation for the two uses).
That particular new comment is only to make it abundantly clear to not factor out any kind of obj_size calculation any more (I removed the second use in this change).
Maybe not adding the comment would have been better, as the marked_bytes == 0 predicate already indicates just that (and the second use of obj_size_in_words is gone)
Looking at the code again, another source for confusion is maybe wrong comment placement. I will improve these.
albertnetymk
left a comment
There was a problem hiding this comment.
this is going to change anyway and is imho a more appropriate time to reconsider this (and will probably naturally fix itself).
OK.
What is an "effective-region" in this context?
I meant effectively-empty region.
|
@tschatzl This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 83 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
Thanks @albertnetymk @walulyai for your reviews |
|
Going to push as commit 21d361e.
Your commit was automatically rebased without conflicts. |
Hi all,
please review this fix that removes the access to a partially unloaded (i.e. unlinked only)
Klassused for debug code inG1UpdateRemSetTrackingBeforeRebuild::distribute_marked_bytes.This starts to fail if metadata purging happens before the call to this methods (as https://bugs.openjdk.org/browse/JDK-8317809 suggests). The test gc/g1/humongousObjects/TestHumongousClassLoader.java starts to crash on linux-x86 with 100% reproduction because it more aggressively uncommits memory when purging metaspace.
The fix fixes the asserts to only access the klass when it should not be unloaded yet.
Testing: failing test case not failing any more, gha
Thanks,
Thomas
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16766/head:pull/16766$ git checkout pull/16766Update a local copy of the PR:
$ git checkout pull/16766$ git pull https://git.openjdk.org/jdk.git pull/16766/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 16766View PR using the GUI difftool:
$ git pr show -t 16766Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16766.diff
Webrev
Link to Webrev Comment