-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8358735: GenShen: block_start() may be incorrect after class unloading #27353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
be removed after debugging of a rare crash observed in pipeline testing.
…ock_start to aid better serviceability. Might be placed under #ifdef ASSERT to avoid perff impact in release builds.
assert under adverse conditions that the planned fix is expected to correct.
be fixed before this is ready. In particular, fails reliably with TestClone w/genshen (at least).
|
/author: ysramakrishna |
|
👋 Welcome back kdnilsen! A progress list of the required criteria for merging this PR into |
|
@kdnilsen This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 151 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
This reverts commit 80198ab.
|
/contributor add @ysramakrishna |
|
@kdnilsen |
ysramakrishna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like a few older comments had not been published previously. I am flushing those and will take a fresh look at the review.
| HeapWord* get_next_marked_addr(const HeapWord* addr, | ||
| const HeapWord* limit) const; | ||
|
|
||
| // Return the last marked address in the range [limit, addr], or addr+1 if none found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Symmetry would have preferred (limit, addr] as the range with limit if none found.
However, may be usage of this method prefers the present shape?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. The reason for the asymmetry is that forward-looking limit may not be a legitimate address (may be end of heap), whereas backward looking limit is a legitimate address.
| template<bm_word_t flip, bool aligned_left> | ||
| inline idx_t get_prev_bit_impl(idx_t l_index, idx_t r_index) const; | ||
|
|
||
| inline idx_t get_next_one_offset(idx_t l_index, idx_t r_index) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document analogous to line 131.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I overlooked this request in prior response. Done.
| inline idx_t get_next_one_offset(idx_t l_index, idx_t r_index) const; | ||
|
|
||
| void clear_large_range (idx_t beg, idx_t end); | ||
| // Search for last one in the range [l_index, r_index). Return r_index if not found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Symmetry arguments wrt spec for get_next_one_offset may have preferred range (l_index, r_index], returning l_index if none found. May be its (transitive) usage prefers this shape? (See similar comment at line 180.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above regarding asymmetry. It is by design, due to shape of the data.
| // Search for last one in the range [l_index, r_index). Return r_index if not found. | ||
| inline idx_t get_prev_one_offset(idx_t l_index, idx_t r_index) const; | ||
|
|
||
| void clear_large_range(idx_t beg, idx_t end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documentation comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
l_index <-> beg
r_index <-> end
in either comment or formal args to make them mutually consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment here as well.
| // common. | ||
| HeapWord* p = _scc->block_start(dirty_l); | ||
| assert(ctx != nullptr || heap->old_generation()->is_parsable(), "Error"); | ||
| HeapWord* p = _scc->first_object_start(dirty_l, ctx, tams, dirty_r); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing ctx, tams, and dirty_r into this method seems interesting. Let's see how they are used.
| // If not null, ctx holds the complete marking context of the old generation. If null, | ||
| // we expect that the marking context isn't available and the crossing maps are valid. | ||
| // Note that crossing maps may be invalid following class unloading and before dead | ||
| // or unloaded objects have been coalesced and filled (updating the crossing maps). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good comment!
What's still not clear is why tams and last_relevant_card_index are passed here. Does it reduce the work in the caller? I'd expect this to just return the first object on the card index or null if no such object exists. I realize ctx is used when one must consult the marking context in preference to the "crossing maps". The relevance of the last 2 arguments isn't clear from this documentation comment.
May be I'll see why these are passed in when I look at the method definition, but I suspect there may be some leakage of abstraction & functionality here between caller and callee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for identifying this "confusion". I'm making an attempt to improve documentation for this comment.
|
|
||
| // if marking context is valid and we are below tams, we use the marking bit map to find the first marked object that | ||
| // intersects with this card, and if no such object exists, we return null | ||
| if ((ctx != nullptr) && (left < tams)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the caller should check if left >= tams and short-circuit rather than have this method do that work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That comment is wrong, which is what caused you to request the alternative semantics for this function. Your comments and questions motivated me to rewrite the comments describing the behavior of this function. Rewriting the comments helped me realize the API was a bit ill-defined. I made some improvements to the behavior so that the definition could be more clearly defined. The new implementation now passes all tests again.
|
/integrate |
|
@kdnilsen This pull request has not yet been marked as ready for integration. |
|
/integrate |
|
Going to push as commit 8531fa1.
Your commit was automatically rebased without conflicts. |
ysramakrishna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still going through this, but it seems as if there's a bunch of potential clean-ups to do here. I realize this has already been integrated; I can may be create a separate task to perhaps clean up some of the questions raised in this.
Since there are quite a few comments now, I am going to flush these for now as a record of some of my thoughts and create a separate task in which I can see if these concerns are real and if the code can be somewhat simplified in a few places.
Nothing specific to do here at this time in response to these stream-of-consciousness comments.
| // Search for last one in the range [l_index, r_index). Return r_index if not found. | ||
| inline idx_t get_prev_one_offset(idx_t l_index, idx_t r_index) const; | ||
|
|
||
| void clear_large_range(idx_t beg, idx_t end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
l_index <-> beg
r_index <-> end
in either comment or formal args to make them mutually consistent.
| if (end_addr <= left) { | ||
| // The range of addresses to be scanned is empty | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this happen? We start off with dirty_l to the left of dirty_r, and with dirty_r having started at a card that would correspond to end_addr. I am not convinced this check is needed. I'd rather assert here that: assert(left <= end_addr, "left should remain left of end_addr established above");
| // on a very large object, i.e. one spanning multiple cards, | ||
| // if its head card is dirty. If not, (i.e. its head card is clean) | ||
| // we'll call it each time we process a new dirty range on the object. | ||
| // This is always the case for large object arrays, which are typically more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of This is always the case ... may be we can say The latter is aways the case ...?
(Mea culpa for the old comment.)
| // common. | ||
| HeapWord* p = _scc->block_start(dirty_l); | ||
| assert(ctx != nullptr || heap->old_generation()->is_parsable(), "Error"); | ||
| HeapWord* p = _scc->first_object_start(dirty_l, ctx, tams, right); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Wondering if we need to pass both tams and right, or just the max of the two. Will look at first_object_start().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, looks like at this point we might potentially have ctx == nullptr and tams == nullptr. I wonder if we can do better here in terms of passing a sensible single right and dispense with passing tams entirely? Let me go back and look at the implementation of the method again.
| // over clusters processed by different workers, with each worked responsible | ||
| // for scanning the portion of the obj-array overlapping the dirty cards in | ||
| // its cluster. | ||
| // 3. Non-array objects are precisely dirtied by the interpreter and the compilers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should say "imprecisely" at line 296, I think?
| assert(!region->is_humongous(), "Use region->humongous_start_region() instead"); | ||
| #endif | ||
|
|
||
| HeapWord* right = MIN2(region->top(), end_range_of_interest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a safe thing to do, but doesn't the caller already establish the invariant that
region->top() >= end_range_of_interest ? Can we just assert that instead of doing the clip/clamp? (And rename the formal parameter name from end_range_of_interest to right?)
If so, we might also want to change the name of the formal parameter from card_index to left, and change it to be a (card-aligned) heap address for symmetry in the API.
| HeapWord* end_of_search_next = MIN2(right, tams); | ||
| size_t last_relevant_card_index; | ||
| if (end_range_of_interest == _end_of_heap) { | ||
| last_relevant_card_index = _rs->card_index_for_addr(end_range_of_interest - 1); | ||
| } else { | ||
| last_relevant_card_index = _rs->card_index_for_addr(end_range_of_interest); | ||
| if (_rs->addr_for_card_index(last_relevant_card_index) == end_range_of_interest) { | ||
| last_relevant_card_index--; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is necessary. I'd just adjust the caller so this is ensured, avoiding this computation here. I think the caller has the last dirty card address and can just use that? I realize there's a bit of an issue with the address for tams not necessarily being card-aligned, but I think we should be able to deal with that in the caller as well once we remember that all of this always happens within a single region. (We can add such an assertion so that future adjustments do not render this assumption invalid if the code is changed/adjusted later.)
| #ifdef ASSERT | ||
| ShenandoahHeap* heap = ShenandoahHeap::heap(); | ||
| ShenandoahHeapRegion* r = heap->heap_region_containing(addr); | ||
| ShenandoahMarkingContext* ctx = heap->marking_context(); | ||
| HeapWord* tams = ctx->top_at_mark_start(r); | ||
| assert(limit != nullptr, "limit must not be null"); | ||
| assert(limit >= r->bottom(), "limit must be more than bottom"); | ||
| assert(addr <= tams, "addr must be less than TAMS"); | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make more sense for these checks to move to the caller? It appears to me to be a leakage of abstraction to test these conditions here. We should be able to return the address for the marked bit found without interpreting the semantics of the bits themselves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that this is the case for the get_next_... version below as well; if my comment makes some sense, this can be addressed separately.
Perhaps frugality in testing the conditions required us to site these assertions here, which I kind of understand, although the right thing in that case is to have the wrapper class, viz. ShenandoahMarkingContext make those checks before calling here.
| #endif | ||
|
|
||
| HeapWord* right = MIN2(region->top(), end_range_of_interest); | ||
| HeapWord* end_of_search_next = MIN2(right, tams); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the caller ensure that tams is always valid (e.g. when ctx == nullptr)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller seems to allow for tams==nullptr and ctx==nullptr. In that case wouldn't we get end_of_search_next==nullptr?
| // Expects to be called for a card affiliated with the old generation in | ||
| // generational mode. | ||
| HeapWord* block_start(size_t card_index) const; | ||
| // Given a card_index, return the starting address of the first live object in the heap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface/API comment should describe a Dijkstra-like pre- and post-condition. i.e. if these conditions are satisfied, we'll give you this result.
A description of what the method does (i.e. how it implements the functionality) belongs in the method implementation.
Here, the two are conflated making the interface description unnecessarily long and convoluted. Sometimes this might indicate that the interface isn't as frugal as it should be.
I might state this more succinctly as follows:
// Given:
// `card_index`: a valid index of a card in the old generation
// `ctx` : a valid marking context for the old generation
// `tams` : a valid top-at-mark-start address for the old generation
// region in which the card_index is located
// `end_range_of_interest` : an address in that region beyond which we need
// not locate an object
//
// Returns:
// the address of the object, if any, at the least address that overlaps with
// the address range between the start of card_index and end_range_of_interest,
// or nullptr if no such object exists in the given range.
Once you look at the spec in this manner, you realize that the first and last arguments go together and define a suitable address range, and the second and third arguments go together and provide a context. This allows you to divide the assertion checking and call interface most optimally between caller and callee.
When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading.
The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable.
Progress
Issue
Reviewers
Contributors
<ysr@openjdk.org>Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27353/head:pull/27353$ git checkout pull/27353Update a local copy of the PR:
$ git checkout pull/27353$ git pull https://git.openjdk.org/jdk.git pull/27353/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27353View PR using the GUI difftool:
$ git pr show -t 27353Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27353.diff
Using Webrev
Link to Webrev Comment