Skip to content

Conversation

@kdnilsen
Copy link
Contributor

@kdnilsen kdnilsen commented Sep 17, 2025

When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card. The existing code is not reliable following class unloading.

The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region. Above TAMS, all objects are presumed live and parsable.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8358735: GenShen: block_start() may be incorrect after class unloading (Bug - P3)

Reviewers

Contributors

  • Y. Srinivas Ramakrishna <ysr@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27353/head:pull/27353
$ git checkout pull/27353

Update a local copy of the PR:
$ git checkout pull/27353
$ git pull https://git.openjdk.org/jdk.git pull/27353/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27353

View PR using the GUI difftool:
$ git pr show -t 27353

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27353.diff

Using Webrev

Link to Webrev Comment

ysramakrishna and others added 20 commits June 6, 2025 01:48
be removed after debugging of a rare crash observed in pipeline testing.
…ock_start to aid better serviceability.

Might be placed under #ifdef ASSERT to avoid perff impact in release
builds.
assert under adverse conditions that the planned fix is expected to
correct.
be fixed before this is ready. In particular, fails reliably with
TestClone w/genshen (at least).
@kdnilsen kdnilsen marked this pull request as draft September 17, 2025 20:12
@kdnilsen
Copy link
Contributor Author

/author: ysramakrishna

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 17, 2025

👋 Welcome back kdnilsen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 17, 2025

@kdnilsen This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8358735: GenShen: block_start() may be incorrect after class unloading

Co-authored-by: Y. Srinivas Ramakrishna <ysr@openjdk.org>
Reviewed-by: wkemper

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 151 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Sep 17, 2025

@kdnilsen The following labels will be automatically applied to this pull request:

  • hotspot-gc
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot-gc hotspot-gc-dev@openjdk.org shenandoah shenandoah-dev@openjdk.org labels Sep 17, 2025
@kdnilsen
Copy link
Contributor Author

kdnilsen commented Nov 6, 2025

/contributor add @ysramakrishna

@openjdk
Copy link

openjdk bot commented Nov 6, 2025

@kdnilsen
Contributor Y. Srinivas Ramakrishna <ysr@openjdk.org> successfully added.

Copy link
Member

@ysramakrishna ysramakrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a few older comments had not been published previously. I am flushing those and will take a fresh look at the review.

HeapWord* get_next_marked_addr(const HeapWord* addr,
const HeapWord* limit) const;

// Return the last marked address in the range [limit, addr], or addr+1 if none found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symmetry would have preferred (limit, addr] as the range with limit if none found.
However, may be usage of this method prefers the present shape?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. The reason for the asymmetry is that forward-looking limit may not be a legitimate address (may be end of heap), whereas backward looking limit is a legitimate address.

template<bm_word_t flip, bool aligned_left>
inline idx_t get_prev_bit_impl(idx_t l_index, idx_t r_index) const;

inline idx_t get_next_one_offset(idx_t l_index, idx_t r_index) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document analogous to line 131.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I overlooked this request in prior response. Done.

inline idx_t get_next_one_offset(idx_t l_index, idx_t r_index) const;

void clear_large_range (idx_t beg, idx_t end);
// Search for last one in the range [l_index, r_index). Return r_index if not found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symmetry arguments wrt spec for get_next_one_offset may have preferred range (l_index, r_index], returning l_index if none found. May be its (transitive) usage prefers this shape? (See similar comment at line 180.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above regarding asymmetry. It is by design, due to shape of the data.

// Search for last one in the range [l_index, r_index). Return r_index if not found.
inline idx_t get_prev_one_offset(idx_t l_index, idx_t r_index) const;

void clear_large_range(idx_t beg, idx_t end);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

l_index <-> beg
r_index <-> end

in either comment or formal args to make them mutually consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment here as well.

// common.
HeapWord* p = _scc->block_start(dirty_l);
assert(ctx != nullptr || heap->old_generation()->is_parsable(), "Error");
HeapWord* p = _scc->first_object_start(dirty_l, ctx, tams, dirty_r);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing ctx, tams, and dirty_r into this method seems interesting. Let's see how they are used.

// If not null, ctx holds the complete marking context of the old generation. If null,
// we expect that the marking context isn't available and the crossing maps are valid.
// Note that crossing maps may be invalid following class unloading and before dead
// or unloaded objects have been coalesced and filled (updating the crossing maps).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good comment!

What's still not clear is why tams and last_relevant_card_index are passed here. Does it reduce the work in the caller? I'd expect this to just return the first object on the card index or null if no such object exists. I realize ctx is used when one must consult the marking context in preference to the "crossing maps". The relevance of the last 2 arguments isn't clear from this documentation comment.

May be I'll see why these are passed in when I look at the method definition, but I suspect there may be some leakage of abstraction & functionality here between caller and callee.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for identifying this "confusion". I'm making an attempt to improve documentation for this comment.


// if marking context is valid and we are below tams, we use the marking bit map to find the first marked object that
// intersects with this card, and if no such object exists, we return null
if ((ctx != nullptr) && (left < tams)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the caller should check if left >= tams and short-circuit rather than have this method do that work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That comment is wrong, which is what caused you to request the alternative semantics for this function. Your comments and questions motivated me to rewrite the comments describing the behavior of this function. Rewriting the comments helped me realize the API was a bit ill-defined. I made some improvements to the behavior so that the definition could be more clearly defined. The new implementation now passes all tests again.

@kdnilsen kdnilsen changed the title 8358735: GenShen: bug in #undef'd code in block_start() 8358735: GenShen: block_start() may be incorrect after class unloading Nov 6, 2025
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 6, 2025
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Nov 7, 2025
@kdnilsen
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Nov 11, 2025

@kdnilsen This pull request has not yet been marked as ready for integration.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 11, 2025
@kdnilsen
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Nov 11, 2025

Going to push as commit 8531fa1.
Since your change was applied there have been 151 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Nov 11, 2025
@openjdk openjdk bot closed this Nov 11, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 11, 2025
@openjdk
Copy link

openjdk bot commented Nov 11, 2025

@kdnilsen Pushed as commit 8531fa1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Copy link
Member

@ysramakrishna ysramakrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still going through this, but it seems as if there's a bunch of potential clean-ups to do here. I realize this has already been integrated; I can may be create a separate task to perhaps clean up some of the questions raised in this.

Since there are quite a few comments now, I am going to flush these for now as a record of some of my thoughts and create a separate task in which I can see if these concerns are real and if the code can be somewhat simplified in a few places.

Nothing specific to do here at this time in response to these stream-of-consciousness comments.

// Search for last one in the range [l_index, r_index). Return r_index if not found.
inline idx_t get_prev_one_offset(idx_t l_index, idx_t r_index) const;

void clear_large_range(idx_t beg, idx_t end);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

l_index <-> beg
r_index <-> end

in either comment or formal args to make them mutually consistent.

Comment on lines +165 to +168
if (end_addr <= left) {
// The range of addresses to be scanned is empty
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this happen? We start off with dirty_l to the left of dirty_r, and with dirty_r having started at a card that would correspond to end_addr. I am not convinced this check is needed. I'd rather assert here that: assert(left <= end_addr, "left should remain left of end_addr established above");

// on a very large object, i.e. one spanning multiple cards,
// if its head card is dirty. If not, (i.e. its head card is clean)
// we'll call it each time we process a new dirty range on the object.
// This is always the case for large object arrays, which are typically more
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of This is always the case ... may be we can say The latter is aways the case ...?

(Mea culpa for the old comment.)

// common.
HeapWord* p = _scc->block_start(dirty_l);
assert(ctx != nullptr || heap->old_generation()->is_parsable(), "Error");
HeapWord* p = _scc->first_object_start(dirty_l, ctx, tams, right);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Wondering if we need to pass both tams and right, or just the max of the two. Will look at first_object_start().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, looks like at this point we might potentially have ctx == nullptr and tams == nullptr. I wonder if we can do better here in terms of passing a sensible single right and dispense with passing tams entirely? Let me go back and look at the implementation of the method again.

// over clusters processed by different workers, with each worked responsible
// for scanning the portion of the obj-array overlapping the dirty cards in
// its cluster.
// 3. Non-array objects are precisely dirtied by the interpreter and the compilers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should say "imprecisely" at line 296, I think?

assert(!region->is_humongous(), "Use region->humongous_start_region() instead");
#endif

HeapWord* right = MIN2(region->top(), end_range_of_interest);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a safe thing to do, but doesn't the caller already establish the invariant that
region->top() >= end_range_of_interest ? Can we just assert that instead of doing the clip/clamp? (And rename the formal parameter name from end_range_of_interest to right?)

If so, we might also want to change the name of the formal parameter from card_index to left, and change it to be a (card-aligned) heap address for symmetry in the API.

Comment on lines +252 to +261
HeapWord* end_of_search_next = MIN2(right, tams);
size_t last_relevant_card_index;
if (end_range_of_interest == _end_of_heap) {
last_relevant_card_index = _rs->card_index_for_addr(end_range_of_interest - 1);
} else {
last_relevant_card_index = _rs->card_index_for_addr(end_range_of_interest);
if (_rs->addr_for_card_index(last_relevant_card_index) == end_range_of_interest) {
last_relevant_card_index--;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is necessary. I'd just adjust the caller so this is ensured, avoiding this computation here. I think the caller has the last dirty card address and can just use that? I realize there's a bit of an issue with the address for tams not necessarily being card-aligned, but I think we should be able to deal with that in the caller as well once we remember that all of this always happens within a single region. (We can add such an assertion so that future adjustments do not render this assumption invalid if the code is changed/adjusted later.)

Comment on lines +62 to +70
#ifdef ASSERT
ShenandoahHeap* heap = ShenandoahHeap::heap();
ShenandoahHeapRegion* r = heap->heap_region_containing(addr);
ShenandoahMarkingContext* ctx = heap->marking_context();
HeapWord* tams = ctx->top_at_mark_start(r);
assert(limit != nullptr, "limit must not be null");
assert(limit >= r->bottom(), "limit must be more than bottom");
assert(addr <= tams, "addr must be less than TAMS");
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense for these checks to move to the caller? It appears to me to be a leakage of abstraction to test these conditions here. We should be able to return the address for the marked bit found without interpreting the semantics of the bits themselves?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that this is the case for the get_next_... version below as well; if my comment makes some sense, this can be addressed separately.

Perhaps frugality in testing the conditions required us to site these assertions here, which I kind of understand, although the right thing in that case is to have the wrapper class, viz. ShenandoahMarkingContext make those checks before calling here.

#endif

HeapWord* right = MIN2(region->top(), end_range_of_interest);
HeapWord* end_of_search_next = MIN2(right, tams);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the caller ensure that tams is always valid (e.g. when ctx == nullptr)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller seems to allow for tams==nullptr and ctx==nullptr. In that case wouldn't we get end_of_search_next==nullptr?

// Expects to be called for a card affiliated with the old generation in
// generational mode.
HeapWord* block_start(size_t card_index) const;
// Given a card_index, return the starting address of the first live object in the heap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface/API comment should describe a Dijkstra-like pre- and post-condition. i.e. if these conditions are satisfied, we'll give you this result.

A description of what the method does (i.e. how it implements the functionality) belongs in the method implementation.

Here, the two are conflated making the interface description unnecessarily long and convoluted. Sometimes this might indicate that the interface isn't as frugal as it should be.

I might state this more succinctly as follows:

// Given:
// `card_index`: a valid index of a card in the old generation
// `ctx` : a valid marking context for the old generation
// `tams` : a valid top-at-mark-start address for the old generation
//                region in which the card_index is located
// `end_range_of_interest` : an address in that region beyond which we need
//                not locate an object
//
// Returns:
// the address of the object, if any, at the least address that overlaps with
// the address range between the start of card_index and end_range_of_interest,
// or nullptr if no such object exists in the given range.

Once you look at the spec in this manner, you realize that the first and last arguments go together and define a suitable address range, and the second and third arguments go together and provide a context. This allows you to divide the assertion checking and call interface most optimally between caller and callee.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated shenandoah shenandoah-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

3 participants