Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8319373: Serial: Refactor dirty cards scanning during Young GC #16492

Closed
wants to merge 22 commits into from

Conversation

albertnetymk
Copy link
Member

@albertnetymk albertnetymk commented Nov 3, 2023

Reading the new code directly is probably easier. The structure is quite similar to its counterpart in Parallel.

It's mostly perf-neutral, except when dirty cards are scarce. Using card_scan.java in JDK-8310031, I observed ~40% reduction in young-gc pause time with stride = 32 * 64.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8319373: Serial: Refactor dirty cards scanning during Young GC (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16492/head:pull/16492
$ git checkout pull/16492

Update a local copy of the PR:
$ git checkout pull/16492
$ git pull https://git.openjdk.org/jdk.git pull/16492/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16492

View PR using the GUI difftool:
$ git pr show -t 16492

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16492.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 3, 2023

👋 Welcome back ayang! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title 8319373 8319373: Serial: Refactor dirty cards scanning during Young GC Nov 3, 2023
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 3, 2023
@openjdk
Copy link

openjdk bot commented Nov 3, 2023

@albertnetymk The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Nov 3, 2023
@mlbridge
Copy link

mlbridge bot commented Nov 3, 2023

Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just getting started on review, and noticed a few minor things.


auto is_word_aligned = [] (const void* const p) {
return (((uintptr_t)p) % sizeof(Word)) == 0;
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have is_aligned in utilities/align.hpp. Please use that.

static_assert(clean_card_val() == (CardValue)-1, "inv");
constexpr Word clean_word = (Word)-1;

CardValue* i = start_card;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't much like naming a pointer variable "i". "i" is a really commonly used index variable name.

for (size_t j = 0; j < sizeof(Word); ++j) {
if (is_dirty(i + j)) {
return i + j;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of duplicating the byte search, this could just break out of the word search.

CardValue* const end_card) {
using Word = uintptr_t;

auto is_word_aligned = [] (const void* const p) {
Copy link
Contributor

@tschatzl tschatzl Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my opinion, feel free to ignore: don't use auto if unavoidable, in this case use the specific type bool. Apart from that I am not sure the gain of defining a helper method for simple single-use-single-line expression is worth the effort.

src/hotspot/share/gc/serial/cardTableRS.cpp Outdated Show resolved Hide resolved
return i_card;
}

// final obj in dirty-chunk crosses card-boundary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// final obj in dirty-chunk crosses card-boundary
// Last object in dirty chunk crosses card-boundary.

I think "last" is the more appropriate term compared to "final" as to me "final" indicates that this is the overall very last object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"last" can mean "previous" so I tried to avoid it.

return i_card;
}

// final card occupied by this obj
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// final card occupied by this obj
// Final card occupied by this obj. We can use this to return and continue our search because this object is imprecisely marked and is always looked at for references as a whole anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method should be renamed, its name is misleading. Or it should have additional documentation. It does not find the first clean card, but the first clean card after the recently processed object.

Also the objArray exception should be documented, see my suggestions, but it is maybe better to collect this information at the top.

I.e. in this case

Heap   | ObjectVeryLong                          |
Cards  |   x  |      |      |   x  |      |      |

x = dirty card

the method does not find the first clean card (the second card), but the last card covering the object (assuming it's not aligned).
This is not a problem because the caller cleans up the extra dirty marks, but still it is surprising.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added doc.

src/hotspot/share/gc/serial/cardTableRS.cpp Outdated Show resolved Hide resolved
using Word = uintptr_t;

auto is_word_aligned = [] (const void* const p) {
return is_aligned(p, sizeof(Word));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The include for is_aligned seems to be missing.

src/hotspot/share/gc/serial/cardTableRS.cpp Outdated Show resolved Hide resolved
Comment on lines +476 to +477
CardValue* const clear_limit_card = is_card_aligned(mr.end()) ? end_card - 1
: end_card - 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment similar to the one in the PSStripeShadowCardTable constructor here, otherwise this code will be too hard to understand in six months.

src/hotspot/share/gc/serial/cardTableRS.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/serial/cardTableRS.cpp Outdated Show resolved Hide resolved
static CardTable::CardValue* find_first_dirty_card(CardValue* const start_card,
CardValue* const end_card);
template<typename Func>
CardTable::CardValue* find_first_clean_card(CardValue* const start_card,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already mentioned that this method should probably be renamed or at least documented more exactly what it does with all its optimizations.

albertnetymk and others added 11 commits November 10, 2023 11:52
Co-authored-by: Thomas Schatzl <59967451+tschatzl@users.noreply.github.com>
Co-authored-by: Thomas Schatzl <59967451+tschatzl@users.noreply.github.com>
Co-authored-by: Thomas Schatzl <59967451+tschatzl@users.noreply.github.com>
Co-authored-by: Thomas Schatzl <59967451+tschatzl@users.noreply.github.com>
Co-authored-by: Thomas Schatzl <59967451+tschatzl@users.noreply.github.com>
Co-authored-by: Thomas Schatzl <59967451+tschatzl@users.noreply.github.com>
Co-authored-by: Thomas Schatzl <59967451+tschatzl@users.noreply.github.com>
Copy link
Contributor

@tschatzl tschatzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@openjdk
Copy link

openjdk bot commented Nov 13, 2023

@albertnetymk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8319373: Serial: Refactor dirty cards scanning during Young GC

Reviewed-by: kbarrett, tschatzl

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 11 new commits pushed to the master branch:

  • 21d361e: 8320525: G1: G1UpdateRemSetTrackingBeforeRebuild::distribute_marked_bytes accesses partially unloaded klass
  • dc256fb: 8320061: [nmt] Multiple issues with peak accounting
  • adad132: 8320767: Use := wherever possible in spec.gmk.in
  • 69c0b24: 8320714: java/util/Locale/LocaleProvidersRun.java and java/util/ResourceBundle/modules/visibility/VisibilityTest.java timeout after passing
  • 66ae6d5: 8320899: Select the correct Makefile when running make in build directory
  • ebbef62: 8320769: Remove ill-adviced "make install" target
  • 86bb804: 8320863: dsymutil command leaves around temporary directories
  • db7fedf: 8320358: GHA: ignore jdk* branches
  • e33b6c1: 8319437: NMT should show library names in call stacks
  • 2fae07f: 8319311: JShell Process Builder should be configurable
  • ... and 1 more: https://git.openjdk.org/jdk/compare/4bcda602668835c35e2ac6ff6702d15cd249bc2a...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 13, 2023

static void clear_cards(CardValue* start, CardValue* end);

static CardTable::CardValue* find_first_dirty_card(CardValue* const start_card,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CardTable qualification for return type isn't needed here, or a couple lines later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got compilation error without it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is weird. Which compiler(s)? Unqualified CardValue should be fine here, and is used in other places in the same signature. It would be needed in the (out of class) definition, where the scoping is different, but shouldn't be needed here (within the class definition). (That scoping issue could be removed by using trailing return type, which has different scope too, but that would be strange style in HotSpot.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From clang 14:

cardTableRS.cpp:349:1: error: unknown type name 'CardValue'
CardValue* CardTableRS::find_first_dirty_card(CardValue* const start_card,

From gcc 13:

cardTableRS.cpp:349:1: error: 'CardValue' does not name a type; did you mean 'JavaValue'?
  349 | CardValue* CardTableRS::find_first_dirty_card(CardValue* const start_card,

Can you build successfully without the prefix?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original comment was about the .hpp file. My followup comment noted that the qualification is needed in
the .cpp file, because of the different scoping (in the class definition body vs outside it). The error you are
mentioning is in the .cpp file. A leading return type is not in the scope of the function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand your request. Compiling works fine if I remove the prefix in header only. However, I think it's more consistent when the declaration in header and definition have the identical signature.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a number of ways in which a declaration and a separate definition may or even must differ.
I think this is a false consistency, and confusing because the qualifier is unnecessary and eye catching.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. I still believe the declaration and definition should be trivially duplicated, because they are semantically the same thing. Subjective, I guess.

while (i_card + sizeof(Word) <= end_card) {
Word* i_word = reinterpret_cast<Word*>(i_card);
if (*i_word != clean_word) {
// Found a clean card in this word; fall back to per-CardValue comparison.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/clean/non-clean/ or s/clean/dirty/.

using Word = uintptr_t;

static_assert(clean_card_val() == (CardValue)-1, "inv");
constexpr Word clean_word = (Word)-1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CardTable provides clean_card_row_val() for this sort of thing.


// Word comparison
while (i_card + sizeof(Word) <= end_card) {
Word* i_word = reinterpret_cast<Word*>(i_card);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe CardTable should provide a helper for accessing a "row" of cards?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g1RemSet.cpp also has sth like this. I will handle them together in another PR.

clear_cl.do_MemRegion(mr);
if (dirty_l == end_card) {
// No dirty cards to iterate.
break;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe s/break/return/ ?

Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more minor nits.

static_assert(clean_card_val() == (CardValue)-1, "inv");
constexpr Word clean_word = (Word)CardTable::clean_card_row_val();

CardValue* i_card = start_card;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the "i_" prefix here mean? Something like "cur_card" might be a better name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for "iterator".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I would have ever guessed that, which suggests it might not be an ideal naming choice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern regarding cur_card is that there is no contrasting prev_ or next_ in this context. The sole message I want to convey here is that we are iterating through cards. Using the conventional loop variable i plus card seems like the most natural approach, in my opinion. It is concise, so it doesn't introduce much overhead despite its frequent occurrence in the context. Therefore, I prefer the current name. By the way, the same style and convention are also employed in other methods in this PR. Additionally, similar naming conventions can be found in the already merged code of Parallel (PSCardTable) and G1 (G1ScanHRForRegionClosure).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur with Kim and would like to ask to rename these variables to use the cur_ prefix as is much more common. I do not see a connection of cur with linked list prev/next pointers; in the previous recent PRs that introduced this naming I (in hindsight wrongly) did not speak up because although I did not like it, nobody else objected either.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "cur" is a frequently used naming convention for iteration, irrespective of there being related
"prev" or "next". I don't like the similar naming in those other places you mention either. Like I said,
"i" usually indicates an integral/index variable, and using it otherwise is (at least to me) confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there is now a strong objection to this naming style, and I did not really like the i_ prefix in the other CRs either but did not speak up, I suggest we just use the existing common cur_ prefix here. From my point of view there is no relation of cur_ to prev/next either.

The other uses can be fixed up separately.

CardValue* const end_card) {
using Word = uintptr_t;

static_assert(clean_card_val() == (CardValue)-1, "inv");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that clean_card_row_val() is being used, why do we need this static_assert?

// Word comparison
while (i_card + sizeof(Word) <= end_card) {
Word* i_word = reinterpret_cast<Word*>(i_card);
if (*i_word != clean_word) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use clean_card_row_val() here directly? Rather than introducing the clean_word variable and
only using it once, here.

// Implemented word-iteration to skip long consecutive clean cards.
CardTable::CardValue* CardTableRS::find_first_dirty_card(CardValue* const start_card,
CardValue* const end_card) {
using Word = uintptr_t;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if Word was consistent with clean_card_row_val, to eliminate casts now and in
the future when you address my earlier comment about needing a clean-row accessor. So I suggest
using Word = decltype(clean_card_row_val());.
Maybe as part of the clean-row accessor also add a CardRow type alias.
Note that I think that type should be unsigned (as currently here, e.g. uintptr_t) rather than signed
(as currently provided by clean_card_row_val).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The similar card-scanning code in g1remset also defines Word. I will extract them together. This PR targets Serial-specific issue, so I wanna minimize touching the shared code, e.g. introducing a new CardWord type and operations around it.

using Word = uintptr_t;

static_assert(clean_card_val() == (CardValue)-1, "inv");
constexpr Word clean_word = (Word)CardTable::clean_card_row_val();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is clean_card_row_val qualified here?

static_assert(clean_card_val() == (CardValue)-1, "inv");
constexpr Word clean_word = (Word)CardTable::clean_card_row_val();

CardValue* i_card = start_card;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I would have ever guessed that, which suggests it might not be an ideal naming choice.


static CardValue* find_first_dirty_card(CardValue* const start_card,
CardValue* const end_card);
template<typename Func>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a blank line between these two function declarations.

static void clear_cards(CardValue* start, CardValue* end);

static CardValue* find_first_dirty_card(CardValue* const start_card,
CardValue* const end_card);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const qualifiers are noise here, having no effect. Similarly in find_first_clean_card. (And why isn't ct
similarly qualified there?) The second const qualifiers in is_dirty/clean don't add much, and it's not
typical usage, but at least they have an effect.

The const qualifiers in the implementations of find_first_xxx are fine, its an implementation detail
that those variables don't get modified. That implementation detail doesn't belong here though.

CardValue* find_first_clean_card(CardValue* const start_card,
CardValue* const end_card,
CardTableRS* ct,
Func&& object_start);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither rvalue references nor universal references are permitted in HotSpot code (yet). Usual style is to
take function arguments by value. The copy may not even matter (may be elided/optimized away), given
this function is called once and potentially inlined. Maybe declare this function inline to encourage that?
I'm waffling over rejecting this, since I think we ought to at least permit universal references, but that
hasn't been proposed yet.

Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@albertnetymk
Copy link
Member Author

Thanks for the review.

/integrate

@openjdk
Copy link

openjdk bot commented Nov 29, 2023

Going to push as commit 77d604a.
Since your change was applied there have been 22 commits pushed to the master branch:

  • 38cfb22: 8318706: Implement JEP 423: Region Pinning for G1
  • e44d4b2: 8320858: Move jpackage tests to tier3
  • 5dcf3a5: 8320715: Improve the tests of test/hotspot/jtreg/compiler/intrinsics/float16
  • 78b6c2b: 8320898: exclude compiler/vectorapi/reshape/TestVectorReinterpret.java on ppc64(le) platforms
  • 9a6ca23: 8320918: Fix errors in the built-in Catalog implementation
  • 5e1b771: 8316422: TestIntegerUnsignedDivMod.java triggers "invalid layout" assert in FrameValues::validate
  • a657aa3: 8320681: [macos] Test tools/jpackage/macosx/MacAppStoreJlinkOptionsTest.java timed out on macOS
  • 3ccd02f: 8320379: C2: Sort spilling/unspilling sequence for better ld/st merging into ldp/stp on AArch64
  • 2c4c6c9: 8320049: PKCS10 would not discard the cause when throw SignatureException on invalid key
  • f93b18f: 8320932: [BACKOUT] dsymutil command leaves around temporary directories
  • ... and 12 more: https://git.openjdk.org/jdk/compare/4bcda602668835c35e2ac6ff6702d15cd249bc2a...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Nov 29, 2023
@openjdk openjdk bot closed this Nov 29, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 29, 2023
@openjdk
Copy link

openjdk bot commented Nov 29, 2023

@albertnetymk Pushed as commit 77d604a.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@albertnetymk albertnetymk deleted the s1-young-gc branch November 29, 2023 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
3 participants