Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8345732: Provide helpers for using PartialArrayState #22622

Closed
wants to merge 14 commits into from

Conversation

kimbarrett
Copy link

@kimbarrett kimbarrett commented Dec 6, 2024

Please review this change that introduces two new helper classes to simplify
the usage of PartialArrayStates to manage splitting the processing of large
object arrays into parallelizable chunks. G1 and Parallel young GCs are
changed to use this new mechanism.

PartialArrayTaskStats is used to collect and report statistics related to
array splitting. It replaces the direct implementation in PSPromotionManager,
and is now also used by G1 young GCs.

PartialArraySplitter packages up most of the work involved in splitting and
processing tasks. It provides task allocation and release, enqueuing, chunk
claiming, and statistics tracking. It does this by encapsulating existing
objects and functionality. Using array splitting is mostly reduced to calling
the splitter's start function and then calling it's step function to process
partial states. This substantially reduces the amount of code for each client
to perform this work.

Testing: mach5 tier1-5

Manually ran some test programs with each of G1 and Parallel, with taskqueue
stats logging enabled, and checked that the logged statistics looked okay.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8345732: Provide helpers for using PartialArrayState (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22622/head:pull/22622
$ git checkout pull/22622

Update a local copy of the PR:
$ git checkout pull/22622
$ git pull https://git.openjdk.org/jdk.git pull/22622/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 22622

View PR using the GUI difftool:
$ git pr show -t 22622

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22622.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 6, 2024

👋 Welcome back kbarrett! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 6, 2024

@kimbarrett This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8345732: Provide helpers for using PartialArrayState

Reviewed-by: tschatzl, ayang, zgu, iwalulya

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 22 new commits pushed to the master branch:

  • 484229e: 8346306: Unattached thread can cause crash during VM exit if it calls wait_if_vm_exited
  • b0c40aa: 8340401: DcmdMBeanPermissionsTest.java and SystemDumpMapTest.java fail with assert(_stack_base != nullptr) failed: Sanity check
  • 6b89954: 8346475: RISC-V: Small improvement for MacroAssembler::ctzc_bit
  • 00d8407: 8346016: Problemlist vm/mlvm/indy/func/jvmti/mergeCP_indy2manyDiff_a in virtual thread mode
  • 5db0a13: 8346132: fallbacklinker.c failed compilation due to unused variable
  • 5590669: 8346570: SM cleanup of tests for Beans and Serialization
  • c8e94ab: 8346532: XXXVector::rearrangeTemplate misses null check
  • f7f2b42: 8346300: Add @test annotation to TCKZoneId.test_constant_OLD_IDS_POST_2024b test
  • a0b7c4f: 8346324: javax/swing/JScrollBar/4865918/bug4865918.java fails in CI
  • 8efc558: 8346378: Cannot use DllMain in libnet for static builds
  • ... and 12 more: https://git.openjdk.org/jdk/compare/4f44cf6bf2423a57a841be817f348e3b1e88f0eb...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 6, 2024
@openjdk
Copy link

openjdk bot commented Dec 6, 2024

@kimbarrett The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Dec 6, 2024
@mlbridge
Copy link

mlbridge bot commented Dec 6, 2024

Webrevs

@albertnetymk
Copy link
Member

/cc hotspot-gc

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Dec 11, 2024
@openjdk
Copy link

openjdk bot commented Dec 11, 2024

@albertnetymk
The hotspot-gc label was successfully added.

Copy link
Contributor

@tschatzl tschatzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from these comments not being in the right place, seems good.

Comment on lines 44 to 45
// Push any needed partial scan tasks. Pushed before processing the initial
// chunk to allow other workers to steal while we're processing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment (last two lines) now imo better belongs to where this method is called. Same with similar comment in step().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest the comment does belong here, but could perhaps be
written more clearly. But on further consideration, I don't think this comment
is needed at all. That behavior is the whole point of the splitter class, as
somewhat discussed in the comments in the header. I've expanded the comments
there to be more explicit.

Also, I really don't want to need to be adding comments about this to each
current and future caller. Part of the point of this class is to minimize the
amount of duplication among clients, and needing (near) duplicated comments
would count against that.

Copy link
Member

@walulyai walulyai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Minor nit:


template<typename Queue>
PartialArraySplitter::Step
PartialArraySplitter::step(PartialArrayState* state, Queue* queue, bool stolen) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably easier to read if we rename to claim, step is used as noun in many other places

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the suggested name change.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 13, 2024
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Dec 13, 2024
#else
#define TASKQUEUE_STATS_ONLY(code)
#endif // TASKQUEUE_STATS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Fixed.

Comment on lines 74 to 77
void inc_split(size_t n = 1) { _split += n; }
void inc_pushed(size_t n = 1) { _pushed += n; }
void inc_stolen(size_t n = 1) { _stolen += n; }
void inc_processed(size_t n = 1) { _processed += n; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed through callers of these, but can't find a strong reason to use default-arg-value here. Will there be more call-sites that justify this usage?

Copy link
Author

@kimbarrett kimbarrett Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, inc_pushed needs an argument while others don't. Given this stats object is likely
mostly encapsulated in and modified by the splitter object, that might always be the case for these
functions. Though consistency has some benefit, maybe not here? I'll wire in the usage, and we
can adjust later if needed.

};

template<typename StatsAccess>
void PartialArrayTaskStats::log_set(uint num_stats,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be merged with its declaration? Seems kind of odd that these duplicates (method signature) are next to each other.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would implicitly declare it inline, which doesn't seem particularly
desirable here. And it doesn't seem worth the overhead of splitting out into
a .inline.hpp file. (That would let the logging includes be moved there,
rather than here in the .hpp file. But that seems like a small benefit, since
I don't think there are going to be that many includes of this file.)

But the implicit inlining probably doesn't really matter after all, since the
access function is probably different in every use, so we'll so we'll have 1-1
uses to instantiations anyway. So sure, merging.

TASKQUEUE_STATS_ONLY(PartialArrayTaskStats _stats;)

public:
explicit PartialArraySplitter(PartialArrayStateManager* manager,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why explicit for a method that has two args.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to remove when 2nd argument added. Originally that number from the manager, but
a potentially long-lived and reused manager with dynamic selection of worker threads made
that wrong.

Copy link
Member

@albertnetymk albertnetymk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor suggestion.


// Result type for claim(), carrying multiple values. Provides the claimed
// chunk's start and end array indices.
struct Claim {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel Chunk is a better name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Chunk is overly generic and used a lot elsewhere. It could just as
easily be Region (e.g. the "claimed region" instead of the "claimed chunk").
I think the "claim-ness" is the important feature here.

//
// title: A string title for the table.
template<typename StatsAccess>
static void log_set(uint num_stats, StatsAccess access, const char* title) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through all its call sites, I believe print_stats is more readable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name log_set was chosen to suggest that it does "UL logging", and to
indicate that it is for dealing with a set of stats objects. I think
print_stats loses both of those cues and is less clear because of that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "set" more than important than "stats" in "set of stats objects"? If "UL logging" is critical, "log_stats" would be better. When I first read this name, I thought it's related to "set" as in "getter/setter" of log...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"stats" is redundent here. Recall this is a static function. A client call is
going to look like PartialArrayTaskStats::log_set(...), so it's already
obvious it's related to "stats" at the call site.

A value assigning function would have a "set_" prefix. Using a "_set" suffix
for that would be really weird and non-idiomatic (and a reader would be quite
right to complain about such).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel that the redundancy here is bad, since the first two args are tied to "stats". OTOH, I find the trailing "set" super confusing. This function is to log/print multiple stats, and the most intuitive choice would have been "log/print" + "stats", because it directly communicates the action being performed (logging stats).

Emphasizing the collective noun instead of the actual noun seems odd. YMMV.

PartialArraySplitter::Claim
PartialArraySplitter::claim(PartialArrayState* state, Queue* queue, bool stolen) {
#if TASKQUEUE_STATS
if (stolen) _stats.inc_stolen();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking it into multiple lines make the control flow more explicit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stylistic difference has been discussed at length in the past.

}

void PartialArrayTaskStats::reset() {
*this = PartialArrayTaskStats();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do sth like static_assert(std::is_trivially_copyable<PartialArrayTaskStats>::value) here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean is_trivially_assignable. I don't think it's a useful
assertion here. Depending on details of the class, one might reasonably
implement such an operation in the same way even if it isn't trivially
assignable.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 17, 2024
Copy link
Contributor

@zhengyu123 zhengyu123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kimbarrett
Copy link
Author

Thanks all for the reviews.

@kimbarrett
Copy link
Author

/integrate

@openjdk
Copy link

openjdk bot commented Dec 19, 2024

Going to push as commit 2344a1a.
Since your change was applied there have been 25 commits pushed to the master branch:

  • 572ce26: 8345266: java/util/concurrent/locks/StampedLock/OOMEInStampedLock.java JTREG_TEST_THREAD_FACTORY=Virtual fails with OOME
  • f6e7713: 8339356: Test javax/net/ssl/SSLSocket/Tls13PacketSize.java failed with java.net.SocketException: An established connection was aborted by the software in your host machine
  • 23d6f74: 8346463: Add test coverage for deploying the default provider as a module
  • 484229e: 8346306: Unattached thread can cause crash during VM exit if it calls wait_if_vm_exited
  • b0c40aa: 8340401: DcmdMBeanPermissionsTest.java and SystemDumpMapTest.java fail with assert(_stack_base != nullptr) failed: Sanity check
  • 6b89954: 8346475: RISC-V: Small improvement for MacroAssembler::ctzc_bit
  • 00d8407: 8346016: Problemlist vm/mlvm/indy/func/jvmti/mergeCP_indy2manyDiff_a in virtual thread mode
  • 5db0a13: 8346132: fallbacklinker.c failed compilation due to unused variable
  • 5590669: 8346570: SM cleanup of tests for Beans and Serialization
  • c8e94ab: 8346532: XXXVector::rearrangeTemplate misses null check
  • ... and 15 more: https://git.openjdk.org/jdk/compare/4f44cf6bf2423a57a841be817f348e3b1e88f0eb...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 19, 2024
@openjdk openjdk bot closed this Dec 19, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 19, 2024
@openjdk
Copy link

openjdk bot commented Dec 19, 2024

@kimbarrett Pushed as commit 2344a1a.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@kimbarrett kimbarrett deleted the pa-splitter branch December 19, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants