-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8345732: Provide helpers for using PartialArrayState #22622
Conversation
👋 Welcome back kbarrett! A progress list of the required criteria for merging this PR into |
@kimbarrett This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 22 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@kimbarrett The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
/cc hotspot-gc |
@albertnetymk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from these comments not being in the right place, seems good.
// Push any needed partial scan tasks. Pushed before processing the initial | ||
// chunk to allow other workers to steal while we're processing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment (last two lines) now imo better belongs to where this method is called. Same with similar comment in step()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to suggest the comment does belong here, but could perhaps be
written more clearly. But on further consideration, I don't think this comment
is needed at all. That behavior is the whole point of the splitter class, as
somewhat discussed in the comments in the header. I've expanded the comments
there to be more explicit.
Also, I really don't want to need to be adding comments about this to each
current and future caller. Part of the point of this class is to minimize the
amount of duplication among clients, and needing (near) duplicated comments
would count against that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Minor nit:
|
||
template<typename Queue> | ||
PartialArraySplitter::Step | ||
PartialArraySplitter::step(PartialArrayState* state, Queue* queue, bool stolen) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably easier to read if we rename to claim
, step is used as noun in many other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the suggested name change.
#else | ||
#define TASKQUEUE_STATS_ONLY(code) | ||
#endif // TASKQUEUE_STATS | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated definition in TaskQueue.hpp
https://github.com/openjdk/jdk/blob/ab1dbd4089a1a15bdf1b6b39994d5b1faacc40ab/src/hotspot/share/gc/shared/taskqueue.hpp#L39-51 should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Fixed.
void inc_split(size_t n = 1) { _split += n; } | ||
void inc_pushed(size_t n = 1) { _pushed += n; } | ||
void inc_stolen(size_t n = 1) { _stolen += n; } | ||
void inc_processed(size_t n = 1) { _processed += n; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed through callers of these, but can't find a strong reason to use default-arg-value here. Will there be more call-sites that justify this usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, inc_pushed needs an argument while others don't. Given this stats object is likely
mostly encapsulated in and modified by the splitter object, that might always be the case for these
functions. Though consistency has some benefit, maybe not here? I'll wire in the usage, and we
can adjust later if needed.
}; | ||
|
||
template<typename StatsAccess> | ||
void PartialArrayTaskStats::log_set(uint num_stats, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be merged with its declaration? Seems kind of odd that these duplicates (method signature) are next to each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would implicitly declare it inline, which doesn't seem particularly
desirable here. And it doesn't seem worth the overhead of splitting out into
a .inline.hpp file. (That would let the logging includes be moved there,
rather than here in the .hpp file. But that seems like a small benefit, since
I don't think there are going to be that many includes of this file.)
But the implicit inlining probably doesn't really matter after all, since the
access function is probably different in every use, so we'll so we'll have 1-1
uses to instantiations anyway. So sure, merging.
TASKQUEUE_STATS_ONLY(PartialArrayTaskStats _stats;) | ||
|
||
public: | ||
explicit PartialArraySplitter(PartialArrayStateManager* manager, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why explicit
for a method that has two args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to remove when 2nd argument added. Originally that number from the manager, but
a potentially long-lived and reused manager with dynamic selection of worker threads made
that wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor suggestion.
|
||
// Result type for claim(), carrying multiple values. Provides the claimed | ||
// chunk's start and end array indices. | ||
struct Claim { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel Chunk
is a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Chunk is overly generic and used a lot elsewhere. It could just as
easily be Region (e.g. the "claimed region" instead of the "claimed chunk").
I think the "claim-ness" is the important feature here.
// | ||
// title: A string title for the table. | ||
template<typename StatsAccess> | ||
static void log_set(uint num_stats, StatsAccess access, const char* title) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going through all its call sites, I believe print_stats
is more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name log_set was chosen to suggest that it does "UL logging", and to
indicate that it is for dealing with a set of stats objects. I think
print_stats loses both of those cues and is less clear because of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is "set" more than important than "stats" in "set of stats objects"? If "UL logging" is critical, "log_stats" would be better. When I first read this name, I thought it's related to "set" as in "getter/setter" of log...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"stats" is redundent here. Recall this is a static function. A client call is
going to look like PartialArrayTaskStats::log_set(...)
, so it's already
obvious it's related to "stats" at the call site.
A value assigning function would have a "set_" prefix. Using a "_set" suffix
for that would be really weird and non-idiomatic (and a reader would be quite
right to complain about such).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel that the redundancy here is bad, since the first two args are tied to "stats". OTOH, I find the trailing "set" super confusing. This function is to log/print multiple stats, and the most intuitive choice would have been "log/print" + "stats", because it directly communicates the action being performed (logging stats).
Emphasizing the collective noun instead of the actual noun seems odd. YMMV.
PartialArraySplitter::Claim | ||
PartialArraySplitter::claim(PartialArrayState* state, Queue* queue, bool stolen) { | ||
#if TASKQUEUE_STATS | ||
if (stolen) _stats.inc_stolen(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking it into multiple lines make the control flow more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stylistic difference has been discussed at length in the past.
} | ||
|
||
void PartialArrayTaskStats::reset() { | ||
*this = PartialArrayTaskStats(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do sth like static_assert(std::is_trivially_copyable<PartialArrayTaskStats>::value)
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean is_trivially_assignable. I don't think it's a useful
assertion here. Depending on details of the class, one might reasonably
implement such an operation in the same way even if it isn't trivially
assignable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks all for the reviews. |
/integrate |
Going to push as commit 2344a1a.
Your commit was automatically rebased without conflicts. |
@kimbarrett Pushed as commit 2344a1a. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Please review this change that introduces two new helper classes to simplify
the usage of PartialArrayStates to manage splitting the processing of large
object arrays into parallelizable chunks. G1 and Parallel young GCs are
changed to use this new mechanism.
PartialArrayTaskStats is used to collect and report statistics related to
array splitting. It replaces the direct implementation in PSPromotionManager,
and is now also used by G1 young GCs.
PartialArraySplitter packages up most of the work involved in splitting and
processing tasks. It provides task allocation and release, enqueuing, chunk
claiming, and statistics tracking. It does this by encapsulating existing
objects and functionality. Using array splitting is mostly reduced to calling
the splitter's start function and then calling it's step function to process
partial states. This substantially reduces the amount of code for each client
to perform this work.
Testing: mach5 tier1-5
Manually ran some test programs with each of G1 and Parallel, with taskqueue
stats logging enabled, and checked that the logged statistics looked okay.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22622/head:pull/22622
$ git checkout pull/22622
Update a local copy of the PR:
$ git checkout pull/22622
$ git pull https://git.openjdk.org/jdk.git pull/22622/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 22622
View PR using the GUI difftool:
$ git pr show -t 22622
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22622.diff
Using Webrev
Link to Webrev Comment