Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8324649: Shenandoah: replace implementation of free set #17561

Closed
wants to merge 126 commits into from

Conversation

kdnilsen
Copy link
Contributor

@kdnilsen kdnilsen commented Jan 24, 2024

Several objectives:

  1. Reduce humongous allocation failures by segregating regular regions from humongous regions
  2. Do not retire regions just because an allocation failed within the region if the memory remaining within the region is large enough to represent a LAB
  3. Track range of empty regions in addition to range of available regions in order to expedite humongous allocations
  4. Treat collector reserves as available for Mutator allocations after evacuation completes
  5. Improve encapsulation so as to enable an OldCollector reserve for future integration of generational Shenandoah

We have compared performance of existing FreeSet implementation with the proposed PR over a broad set of performance workloads and see that the impact is mostly neutral.

Comparing 105235.0 metrics from control, 220638.0 from experiment.
Compare: 0.589s
Most impacted benchmarks | Most impacted metrics

                             Shenandoah/jython |                                          cwr_total


                            Only in experiment |                                    Only in control

             crypto.signverify/trigger_failure |                        crypto.rsa/cmr_thread_roots
            extremem-large-31g/adjust_pointers |       scimark.sparse.small/concurrent_thread_roots
        extremem-large-31g/calculate_addresses |              xml.transform/concurrent_thread_roots
  crypto.signverify/class_unloading_rendezvous |                    mpegaudio/concurrent_weak_roots
                              serial/cmr_total |                        crypto.rsa/ctr_thread_roots

Shenandoah

+5.64% jython/cwr_total p=0.00037
Control: 1.928ms (+/-272.40us) 170
Test: 2.037ms (+/-322.73us) 344


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8324649: Shenandoah: replace implementation of free set (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17561/head:pull/17561
$ git checkout pull/17561

Update a local copy of the PR:
$ git checkout pull/17561
$ git pull https://git.openjdk.org/jdk.git pull/17561/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17561

View PR using the GUI difftool:
$ git pr show -t 17561

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17561.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 24, 2024

👋 Welcome back kdnilsen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 24, 2024

@kdnilsen The following labels will be automatically applied to this pull request:

  • hotspot-gc
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot-gc hotspot-gc-dev@openjdk.org shenandoah shenandoah-dev@openjdk.org labels Jan 24, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 24, 2024
@mlbridge
Copy link

mlbridge bot commented Jan 24, 2024

Webrevs

Copy link
Contributor

@earthling-amzn earthling-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestions, questions.

@kdnilsen
Copy link
Contributor Author

I've been trying to understand the reported trigger failure regression on hyperalloc. I cannot reproduce that result. The host on which I'm experimenting apparently has more cores than what runs the pipeline, so I have to push hyperalloc to higher allocation rates and higher memory utilization in order to see any "trigger failures at all". On my host, I see a small number of trigger failures beginning at 6144 KB/s allocation rate with live memory 4096 MB out of heap size 10 GB. On this workload, I see slightly more trigger failures with the original free set implementation than with the new one. Here are the results of all my recent experiments:

Screenshot 2024-01-26 at 10 43 47 AM

Copy link
Member

@ysramakrishna ysramakrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few scattered comments as I skimmed through this.
Will probably go back and look again, more carefully but thought I'd leave these here for now until I get back to this later, take a fresh look and do a more careful and complete review.

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp Outdated Show resolved Hide resolved
Copy link
Member

@ysramakrishna ysramakrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more feedback comments. Will try and finish in next round, but flushing the review comments buffer for now until I get back to this later.

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp Outdated Show resolved Hide resolved
@kdnilsen
Copy link
Contributor Author

kdnilsen commented May 7, 2024

@rkennke : Thanks for all your careful feedback. If you could be more specific about loops that might result in quadratic behavior, I'll look more closely. There are some loops within loops, but I'm thinking these are not quadratic because the inner loop bounds are constants.

Copy link
Member

@ysramakrishna ysramakrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll re-review the changes since my last review which predates the new bitmap based set implementation. Will try and get that done ASAP in the next day or two. Sorry for the delay.

Copy link
Contributor

@rkennke rkennke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's mostly there. Just a few remainging minor nits.

case NotFree: return "NotFree";
case Mutator: return "Mutator";
case Collector: return "Collector";
default: return "Unrecognized";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok. So maybe at least put a ShouldNotReachHere(); before the return?

Comment on lines 315 to 319
#ifdef _LP64
log_info(gc)("%6s: %18s %18s %18s", "index", "Mutator Bits", "Collector Bits", "NotFree Bits");
#else
log_info(gc)("%6s: %10s %10s %10s", "index", "Mutator Bits", "Collector Bits", "NotFree Bits");
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What Aleksey said.

void ShenandoahFreeSet::increase_used(size_t num_bytes) {
shenandoah_assert_heaplocked();
_used += num_bytes;
ShenandoahRegionPartitions::~ShenandoahRegionPartitions() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no need to put this trivial destructor definition in the .cpp file. Just place it with the declaration in the .hpp file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

if (idx >= _max) {
return _max;
} else {
// _membership[which_partition].is_set(idx) may not be true if we are shrinking the interval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What Aleksey said. ;-)

inline idx_t ShenandoahRegionPartitions::rightmost(ShenandoahFreeSetPartitionId which_partition) const {
assert (which_partition < NumPartitions, "selected free partition must be valid");
idx_t idx = _rightmosts[int(which_partition)];
// _membership[which_partition].is_set(idx) may not be true if we are shrinking the interval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've rewritten both of those comments to clarify what I was thinking... Hope this is more clear now.


const char* ShenandoahRegionPartitions::partition_membership_name(idx_t idx) const {
ShenandoahFreeSetPartitionId result = membership(idx);
return partition_name(result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could make it even more succinct by doing return partition_name(membership(idx));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement. Thanks. Changed.

@@ -0,0 +1,292 @@
/*
* Copyright (c) 2016, 2021, Red Hat, Inc. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copyright seems wrong. This is entirely new code by you, right? Should only have Amazon copyright notice, then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review. I'll work through these suggestions. Question about copyright: I guess this is pretty much "all new", but the code was derived from original code, and much of the public API was borrowed from original. So I think maybe I should keep Red Hat copyright, and add Amazon copyright. What do you advise?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But has it been derived from Red Hat code? My understanding was that it's basically a fork of src/hotspot/share/utilities/bitMap.* ? Which is Oracle-copyrighted. Is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I didn't look closely. This was not derived from original code. I wrote this blind to the original, and then I made changes under guidance of reviewers to harmonize the API with Oracle implementation of bitMap. So I think your original comment about copyright is appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've fixed copyright notices for ShenandoahSimpleBitMap.?pp

return array_idx;
}

inline idx_t alignment() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this could even be constexpr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

@@ -0,0 +1,101 @@
/*
* Copyright (c) 2016, 2019, Red Hat, Inc. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same copyright issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines +98 to +105
inline bool ShenandoahFreeSet::can_allocate_from(ShenandoahHeapRegion *r) const {
return r->is_empty() || (r->is_trash() && !_heap->is_concurrent_weak_root_in_progress());
}

inline bool ShenandoahFreeSet::can_allocate_from(size_t idx) const {
ShenandoahHeapRegion* r = _heap->get_region(idx);
return can_allocate_from(r);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of this abstraction leakage can be avoided by making

"can_allocate_from(size_t idx)" a method on the _heap object, and calling that, letting the heap do the test:

bool ShenandoahHeap::can_allocate_from(size_t idx) {
  ShenandoahHeapRegion* r = get_region(idx);
  return r->is_empty() || (r->is_trash() && !is_concurrent_weak_root_in_progress());
}

with ShFreeSet merely delegating to its _heap.

Probably similar pattern for alloc_capacity method below.

(I'll keep leaving single comments, so I don't slow down the review/feedback across potential interruptions. I don't expect there will be many, but still.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the rationale for these changes. In my mind, the "notions" of can_allocate_from() and alloc_capacity() as used here within ShenandoahFreeSet are not universal. These notions are specific to what the free set understands about the life-cycle of regions. I think moving this functionality into heap actually leaks more encapsulation, in that nobody else cares to use these novel interpretations of can_allocate_from() and alloc_capacity().

Maybe you can elaborate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the structural effect of moving the methods there and failed to see that these were policies encoded in the free set abstraction rather than policies of (or characteristics of) ShenandoahHeap used by the free set (as I was thinking).

No change is needed in that case. Sorry for my confusion.

Comment on lines 91 to 93
void dump_bitmap_row(ssize_t idx) const;
void dump_bitmap_range(ssize_t start_idx, ssize_t end_idx) const;
void dump_bitmap_all() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the first case you can may be say region_idx instead of idx ?

Also, it seems dump_bitmap() would be sufficient (at least for the case of dump_bitmap_all(), although in general for all).

However, given that these are private methods that will only be found in the implementation and nowhere else in its clients' code, may be current naming is ok (indeed, it looks like the first two are work functions in service of the last).

PS: I don't see any current usage of dump_bitmap_all(). May be you wanted it in some debug tracing code or something, or keeping it for possible future debugging usage? (Or may be my naive web-page search of this review page is naive/faulty.) It's perhaps useful to keep around for debug/tracing though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Making these changes to argument names, and conditionalizing this code on #ifndef PRODUCT

void make_all_regions_unavailable();

// Set the partition id for a particular region without adjusting interval bounds or usage/capacity tallies
inline void raw_set_membership(size_t idx, ShenandoahFreeSetPartitionId p) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that the word set can be a little ambiguous here (from the mathematical notion of a set as a collection of things), although the fact that it has a void return type tells us that set is the imperative verb here, not a collection of things. May be raw_assign_membership(size_t region_idx, ShFSPId p) if you want to, but I may be splitting hairs here :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Making this change.


// Set the Mutator intervals, usage, and capacity according to arguments. Reset the Collector intervals, used, capacity
// to represent empty Collector free set.
void establish_intervals(ssize_t mutator_leftmost, ssize_t mutator_rightmost,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not clarify in name via establish_mutator_intervals ? (I noted there isn't a corresponding method for "collector intervals", although there are corresponding fields for its left and rightmosts etc.) A documentation comment somewhere indicating how/when those get established would be useful. I see them being incrementally updated in the incremental manipulation of set memberships, but wanted to get a sense of the reasons for this asymmetry (and for that reason to be documented when we revisit this code many moons in the future).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Making this change and adding a comment to motivate this "special" method for mutator intervals.

Copy link
Contributor

@rkennke rkennke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look ok now, but you've got an assert failure on linux-x64 in GHA that needs to be fixed:
TEST: gc/shenandoah/oom/TestThreadFailure.java

Internal Error (/home/runner/work/jdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp:533), pid=23127, tid=23231

assert(beg_off >= leftmost_empty(ShenandoahFreeSetPartitionId::Mutator)) failed: free empty regions before the leftmost: 119, bound 128

Using the accessor methods may cause misbehavior if an accessor method
falsely believes a range is empty because the ranges have not yet been
adjusted to represent regions that were newly inserted into a particular
partition.
Copy link
Member

@ysramakrishna ysramakrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed. Note: I only skimmed the bitmap implementation and didn't review it closely. I am relying on the unit tests you wrote, your testing of the code in codepipeline stress testing and on others' reviews to have caught ay issues, so don't feel further deep review of the bitmap implementation by me is going to add additional value.

Copy link
Contributor

@rkennke rkennke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now! Thanks a lot for following through all the review so far!

@kdnilsen
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label May 16, 2024
@openjdk
Copy link

openjdk bot commented May 16, 2024

@kdnilsen
Your change (at version 3219119) is now ready to be sponsored by a Committer.

@ysramakrishna
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented May 16, 2024

Going to push as commit dc184f1.
Since your change was applied there have been 136 commits pushed to the master branch:

  • 2599151: 8331202: Support for Duration until another Instant
  • 6f7ddbe: 8260633: [macos] java/awt/dnd/MouseEventAfterStartDragTest/MouseEventAfterStartDragTest.html test failed
  • fa3e94d: 8332393: Problemlist compiler/rangechecks/TestArrayAccessAboveRCAfterRCCastIIEliminated.java
  • a33cb90: 8332098: Add missing @ since tags to jdk.jdi
  • 910d77b: 8331953: ubsan: metaspaceShared.cpp:1305:57: runtime error: applying non-zero offset 12849152 to null pointer
  • f9f8d0b: 8332101: Add an @since to StandardOperation:REMOVE in jdk.dynalink
  • f398cd2: 8331575: C2: crash when ConvL2I is split thru phi at LongCountedLoop
  • 96c5c3f: 8329998: Remove double initialization for parts of small TypeArrays in ZObjArrayAllocator
  • ee4a9d3: 8321622: ClassFile.verify(byte[] bytes) throws unexpected ConstantPoolException, IAE
  • ab8d7b0: 8324517: C2: crash in compiled code because of dependency on removed range check CastIIs
  • ... and 126 more: https://git.openjdk.org/jdk/compare/b20fa7b48b0f0a64c0760f26188d4c11c3233b61...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 16, 2024
@openjdk openjdk bot closed this May 16, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels May 16, 2024
@openjdk
Copy link

openjdk bot commented May 16, 2024

@ysramakrishna @kdnilsen Pushed as commit dc184f1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated shenandoah shenandoah-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

5 participants