Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
0372026
Card table as DCQ
tschatzl Feb 11, 2025
7782295
* remove some commented out debug code
tschatzl Feb 24, 2025
9e26abb
* remove mention of "enqueue" or "enqueuing" for actions related to p…
tschatzl Feb 24, 2025
3004a96
* fix crash when writing dirty cards for memory regions during card t…
tschatzl Feb 24, 2025
b8100b9
* mdoerr review: fix comments in ppc code
tschatzl Feb 24, 2025
0100d8e
* only provide byte map base for JavaThreads
tschatzl Feb 25, 2025
9ef9c5f
* remove unnecessarily added logging
tschatzl Feb 25, 2025
e51eec8
* ayang review 1
tschatzl Feb 28, 2025
7d361fc
* ayang review 1 (ctd)
tschatzl Feb 28, 2025
d87935a
* fix assert
tschatzl Feb 28, 2025
810bf2d
* fix comment (trailing whitespace)
tschatzl Mar 3, 2025
b3dd008
ayang review 2
tschatzl Mar 3, 2025
8f46dc9
* iwalulya initial comments
tschatzl Mar 4, 2025
9e2ee54
* do not change card table base for gc threads during swapping
tschatzl Mar 4, 2025
442d9ea
* iwalulya review 2
tschatzl Mar 4, 2025
fc674f0
* ayang review - fix comment
tschatzl Mar 4, 2025
b4d19d9
iwalulya review
tschatzl Mar 4, 2025
4a97811
ayang review
tschatzl Mar 4, 2025
a457e6e
* fix whitespace
tschatzl Mar 5, 2025
350a4fa
* iwalulya review
tschatzl Mar 6, 2025
c994000
* ayang review 3
tschatzl Mar 7, 2025
93b884f
* fix card table verification crashes: in the first refinement phase,…
tschatzl Mar 8, 2025
758fac0
* optimized RISCV gen_write_ref_array_post_barrier() implementation c…
tschatzl Mar 11, 2025
aec9505
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Mar 12, 2025
3766b76
* ayang review
tschatzl Mar 12, 2025
7861117
* when aborting refinement during full collection, the global card ta…
tschatzl Mar 13, 2025
51a9eed
* ayang review
tschatzl Mar 14, 2025
b073017
Merge branch 'master' into 8342381-card-table-instead-of-dcq
tschatzl Mar 14, 2025
447fe39
* more documentation on why we need to rendezvous the gc threads
tschatzl Mar 15, 2025
4d0afd5
* obsolete G1UpdateBufferSize
tschatzl Mar 17, 2025
ff9eb26
Merge branch 'master' into 8342382-card-table-instead-of-dcq3
tschatzl Mar 18, 2025
c833bc8
* factor out card table and refinement table merging into a single
tschatzl Mar 18, 2025
f419556
* fix IR code generation tests that change due to barrier cost changes
tschatzl Mar 19, 2025
5e76a51
* make young gen length revising independent of refinement thread
tschatzl Mar 20, 2025
d931104
Merge branch 'master' into submit/8342382-card-table-instead-of-dcq
tschatzl Mar 21, 2025
6d574da
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Mar 26, 2025
51fb6e6
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Apr 1, 2025
27b3dd6
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Apr 4, 2025
1c5a669
* missing file from merge
tschatzl Apr 4, 2025
c5d5f3a
Reorder includes
robcasloz Apr 9, 2025
9481821
Refine needs_liveness_data
robcasloz Apr 9, 2025
855ec8d
Do not unnecessarily pass around tmp2 in x86
robcasloz Apr 9, 2025
d4649ed
* ayang review: revising young gen length
tschatzl Apr 9, 2025
63b1de8
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Apr 10, 2025
39aa903
* fixes after merge related to 32 bit x86 removal
tschatzl Apr 10, 2025
fcf96a2
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Apr 10, 2025
fd77531
* remove support for 32 bit x86 in the barrier generation code, follo…
tschatzl Apr 10, 2025
068d2a3
* indentation fix
tschatzl Apr 10, 2025
e683152
* ayang review (part 1)
tschatzl Apr 11, 2025
a3b2386
* ayang review (part 2 - yield duration changes)
tschatzl Apr 11, 2025
e4bf1ac
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Apr 23, 2025
51dfbe5
Merge branch 'master' into card-table-as-dcq-merge
tschatzl Apr 29, 2025
8b56880
* ayang review: remove sweep_epoch
tschatzl Apr 29, 2025
1def83a
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl May 15, 2025
c07a73d
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Jun 10, 2025
750ed2d
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Jun 27, 2025
441c234
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Jul 7, 2025
5ab928e
Merge branch 'master' into pull/23739
tschatzl Jul 14, 2025
4b21868
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Jul 17, 2025
cea0e1b
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Jul 23, 2025
dd83638
* remove unused G1DetachedRefinementStats_lock
tschatzl Jul 23, 2025
23aa2c8
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Jul 28, 2025
188fc81
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Aug 5, 2025
7fe518e
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Aug 12, 2025
6c88f1d
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Aug 22, 2025
e8a8282
* forgot to actually save the files
tschatzl Aug 22, 2025
cc4b7a0
* fix merge error
tschatzl Aug 22, 2025
4a41b40
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Sep 1, 2025
b3873d6
* commit merge changes
tschatzl Sep 1, 2025
104d506
* improve logging for refinement, making it similar to marking logging
tschatzl Sep 3, 2025
2a614a2
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Sep 4, 2025
4601bf8
* sort includes
tschatzl Sep 8, 2025
87b4136
* iwalulya: remove confusing comment
tschatzl Sep 10, 2025
e7c3a06
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Sep 10, 2025
de1469d
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Sep 10, 2025
d0ca906
* aph review, fix some comment
tschatzl Sep 10, 2025
b47c7b0
* walulyai review
tschatzl Sep 10, 2025
c469c13
* walulyai review
tschatzl Sep 10, 2025
74e9240
* therealaph suggestion for avoiding the register aliasin in gen_writ…
tschatzl Sep 12, 2025
1ced9f9
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Sep 12, 2025
bf8cab3
* iwalulya review
tschatzl Sep 12, 2025
b5d22d5
Merge branch 'master' into 8342382-card-table-instead-of-dcq
tschatzl Sep 22, 2025
53ef008
* improved gen_write_ref_array_post_barrier() for riscv, contributed …
tschatzl Sep 22, 2025
6e37f8d
* iwalulya: "Amount of" -> "Number of" in new flag description
tschatzl Sep 22, 2025
311bb3e
* walulyai: remove cost_per_pending_card_ms_default array since we on…
tschatzl Sep 22, 2025
d80d690
* walulyai: remove unnecessarily introduced newline
tschatzl Sep 22, 2025
3c889e9
* walulyai: bufferNodeList can be removed
tschatzl Sep 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions src/hotspot/share/gc/g1/g1CollectedHeap.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
#include "gc/g1/g1RegionPinCache.inline.hpp"
#include "gc/g1/g1RegionToSpaceMapper.hpp"
#include "gc/g1/g1RemSet.hpp"
#include "gc/g1/g1ReviseYoungListTargetLengthTask.hpp"
#include "gc/g1/g1RootClosures.hpp"
#include "gc/g1/g1RootProcessor.hpp"
#include "gc/g1/g1SATBMarkQueueSet.hpp"
Expand Down Expand Up @@ -1160,6 +1161,7 @@ G1CollectedHeap::G1CollectedHeap() :
_service_thread(nullptr),
_periodic_gc_task(nullptr),
_free_arena_memory_task(nullptr),
_revise_young_length_task(nullptr),
_workers(nullptr),
_refinement_epoch(0),
_last_synchronized_start(0),
Expand Down Expand Up @@ -1468,6 +1470,11 @@ jint G1CollectedHeap::initialize() {
_free_arena_memory_task = new G1MonotonicArenaFreeMemoryTask("Card Set Free Memory Task");
_service_thread->register_task(_free_arena_memory_task);

if (policy()->use_adaptive_young_list_length()) {
_revise_young_length_task = new G1ReviseYoungLengthTargetLengthTask("Revise Young Length List Task");
_service_thread->register_task(_revise_young_length_task);
}

// Here we allocate the dummy G1HeapRegion that is required by the
// G1AllocRegion class.
G1HeapRegion* dummy_region = _hrm.get_dummy_region();
Expand Down
2 changes: 2 additions & 0 deletions src/hotspot/share/gc/g1/g1CollectedHeap.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ class MemoryPool;
class nmethod;
class PartialArrayStateManager;
class ReferenceProcessor;
class G1ReviseYoungLengthTargetLengthTask;
class STWGCTimer;
class WorkerThreads;

Expand Down Expand Up @@ -172,6 +173,7 @@ class G1CollectedHeap : public CollectedHeap {
G1ServiceThread* _service_thread;
G1ServiceTask* _periodic_gc_task;
G1MonotonicArenaFreeMemoryTask* _free_arena_memory_task;
G1ReviseYoungLengthTargetLengthTask* _revise_young_length_task;

WorkerThreads* _workers;

Expand Down
83 changes: 10 additions & 73 deletions src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -392,7 +392,11 @@ bool G1ConcurrentRefineSweepState::are_java_threads_synched() const {
uint64_t G1ConcurrentRefine::adjust_threads_period_ms() const {
// Instead of a fixed value, this could be a command line option. But then
// we might also want to allow configuration of adjust_threads_wait_ms().
return 50;

// Use a prime number close to 50ms, different to other components that derive
// their wait time from the try_get_available_bytes_estimate() call to minimize
// interference.
return 53;
}

static size_t minimum_pending_cards_target() {
Expand Down Expand Up @@ -514,14 +518,6 @@ void G1ConcurrentRefine::adjust_after_gc(double logged_cards_time_ms,
}
}

// Wake up the control thread less frequently when the time available until
// the next GC is longer. But don't increase the wait time too rapidly.
// This reduces the number of control thread wakeups that just immediately
// go back to waiting, while still being responsive to behavior changes.
static uint64_t compute_adjust_wait_time_ms(double available_ms) {
return static_cast<uint64_t>(sqrt(available_ms) * 4.0);
}

uint64_t G1ConcurrentRefine::adjust_threads_wait_ms() const {
assert_current_thread_is_control_refinement_thread();
if (is_pending_cards_target_initialized()) {
Expand All @@ -531,9 +527,9 @@ uint64_t G1ConcurrentRefine::adjust_threads_wait_ms() const {
if (_heap_was_locked) {
return 1;
}
double available_ms = _threads_needed.predicted_time_until_next_gc_ms();
uint64_t wait_time_ms = compute_adjust_wait_time_ms(available_ms);
return MAX2(wait_time_ms, adjust_threads_period_ms());
double available_time_ms = _threads_needed.predicted_time_until_next_gc_ms();

return _policy->adjust_wait_time_ms(available_time_ms, adjust_threads_period_ms());
} else {
// If target not yet initialized then wait forever (until explicitly
// activated). This happens during startup, when we don't bother with
Expand All @@ -542,55 +538,6 @@ uint64_t G1ConcurrentRefine::adjust_threads_wait_ms() const {
}
}

class G1ConcurrentRefine::RemSetSamplingClosure : public G1HeapRegionClosure {
size_t _sampled_code_root_rs_length;

public:
RemSetSamplingClosure() :
_sampled_code_root_rs_length(0) {}

bool do_heap_region(G1HeapRegion* r) override {
G1HeapRegionRemSet* rem_set = r->rem_set();
_sampled_code_root_rs_length += rem_set->code_roots_list_length();
return false;
}

size_t sampled_code_root_rs_length() const { return _sampled_code_root_rs_length; }
};

// Adjust the target length (in regions) of the young gen, based on the
// current length of the remembered sets.
//
// At the end of the GC G1 determines the length of the young gen based on
// how much time the next GC can take, and when the next GC may occur
// according to the MMU.
//
// The assumption is that a significant part of the GC is spent on scanning
// the remembered sets (and many other components), so this thread constantly
// reevaluates the prediction for the remembered set scanning costs, and potentially
// resizes the young gen. This may do a premature GC or even increase the young
// gen size to keep pause time length goal.
void G1ConcurrentRefine::adjust_young_list_target_length() {
if (_policy->use_adaptive_young_list_length()) {
G1CollectedHeap* g1h = G1CollectedHeap::heap();
G1CollectionSet* cset = g1h->collection_set();
RemSetSamplingClosure cl;
cset->iterate(&cl);

size_t pending_cards;
size_t current_to_collection_set_cards;
{
MutexLocker x(G1RareEvent_lock, Mutex::_no_safepoint_check_flag);
G1Policy* p = g1h->policy();
pending_cards = p->current_pending_cards();
current_to_collection_set_cards = p->current_to_collection_set_cards();
}
_policy->revise_young_list_target_length(pending_cards,
current_to_collection_set_cards,
cl.sampled_code_root_rs_length());
}
}

bool G1ConcurrentRefine::adjust_num_threads_periodically() {
assert_current_thread_is_control_refinement_thread();

Expand All @@ -607,18 +554,8 @@ bool G1ConcurrentRefine::adjust_num_threads_periodically() {

// Reset pending request.
_needs_adjust = false;
// Getting used young bytes requires holding Heap_lock. But we can't use
// normal lock and block until available. Blocking on the lock could
// deadlock with a GC VMOp that is holding the lock and requesting a
// safepoint. Instead try to lock, and if fail then skip adjustment for
// this iteration and retry the adjustment later.
if (Heap_lock->try_lock()) {
size_t used_bytes = _policy->estimate_used_young_bytes_locked();
Heap_lock->unlock();

adjust_young_list_target_length();
size_t young_bytes = _policy->young_list_target_length() * G1HeapRegion::GrainBytes;
size_t available_bytes = young_bytes - MIN2(young_bytes, used_bytes);
size_t available_bytes = 0;
if (_policy->try_get_available_bytes_estimate(available_bytes)) {
adjust_threads_wanted(available_bytes);
_last_adjust = Ticks::now();
} else {
Expand Down
3 changes: 0 additions & 3 deletions src/hotspot/share/gc/g1/g1ConcurrentRefine.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -249,9 +249,6 @@ class G1ConcurrentRefine : public CHeapObj<mtGC> {

uint64_t adjust_threads_period_ms() const;

class RemSetSamplingClosure; // Helper class for adjusting young length.
void adjust_young_list_target_length();

void adjust_threads_wanted(size_t available_bytes);

NONCOPYABLE(G1ConcurrentRefine);
Expand Down
2 changes: 1 addition & 1 deletion src/hotspot/share/gc/g1/g1ConcurrentRefineThread.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ void G1ConcurrentRefineThread::do_refinement() {
{
// The young gen revising mechanism reads the predictor and the values set
// here. Avoid inconsistencies by locking.
MutexLocker x(G1RareEvent_lock, Mutex::_no_safepoint_check_flag);
MutexLocker x(G1ReviseYoungLength_lock, Mutex::_no_safepoint_check_flag);
policy->record_dirtying_stats(TimeHelper::counter_to_millis(G1CollectedHeap::heap()->last_refinement_epoch_start()),
TimeHelper::counter_to_millis(next_epoch_start),
stats->cards_pending(),
Expand Down
22 changes: 2 additions & 20 deletions src/hotspot/share/gc/g1/g1ConcurrentRefineThreadsNeeded.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -52,28 +52,10 @@ void G1ConcurrentRefineThreadsNeeded::update(uint active_threads,
size_t available_bytes,
size_t num_cards,
size_t target_num_cards) {
const G1Analytics* analytics = _policy->analytics();

// Estimate time until next GC, based on remaining bytes available for
// allocation and the allocation rate.
double alloc_region_rate = analytics->predict_alloc_rate_ms();
double alloc_bytes_rate = alloc_region_rate * G1HeapRegion::GrainBytes;
if (alloc_bytes_rate == 0.0) {
// A zero rate indicates we don't yet have data to use for predictions.
// Since we don't have any idea how long until the next GC, use a time of
// zero.
_predicted_time_until_next_gc_ms = 0.0;
} else {
// If the heap size is large and the allocation rate is small, we can get
// a predicted time until next GC that is so large it can cause problems
// (such as overflow) in other calculations. Limit the prediction to one
// hour, which is still large in this context.
const double one_hour_ms = 60.0 * 60.0 * MILLIUNITS;
double raw_time_ms = available_bytes / alloc_bytes_rate;
_predicted_time_until_next_gc_ms = MIN2(raw_time_ms, one_hour_ms);
}
_predicted_time_until_next_gc_ms = _policy->predict_time_to_next_gc_ms(available_bytes);

// Estimate number of cards that need to be processed before next GC.
const G1Analytics* analytics = _policy->analytics();

double incoming_rate = analytics->predict_dirtied_cards_rate_ms();
double raw_cards = incoming_rate * _predicted_time_until_next_gc_ms;
Expand Down
46 changes: 44 additions & 2 deletions src/hotspot/share/gc/g1/g1Policy.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -633,9 +633,9 @@ void G1Policy::record_dirtying_stats(double last_mutator_start_dirty_ms,
double yield_duration_ms,
size_t next_pending_cards_from_gc,
size_t next_to_collection_set_cards) {
assert(SafepointSynchronize::is_at_safepoint() || G1RareEvent_lock->is_locked(),
assert(SafepointSynchronize::is_at_safepoint() || G1ReviseYoungLength_lock->is_locked(),
"must be (at safepoint %s locked %s)",
BOOL_TO_STR(SafepointSynchronize::is_at_safepoint()), BOOL_TO_STR(G1RareEvent_lock->is_locked()));
BOOL_TO_STR(SafepointSynchronize::is_at_safepoint()), BOOL_TO_STR(G1ReviseYoungLength_lock->is_locked()));
// Record mutator's card logging rate.

// Unlike above for conc-refine rate, here we should not require a
Expand Down Expand Up @@ -1433,6 +1433,48 @@ size_t G1Policy::allowed_waste_in_collection_set() const {
return G1HeapWastePercent * _g1h->capacity() / 100;
}

bool G1Policy::try_get_available_bytes_estimate(size_t& available_bytes) const {
// Getting used young bytes requires holding Heap_lock. But we can't use
// normal lock and block until available. Blocking on the lock could
// deadlock with a GC VMOp that is holding the lock and requesting a
// safepoint. Instead try to lock, and return the result of that attempt,
// and the estimate if successful.
if (Heap_lock->try_lock()) {
size_t used_bytes = estimate_used_young_bytes_locked();
Heap_lock->unlock();

size_t young_bytes = young_list_target_length() * G1HeapRegion::GrainBytes;
available_bytes = young_bytes - MIN2(young_bytes, used_bytes);
return true;
} else {
available_bytes = 0;
return false;
}
}

double G1Policy::predict_time_to_next_gc_ms(size_t available_bytes) const {
double alloc_region_rate = _analytics->predict_alloc_rate_ms();
double alloc_bytes_rate = alloc_region_rate * G1HeapRegion::GrainBytes;
if (alloc_bytes_rate == 0.0) {
// A zero rate indicates we don't yet have data to use for predictions.
// Since we don't have any idea how long until the next GC, use a time of
// zero.
return 0.0;
} else {
// If the heap size is large and the allocation rate is small, we can get
// a predicted time until next GC that is so large it can cause problems
// (such as overflow) in other calculations. Limit the prediction to one
// hour, which is still large in this context.
const double one_hour_ms = 60.0 * 60.0 * MILLIUNITS;
double raw_time_ms = available_bytes / alloc_bytes_rate;
return MIN2(raw_time_ms, one_hour_ms);
}
}

uint64_t G1Policy::adjust_wait_time_ms(double wait_time_ms, uint64_t min_time_ms) {
return MAX2(static_cast<uint64_t>(sqrt(wait_time_ms) * 4.0), min_time_ms);
}

double G1Policy::last_mutator_dirty_start_time_ms() {
return TimeHelper::counter_to_millis(_g1h->last_refinement_epoch_start());
}
Expand Down
19 changes: 18 additions & 1 deletion src/hotspot/share/gc/g1/g1Policy.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -335,7 +335,6 @@ class G1Policy: public CHeapObj<mtGC> {
// Amount of allowed waste in bytes in the collection set.
size_t allowed_waste_in_collection_set() const;


private:

// Predict the number of bytes of surviving objects from survivor and old
Expand Down Expand Up @@ -369,10 +368,28 @@ class G1Policy: public CHeapObj<mtGC> {

bool use_adaptive_young_list_length() const;

// Try to get an estimate of the currently available bytes in the young gen. This
// operation considers itself low-priority: if other threads need the resources
// required to get the information, return false to indicate that the caller
// should retry "soon".
bool try_get_available_bytes_estimate(size_t& bytes) const;
// Estimate time until next GC, based on remaining bytes available for
// allocation and the allocation rate.
double predict_time_to_next_gc_ms(size_t available_bytes) const;

// Adjust wait times to make them less frequent the longer the next GC is away.
// But don't increase the wait time too rapidly, further bound it by min_time_ms.
// This reduces the number of thread wakeups that just immediately
// go back to waiting, while still being responsive to behavior changes.
uint64_t adjust_wait_time_ms(double wait_time_ms, uint64_t min_time_ms);

private:
// Return an estimate of the number of bytes used in young gen.
// precondition: holding Heap_lock
size_t estimate_used_young_bytes_locked() const;

public:

void transfer_survivors_to_cset(const G1SurvivorRegions* survivors);

// Record and log stats and pending cards to update predictors.
Expand Down
Loading