Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8257774: G1: Trigger collect when free region count drops below threshold to prevent evacuation failures #3143

Closed
wants to merge 15 commits into from
Closed
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
@@ -164,21 +164,21 @@ class G1AllocRegion : public CHeapObj<mtGC> {
size_t desired_word_size,
size_t* actual_word_size);

// Second-level allocation: Should be called while holding a
// lock. It will try to first allocate lock-free out of the active
// region or, if it's unable to, it will try to replace the active
// alloc region with a new one. We require that the caller takes the
// appropriate lock before calling this so that it is easier to make
// it conform to its locking protocol.
inline HeapWord* attempt_allocation_locked(size_t word_size);
// Same as attempt_allocation_locked(size_t, bool), but allowing specification
// of minimum word size of the block in min_word_size, and the maximum word
// size of the allocation in desired_word_size. The actual size of the block is
// returned in actual_word_size.
// Second-level allocation: Should be called while holding a
// lock. We require that the caller takes the appropriate lock
// before calling this so that it is easier to make it conform
// to the locking protocol.
// Tries to allocate at least min_word_size words, and at most desired_word_size.
// Returns the actual size of the block in actual_word_size.
Copy link
Contributor

@kstefanj kstefanj Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please format these lines to fit the rest of the comment:

Suggested change
// to the locking protocol.
// Tries to allocate at least min_word_size words, and at most desired_word_size.
// Returns the actual size of the block in actual_word_size.
// to the locking protocol. The min and desired word size allow
// specifying a minimum and maximum size of the allocation. The
// actual size of allocation is returned in actual_word_size.

inline HeapWord* attempt_allocation_locked(size_t min_word_size,
size_t desired_word_size,
size_t* actual_word_size);

inline HeapWord* attempt_allocation_use_new_region(size_t min_word_size,
size_t desired_word_size,
size_t* actual_word_size);
adityamandaleeka marked this conversation as resolved.
Show resolved Hide resolved

// Should be called to allocate a new region even if the max of this
// type of regions has been reached. Should only be called if other
// allocation attempts have failed and we are not holding a valid
@@ -98,16 +98,19 @@ inline HeapWord* G1AllocRegion::attempt_allocation_locked(size_t word_size) {
inline HeapWord* G1AllocRegion::attempt_allocation_locked(size_t min_word_size,
size_t desired_word_size,
size_t* actual_word_size) {
// First we have to redo the allocation, assuming we're holding the
// appropriate lock, in case another thread changed the region while
// we were waiting to get the lock.
HeapWord* result = attempt_allocation(min_word_size, desired_word_size, actual_word_size);
if (result != NULL) {
return result;
}

return attempt_allocation_use_new_region(min_word_size, desired_word_size, actual_word_size);
}

inline HeapWord* G1AllocRegion::attempt_allocation_use_new_region(size_t min_word_size,
size_t desired_word_size,
size_t* actual_word_size) {
adityamandaleeka marked this conversation as resolved.
Show resolved Hide resolved
retire(true /* fill_up */);
result = new_alloc_region_and_allocate(desired_word_size, false /* force */);
HeapWord* result = new_alloc_region_and_allocate(desired_word_size, false /* force */);
if (result != NULL) {
*actual_word_size = desired_word_size;
trace("alloc locked (second attempt)", min_word_size, desired_word_size, *actual_word_size, result);
@@ -112,10 +112,21 @@ class G1Allocator : public CHeapObj<mtGC> {

// Allocate blocks of memory during mutator time.

// Attempt allocation in the current alloc region. If use_retained_region_if_available
Copy link

@brstaffoMS brstaffoMS Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reference to use_retained_region_if_available out of date?

Copy link
Contributor

@kstefanj kstefanj Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_retained_region_if_available is no longer used remove from comment.

// is set and a retained region is available, the allocation will first be tried in the
// retained region.
inline HeapWord* attempt_allocation(size_t min_word_size,
size_t desired_word_size,
size_t* actual_word_size);

// Attempt allocation, retiring the current region and allocating a new one. It is
// assumed that attempt_allocation() has been tried and failed already first.
inline HeapWord* attempt_allocation_use_new_region(size_t word_size);

// This is to be called when holding an appropriate lock. It first tries in the
// current allocation region, and then attempts an allocation using a new region.
inline HeapWord* attempt_allocation_locked(size_t word_size);

inline HeapWord* attempt_allocation_force(size_t word_size);

size_t unsafe_max_tlab_alloc();
@@ -52,16 +52,29 @@ inline HeapWord* G1Allocator::attempt_allocation(size_t min_word_size,
size_t desired_word_size,
size_t* actual_word_size) {
uint node_index = current_node_index();


Copy link
Contributor

@kstefanj kstefanj Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove at least one of those two blank lines.

HeapWord* result = mutator_alloc_region(node_index)->attempt_retained_allocation(min_word_size, desired_word_size, actual_word_size);
if (result != NULL) {
return result;
}

return mutator_alloc_region(node_index)->attempt_allocation(min_word_size, desired_word_size, actual_word_size);
}

inline HeapWord* G1Allocator::attempt_allocation_use_new_region(size_t word_size) {
uint node_index = current_node_index();
size_t temp;
HeapWord* result = mutator_alloc_region(node_index)->attempt_allocation_use_new_region(word_size, word_size, &temp);
assert(result != NULL || mutator_alloc_region(node_index)->get() == NULL,
"Must not have a mutator alloc region if there is no memory, but is " PTR_FORMAT, p2i(mutator_alloc_region(node_index)->get()));
adityamandaleeka marked this conversation as resolved.
Show resolved Hide resolved
return result;
}

inline HeapWord* G1Allocator::attempt_allocation_locked(size_t word_size) {
uint node_index = current_node_index();
HeapWord* result = mutator_alloc_region(node_index)->attempt_allocation_locked(word_size);

assert(result != NULL || mutator_alloc_region(node_index)->get() == NULL,
"Must not have a mutator alloc region if there is no memory, but is " PTR_FORMAT, p2i(mutator_alloc_region(node_index)->get()));
return result;
@@ -423,26 +423,42 @@ HeapWord* G1CollectedHeap::attempt_allocation_slow(size_t word_size) {
HeapWord* result = NULL;
for (uint try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) {
bool should_try_gc;
bool proactive_collection_required = false;
uint gc_count_before;

{
MutexLocker x(Heap_lock);
result = _allocator->attempt_allocation_locked(word_size);

// Now that we have the lock, we first retry the allocation in case another
// thread changed the region while we were waiting to acquire the lock.
size_t actual_size;
result = _allocator->attempt_allocation(word_size, word_size, &actual_size);
if (result != NULL) {
return result;
}

// If the GCLocker is active and we are bound for a GC, try expanding young gen.
// This is different to when only GCLocker::needs_gc() is set: try to avoid
// waiting because the GCLocker is active to not wait too long.
if (GCLocker::is_active_and_needs_gc() && policy()->can_expand_young_list()) {
// No need for an ergo message here, can_expand_young_list() does this when
// it returns true.
result = _allocator->attempt_allocation_force(word_size);
proactive_collection_required = policy()->proactive_collection_required(1);
if (!proactive_collection_required) {
// We've already attempted a lock-free allocation above, so we don't want to
// do it again. Let's jump straight to replacing the active region.
result = _allocator->attempt_allocation_use_new_region(word_size);
if (result != NULL) {
return result;
}

// If the GCLocker is active and we are bound for a GC, try expanding young gen.
// This is different to when only GCLocker::needs_gc() is set: try to avoid
// waiting because the GCLocker is active to not wait too long.
if (GCLocker::is_active_and_needs_gc() && policy()->can_expand_young_list()) {
// No need for an ergo message here, can_expand_young_list() does this when
// it returns true.
result = _allocator->attempt_allocation_force(word_size);
if (result != NULL) {
return result;
}
}
}

// Only try a GC if the GCLocker does not signal the need for a GC. Wait until
// the GCLocker initiated GC has been performed and then retry. This includes
// the case when the GC Locker is not active but has not been performed.
@@ -452,9 +468,10 @@ HeapWord* G1CollectedHeap::attempt_allocation_slow(size_t word_size) {
}

if (should_try_gc) {
GCCause::Cause gc_cause = proactive_collection_required ? GCCause::_g1_proactive_collection
: GCCause::_g1_inc_collection_pause;
bool succeeded;
result = do_collection_pause(word_size, gc_count_before, &succeeded,
GCCause::_g1_inc_collection_pause);
result = do_collection_pause(word_size, gc_count_before, &succeeded, gc_cause);
if (result != NULL) {
assert(succeeded, "only way to get back a non-NULL result");
log_trace(gc, alloc)("%s: Successfully scheduled collection returning " PTR_FORMAT,
@@ -848,21 +865,25 @@ HeapWord* G1CollectedHeap::attempt_allocation_humongous(size_t word_size) {
HeapWord* result = NULL;
for (uint try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) {
bool should_try_gc;
bool proactive_collection_required = false;
uint gc_count_before;


{
MutexLocker x(Heap_lock);

// Given that humongous objects are not allocated in young
// regions, we'll first try to do the allocation without doing a
// collection hoping that there's enough space in the heap.
result = humongous_obj_allocate(word_size);
if (result != NULL) {
size_t size_in_regions = humongous_obj_size_in_regions(word_size);
policy()->old_gen_alloc_tracker()->
add_allocated_humongous_bytes_since_last_gc(size_in_regions * HeapRegion::GrainBytes);
return result;
size_t size_in_regions = humongous_obj_size_in_regions(word_size);
proactive_collection_required = policy()->proactive_collection_required((uint)size_in_regions);
if (!proactive_collection_required) {
// Given that humongous objects are not allocated in young
// regions, we'll first try to do the allocation without doing a
// collection hoping that there's enough space in the heap.
result = humongous_obj_allocate(word_size);
if (result != NULL) {
policy()->old_gen_alloc_tracker()->
add_allocated_humongous_bytes_since_last_gc(size_in_regions * HeapRegion::GrainBytes);
return result;
}
}

// Only try a GC if the GCLocker does not signal the need for a GC. Wait until
@@ -874,9 +895,10 @@ HeapWord* G1CollectedHeap::attempt_allocation_humongous(size_t word_size) {
}

if (should_try_gc) {
GCCause::Cause gc_cause = proactive_collection_required ? GCCause::_g1_proactive_collection
: GCCause::_g1_humongous_allocation;
bool succeeded;
result = do_collection_pause(word_size, gc_count_before, &succeeded,
GCCause::_g1_humongous_allocation);
result = do_collection_pause(word_size, gc_count_before, &succeeded, gc_cause);
if (result != NULL) {
assert(succeeded, "only way to get back a non-NULL result");
log_trace(gc, alloc)("%s: Successfully scheduled collection returning " PTR_FORMAT,
@@ -68,6 +68,8 @@ G1Policy::G1Policy(STWGCTimer* gc_timer) :
_reserve_regions(0),
_young_gen_sizer(),
_free_regions_at_end_of_collection(0),
_predicted_surviving_bytes_from_survivor(0),
_predicted_surviving_bytes_from_old(0),
_rs_length(0),
_rs_length_prediction(0),
_pending_cards_at_gc_start(0),
@@ -450,6 +452,7 @@ void G1Policy::record_full_collection_end() {
// also call this on any additional surv rate groups

_free_regions_at_end_of_collection = _g1h->num_free_regions();
update_survival_estimates_for_next_collection();
_survivor_surv_rate_group->reset();
update_young_list_max_and_target_length();
update_rs_length_prediction();
@@ -779,6 +782,7 @@ void G1Policy::record_collection_pause_end(double pause_time_ms, bool concurrent
_free_regions_at_end_of_collection = _g1h->num_free_regions();

update_rs_length_prediction();
update_survival_estimates_for_next_collection();

// Do not update dynamic IHOP due to G1 periodic collection as it is highly likely
// that in this case we are not running in a "normal" operating mode.
@@ -1400,6 +1404,85 @@ void G1Policy::calculate_optional_collection_set_regions(G1CollectionSetCandidat
num_optional_regions, max_optional_regions, prediction_ms);
}

// Number of regions required to store the given number of bytes, taking
// into account the target amount of wasted space in PLABs.
static size_t get_num_regions_adjust_for_plab_waste(size_t byte_count) {
size_t byte_count_adjusted = byte_count * (size_t)(100 + TargetPLABWastePct) / 100.0;

// Round up the region count
return (byte_count_adjusted + HeapRegion::GrainBytes - 1) / HeapRegion::GrainBytes;
}

bool G1Policy::proactive_collection_required(uint alloc_region_count) {
if (!Universe::is_fully_initialized()) {
adityamandaleeka marked this conversation as resolved.
Show resolved Hide resolved
// Don't attempt any proactive GC's before initialization is complete.
return false;
}

if (_g1h->young_regions_count() == 0 && _collection_set->candidates() != NULL && _collection_set->candidates()->is_empty()) {
Copy link
Contributor

@kstefanj kstefanj Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment that we have no regions to collect. Adding a has_candidates() to G1CollectionSet seems to make sense now when we check for candidates a couple of times and will make this condition read a bit better as well:

Suggested change
if (_g1h->young_regions_count() == 0 && _collection_set->candidates() != NULL && _collection_set->candidates()->is_empty()) {
if (_g1h->young_regions_count() == 0 && !_collection_set->has_candidates()) {
// Don't attempt a proactive GC when there are no regions to collect.

Copy link
Contributor Author

@adityamandaleeka adityamandaleeka May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thomas and I talked about this before but decided not to do it because the two cases where we're doing this are doing slightly different things (one is checking for NULL || empty and the other is checking !NULL && empty, which is not the inverse of the first). Looking at it closer now though, I think this one should also be bailing in the case where candidates is NULL right? If we do that, we can definitely add the has_candidates function and use that in both cases.

return false;
}

uint eden_count = _g1h->eden_regions_count();
size_t const eden_surv_bytes_pred = _eden_surv_rate_group->accum_surv_rate_pred(eden_count) * HeapRegion::GrainBytes;
size_t const total_young_predicted_surviving_bytes = eden_surv_bytes_pred + _predicted_surviving_bytes_from_survivor;

uint required_regions = (uint)(get_num_regions_adjust_for_plab_waste(total_young_predicted_surviving_bytes) +
get_num_regions_adjust_for_plab_waste(_predicted_surviving_bytes_from_old));

if (required_regions > _g1h->num_free_regions() - alloc_region_count) {
log_debug(gc, ergo, cset)("Proactive GC, insufficient free regions. Predicted need %u. Curr Eden %u (Pred %u). Curr Survivor %u (Pred %u). Curr Old %u (Pred %u) Free %u Alloc %u",
required_regions,
eden_count,
(uint)get_num_regions_adjust_for_plab_waste(eden_surv_bytes_pred),
_g1h->survivor_regions_count(),
(uint)get_num_regions_adjust_for_plab_waste(_predicted_surviving_bytes_from_survivor),
_g1h->old_regions_count(),
(uint)get_num_regions_adjust_for_plab_waste(_predicted_surviving_bytes_from_old),
_g1h->num_free_regions(),
alloc_region_count);

return true;
}

return false;
}

void G1Policy::update_survival_estimates_for_next_collection() {
// Predict the number of bytes of surviving objects from survivor and old
// regions and update the associated members.

// Survivor regions
size_t survivor_bytes = 0;
const GrowableArray<HeapRegion*>* survivor_regions = _g1h->survivor()->regions();
for (GrowableArrayIterator<HeapRegion*> it = survivor_regions->begin();
it != survivor_regions->end();
++it) {
survivor_bytes += predict_bytes_to_copy(*it);
}

_predicted_surviving_bytes_from_survivor = survivor_bytes;

// Old regions
G1CollectionSetCandidates *candidates = _collection_set->candidates();
if (candidates == NULL || candidates->is_empty()) {
Copy link
Contributor

@kstefanj kstefanj Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use !_collection_set->has_candidates() if added.

_predicted_surviving_bytes_from_old = 0;
return;
}

// Use the minimum old gen collection set as conservative estimate for the number
// of regions to take for this calculation.
uint iterate_count = MIN2(candidates->num_remaining(), calc_min_old_cset_length(candidates));
uint current_index = candidates->cur_idx();
size_t old_bytes = 0;
for (uint i = 0; i < iterate_count; i++) {
HeapRegion *region = candidates->at(current_index + i);
old_bytes += predict_bytes_to_copy(region);
}

_predicted_surviving_bytes_from_old = old_bytes;
}

void G1Policy::transfer_survivors_to_cset(const G1SurvivorRegions* survivors) {
note_start_adding_survivor_regions();

@@ -98,6 +98,11 @@ class G1Policy: public CHeapObj<mtGC> {

uint _free_regions_at_end_of_collection;

// These values are predictions of how much we think will survive in each
// section of the heap.
size_t _predicted_surviving_bytes_from_survivor;
size_t _predicted_surviving_bytes_from_old;

size_t _rs_length;

size_t _rs_length_prediction;
@@ -345,7 +350,17 @@ class G1Policy: public CHeapObj<mtGC> {
double time_remaining_ms,
uint& num_optional_regions);

// Returns whether a collection should be done proactively, taking into
// account the current number of free regions and the expected survival
// rates in each section of the heap.
bool proactive_collection_required(uint region_count);

private:

// Predict the number of bytes of surviving objects from survivor and old
// regions and update the associated members.
void update_survival_estimates_for_next_collection();

// Set the state to start a concurrent marking cycle and clear
// _initiate_conc_mark_if_possible because it has now been
// acted on.
@@ -122,7 +122,7 @@ VM_G1CollectForAllocation::VM_G1CollectForAllocation(size_t word_size,
void VM_G1CollectForAllocation::doit() {
G1CollectedHeap* g1h = G1CollectedHeap::heap();

if (_word_size > 0) {
if (_word_size > 0 && _gc_cause != GCCause::_g1_proactive_collection) {
Copy link
Contributor

@kstefanj kstefanj Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a helper to ´VM_G1CollectForAllocation`:

bool try_allocate_before_gc() {
  if (_gc_cause == GCCause::_g1_proactive_collection) {
    // Never allocate before proactive GCs.
    return false;
  }
  return true;
}

The condition would then read a bit better:

Suggested change
if (_word_size > 0 && _gc_cause != GCCause::_g1_proactive_collection) {
if (try_allocate_before_gc() && _word_size > 0) {

// An allocation has been requested. So, try to do that first.
_result = g1h->attempt_allocation_at_safepoint(_word_size,
false /* expect_null_cur_alloc_region */);
@@ -99,6 +99,9 @@ const char* GCCause::to_string(GCCause::Cause cause) {
case _g1_periodic_collection:
return "G1 Periodic Collection";

case _g1_proactive_collection:
return "G1 Proactive Collection";

case _dcmd_gc_run:
return "Diagnostic Command";

@@ -73,6 +73,7 @@ class GCCause : public AllStatic {
_g1_inc_collection_pause,
_g1_humongous_allocation,
_g1_periodic_collection,
_g1_proactive_collection,

_dcmd_gc_run,