Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8257774: G1: Trigger collect when free region count drops below threshold to prevent evacuation failures #1650

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
62 changes: 37 additions & 25 deletions src/hotspot/share/gc/g1/g1CollectedHeap.cpp
Expand Up @@ -420,25 +420,30 @@ HeapWord* G1CollectedHeap::attempt_allocation_slow(size_t word_size) {
HeapWord* result = NULL;
for (uint try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) {
bool should_try_gc;
bool force_gc = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

force_gc and should_try_gc seems to overlap a bit here. At least the naming isn't perfect because we may not do a gc even if force_gc is true which I'd kind of expect.

I do not have a good new name right now how to fix this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be removed as part of my next round of changes

uint gc_count_before;

{
MutexLocker x(Heap_lock);
result = _allocator->attempt_allocation_locked(word_size);
if (result != NULL) {
return result;
}

// If the GCLocker is active and we are bound for a GC, try expanding young gen.
// This is different to when only GCLocker::needs_gc() is set: try to avoid
// waiting because the GCLocker is active to not wait too long.
if (GCLocker::is_active_and_needs_gc() && policy()->can_expand_young_list()) {
// No need for an ergo message here, can_expand_young_list() does this when
// it returns true.
result = _allocator->attempt_allocation_force(word_size);
if (policy()->can_mutator_consume_free_regions(1)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if force_gc (or whatever name it will have) would be set here unconditionally as the else is pretty far away here.

I.e.

force_gc = policy()->can_mutator_consume_free_regions(1);
  
  if (force_gc) { // needing to use the name "force_gc" here shows that the name is wrong...
    ... try allocation
    ... check if we should expand young gen beyond regular size due to GCLocker
  }

The other issue I have with using can_mutator_consume_free_regions() here is that there is already a very similar G1Policy::should_allocate_mutator_region; and anyway, the attempt_allocation_locked call may actually succeed without requiring a new region (actually, it is not uncommon that another thread got a free region while trying to take the Heap_lock.

I think a better place for can_mutator_consume_free_regions() is in G1Policy::should_allocate_mutator_region() for this case.

attempt_allocation_locked however does not return a reason for why allocation failed (at the moment). Maybe it is better to let it return a tuple with result and reason (or a second "out" parameter)? (I haven't tried how this would look like, it seems worth trying and better than the current way of handling this).

This one could be used in the following code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this code I am investigating moving the check into either 'G1Policy::should_allocate_mutator_region()' or to the caller of 'G1Policy::should_allocate_mutator_region()' so I can distinguish the reason why it failed to allocate. Hopefully I have something ready to push this week. I am on vacation so my responses are a little slow

result = _allocator->attempt_allocation_locked(word_size);
if (result != NULL) {
return result;
}

// If the GCLocker is active and we are bound for a GC, try expanding young gen.
// This is different to when only GCLocker::needs_gc() is set: try to avoid
// waiting because the GCLocker is active to not wait too long.
if (GCLocker::is_active_and_needs_gc() && policy()->can_expand_young_list()) {
// No need for an ergo message here, can_expand_young_list() does this when
// it returns true.
result = _allocator->attempt_allocation_force(word_size);
if (result != NULL) {
return result;
}
}
} else {
force_gc = true;
}
// Only try a GC if the GCLocker does not signal the need for a GC. Wait until
// the GCLocker initiated GC has been performed and then retry. This includes
Expand All @@ -451,7 +456,7 @@ HeapWord* G1CollectedHeap::attempt_allocation_slow(size_t word_size) {
if (should_try_gc) {
bool succeeded;
result = do_collection_pause(word_size, gc_count_before, &succeeded,
GCCause::_g1_inc_collection_pause);
GCCause::_g1_inc_collection_pause, force_gc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really think it is better to add a new GCCause for this instead of the additional parameter. We are not doing a GC because we are out of space because of the allocation of word_size, but a "pre-emptive" GC due to the allocation of word_size.
This warrants a new GC cause imho.

if (result != NULL) {
assert(succeeded, "only way to get back a non-NULL result");
log_trace(gc, alloc)("%s: Successfully scheduled collection returning " PTR_FORMAT,
Expand Down Expand Up @@ -844,21 +849,26 @@ HeapWord* G1CollectedHeap::attempt_allocation_humongous(size_t word_size) {
HeapWord* result = NULL;
for (uint try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) {
bool should_try_gc;
bool force_gc = false;
uint gc_count_before;


{
MutexLocker x(Heap_lock);

// Given that humongous objects are not allocated in young
// regions, we'll first try to do the allocation without doing a
// collection hoping that there's enough space in the heap.
result = humongous_obj_allocate(word_size);
if (result != NULL) {
size_t size_in_regions = humongous_obj_size_in_regions(word_size);
policy()->old_gen_alloc_tracker()->
add_allocated_humongous_bytes_since_last_gc(size_in_regions * HeapRegion::GrainBytes);
return result;
size_t size_in_regions = humongous_obj_size_in_regions(word_size);
if (policy()->can_mutator_consume_free_regions((uint)size_in_regions)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I would prefer that the result of this is stored in a local not called force_gc :) and then used instead of appending such a small else after the "long'ish" true case.

// Given that humongous objects are not allocated in young
// regions, we'll first try to do the allocation without doing a
// collection hoping that there's enough space in the heap.
result = humongous_obj_allocate(word_size);
if (result != NULL) {
policy()->old_gen_alloc_tracker()->
add_allocated_humongous_bytes_since_last_gc(size_in_regions * HeapRegion::GrainBytes);
return result;
}
} else {
force_gc = true;
}

// Only try a GC if the GCLocker does not signal the need for a GC. Wait until
Expand All @@ -872,7 +882,7 @@ HeapWord* G1CollectedHeap::attempt_allocation_humongous(size_t word_size) {
if (should_try_gc) {
bool succeeded;
result = do_collection_pause(word_size, gc_count_before, &succeeded,
GCCause::_g1_humongous_allocation);
GCCause::_g1_humongous_allocation, force_gc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the reason for this GC if force_gc is true is not that we do not have enough space for the humongous allocation, but we are doing a "pre-emptive" GC because of the allocation of "word_size" similar to before.

if (result != NULL) {
assert(succeeded, "only way to get back a non-NULL result");
log_trace(gc, alloc)("%s: Successfully scheduled collection returning " PTR_FORMAT,
Expand Down Expand Up @@ -2635,12 +2645,14 @@ void G1CollectedHeap::verify_numa_regions(const char* desc) {
HeapWord* G1CollectedHeap::do_collection_pause(size_t word_size,
uint gc_count_before,
bool* succeeded,
GCCause::Cause gc_cause) {
GCCause::Cause gc_cause,
bool force_gc) {
assert_heap_not_locked_and_not_at_safepoint();
VM_G1CollectForAllocation op(word_size,
gc_count_before,
gc_cause,
policy()->max_pause_time_ms());
policy()->max_pause_time_ms(),
force_gc);
VMThread::execute(&op);

HeapWord* result = op.result();
Expand Down
3 changes: 2 additions & 1 deletion src/hotspot/share/gc/g1/g1CollectedHeap.hpp
Expand Up @@ -761,7 +761,8 @@ class G1CollectedHeap : public CollectedHeap {
HeapWord* do_collection_pause(size_t word_size,
uint gc_count_before,
bool* succeeded,
GCCause::Cause gc_cause);
GCCause::Cause gc_cause,
bool force_gc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that parameter is kept, please add documentation.


void wait_for_root_region_scanning();

Expand Down
63 changes: 63 additions & 0 deletions src/hotspot/share/gc/g1/g1Policy.cpp
Expand Up @@ -69,6 +69,8 @@ G1Policy::G1Policy(STWGCTimer* gc_timer) :
_reserve_regions(0),
_young_gen_sizer(),
_free_regions_at_end_of_collection(0),
_predicted_survival_bytes_from_survivor(0),
_predicted_survival_bytes_from_old(0),
_rs_length(0),
_rs_length_prediction(0),
_pending_cards_at_gc_start(0),
Expand Down Expand Up @@ -451,6 +453,7 @@ void G1Policy::record_full_collection_end() {
// also call this on any additional surv rate groups

_free_regions_at_end_of_collection = _g1h->num_free_regions();
calculate_required_regions_for_next_collect();
_survivor_surv_rate_group->reset();
update_young_list_max_and_target_length();
update_rs_length_prediction();
Expand Down Expand Up @@ -782,6 +785,9 @@ void G1Policy::record_collection_pause_end(double pause_time_ms, bool concurrent

update_rs_length_prediction();

// Is this the right place? Should it be in the below?
calculate_required_regions_for_next_collect();

// Do not update dynamic IHOP due to G1 periodic collection as it is highly likely
// that in this case we are not running in a "normal" operating mode.
if (_g1h->gc_cause() != GCCause::_g1_periodic_collection) {
Expand Down Expand Up @@ -1441,6 +1447,63 @@ void G1Policy::calculate_optional_collection_set_regions(G1CollectionSetCandidat
num_optional_regions, max_optional_regions, prediction_ms);
}

bool G1Policy::can_mutator_consume_free_regions(uint alloc_region_count) {
uint eden_count = _g1h->eden_regions_count();
if (eden_count < 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer "eden_count == 0" here since it is an uint anyway.

return true;
}
size_t const predicted_survival_bytes_from_eden = _eden_surv_rate_group->accum_surv_rate_pred(eden_count) * HeapRegion::GrainBytes;
size_t const total_predicted_survival_bytes = predicted_survival_bytes_from_eden + _predicted_survival_bytes_from_survivor + _predicted_survival_bytes_from_old;
// adjust the total survival bytes by the target amount of wasted space in PLABs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adjustment for wasted space (the (100 + PLABWastePercent) / 100) code) in PLABs is now done twice. Please extract a (maybe static) helper function.

// should old bytes be adjusted and turned into a region count on its own?
size_t const adjusted_survival_bytes_bytes = (size_t)(total_predicted_survival_bytes * (100 + TargetPLABWastePct) / 100.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answering the question: yes, both young gen and old gen survivors must be treated separately and rounded up to regions seperately.


uint required_regions = ceil((double)adjusted_survival_bytes_bytes / (double)HeapRegion::GrainBytes);
if (required_regions <= _g1h->num_free_regions() - alloc_region_count) {
return true;
}
log_debug(gc, ergo, cset)("Forcing GC, insufficient free regions for GC predicted %u current eden %u (%u) survivor %u (%u) old %u (%u) free %u alloc %u",
required_regions,
eden_count,
(uint)(predicted_survival_bytes_from_eden / HeapRegion::GrainBytes),
_g1h->survivor_regions_count(),
(uint)(_predicted_survival_bytes_from_survivor / HeapRegion::GrainBytes),
_g1h->old_regions_count(),
(uint)(_predicted_survival_bytes_from_old / HeapRegion::GrainBytes),
_g1h->num_free_regions(),
alloc_region_count);
return false;
}

void G1Policy::calculate_required_regions_for_next_collect() {
// calculate the survival bytes from survivor in the next GC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments that are sentences should start upper case and end with a full stop.

size_t survivor_bytes = 0;
const GrowableArray<HeapRegion*>* survivor_regions = _g1h->survivor()->regions();
for (GrowableArrayIterator<HeapRegion*> it = survivor_regions->begin();
it != survivor_regions->end();
++it) {

survivor_bytes += predict_bytes_to_copy(*it);
}
_predicted_survival_bytes_from_survivor = survivor_bytes;

// calculate the survival bytes from old in the next GC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same problem as above with the comment style. :) Please add a sentence that we use the minimum old gen collection set as conservative estimate for the number of regions to take for this calculation.

_predicted_survival_bytes_from_old = 0;
G1CollectionSetCandidates *candidates = _collection_set->candidates();
if ((candidates != NULL) && !candidates->is_empty()) {
uint predicted_old_region_count = calc_min_old_cset_length();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or add the comment that we intentionally use calc_min_old_cset_length as an estimate for the number of old regions for this calculations here.

uint num_remaining = candidates->num_remaining();
uint iterate_count = num_remaining < predicted_old_region_count ? num_remaining : predicted_old_region_count;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of prefer using the MIN2() expression here instead of the "if" as it is what you want anyway (it's the same of course, but the use of MIN2 directly indicates that we are taking the minimum).

Not sure if adding the two locals predicted_old_region_count and num_remaining should be kept then. They do not add too much (and should be const).

I'm also not sure if the second part of the condition of the outer if (i.e. && !candidates->is_empty()) is useful. candidates->num_remaining() should be zero in this case.

uint current_index = candidates->cur_idx();
size_t old_bytes = 0;
for (uint i = 0; i < iterate_count; i++) {
HeapRegion *region = candidates->at(current_index + i);
old_bytes += predict_bytes_to_copy(region);
}
_predicted_survival_bytes_from_old = old_bytes;
}
}

void G1Policy::transfer_survivors_to_cset(const G1SurvivorRegions* survivors) {
note_start_adding_survivor_regions();

Expand Down
5 changes: 5 additions & 0 deletions src/hotspot/share/gc/g1/g1Policy.hpp
Expand Up @@ -98,6 +98,9 @@ class G1Policy: public CHeapObj<mtGC> {

uint _free_regions_at_end_of_collection;

size_t _predicted_survival_bytes_from_survivor;
size_t _predicted_survival_bytes_from_old;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a non-English native speaker I think "survival_bytes" is strange as "survival" isn't an adjective. Maybe "surviving_bytes" sounds better?
The code in calculate_required_regions_for_next_collect uses "survivor" btw. I would still prefer "surviving" in some way as it differs from the "survivor" in "survivor regions", but let's keep nomenclature at least consistent.

Also please add a comment what these are used for.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change the names and properly comment them.


size_t _rs_length;

size_t _rs_length_prediction;
Expand Down Expand Up @@ -362,6 +365,8 @@ class G1Policy: public CHeapObj<mtGC> {
double time_remaining_ms,
uint& num_optional_regions);

bool can_mutator_consume_free_regions(uint region_count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments missing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a comment in my next revision

void calculate_required_regions_for_next_collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... for_next_collection" sounds better? This method name seems self-explaining, but it does not calculate the required regions but updates the two new members which contain values stored in bytes.

private:
// Set the state to start a concurrent marking cycle and clear
// _initiate_conc_mark_if_possible because it has now been
Expand Down
6 changes: 4 additions & 2 deletions src/hotspot/share/gc/g1/g1VMOperations.cpp
Expand Up @@ -106,9 +106,11 @@ void VM_G1TryInitiateConcMark::doit() {
VM_G1CollectForAllocation::VM_G1CollectForAllocation(size_t word_size,
uint gc_count_before,
GCCause::Cause gc_cause,
double target_pause_time_ms) :
double target_pause_time_ms,
bool force_gc) :
VM_CollectForAllocation(word_size, gc_count_before, gc_cause),
_gc_succeeded(false),
_force_gc(force_gc),
_target_pause_time_ms(target_pause_time_ms) {

guarantee(target_pause_time_ms > 0.0,
Expand All @@ -120,7 +122,7 @@ VM_G1CollectForAllocation::VM_G1CollectForAllocation(size_t word_size,
void VM_G1CollectForAllocation::doit() {
G1CollectedHeap* g1h = G1CollectedHeap::heap();

if (_word_size > 0) {
if (_word_size > 0 && !_force_gc) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not start the GC with a requested word_size == 0 here and remove the need for the flag (still there is need to record somehow the purpose of this gc)? In some way it makes sense as space is not really exhausted.

// An allocation has been requested. So, try to do that first.
_result = g1h->attempt_allocation_at_safepoint(_word_size,
false /* expect_null_cur_alloc_region */);
Expand Down
4 changes: 3 additions & 1 deletion src/hotspot/share/gc/g1/g1VMOperations.hpp
Expand Up @@ -68,13 +68,15 @@ class VM_G1TryInitiateConcMark : public VM_GC_Operation {

class VM_G1CollectForAllocation : public VM_CollectForAllocation {
bool _gc_succeeded;
bool _force_gc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not completely happy about using an extra flag for these forced GCs here; also this makes them indistinguishable with other GCs in the logs as far as I can see.
What do you think about adding GCCause(s) instead to automatically make them stand out in the logs?
Or at least make sure that they stand out in the logs for debugging issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove the _force_gc flag and add a Preemptive GCCause.

double _target_pause_time_ms;

public:
VM_G1CollectForAllocation(size_t word_size,
uint gc_count_before,
GCCause::Cause gc_cause,
double target_pause_time_ms);
double target_pause_time_ms,
bool force_gc = false);
virtual VMOp_Type type() const { return VMOp_G1CollectForAllocation; }
virtual void doit();
bool gc_succeeded() const { return _gc_succeeded; }
Expand Down
9 changes: 5 additions & 4 deletions test/hotspot/jtreg/gc/g1/TestGCLogMessages.java
Expand Up @@ -313,16 +313,17 @@ public static void main(String [] args) {
static class GCTestWithEvacuationFailure {
private static byte[] garbage;
private static byte[] largeObject;
private static Object[] holder = new Object[200]; // Must be larger than G1EvacuationFailureALotCount
private static Object[] holder = new Object[800]; // Must be larger than G1EvacuationFailureALotCount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious about these changes: it is not immediately obvious to me why they are necessary as the mechanism to force evacuation failure (G1EvacuationFailureALotCount et al) should be independent of these changes.

And the 17MB (for the humonguous object)+ 16MB of garbage should be enough for at least one gc; but maybe these changes trigger an early gc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the GC was being triggered earlier and not getting the evacuation failure. With these adjustments it consistently gets an evacuation failure. I was seeing this with my initial prototype so I will verify that it is still required.


public static void main(String [] args) {
largeObject = new byte[16*1024*1024];
System.out.println("Creating garbage");
// Create 16 MB of garbage. This should result in at least one GC,
// Create 64 MB of garbage. This should result in at least one GC,
// (Heap size is 32M, we use 17MB for the large object above)
// which is larger than G1EvacuationFailureALotInterval.
// which is larger than G1EvacuationFailureALotInterval and enough
// will survive to cause the evacuation failure.
for (int i = 0; i < 16 * 1024; i++) {
holder[i % holder.length] = new byte[1024];
holder[i % holder.length] = new byte[4096];
}
System.out.println("Done");
}
Expand Down