New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8257774: G1: Trigger collect when free region count drops below threshold to prevent evacuation failures #1650
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -420,25 +420,30 @@ HeapWord* G1CollectedHeap::attempt_allocation_slow(size_t word_size) { | |
HeapWord* result = NULL; | ||
for (uint try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) { | ||
bool should_try_gc; | ||
bool force_gc = false; | ||
uint gc_count_before; | ||
|
||
{ | ||
MutexLocker x(Heap_lock); | ||
result = _allocator->attempt_allocation_locked(word_size); | ||
if (result != NULL) { | ||
return result; | ||
} | ||
|
||
// If the GCLocker is active and we are bound for a GC, try expanding young gen. | ||
// This is different to when only GCLocker::needs_gc() is set: try to avoid | ||
// waiting because the GCLocker is active to not wait too long. | ||
if (GCLocker::is_active_and_needs_gc() && policy()->can_expand_young_list()) { | ||
// No need for an ergo message here, can_expand_young_list() does this when | ||
// it returns true. | ||
result = _allocator->attempt_allocation_force(word_size); | ||
if (policy()->can_mutator_consume_free_regions(1)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer if I.e.
The other issue I have with using I think a better place for
This one could be used in the following code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For this code I am investigating moving the check into either 'G1Policy::should_allocate_mutator_region()' or to the caller of 'G1Policy::should_allocate_mutator_region()' so I can distinguish the reason why it failed to allocate. Hopefully I have something ready to push this week. I am on vacation so my responses are a little slow |
||
result = _allocator->attempt_allocation_locked(word_size); | ||
if (result != NULL) { | ||
return result; | ||
} | ||
|
||
// If the GCLocker is active and we are bound for a GC, try expanding young gen. | ||
// This is different to when only GCLocker::needs_gc() is set: try to avoid | ||
// waiting because the GCLocker is active to not wait too long. | ||
if (GCLocker::is_active_and_needs_gc() && policy()->can_expand_young_list()) { | ||
// No need for an ergo message here, can_expand_young_list() does this when | ||
// it returns true. | ||
result = _allocator->attempt_allocation_force(word_size); | ||
if (result != NULL) { | ||
return result; | ||
} | ||
} | ||
} else { | ||
force_gc = true; | ||
} | ||
// Only try a GC if the GCLocker does not signal the need for a GC. Wait until | ||
// the GCLocker initiated GC has been performed and then retry. This includes | ||
|
@@ -451,7 +456,7 @@ HeapWord* G1CollectedHeap::attempt_allocation_slow(size_t word_size) { | |
if (should_try_gc) { | ||
bool succeeded; | ||
result = do_collection_pause(word_size, gc_count_before, &succeeded, | ||
GCCause::_g1_inc_collection_pause); | ||
GCCause::_g1_inc_collection_pause, force_gc); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really think it is better to add a new GCCause for this instead of the additional parameter. We are not doing a GC because we are out of space because of the allocation of |
||
if (result != NULL) { | ||
assert(succeeded, "only way to get back a non-NULL result"); | ||
log_trace(gc, alloc)("%s: Successfully scheduled collection returning " PTR_FORMAT, | ||
|
@@ -844,21 +849,26 @@ HeapWord* G1CollectedHeap::attempt_allocation_humongous(size_t word_size) { | |
HeapWord* result = NULL; | ||
for (uint try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) { | ||
bool should_try_gc; | ||
bool force_gc = false; | ||
uint gc_count_before; | ||
|
||
|
||
{ | ||
MutexLocker x(Heap_lock); | ||
|
||
// Given that humongous objects are not allocated in young | ||
// regions, we'll first try to do the allocation without doing a | ||
// collection hoping that there's enough space in the heap. | ||
result = humongous_obj_allocate(word_size); | ||
if (result != NULL) { | ||
size_t size_in_regions = humongous_obj_size_in_regions(word_size); | ||
policy()->old_gen_alloc_tracker()-> | ||
add_allocated_humongous_bytes_since_last_gc(size_in_regions * HeapRegion::GrainBytes); | ||
return result; | ||
size_t size_in_regions = humongous_obj_size_in_regions(word_size); | ||
if (policy()->can_mutator_consume_free_regions((uint)size_in_regions)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, I would prefer that the result of this is stored in a local not called |
||
// Given that humongous objects are not allocated in young | ||
// regions, we'll first try to do the allocation without doing a | ||
// collection hoping that there's enough space in the heap. | ||
result = humongous_obj_allocate(word_size); | ||
if (result != NULL) { | ||
policy()->old_gen_alloc_tracker()-> | ||
add_allocated_humongous_bytes_since_last_gc(size_in_regions * HeapRegion::GrainBytes); | ||
return result; | ||
} | ||
} else { | ||
force_gc = true; | ||
} | ||
|
||
// Only try a GC if the GCLocker does not signal the need for a GC. Wait until | ||
|
@@ -872,7 +882,7 @@ HeapWord* G1CollectedHeap::attempt_allocation_humongous(size_t word_size) { | |
if (should_try_gc) { | ||
bool succeeded; | ||
result = do_collection_pause(word_size, gc_count_before, &succeeded, | ||
GCCause::_g1_humongous_allocation); | ||
GCCause::_g1_humongous_allocation, force_gc); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe the reason for this GC if |
||
if (result != NULL) { | ||
assert(succeeded, "only way to get back a non-NULL result"); | ||
log_trace(gc, alloc)("%s: Successfully scheduled collection returning " PTR_FORMAT, | ||
|
@@ -2635,12 +2645,14 @@ void G1CollectedHeap::verify_numa_regions(const char* desc) { | |
HeapWord* G1CollectedHeap::do_collection_pause(size_t word_size, | ||
uint gc_count_before, | ||
bool* succeeded, | ||
GCCause::Cause gc_cause) { | ||
GCCause::Cause gc_cause, | ||
bool force_gc) { | ||
assert_heap_not_locked_and_not_at_safepoint(); | ||
VM_G1CollectForAllocation op(word_size, | ||
gc_count_before, | ||
gc_cause, | ||
policy()->max_pause_time_ms()); | ||
policy()->max_pause_time_ms(), | ||
force_gc); | ||
VMThread::execute(&op); | ||
|
||
HeapWord* result = op.result(); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -761,7 +761,8 @@ class G1CollectedHeap : public CollectedHeap { | |
HeapWord* do_collection_pause(size_t word_size, | ||
uint gc_count_before, | ||
bool* succeeded, | ||
GCCause::Cause gc_cause); | ||
GCCause::Cause gc_cause, | ||
bool force_gc); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If that parameter is kept, please add documentation. |
||
|
||
void wait_for_root_region_scanning(); | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -69,6 +69,8 @@ G1Policy::G1Policy(STWGCTimer* gc_timer) : | |
_reserve_regions(0), | ||
_young_gen_sizer(), | ||
_free_regions_at_end_of_collection(0), | ||
_predicted_survival_bytes_from_survivor(0), | ||
_predicted_survival_bytes_from_old(0), | ||
_rs_length(0), | ||
_rs_length_prediction(0), | ||
_pending_cards_at_gc_start(0), | ||
|
@@ -451,6 +453,7 @@ void G1Policy::record_full_collection_end() { | |
// also call this on any additional surv rate groups | ||
|
||
_free_regions_at_end_of_collection = _g1h->num_free_regions(); | ||
calculate_required_regions_for_next_collect(); | ||
_survivor_surv_rate_group->reset(); | ||
update_young_list_max_and_target_length(); | ||
update_rs_length_prediction(); | ||
|
@@ -782,6 +785,9 @@ void G1Policy::record_collection_pause_end(double pause_time_ms, bool concurrent | |
|
||
update_rs_length_prediction(); | ||
|
||
// Is this the right place? Should it be in the below? | ||
calculate_required_regions_for_next_collect(); | ||
|
||
// Do not update dynamic IHOP due to G1 periodic collection as it is highly likely | ||
// that in this case we are not running in a "normal" operating mode. | ||
if (_g1h->gc_cause() != GCCause::_g1_periodic_collection) { | ||
|
@@ -1441,6 +1447,63 @@ void G1Policy::calculate_optional_collection_set_regions(G1CollectionSetCandidat | |
num_optional_regions, max_optional_regions, prediction_ms); | ||
} | ||
|
||
bool G1Policy::can_mutator_consume_free_regions(uint alloc_region_count) { | ||
uint eden_count = _g1h->eden_regions_count(); | ||
if (eden_count < 1) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd prefer "eden_count == 0" here since it is an uint anyway. |
||
return true; | ||
} | ||
size_t const predicted_survival_bytes_from_eden = _eden_surv_rate_group->accum_surv_rate_pred(eden_count) * HeapRegion::GrainBytes; | ||
size_t const total_predicted_survival_bytes = predicted_survival_bytes_from_eden + _predicted_survival_bytes_from_survivor + _predicted_survival_bytes_from_old; | ||
// adjust the total survival bytes by the target amount of wasted space in PLABs. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This adjustment for wasted space (the |
||
// should old bytes be adjusted and turned into a region count on its own? | ||
size_t const adjusted_survival_bytes_bytes = (size_t)(total_predicted_survival_bytes * (100 + TargetPLABWastePct) / 100.0); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Answering the question: yes, both young gen and old gen survivors must be treated separately and rounded up to regions seperately. |
||
|
||
uint required_regions = ceil((double)adjusted_survival_bytes_bytes / (double)HeapRegion::GrainBytes); | ||
if (required_regions <= _g1h->num_free_regions() - alloc_region_count) { | ||
return true; | ||
} | ||
log_debug(gc, ergo, cset)("Forcing GC, insufficient free regions for GC predicted %u current eden %u (%u) survivor %u (%u) old %u (%u) free %u alloc %u", | ||
required_regions, | ||
eden_count, | ||
(uint)(predicted_survival_bytes_from_eden / HeapRegion::GrainBytes), | ||
_g1h->survivor_regions_count(), | ||
(uint)(_predicted_survival_bytes_from_survivor / HeapRegion::GrainBytes), | ||
_g1h->old_regions_count(), | ||
(uint)(_predicted_survival_bytes_from_old / HeapRegion::GrainBytes), | ||
_g1h->num_free_regions(), | ||
alloc_region_count); | ||
return false; | ||
} | ||
|
||
void G1Policy::calculate_required_regions_for_next_collect() { | ||
// calculate the survival bytes from survivor in the next GC | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Comments that are sentences should start upper case and end with a full stop. |
||
size_t survivor_bytes = 0; | ||
const GrowableArray<HeapRegion*>* survivor_regions = _g1h->survivor()->regions(); | ||
for (GrowableArrayIterator<HeapRegion*> it = survivor_regions->begin(); | ||
it != survivor_regions->end(); | ||
++it) { | ||
|
||
survivor_bytes += predict_bytes_to_copy(*it); | ||
} | ||
_predicted_survival_bytes_from_survivor = survivor_bytes; | ||
|
||
// calculate the survival bytes from old in the next GC | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same problem as above with the comment style. :) Please add a sentence that we use the minimum old gen collection set as conservative estimate for the number of regions to take for this calculation. |
||
_predicted_survival_bytes_from_old = 0; | ||
G1CollectionSetCandidates *candidates = _collection_set->candidates(); | ||
if ((candidates != NULL) && !candidates->is_empty()) { | ||
uint predicted_old_region_count = calc_min_old_cset_length(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or add the comment that we intentionally use |
||
uint num_remaining = candidates->num_remaining(); | ||
uint iterate_count = num_remaining < predicted_old_region_count ? num_remaining : predicted_old_region_count; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I kind of prefer using the Not sure if adding the two locals I'm also not sure if the second part of the condition of the outer if (i.e. |
||
uint current_index = candidates->cur_idx(); | ||
size_t old_bytes = 0; | ||
for (uint i = 0; i < iterate_count; i++) { | ||
HeapRegion *region = candidates->at(current_index + i); | ||
old_bytes += predict_bytes_to_copy(region); | ||
} | ||
_predicted_survival_bytes_from_old = old_bytes; | ||
} | ||
} | ||
|
||
void G1Policy::transfer_survivors_to_cset(const G1SurvivorRegions* survivors) { | ||
note_start_adding_survivor_regions(); | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -98,6 +98,9 @@ class G1Policy: public CHeapObj<mtGC> { | |
|
||
uint _free_regions_at_end_of_collection; | ||
|
||
size_t _predicted_survival_bytes_from_survivor; | ||
size_t _predicted_survival_bytes_from_old; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a non-English native speaker I think "survival_bytes" is strange as "survival" isn't an adjective. Maybe "surviving_bytes" sounds better? Also please add a comment what these are used for. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will change the names and properly comment them. |
||
|
||
size_t _rs_length; | ||
|
||
size_t _rs_length_prediction; | ||
|
@@ -362,6 +365,8 @@ class G1Policy: public CHeapObj<mtGC> { | |
double time_remaining_ms, | ||
uint& num_optional_regions); | ||
|
||
bool can_mutator_consume_free_regions(uint region_count); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Comments missing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will add a comment in my next revision |
||
void calculate_required_regions_for_next_collect(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "... for_next_collection" sounds better? This method name seems self-explaining, but it does not calculate the required regions but updates the two new members which contain values stored in bytes. |
||
private: | ||
// Set the state to start a concurrent marking cycle and clear | ||
// _initiate_conc_mark_if_possible because it has now been | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -106,9 +106,11 @@ void VM_G1TryInitiateConcMark::doit() { | |
VM_G1CollectForAllocation::VM_G1CollectForAllocation(size_t word_size, | ||
uint gc_count_before, | ||
GCCause::Cause gc_cause, | ||
double target_pause_time_ms) : | ||
double target_pause_time_ms, | ||
bool force_gc) : | ||
VM_CollectForAllocation(word_size, gc_count_before, gc_cause), | ||
_gc_succeeded(false), | ||
_force_gc(force_gc), | ||
_target_pause_time_ms(target_pause_time_ms) { | ||
|
||
guarantee(target_pause_time_ms > 0.0, | ||
|
@@ -120,7 +122,7 @@ VM_G1CollectForAllocation::VM_G1CollectForAllocation(size_t word_size, | |
void VM_G1CollectForAllocation::doit() { | ||
G1CollectedHeap* g1h = G1CollectedHeap::heap(); | ||
|
||
if (_word_size > 0) { | ||
if (_word_size > 0 && !_force_gc) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not start the GC with a requested word_size == 0 here and remove the need for the flag (still there is need to record somehow the purpose of this gc)? In some way it makes sense as space is not really exhausted. |
||
// An allocation has been requested. So, try to do that first. | ||
_result = g1h->attempt_allocation_at_safepoint(_word_size, | ||
false /* expect_null_cur_alloc_region */); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -68,13 +68,15 @@ class VM_G1TryInitiateConcMark : public VM_GC_Operation { | |
|
||
class VM_G1CollectForAllocation : public VM_CollectForAllocation { | ||
bool _gc_succeeded; | ||
bool _force_gc; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not completely happy about using an extra flag for these forced GCs here; also this makes them indistinguishable with other GCs in the logs as far as I can see. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will remove the _force_gc flag and add a Preemptive GCCause. |
||
double _target_pause_time_ms; | ||
|
||
public: | ||
VM_G1CollectForAllocation(size_t word_size, | ||
uint gc_count_before, | ||
GCCause::Cause gc_cause, | ||
double target_pause_time_ms); | ||
double target_pause_time_ms, | ||
bool force_gc = false); | ||
virtual VMOp_Type type() const { return VMOp_G1CollectForAllocation; } | ||
virtual void doit(); | ||
bool gc_succeeded() const { return _gc_succeeded; } | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -313,16 +313,17 @@ public static void main(String [] args) { | |
static class GCTestWithEvacuationFailure { | ||
private static byte[] garbage; | ||
private static byte[] largeObject; | ||
private static Object[] holder = new Object[200]; // Must be larger than G1EvacuationFailureALotCount | ||
private static Object[] holder = new Object[800]; // Must be larger than G1EvacuationFailureALotCount | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just curious about these changes: it is not immediately obvious to me why they are necessary as the mechanism to force evacuation failure (G1EvacuationFailureALotCount et al) should be independent of these changes. And the 17MB (for the humonguous object)+ 16MB of garbage should be enough for at least one gc; but maybe these changes trigger an early gc? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes the GC was being triggered earlier and not getting the evacuation failure. With these adjustments it consistently gets an evacuation failure. I was seeing this with my initial prototype so I will verify that it is still required. |
||
|
||
public static void main(String [] args) { | ||
largeObject = new byte[16*1024*1024]; | ||
System.out.println("Creating garbage"); | ||
// Create 16 MB of garbage. This should result in at least one GC, | ||
// Create 64 MB of garbage. This should result in at least one GC, | ||
// (Heap size is 32M, we use 17MB for the large object above) | ||
// which is larger than G1EvacuationFailureALotInterval. | ||
// which is larger than G1EvacuationFailureALotInterval and enough | ||
// will survive to cause the evacuation failure. | ||
for (int i = 0; i < 16 * 1024; i++) { | ||
holder[i % holder.length] = new byte[1024]; | ||
holder[i % holder.length] = new byte[4096]; | ||
} | ||
System.out.println("Done"); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
force_gc
andshould_try_gc
seems to overlap a bit here. At least the naming isn't perfect because we may not do a gc even ifforce_gc
is true which I'd kind of expect.I do not have a good new name right now how to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be removed as part of my next round of changes