Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -240,13 +240,14 @@ bool ShenandoahAdaptiveHeuristics::should_start_gc() {
log_debug(gc)("should_start_gc? available: %zu, soft_max_capacity: %zu"
", allocated: %zu", available, capacity, allocated);

// Track allocation rate even if we decide to start a cycle for other reasons.
double rate = _allocation_rate.sample(allocated);

if (_start_gc_is_pending) {
log_trigger("GC start is already pending");
return true;
}

// Track allocation rate even if we decide to start a cycle for other reasons.
double rate = _allocation_rate.sample(allocated);
_last_trigger = OTHER;

size_t min_threshold = min_free_threshold();
Expand Down Expand Up @@ -360,16 +361,32 @@ ShenandoahAllocationRate::ShenandoahAllocationRate() :
_rate_avg(int(ShenandoahAdaptiveSampleSizeSeconds * ShenandoahAdaptiveSampleFrequencyHz), ShenandoahAdaptiveDecayFactor) {
}

double ShenandoahAllocationRate::force_sample(size_t allocated, size_t &unaccounted_bytes_allocated) {
const double MinSampleTime = 0.002; // Do not sample if time since last update is less than 2 ms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to "carry over" the unaccounted_bytes_allocated if, at the time of reset, the last sample was taken less than ShenandoahAllocationRate::_interval_sec (instead of MinSampleTime)? I believe this would result in more samples being carried over.

Copy link
Contributor Author

@kdnilsen kdnilsen Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done experiments with a variety of different values for MinSampleTime (e.g. 50 ms, 25ms, 2 ms). I find that any value
greater than 1 ms works pretty well in the sense that the measured rate is pretty consistent with the average rate as computed
by the preexisting mechanisms.

The biggest problem I observed was when we tried to measure rate over an interval of a few microseconds, and these intervals
sometimes had zero allocations.

We could make the change you suggest, but that would actually result in fewer samples rather than more. It would also make
the value of bytes_allocated_since_gc_start() less precise in that this quantity would carry up to 100 ms of allocations that
actually occurred before the start of the current GC.

Here's an ASCII art attempt to illustrate how it works in the current implementation.

Here is a timeline with a tick mark to illustrate the desired sample time, every 100 ms:

      	 .1s  	   .2s	     .3s       .4s.     .5s  	   .6s	     .7s       .8s.    .9s  	   1s	      1.1s      1.2s
|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---
      	       ^      	      	      	      	    	      	      	      	      	      	       ^
Existing       |                                                                                       |
Samples:  A    B                                                                                       C      	 D         E
               |                                                                                       |
Proposed       |                                                                                       |
Samples:  A    |                                                                                       B      	 C    	   D
               |                                                                                       |
               |                                                                                       |
      	       Trigger GC at 0.14s    	      	      	      	      	      	      	      	       Finish GC at 1.03s

In either approach, we sample at 0.1s and this sample contributes to the average allocation rate that causes us to trigger at
time 0.14s. In the current implementation, at time 0.14s, we take a new sample representing the most current allocation rate
as computed over the span between 0.1s and 0.14s. As implemented, if this time delta (0.14s minus 0.10s =0.04s) is less than
2 ms, we'll skip sample B. I think you are proposing that we skip sample B whenever this delta is less than 100 ms
(_interval_sec).

We do not sample during GC. As soon as GC finishes, we begin asking the question, should I start another GC. This resumes the
sampling activity, aligning the periodic samples on the time at which we finished previous GC cycle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the change in the PR, I was thinking of "skipped" samples as being "carried over" to the next cycle (per unaccounted_bytes_allocated = allocated - _last_sample_value;).

double now = os::elapsedTime();
double time_since_last_update = now -_last_sample_time;
if (time_since_last_update < MinSampleTime) {
unaccounted_bytes_allocated = allocated - _last_sample_value;
_last_sample_value = 0;
return 0.0;
} else {
double rate = instantaneous_rate(now, allocated);
_rate.add(rate);
_rate_avg.add(_rate.avg());
_last_sample_time = now;
_last_sample_value = allocated;
unaccounted_bytes_allocated = 0;
return rate;
}
}

double ShenandoahAllocationRate::sample(size_t allocated) {
double now = os::elapsedTime();
double rate = 0.0;
if (now - _last_sample_time > _interval_sec) {
if (allocated >= _last_sample_value) {
rate = instantaneous_rate(now, allocated);
_rate.add(rate);
_rate_avg.add(_rate.avg());
}

rate = instantaneous_rate(now, allocated);
_rate.add(rate);
_rate_avg.add(_rate.avg());
_last_sample_time = now;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: in original implementation, the first sample() collected following the start of GC typically finds that allocated < _last_sample_value, so we ignore all data associated with that time gap. With Shenandoah, the next sample happens the first time we ask should_start_gc() following the completion of the current GC. With GenShen, the next sample happens after marking, when we calculate the allocation runway in order to size our reserves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible that the first sample() collected() following the start of GC finds allocated > _last_sample value. If this happens, the above computes a bogus result. It thinks we've only allocated (new_sample_value - _last_sample_value) bytes in total time (now - _last_sample_time). In truth, we've allocated new_sample_value plus currently_unknown_allocations following _last_sample_time but before the start of GC.

_last_sample_value = allocated;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,12 @@ class ShenandoahAllocationRate : public CHeapObj<mtGC> {
explicit ShenandoahAllocationRate();
void allocation_counter_reset();

double force_sample(size_t allocated, size_t &unaccounted_bytes_allocated);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document these two APIs: force_sample() and sample().

A small block documentation comment on ShenandoahAllocationRate() is also a good idea.

double sample(size_t allocated);

double upper_bound(double sds) const;
bool is_spiking(double rate, double threshold) const;

private:

double instantaneous_rate(double time, size_t allocated) const;
Expand Down Expand Up @@ -71,18 +73,18 @@ class ShenandoahAdaptiveHeuristics : public ShenandoahHeuristics {

virtual void choose_collection_set_from_regiondata(ShenandoahCollectionSet* cset,
RegionData* data, size_t size,
size_t actual_free);
size_t actual_free) override;

void record_cycle_start();
void record_success_concurrent();
void record_success_degenerated();
void record_success_full();
virtual void record_cycle_start() override;
virtual void record_success_concurrent() override;
virtual void record_success_degenerated() override;
virtual void record_success_full() override;

virtual bool should_start_gc();
virtual bool should_start_gc() override;

virtual const char* name() { return "Adaptive"; }
virtual bool is_diagnostic() { return false; }
virtual bool is_experimental() { return false; }
virtual const char* name() override { return "Adaptive"; }
virtual bool is_diagnostic() override { return false; }
virtual bool is_experimental() override { return false; }

private:
// These are used to adjust the margin of error and the spike threshold
Expand Down Expand Up @@ -150,6 +152,13 @@ class ShenandoahAdaptiveHeuristics : public ShenandoahHeuristics {
_last_trigger = trigger_type;
ShenandoahHeuristics::accept_trigger();
}

public:
virtual size_t force_alloc_rate_sample(size_t bytes_allocated) override {
size_t unaccounted_bytes;
_allocation_rate.force_sample(bytes_allocated, unaccounted_bytes);
return unaccounted_bytes;
}
};

#endif // SHARE_GC_SHENANDOAH_HEURISTICS_SHENANDOAHADAPTIVEHEURISTICS_HPP
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,11 @@ class ShenandoahHeuristics : public CHeapObj<mtGC> {

double elapsed_cycle_time() const;

virtual size_t force_alloc_rate_sample(size_t bytes_allocated) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize none of these public API's have any documentation of intent, but might be a good idea to start doing that with this new API you are adding here.

// do nothing
return 0;
}

// Format prefix and emit log message indicating a GC cycle hs been triggered
void log_trigger(const char* fmt, ...) ATTRIBUTE_PRINTF(2, 3);
};
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@ class ShenandoahSpaceInfo {
virtual size_t soft_available() const = 0;
virtual size_t available() const = 0;
virtual size_t used() const = 0;

// Return an approximation of the bytes allocated since GC start. The value returned is monotonically non-decreasing
// in time within each GC cycle. For certain GC cycles, the value returned may include some bytes allocated before
// the start of the current GC cycle.
virtual size_t bytes_allocated_since_gc_start() const = 0;
};

Expand Down
4 changes: 2 additions & 2 deletions src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -151,8 +151,8 @@ size_t ShenandoahGeneration::bytes_allocated_since_gc_start() const {
return AtomicAccess::load(&_bytes_allocated_since_gc_start);
}

void ShenandoahGeneration::reset_bytes_allocated_since_gc_start() {
AtomicAccess::store(&_bytes_allocated_since_gc_start, (size_t)0);
void ShenandoahGeneration::reset_bytes_allocated_since_gc_start(size_t initial_bytes_allocated) {
AtomicAccess::store(&_bytes_allocated_since_gc_start, initial_bytes_allocated);
}

void ShenandoahGeneration::increase_allocated(size_t bytes) {
Expand Down
2 changes: 1 addition & 1 deletion src/hotspot/share/gc/shenandoah/shenandoahGeneration.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ class ShenandoahGeneration : public CHeapObj<mtGC>, public ShenandoahSpaceInfo {
size_t soft_available() const override;

size_t bytes_allocated_since_gc_start() const override;
void reset_bytes_allocated_since_gc_start();
void reset_bytes_allocated_since_gc_start(size_t initial_bytes_allocated);
void increase_allocated(size_t bytes);

// These methods change the capacity of the generation by adding or subtracting the given number of bytes from the current
Expand Down
23 changes: 19 additions & 4 deletions src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2319,12 +2319,27 @@ address ShenandoahHeap::in_cset_fast_test_addr() {
}

void ShenandoahHeap::reset_bytes_allocated_since_gc_start() {
// It is important to force_alloc_rate_sample() before the associated generation's bytes_allocated has been reset.
// Note that there is no lock to prevent additional alloations between sampling bytes_allocated_since_gc_start() and
// reset_bytes_allocated_since_gc_start(). If additional allocations happen, they will be ignored in the average
// allocation rate computations. This effect is considered to be be negligible.

// unaccounted_bytes is the bytes not accounted for by our forced sample. If the sample interval is too short,
// the "forced sample" will not happen, and any recently allocated bytes are "unaccounted for". We pretend these
// bytes are allocated after the start of subsequent gc.
Comment on lines +2322 to +2329
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these comments should be made slightly more abstract and moved to the spec of reset and of the force methods. These almost feel like a bit of the specifics of the sampling and the "residue" computation leaking through into objects that it shouldn't leak into. So a bit of thought on how to organize this best by pushing everything to one place would work better from the standpoint of code maintainability and robustness.

As I understand the issue is that we use two separate APIs with no locking to read the allocated bytes and to reset these bytes, all while these are being concurrently incremented. So you want to have the reset method detect the residue allocation between the two points, and ascribe it to the rate computation to be used in the next sample. I feel there may be a simpler way to do this by pushing all of these mechanics & details down into the sampling method which has direct access to the counter that it can manipulate for computing rates and upon request for resetting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. What we're really try to track is the mutator's allocation rate. There's no particular reason it needs to be implemented in terms of bytes allocated since gc start. Of course, we must reset (or at least, reduce) the value periodically to avoid rollover, but again, we don't need to do that only when we start a gc. As @kdnilsen noted, having the control thread take the samples also means we don't collect samples during the GC cycle. We should consider having mutators (or some other thread) collect the samples and manage the allocation counter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review. I'll update docs in a separate PR (since I integrated already).

size_t unaccounted_bytes;
if (mode()->is_generational()) {
young_generation()->reset_bytes_allocated_since_gc_start();
old_generation()->reset_bytes_allocated_since_gc_start();
size_t bytes_allocated = young_generation()->bytes_allocated_since_gc_start();
unaccounted_bytes = young_generation()->heuristics()->force_alloc_rate_sample(bytes_allocated);
young_generation()->reset_bytes_allocated_since_gc_start(unaccounted_bytes);
unaccounted_bytes = 0;
old_generation()->reset_bytes_allocated_since_gc_start(unaccounted_bytes);
} else {
size_t bytes_allocated = global_generation()->bytes_allocated_since_gc_start();
// Single-gen Shenandoah uses global heuristics.
unaccounted_bytes = heuristics()->force_alloc_rate_sample(bytes_allocated);
}

global_generation()->reset_bytes_allocated_since_gc_start();
global_generation()->reset_bytes_allocated_since_gc_start(unaccounted_bytes);
}

void ShenandoahHeap::set_degenerated_gc_in_progress(bool in_progress) {
Expand Down