Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
**_For background and results see https://github.com/eclipse/omr/issues/5829_** > Using overhead data (busy/stall times for managing and synchronizing threads), Adaptive Threading aims to identify sub-optimal/detrimental parallelism and continuously adjusts the GC thread count to reach optimal parallelism. _Adaptive Threading changes are implemented at the various phases of GC as follows, during:_ 1. **Pre-collection _(during to task/thread dispatch)_:** to adjust thread count based on the previous cycle’s recommendation 2. **Collection:** to gather data for parallelization overhead for the running cycle 3. **Post-collection _(after worker threads are suspended)_:** to project/calculate optimal thread count and give recommendation for the next cycle based on the current GC that’s completed Adaptive threading will be enabled by default. The user may choose to enable/disable adaptive threading through the `-XX:[+-]AdaptiveThreading` options. However Adaptive Threading is ignored, even if it is enabled, when GC thread count is forced (e.g user specifics Xgcthreads). The user can also specify upper thread limit for adaptive threading using `-Xgcmaxthreads` option. _The specifics of the implementation are as follows:_ - Thread count associated with a parallel task instance, indicates the projected optimal number of threads to complete the task. - Drives Adaptive Threading - changes with each dispatch of the task, based on observations made when previously completing the same task - Currently, adaptive threading is only used with _Scavenger_, hence this is only applies to tasks of type `MM_ParallelScavengeTask` so a recommended thread count is provided with a new `ParallelScavengeTask` instance (i.e when scavenge is run) - `getRecommendedWorkingThreads` introduced to `MM_Task` base class, base implementation returns `UDATA_MAX` (signifies no adaptive threading), overridden by `ParallelScavengeTask` to return adaptive thread count. during collection): - ` _workerScavengeStartTime` and `_workerScavengeEndTime` - Thread local task (scavenge) start/end timestamps taken when worker thread starts/competes a task (scavenge). This is used to determine the time it takes a worker to start collection task from the time a cycle starts and how long a worker waits for others when it is completed its task. - `_adjustedSyncStallTime` - Similar to the existing `_SyncStallTime` stat **except** it accounts for Critical Section duration. That is, it subtracts critical section duration from stall time of synced thread. This is because the stall time being added from critical section is independent of number of the threads being synchronized. This independent stall time can not be adjusted for, hence it must be ignored. This is relevant for `syncAndRealeaseSingle/Master` APIs as they are the only APIs that record a stall time with critical sections, without these, `_SyncStallTime` == `_adjustedSyncStallTime` - existing `addToSyncStallTime` method extended to update `_adjustedSyncStallTime` in addition to `_SyncStallTime`. `addToSyncStallTime` now takes `criticalSectionDuration` (defaults to 0) as an input, a value for it passed when a critical section is executed prior to updating stall stats . - `_notifyStallTime ` - Used to record the stall times resulting from notifying other waiting threads. Note, this stat is not inclusive, it only records notify times relevant to adaptive threading. - We are more concerned about recording 'notify_all' rather than 'notify_one' as 'notify_all' is dependent on the number of threads being notified. - Introduced `calculateRecommendedWorkingThreads` routine - Called at the end of each successful scavenge to project optimal threads for next scavenge - Implements Adaptive Model _(see background for more info)_ - Calculates averages of stall stats and uses them as inputs to adaptive model, projected optimal thread count output is stored to new member `_recommendedThreads`. - Timestamps taken for worker thread scavenge start/end time - `notifyStallTime` recorded for `notify_all` on `_scanCacheMonitor` _(some instances are ignored as they are not relevant to adaptive threading such as backout cases)_ - `MergeThreadGCStats` updated to merge new scavenger stats (discussed above) - trace point added here to breakdown thread local stats for adaptive threading for Adaptive Threading - Dispatcher now queries the task for the recommended thread count when determining the number of threads to release from thread pool to complete the task - Ensures thread count is bounded properly to respect max thread count, either user provided adaptiveThreadCount or default max. - `_syncCriticalSectionStartTime` & `_syncCriticalSectionDuration` introduced to record Critical Section times for adjusted stall time - `_syncCriticalSectionStartTime` recorded when all threads are synced and the released thread is about to exit sync api to execute critical section - `_syncCriticalSectionDuration` is recored when thread executing critical section is about to release the synced thread (indicating critical section is complete). As synced threads are released, they update their stall time with the newly set critical section duration. - `notifyStallTime` recorded for `notify_all` on synced threads Signed-off-by: Salman Rana <salman.rana@ibm.com>
- Loading branch information