runtime: avoid unnecessary sysmon preemptions #60693
Labels
compiler/runtime
Issues related to the Go compiler and/or runtime.
NeedsFix
The path to resolution is known, but the work has not been done.
Milestone
sysmon preempts long running goroutines to allow scheduling other goroutines. Today this is done unconditionally. However, the scheduler guarantees work conservation. This means that if there is runnable work, we must start a P to run the work. From this, we can deduce the reverse: if there are idle Ps, then there must not be runnable work.
sysmon could use this property to reduce its preemptions: only preempt if there are no idle Ps.
This came up recently when investigating #55160, which is a bug in the scheduler causing it to lose work conservation. Unconditional sysmon preemptions masked the bug by inducing preemptions, which allows (most) programs to continue making forward progress with fewer Ps. If sysmon preempted only if there were no idle Ps, programs would hang in the presence of this bug, making it more apparent.
@felixge proposed this earlier this year in https://go.dev/cl/460541. We held off due to two sources of work that break our work conservation guarantee: fractional GC workers, and the trace reader goroutine. Neither of these have a direct waker that does a wakeup when the work is runnable. Instead, they both depend on the scheduler eventually running and noticing the work.
In light of new information about how this can be useful in uncovering scheduler bugs, I think we should reconsider addresses these cases. We could also introduce this change as a debug mode without addressing these; neither fractional GC workers nor the trace reader are required for general correctness.
cc @mknyszek @aclements
The text was updated successfully, but these errors were encountered: