You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sysmon preempts long running goroutines to allow scheduling other goroutines. Today this is done unconditionally. However, the scheduler guarantees work conservation. This means that if there is runnable work, we must start a P to run the work. From this, we can deduce the reverse: if there are idle Ps, then there must not be runnable work.
sysmon could use this property to reduce its preemptions: only preempt if there are no idle Ps.
This came up recently when investigating #55160, which is a bug in the scheduler causing it to lose work conservation. Unconditional sysmon preemptions masked the bug by inducing preemptions, which allows (most) programs to continue making forward progress with fewer Ps. If sysmon preempted only if there were no idle Ps, programs would hang in the presence of this bug, making it more apparent.
@felixge proposed this earlier this year in https://go.dev/cl/460541. We held off due to two sources of work that break our work conservation guarantee: fractional GC workers, and the trace reader goroutine. Neither of these have a direct waker that does a wakeup when the work is runnable. Instead, they both depend on the scheduler eventually running and noticing the work.
In light of new information about how this can be useful in uncovering scheduler bugs, I think we should reconsider addresses these cases. We could also introduce this change as a debug mode without addressing these; neither fractional GC workers nor the trace reader are required for general correctness.
neither fractional GC workers nor the trace reader are required for general correctness.
Small note: fractional GC workers are supposed to be required for GC progress, they just aren't because their scheduling is too loose and idle-time GC workers are the current backstop required for progress (if there's only a fractional worker).
goschedImpl transitions the current goroutine from _Grunning to
_Grunnable and places it on the global run queue before calling into
It does _not_ call wakep after adding the global run queue. I believe
the intuition behind skipping wakep is that since we are immediately
calling the scheduler so we don't need to wake anything to run this
work. Unfortunately, this intuition is not correct, as it breaks
coordination with spinning Ms .
Consider this example scenario:
M0: Running P0, G0
M1: Spinning, holding P1 and looking for work
M1: Fails to find work; drops P
M0: newproc adds G1 to P0 runq
M0: does not wakep because there is a spinning M
M1: clear mp.spinning, decrement sched.nmspinning (now in "delicate dance")
M1: check sched.runqsize -> no global runq work
M0: gosched preempts G0; adds G0 to global runq
M0: does not wakep because gosched doesn't wakep
M0: schedules G1 from P0 runq
M1: check P0 runq -> no work
M1: no work -> park
G0 is stranded on the global runq with no M/P looking to run it. This is
a loss of work conservation.
As a result, G0 will have unbounded* scheduling delay, only getting
scheduled when G1 yields. Even once G1 yields, we still won't start
another P, so both G0 and G1 will switch back and forth sharing one P
when they should start another.
*The caveat to this is that today sysmon will preempt G1 after 10ms,
effectively capping the scheduling delay to 10ms, but not solving the P
underutilization problem. Sysmon's behavior here is theoretically
unnecessary, as our work conservation guarantee should allow sysmon to
avoid preemption if there are any idle Ps. Issue #60693 tracks changing
this behavior and the challenges involved.
 It would be OK if we unconditionally entered the scheduler as a
spinning M ourselves, as that would require schedule to call wakep when
it finds work in case there is more work.
Reviewed-by: Michael Knyszek <firstname.lastname@example.org>
TryBot-Result: Gopher Robot <email@example.com>
Run-TryBot: Michael Pratt <firstname.lastname@example.org>
Auto-Submit: Michael Pratt <email@example.com>