runtime: stronger affinity between G ↔ P ↔ M ↔ CPU? #65694
Labels
compiler/runtime
Issues related to the Go compiler and/or runtime.
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Performance
Scalability
Issues related to runtime/application scalability
Milestone
Currently the runtime makes some attempts to maintain affinity between these resources:
There are cases where we explicitly do not maintain affinity:
The lack of perfect affinity is typically readily evident when viewing an execution trace, where you can see Gs moving around, even when there are idle resources. It is especially evident across a STW, but movement can be seen during normal execution as well.
In this example below, we can see G11 through G19 all moving between threads several times (thanks @aktau for raising this).

These migrations are certainly a (minor) annoyance when viewing traces.
They may also be a source of performance degradation. For example, CPU caches are likely empty after a migration, causing additional cache misses. Perhaps it could even have NUMA effects if a Gs allocations came from a P's mcache with spans that the OS has placed on one NUMA node, and then moving to a different M/CPU makes memory access slower.
None of these potential performance effects have been measured to determine if they are noticeable. e.g., migration will clearly have cache effects, but migration tends to occur in 10ms intervals, or much longer. It isn't clear that cache effects would be noticeable at these long time scales. More research is required.
cc @mknyszek @aclements @aktau
The text was updated successfully, but these errors were encountered: