cmd/go/internal/test: test concurrency limited by sequential printing #61233
Labels
GoCommand
cmd/go
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
ToolSpeed
Milestone
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What did you do?
Ran
go test -debug-trace=trace -debug-actiongraph=actiongraph ./...
What did you expect to see?
Tests are run with maximum concurrency
What did you see instead?
test run
tasks are started but not doing any work, blocking other work from progressing.Example simple reproducer:
Basically this just makes a few packages, one with a slow test, and a few empty packages.
When we run this, I would expect to have one worker blocked on the
b
test (sleep 1
), while the rest execute in the other worker and quickly complete.Instead, we see this:
This looks odd - the
e
package doesn't have any tests, so why istest run e
taking 1.3s?The reason seems to be related to https://go-review.git.corp.google.com/c/go/+/448357 (although I am not certain it worked different before that CL). One the
runTestActor
starts, it is immediately blocked on<-r.prev
. This, in turn, means it is blocked until the previous test (d
) finishes.Ultimately, this blocks an entire worker doing nothing.
A real world example of this can be found here (warning: 170mb). You can see it occurring in a few places, most prominently around 300s we have almost all 16 workers blocked, most/all on "no test files" packages.
I think based on the way actions are pushed into the
ready
queue, any action that has zero remaining dependencies may be pushed onto the queue. In some of my testing, I saw the occasionally some of the very last packages in the test ordering would get scheduled first - blocking an entire worker for almost the entire process.I believe this new logic is only need for
-json
mode. One option is to just enable it only if-json
is enabled.I tested this locally against the same repo as the "real world example" above.
With Go 1.20, the tests took 70s to execute. With the patch, they took 30s.
With the same patch, the simple reproducer looks as expected:
Before trace After trace (warning: 150mb)
Another approach is to add each
test run
action as a dependency for the previous, to serialize the ordering. However, this makes things far worse, as this is serializing things much stricter -- the current serialization just ensures the starts are serialized, while this approach would make sure the entire test execution is as well - not what we want.In my testing this made things worse.
A similar approach may be to make another action type
test start
, but I didn't test this outThe text was updated successfully, but these errors were encountered: