Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: all.bash fails to saturate 6 cores #17751

Open
bcmills opened this issue Nov 2, 2016 · 93 comments
Open

build: all.bash fails to saturate 6 cores #17751

bcmills opened this issue Nov 2, 2016 · 93 comments

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Nov 2, 2016

go version devel +eed2bb7 Wed Nov 2 19:21:16 2016 +0000 linux/amd64

What did you do?

./clean.bash
time ./all.bash

What did you expect to see?

I do my development on a virtualized 6-core Xeon running Ubuntu. (runtime.NumCPU() returns 6.)

Per @rsc's comments on #10571, I expected "Both the build half and the test half of all.bash should consume nearly all the CPUs for most of the time."

What did you see instead?

all.bash seems to saturate at ~2 CPU cores for a significant fraction of the run. It consumes less than twice as much user time than real. Oddly, sys usage is even higher than user; I'm not sure what to make of that.

real    8m55.199s
user    14m13.511s
sys     8m51.402s

Here's a chart of CPU saturation over time as measured by mpstat. You can see that there are a couple bursts of parallelism with mostly poor saturation in between. The machine doesn't appear to be hitting swap during the build.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Nov 2, 2016

I don't have access to your CPU saturation chart. google.com accounts can't share outside of Google, IIRC. Use your golang account?

@bcmills
Copy link
Member Author

@bcmills bcmills commented Nov 2, 2016

That's annoying. I don't have a golang account; I'll see if I can just upload an image instead.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Nov 2, 2016

image

@minux
Copy link
Member

@minux minux commented Nov 2, 2016

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Nov 2, 2016

on a virtualized 6-core Xeon

Which virtualization? (might explain high sys time?)

@josharian
Copy link
Contributor

@josharian josharian commented Nov 3, 2016

#15736 will help with understanding this. Sorry I didn't get it into 1.8. Making compilation more concurrent will also help saturate the cores (as would #15734, but the consensus is to focus on intra-package concurrency instead).

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Nov 3, 2016

Also, the make.bash portion of Bryan's graph doesn't look super terrible.

The terrible part is running the tests, almost none of which use t.Parallel so are run serialized.

If people could go around and sprinkle some magic t.Parallel fairy dusty on tests where it's safe to do (where it doesn't introduce flakiness under load), that would be great.

@josharian
Copy link
Contributor

@josharian josharian commented Nov 3, 2016

Here are the 20 slowest packages to test, at least on my machine:

$ pbpaste | grep "^ok" | awk '{print $3"\t"$2}' | sort -n -r | head -n 20
67.474s cmd/go
54.461s runtime
42.232s cmd/compile/internal/gc
10.048s net/http
9.891s  go/build
9.311s  cmd/objdump
8.648s  cmd/cover
8.295s  cmd/vet
7.839s  net
6.487s  cmd/nm
5.483s  cmd/pack
4.813s  cmd/compile
4.534s  os/signal
3.899s  cmd/addr2line
2.784s  net/http/cgi
2.692s  time
2.615s  go/types
2.246s  log/syslog
2.209s  compress/flate
2.176s  runtime/pprof

Of these, cmd/go, runtime, net/http, net, os/signal, net/http/cgi, time, and runtime/pprof seem the most likely to become flakier under load. I'll take a look at cmd/compile/internal/gc and leave the rest for others.

@dsnet
Copy link
Member

@dsnet dsnet commented Nov 3, 2016

I can hammer on compress/flate. It's a package whose tests I want to clean up anyways.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Nov 3, 2016

Here are the 20 slowest packages to test, at least on my machine

I have the timing data for every trybot + build run on all machines in BigQuery, btw.

@gopherbot
Copy link

@gopherbot gopherbot commented Nov 3, 2016

CL https://golang.org/cl/32585 mentions this issue.

gopherbot pushed a commit that referenced this issue Nov 3, 2016
TestAssembly takes 20s on my machine,
which is too slow for normal operation.
Marking as -short has its dangers (#17472),
but hopefully we'll soon have a builder for that.

All the SSA tests are hermetic and not time sensitive
and can thus be run in parallel.
Reduces the cmd/compile/internal/gc test time during
all.bash on my laptop from 42s to 7s.

Updates #17751

Change-Id: Idd876421db23b9fa3475e8a9b3355a5dc92a5a29
Reviewed-on: https://go-review.googlesource.com/32585
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@josharian josharian changed the title all.bash fails to saturate 6 cores build: all.bash fails to saturate 6 cores Nov 3, 2016
@gopherbot
Copy link

@gopherbot gopherbot commented Nov 3, 2016

CL https://golang.org/cl/32673 mentions this issue.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Nov 3, 2016

@bradfitz

Which virtualization? (might explain high sys time?)

Xen (via Ganeti), I believe.

I suspect that I/O performance on my system is pretty poor, so that might explain the high sys time.

gopherbot pushed a commit that referenced this issue Nov 4, 2016
Rebuild cmd/objdump once instead of twice.
Speeds up standalone 'go test cmd/objdump' on my
machine from ~1.4s to ~1s.

Updates #17751

Change-Id: I15fd79cf18c310f892bc28a9e9ca47ee010c989a
Reviewed-on: https://go-review.googlesource.com/32673
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Nov 4, 2016

CL https://golang.org/cl/32684 mentions this issue.

gopherbot pushed a commit that referenced this issue Nov 4, 2016
Before: 8.9 seconds for go test -short
 After: 2.8 seconds

There are still 250 tests without t.Parallel, but I got the important
onces using a script:

    $ go test -short -v 2>&1 | go run ~/slowtests.go

Where slowtests.go is https://play.golang.org/p/9mh5Wg1nLN

The remaining 250 (the output lines from slowtests.go) all have a
reported duration of 0ms, except one 50ms test which has to be serial.

Where tests can't be parallel, I left a comment at the top saying why,
so people don't add t.Parallel later and get surprised at failures.

Updates #17751

Change-Id: Icbe32cbe2b996e23c89f1af6339287fa22af5115
Reviewed-on: https://go-review.googlesource.com/32684
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
@gopherbot
Copy link

@gopherbot gopherbot commented Nov 4, 2016

CL https://golang.org/cl/32751 mentions this issue.

gopherbot pushed a commit that referenced this issue Nov 4, 2016
I used the slowtests.go tool as described in
https://golang.org/cl/32684 on packages that stood out.

go test -short std drops from ~56 to ~52 seconds.

This isn't a huge win, but it was mostly an exercise.

Updates #17751

Change-Id: I9f3402e36a038d71e662d06ce2c1d52f6c4b674d
Reviewed-on: https://go-review.googlesource.com/32751
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Nov 4, 2016

CL https://golang.org/cl/32754 mentions this issue.

gopherbot pushed a commit that referenced this issue Nov 6, 2016
Was 2.3 seconds. Now 1.4 seconds.

Next win would be not running a child process and refactoring main so
it could be called from tests easily. But that would also require
rewriting the errchk written in Perl. This appears to be the last user
of errchk in the tree.

Updates #17751

Change-Id: Id7c3cec76f438590789b994e756f55b5397be07f
Reviewed-on: https://go-review.googlesource.com/32754
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Nov 7, 2016

CL https://golang.org/cl/32850 mentions this issue.

gopherbot pushed a commit that referenced this issue Nov 7, 2016
Cuts tests from 35 to 25 seconds.

Many of these could be parallel if the test runner were modified to
give each test its own workdir cloned from the tempdir files they
use. But later. This helps for now.

Updates #17751

Change-Id: Icc2ff87cca60a33ec5fd8abb1eb0a9ca3e85bf95
Reviewed-on: https://go-review.googlesource.com/32850
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@rsc rsc added this to the Unplanned milestone Jan 4, 2017
@josharian
Copy link
Contributor

@josharian josharian commented Aug 4, 2017

I'm curious whether concurrent compilation in 1.9 helps much here.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Sep 13, 2017

I'm curious whether concurrent compilation in 1.9 helps much here.

The initial build transient seems to saturate a bit better, and a few of the tests spike to saturation, but the overall utilization is still pretty similar.

From a clean build at tip:
chart

gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: Ie46118eddbd7c3ed0bb9ecee4bdc1cb6fdaf06a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/214291
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
This one's a bit complex and required writing support go programs
within the test script.

Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I9e91225b20b1b043f032b77a55c5825cb9d9a4b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/214292
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Adds a in-script go binary called inarchive to check that an
archive is produced. A weaker could be done faster using grep,
but this is more faithful to the original test.

Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I001fa0698063be80fe3da947c81d4eb0829be47f
Reviewed-on: https://go-review.googlesource.com/c/go/+/214295
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I1328a87e2481b4555b01df5c898f1a8015412adc
Reviewed-on: https://go-review.googlesource.com/c/go/+/214296
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I27e52c4eabfcd1782965f17c098719dd0ea7e3ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/214390
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: Ib1c55a48fafb5ce040ac70707bbc2a3ee5e2ddd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/214382
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I69c69809fb1698c8198ef3ea00103a9acb7b6ce7
Reviewed-on: https://go-review.googlesource.com/c/go/+/214387
Run-TryBot: Michael Matloob <matloob@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I2170f427c238e9fe8c029b43b346621d82c5e8fc
Reviewed-on: https://go-review.googlesource.com/c/go/+/214388
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I16fb0910196c96caef6ed380f96010a548407f9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/214424
Reviewed-by: Bryan C. Mills <bcmills@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I2d14c07c590cc618c66f27fdc3a2bb8120c6d646
Reviewed-on: https://go-review.googlesource.com/c/go/+/214427
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I1cb3e2e28700b05b08933f4e24cd996268c1f163
Reviewed-on: https://go-review.googlesource.com/c/go/+/214428
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 19, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I6db652a4a515daf6e87645d34191dc9a441f5720
Reviewed-on: https://go-review.googlesource.com/c/go/+/214431
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@gopherbot
Copy link

@gopherbot gopherbot commented Feb 20, 2020

Change https://golang.org/cl/220178 mentions this issue: cmd/go: roll forward "convert TestShadowingLogic to the script framework"

gopherbot pushed a commit that referenced this issue Feb 27, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I1020feaa4ddb40ff52c46728bc4973cea4c7b066
Reviewed-on: https://go-review.googlesource.com/c/go/+/214391
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
…t framework

The name of the test is too long to fit on the first line. It's
TestGoTestBuildsAnXtestContainingOnlyNonRunnableExamples.

Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I490748271b10a85cbe1d34f9dbecb86ccf0101a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/214423
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: If1e591f28d6399a07b37ed7f4a1419bf7cd915eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/214425
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I9a99aa5d37300c83a2f95fb906949cb4c1d5356f
Reviewed-on: https://go-review.googlesource.com/c/go/+/214426
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I394844da1ffc0dcde7f5862c41ed8efa7c5ca088
Reviewed-on: https://go-review.googlesource.com/c/go/+/214429
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
It looks like TestGoBuildGOPATHOrderBroken has been fixed so I've converted
that too, without the skip.

Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I7ee77f22fb888811c175bcdc5eb814c80fbec420
Reviewed-on: https://go-review.googlesource.com/c/go/+/214432
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I601e0fcee32b8c5bf2107b520d1dfbe12a19ad3f
Reviewed-on: https://go-review.googlesource.com/c/go/+/213223
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: Ib386838081abad8bc6b01c1f0a4656553d0b6ff3
Reviewed-on: https://go-review.googlesource.com/c/go/+/214579
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
I think this test needs to be split up eventually. It's one of
the longest tests.

Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: If2168fac040d78fd0ec3dcbdef2affd2a8f48f6d
Reviewed-on: https://go-review.googlesource.com/c/go/+/214158
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
This is a bit complex. There's a driver program to run go with modifications
to the GOPATH used to test Windows.

Also remove the cd method on testgoData, because this was the last function
that used it.

Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I3e8e27f37fd3701bd36b6365b128dd73b69181c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/214578
Reviewed-by: Jay Conrod <jayconrod@google.com>
gopherbot pushed a commit that referenced this issue Feb 27, 2020
…ork"

This rolls forward the change golang.org/cl/214431, which was reverted
in golang.org/cl/220217. The cl was broken because
TestVersionControlErrorMessageIncludesCorrectDirectory, which is going
to be removed in golang.org/cl/214429 hadn't been submitted yet.

Original change description:

Part of converting all tests to script framework to improve
test parallelism.

Updates #36320
Updates #17751

Change-Id: I87b3f9acb8575fbcbd58d454b5f9bac4923429b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/220178
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@gopherbot
Copy link

@gopherbot gopherbot commented Mar 17, 2020

Change https://golang.org/cl/223745 mentions this issue: cmd/go: replace TestCgoDependsOnSyscall with a simpler script test

gopherbot pushed a commit that referenced this issue Mar 17, 2020
The existing test attempted to remove '_race' binaries from
GOROOT/pkg, which could not only fail if GOROOT is read-only, but also
interfere with other tests run in parallel.

Updates #30316
Updates #37573
Updates #17751

Change-Id: Id7e2286ab67f8333baf4d52244b7f4476aa93a46
Reviewed-on: https://go-review.googlesource.com/c/go/+/223745
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Jun 24, 2020

Change https://golang.org/cl/239738 mentions this issue: [release-branch.go1.14] cmd/go: convert TestBuildIDContainsArchModeEnv to the script framework

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 24, 2020

Change https://golang.org/cl/239777 mentions this issue: [release-branch.go1.13] cmd/go: convert TestBuildIDContainsArchModeEnv to the script framework

gopherbot pushed a commit that referenced this issue Jun 29, 2020
…v to the script framework

Part of converting all tests to script framework to improve
test parallelism.

Fixes #39824
Updates #36320
Updates #17751

Change-Id: I69c69809fb1698c8198ef3ea00103a9acb7b6ce7
Reviewed-on: https://go-review.googlesource.com/c/go/+/214387
Run-TryBot: Michael Matloob <matloob@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
(cherry picked from CL 214387)
Reviewed-on: https://go-review.googlesource.com/c/go/+/239738
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
gopherbot pushed a commit that referenced this issue Jun 29, 2020
…v to the script framework

Part of converting all tests to script framework to improve
test parallelism.

Fixes #39823
Updates #36320
Updates #17751

Change-Id: I69c69809fb1698c8198ef3ea00103a9acb7b6ce7
Reviewed-on: https://go-review.googlesource.com/c/go/+/214387
Run-TryBot: Michael Matloob <matloob@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
(cherry picked from CL 214387)
Reviewed-on: https://go-review.googlesource.com/c/go/+/239777
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants
You can’t perform that action at this time.