Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: shard and scale the longtest SlowBots #37439

Closed
bcmills opened this issue Feb 25, 2020 · 4 comments
Closed

x/build: shard and scale the longtest SlowBots #37439

bcmills opened this issue Feb 25, 2020 · 4 comments

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Feb 25, 2020

I missed a Windows test failure in CL 220645 because I forgot to run it against the windows-amd64-longtest SlowBot. I forgot to run it against that SlowBot because I'm not in the habit of doing so.

I'm not in the habit of running that SlowBot because it is currently much too slow. To pick some relevant runs:

  • The first run on CL 220717 started at 5:07 PM and completed at 5:31 PM (24 minutes).
  • The second run on that CL started at 5:51 PM and completed at 6:33 PM (42 minutes).
  • The run on CL 220722 started at 5:48 PM and completed at 6:28 PM (40 minutes).

In contrast, a regular TryBot typically caps out around 10 minutes (#32632), and we consider runs that take longer than 20 minutes to be unacceptably slow (#36629, #36482).

Since there is nothing particularly special about the hardware needed to run the longtest builds (they're just large VMs), I think we should adjust the builder configuration to run the -longtest SlowBots with 4 or more shards each. That way, the end-to-end latency impact of adding one of these bots to a CL will be minimal, and we will not only have less of a disincentive to using them, but also have much faster feedback in order to inform revert-or-fix decisions when one breaks.

CC @golang/osp-team

@cagedmantis
Copy link
Contributor

@cagedmantis cagedmantis commented Feb 28, 2020

@dmitshur dmitshur self-assigned this Nov 6, 2020
@gopherbot
Copy link

@gopherbot gopherbot commented Nov 6, 2020

Change https://golang.org/cl/268037 mentions this issue: dashboard: try to speed up pre-submit longtest builders

gopherbot pushed a commit to golang/build that referenced this issue Nov 7, 2020
The longtest builders are currently primarily post-submit builders,
where it's okay for them to be as slow as they need to be in order
to provide additional test coverage. In this context, whether they
take 40 minutes or 50 makes little difference.

The longtest builders are also sometimes requested via SlowBots for
changes that are riskier than usual, or otherwise desire additional
coverage beyond the normal TryBots. They're also always enabled for
CLs to release branches. In such contexts, speeding up SlowBot runs
from 40 minutes to 20 or less would be appreciated and in turn help
people use longtest SlowBots more frequently.

Longtest builders are already configured to use sharded tests.
Configure them to use additional helpers to speed up test execution.
Try out 3, 5, and 9 helpers to see how much it helps before settling.

For golang/go#37439.

Change-Id: I425bc0257b7a54bb32c0eb1719fea7ba3f4fd461
Reviewed-on: https://go-review.googlesource.com/c/build/+/268037
Trust: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
@dmitshur
Copy link
Contributor

@dmitshur dmitshur commented Nov 7, 2020

I sent https://golang.org/cl/268037 for this.

I had a chance to try out all 3 values for additional TryBot helpers (3, 5, and 9) in at least one CL and the times so far were:

windows-amd64-longtest (+3 helpers) linux-386-longtest (+5) linux-amd64-longtest (+9)
12 min 21 sec 12 min 6 sec 10 min 15 sec
10 min 9 sec

It seems even just 3 additional helpers goes a long way to speed up the longtest SlowBot. I'll collect some more timing data next up.

(It's not a completely fair comparison since the builders are different, but it's enough to get some idea.)

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 21, 2020

Change https://golang.org/cl/279513 mentions this issue: dashboard: pick 4 TryBot helpers for -longtest SlowBots

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants