Skip to content

x/build/cmd/coordinator: bad test partitioning for plan9-arm #49343

Closed
@millerresearch

Description

@millerresearch

Currently a lot of short tests are being executed individually in separate invocations of go tool dist test, instead of being grouped together. This matters for overall runtime, because on the raspberry pi builders each invocation of go tool dist test adds an overhead of >30 seconds, even for a test which takes a small fraction of a second.
See for example this excerpt from an execution log on plan9-arm:

  2021-10-25T17:16:23Z still_waiting_on_test go_test:internal/abi
  2021-10-25T17:16:53Z still_waiting_on_test go_test:internal/abi
  2021-10-25T17:17:18Z finish_run_tests_multi after 1m24.6s; pi4g: [go_test:internal/abi go_test:internal/buildcfg go_test:internal/cpu]
  2021-10-25T17:17:18Z run_test:go_test:internal/execabs pi4g
  2021-10-25T17:17:48Z still_waiting_on_test go_test:internal/execabs
  2021-10-25T17:17:53Z finish_run_test:go_test:internal/execabs after 35.3s; pi4g
  2021-10-25T17:17:53Z run_test:go_test:internal/fmtsort pi4g
  2021-10-25T17:18:23Z still_waiting_on_test go_test:internal/fmtsort
  2021-10-25T17:18:37Z finish_run_test:go_test:internal/fmtsort after 43.5s; pi4g
  2021-10-25T17:18:37Z run_test:go_test:internal/fuzz pi4g
  2021-10-25T17:19:07Z still_waiting_on_test go_test:internal/fuzz
  2021-10-25T17:19:13Z finish_run_test:go_test:internal/fuzz after 36.3s; pi4g
  2021-10-25T17:19:13Z run_test:go_test:internal/itoa pi4g
  2021-10-25T17:19:43Z still_waiting_on_test go_test:internal/itoa
  2021-10-25T17:19:48Z finish_run_test:go_test:internal/itoa after 34.6s; pi4g
  2021-10-25T17:19:48Z run_test:go_test:internal/poll pi4g
  2021-10-25T17:20:18Z still_waiting_on_test go_test:internal/poll
  2021-10-25T17:20:35Z finish_run_test:go_test:internal/poll after 47.1s; pi4g
  2021-10-25T17:20:35Z run_test:go_test:internal/profile pi4g
  2021-10-25T17:21:05Z still_waiting_on_test go_test:internal/profile
  2021-10-25T17:21:11Z finish_run_test:go_test:internal/profile after 36.1s; pi4g
  2021-10-25T17:21:11Z run_test:go_test:internal/reflectlite pi4g
  2021-10-25T17:21:41Z still_waiting_on_test go_test:internal/reflectlite
  2021-10-25T17:22:04Z finish_run_test:go_test:internal/reflectlite after 53s; pi4g
  2021-10-25T17:22:04Z run_test:go_test:internal/singleflight pi4g
  2021-10-25T17:22:34Z still_waiting_on_test go_test:internal/singleflight
  2021-10-25T17:22:38Z finish_run_test:go_test:internal/singleflight after 34s; pi4g
  2021-10-25T17:22:38Z run_test:go_test:internal/trace pi4g
  2021-10-25T17:23:08Z still_waiting_on_test go_test:internal/trace
  2021-10-25T17:23:14Z finish_run_test:go_test:internal/trace after 35.5s; pi4g
  2021-10-25T17:23:14Z run_test:go_test:internal/unsafeheader pi4g
  2021-10-25T17:23:44Z still_waiting_on_test go_test:internal/unsafeheader
  2021-10-25T17:23:48Z finish_run_test:go_test:internal/unsafeheader after 34.8s; pi4g
  2021-10-25T17:23:48Z run_test:go_test:internal/xcoff pi4g
  2021-10-25T17:24:18Z still_waiting_on_test go_test:internal/xcoff
  2021-10-25T17:24:24Z finish_run_test:go_test:internal/xcoff after 35.4s; pi4g
  2021-10-25T17:24:24Z run_test:go_test:io pi4g
  2021-10-25T17:24:54Z still_waiting_on_test go_test:io
  2021-10-25T17:25:11Z finish_run_test:go_test:io after 47.4s; pi4g
  2021-10-25T17:25:11Z run_test:go_test:io/fs pi4g
  2021-10-25T17:25:41Z still_waiting_on_test go_test:io/fs
  2021-10-25T17:25:46Z finish_run_test:go_test:io/fs after 34.9s; pi4g
  2021-10-25T17:25:46Z run_test:go_test:io/ioutil pi4g
  2021-10-25T17:26:16Z still_waiting_on_test go_test:io/ioutil
  2021-10-25T17:26:39Z finish_run_test:go_test:io/ioutil after 52.4s; pi4g
  2021-10-25T17:26:39Z run_test:go_test:log pi4g
  2021-10-25T17:27:09Z still_waiting_on_test go_test:log
  2021-10-25T17:27:15Z finish_run_test:go_test:log after 35.9s; pi4g
  2021-10-25T17:27:15Z run_test:go_test:math pi4g
  2021-10-25T17:27:45Z still_waiting_on_test go_test:math
  2021-10-25T17:28:04Z finish_run_test:go_test:math after 49.6s; pi4g
  2021-10-25T17:28:04Z run_test:go_test:math/big pi4g
  2021-10-25T17:28:34Z still_waiting_on_test go_test:math/big
  2021-10-25T17:28:51Z finish_run_test:go_test:math/big after 46.5s; pi4g
  2021-10-25T17:28:51Z run_test:go_test:math/bits pi4g
  2021-10-25T17:29:21Z still_waiting_on_test go_test:math/bits
  2021-10-25T17:29:42Z finish_run_test:go_test:math/bits after 50.9s; pi4g
  2021-10-25T17:29:42Z run_test:go_test:math/cmplx pi4g
  2021-10-25T17:30:12Z still_waiting_on_test go_test:math/cmplx
  2021-10-25T17:30:19Z finish_run_test:go_test:math/cmplx after 36.7s; pi4g
  2021-10-25T17:30:19Z run_test:go_test:math/rand pi4g
  2021-10-25T17:30:49Z still_waiting_on_test go_test:math/rand
  2021-10-25T17:30:58Z finish_run_test:go_test:math/rand after 39.4s; pi4g
  2021-10-25T17:30:58Z run_test:go_test:mime pi4g
  2021-10-25T17:31:28Z still_waiting_on_test go_test:mime
  2021-10-25T17:31:35Z finish_run_test:go_test:mime after 36.9s; pi4g
  2021-10-25T17:31:35Z run_test:go_test:mime/multipart pi4g
  2021-10-25T17:32:05Z still_waiting_on_test go_test:mime/multipart
  2021-10-25T17:32:13Z finish_run_test:go_test:mime/multipart after 37.8s; pi4g
  2021-10-25T17:32:13Z run_test:go_test:mime/quotedprintable pi4g
  2021-10-25T17:32:43Z still_waiting_on_test go_test:mime/quotedprintable
  2021-10-25T17:32:49Z finish_run_test:go_test:mime/quotedprintable after 36.1s; pi4g
  2021-10-25T17:32:49Z run_test:go_test:net pi4g
  2021-10-25T17:33:19Z still_waiting_on_test go_test:net
  2021-10-25T17:33:49Z still_waiting_on_test go_test:net
  2021-10-25T17:34:12Z finish_run_test:go_test:net after 1m23.2s; pi4g
  2021-10-25T17:34:12Z run_test:go_test:net/http pi4g
  2021-10-25T17:34:42Z still_waiting_on_test go_test:net/http
  2021-10-25T17:35:12Z still_waiting_on_test go_test:net/http
  2021-10-25T17:35:42Z still_waiting_on_test go_test:net/http
  2021-10-25T17:36:12Z still_waiting_on_test go_test:net/http
  2021-10-25T17:36:20Z finish_run_test:go_test:net/http after 2m8.1s; pi4g
  2021-10-25T17:36:20Z run_tests_multi pi4g: [go_test:net/http/cgi go_test:net/http/cookiejar go_test:net/http/fcgi]
  2021-10-25T17:36:50Z still_waiting_on_test go_test:net/http/cgi
  2021-10-25T17:37:02Z finish_run_tests_multi after 41s; pi4g: [go_test:net/http/cgi go_test:net/http/cookiejar go_test:net/http/fcgi]

compared to the same subsequence of tests run on another slow platform, freebsd-arm64:

  2021-10-25T19:14:46Z still_waiting_on_test go_test:internal/abi
  2021-10-25T19:14:51Z finish_run_tests_multi after 34.4s; aarch64.localdomain: [go_test:internal/abi go_test:internal/buildcfg go_test:internal/cpu]
  2021-10-25T19:14:51Z run_tests_multi aarch64.localdomain: [go_test:internal/execabs go_test:internal/fmtsort go_test:internal/fuzz]
  2021-10-25T19:14:59Z finish_run_tests_multi after 8.31s; aarch64.localdomain: [go_test:internal/execabs go_test:internal/fmtsort go_test:internal/fuzz]
  2021-10-25T19:14:59Z run_tests_multi aarch64.localdomain: [go_test:internal/itoa go_test:internal/poll go_test:internal/profile]
  2021-10-25T19:15:11Z finish_run_tests_multi after 12.3s; aarch64.localdomain: [go_test:internal/itoa go_test:internal/poll go_test:internal/profile]
  2021-10-25T19:15:11Z run_tests_multi aarch64.localdomain: [go_test:internal/reflectlite go_test:internal/singleflight go_test:internal/trace]
  2021-10-25T19:15:27Z finish_run_tests_multi after 15.4s; aarch64.localdomain: [go_test:internal/reflectlite go_test:internal/singleflight go_test:internal/trace]
  2021-10-25T19:15:27Z run_tests_multi aarch64.localdomain: [go_test:internal/unsafeheader go_test:internal/xcoff go_test:io]
  2021-10-25T19:15:38Z finish_run_tests_multi after 11.7s; aarch64.localdomain: [go_test:internal/unsafeheader go_test:internal/xcoff go_test:io]
  2021-10-25T19:15:38Z run_tests_multi aarch64.localdomain: [go_test:io/fs go_test:io/ioutil go_test:log]
  2021-10-25T19:15:42Z finish_run_tests_multi after 3.83s; aarch64.localdomain: [go_test:io/fs go_test:io/ioutil go_test:log]
  2021-10-25T19:15:42Z run_tests_multi aarch64.localdomain: [go_test:log/syslog go_test:math go_test:math/big]
  2021-10-25T19:15:59Z finish_run_tests_multi after 16.6s; aarch64.localdomain: [go_test:log/syslog go_test:math go_test:math/big]
  2021-10-25T19:15:59Z run_tests_multi aarch64.localdomain: [go_test:math/bits go_test:math/cmplx go_test:math/rand]
  2021-10-25T19:16:12Z finish_run_tests_multi after 13.3s; aarch64.localdomain: [go_test:math/bits go_test:math/cmplx go_test:math/rand]
  2021-10-25T19:16:12Z run_tests_multi aarch64.localdomain: [go_test:mime go_test:mime/multipart go_test:mime/quotedprintable]
  2021-10-25T19:16:17Z finish_run_tests_multi after 4.66s; aarch64.localdomain: [go_test:mime go_test:mime/multipart go_test:mime/quotedprintable]
  2021-10-25T19:16:17Z run_tests_multi aarch64.localdomain: [go_test:net go_test:net/http go_test:net/http/cgi]
  2021-10-25T19:16:47Z still_waiting_on_test go_test:net
  2021-10-25T19:16:56Z finish_run_tests_multi after 39.2s; aarch64.localdomain: [go_test:net go_test:net/http go_test:net/http/cgi]

Naive question: on single builders without helpers, why are the tests partitioned at all? Running the whole set of tests under a single invocation of go tool dist test would reduce the overhead considerably.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Buildersx/build issues (builders, bots, dashboards)NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions