Optimize CI #1310
Labels
experience-medium
This issue is of medium difficulty, and requires some experience
google-20%-project
Potential 20% project for a Google employee
help wanted
Extra attention is needed
As of this writing, our CI tests (specified in
.github/workflows/ci.yml
) take ~5.5m to run end-to-end during PR development and ~19.5m to run end-to-end in the merge queue. This significantly affects developer velocity, especially when developing a sequence of features which stack (ie, one PR needs to land before the next PR can be seriously considered).This task tracks optimizing our end-to-end CI latency. Anything is on the table!
Note that both the PR latency and the merge queue latency are on the table. The PR latency is obviously the more important metric, since PR tests may run multiple times during PR development. However, given that GitHub has no automated way to merge a stack of PRs, we often have to actively keep an eye on the merge queue in order to know when we can kick off the next PR's merge. For this reason, merge queue latency is important as well.
Advice
As of this writing, we skip 5 out of 7 build targets and all Miri tests during PR development. Thus, the merge queue CI tests have somewhat different performance characteristics than PR CI tests.
In my own investigations, I've discovered the following:
build_test
job, which encompasses the primary test matrix (there are other ancillary jobs such askani
,check_fmt
, etc; these do not appear to be the bottleneck)We've already done some work to speed up Miri test execution (recently, #1307, #1308, and #1313). There is probably a lot more that could be done there.
There are probably also a lot of other optimization opportunities besides Miri; I just haven't taken the time to investigate in detail.
See also: #1312, #1314
Failed attempts
I tried these, but found no speedup, or wasn't able to get them working:
The text was updated successfully, but these errors were encountered: