Skip to content

Commit 6700ed7

Browse files
committed
various improvements resulting from reading Testing with CI
1 parent 22e1486 commit 6700ed7

File tree

2 files changed

+65
-65
lines changed

2 files changed

+65
-65
lines changed

src/tests/ci.md

Lines changed: 53 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,12 @@ From a high-level point of view, when you open a pull request at
77
`rust-lang/rust`, the following will happen:
88

99
- A small [subset](#pull-request-builds) of tests and checks are run after each
10-
push to the PR. This should help catching common errors.
10+
push to the PR. This should help catch common errors.
1111
- When the PR is approved, the [bors] bot enqueues the PR into a [merge queue].
1212
- Once the PR gets to the front of the queue, bors will create a merge commit
1313
and run the [full test suite](#auto-builds) on it. The merge commit either
1414
contains only one specific PR or it can be a ["rollup"](#rollups) which
15-
combines multiple PRs together, to save CI costs.
15+
combines multiple PRs together, to reduce CI costs and merge delays.
1616
- Once the whole test suite finishes, two things can happen. Either CI fails
1717
with an error that needs to be addressed by the developer, or CI succeeds and
1818
the merge commit is then pushed to the `master` branch.
@@ -38,12 +38,12 @@ input, which contains a declarative configuration of all our CI jobs.
3838
> orchestrating the scripts that drive the process.
3939
4040
In essence, all CI jobs run `./x test`, `./x dist` or some other command with
41-
different configurations, across various operating systems, targets and
41+
different configurations, across various operating systems, targets, and
4242
platforms. There are two broad categories of jobs that are executed, `dist` and
4343
non-`dist` jobs.
4444

4545
- Dist jobs build a full release of the compiler for a specific platform,
46-
including all the tools we ship through rustup; Those builds are then uploaded
46+
including all the tools we ship through rustup. Those builds are then uploaded
4747
to the `rust-lang-ci2` S3 bucket and are available to be locally installed
4848
with the [rustup-toolchain-install-master] tool. The same builds are also used
4949
for actual releases: our release process basically consists of copying those
@@ -70,7 +70,7 @@ these execute the `x86_64-gnu-llvm-X`, `x86_64-gnu-tools`, `pr-check-1`, `pr-che
7070
and `tidy` jobs, all running on Linux. These execute a relatively short
7171
(~40 minutes) and lightweight test suite that should catch common issues. More
7272
specifically, they run a set of lints, they try to perform a cross-compile check
73-
build to Windows mingw (without producing any artifacts) and they test the
73+
build to Windows mingw (without producing any artifacts), and they test the
7474
compiler using a *system* version of LLVM. Unfortunately, it would take too many
7575
resources to run the full test suite for each commit on every PR.
7676

@@ -95,17 +95,16 @@ jobs that exercise various tests across operating systems and targets. The full
9595
test suite is quite slow; it can take several hours until all the `auto` CI
9696
jobs finish.
9797

98-
Most platforms only run the build steps, some run a restricted set of tests,
98+
Most platforms only run the build steps, some run a restricted set of tests;
9999
only a subset run the full suite of tests (see Rust's [platform tiers]).
100100

101101
Auto jobs are defined in the `auto` section of [`jobs.yml`]. They are executed
102-
on the `auto` branch under the `rust-lang/rust` repository and
103-
their results can be seen [here](https://github.com/rust-lang/rust/actions),
104-
although usually you will be notified of the result by a comment made by bors on
105-
the corresponding PR.
102+
on the `auto` branch under the `rust-lang/rust` repository,
103+
and the final result will be reported via a comment made by bors on the corresponding PR.
104+
The live results can be seen on [the GitHub Actions workflows page].
106105

107106
At any given time, at most a single `auto` build is being executed. Find out
108-
more [here](#merging-prs-serially-with-bors).
107+
more in [Merging PRs serially with bors](#merging-prs-serially-with-bors).
109108

110109
[platform tiers]: https://forge.rust-lang.org/release/platform-support.html#rust-platform-support
111110

@@ -125,7 +124,7 @@ There are several use-cases for try builds:
125124
when you start a try build). To create a try build and schedule it for a
126125
performance benchmark, you can use the `@bors try @rust-timer queue` command
127126
combination.
128-
- Check the impact of the PR across the Rust ecosystem, using a [crater] run.
127+
- Check the impact of the PR across the Rust ecosystem, using a [Crater](crater.md) run.
129128
Again, a working compiler build is needed for this, which can be produced by
130129
the [dist-x86_64-linux] CI job.
131130
- Run a specific CI job (e.g. Windows tests) on a PR, to quickly test if it
@@ -134,11 +133,11 @@ There are several use-cases for try builds:
134133
By default, if you send a comment with `@bors try`, the jobs defined in the `try` section of
135134
[`jobs.yml`] will be executed. We call this mode a "fast try build". Such a try build
136135
will not execute any tests, and it will allow compilation warnings. It is useful when you want to
137-
get an optimized toolchain as fast as possible, for a crater run or performance benchmarks,
136+
get an optimized toolchain as fast as possible, for a Crater run or performance benchmarks,
138137
even if it might not be working fully correctly. If you want to do a full build for the default try job,
139138
specify its job name in a job pattern (explained below).
140139

141-
If you want to run custom CI job(s) in a try build and make sure that they pass all tests and do
140+
If you want to run custom CI jobs in a try build and make sure that they pass all tests and do
142141
not produce any compilation warnings, you can select CI jobs to be executed by specifying a *job pattern*,
143142
which can be used in one of two ways:
144143
- You can add a set of `try-job: <job pattern>` directives to the PR description (described below) and then
@@ -151,7 +150,7 @@ which can be used in one of two ways:
151150

152151
Each job pattern can either be an exact name of a job or a glob pattern that matches multiple jobs,
153152
for example `*msvc*` or `*-alt`. You can start at most 20 jobs in a single try build. When using
154-
glob patterns in the PR description, you can (but do not have to) wrap them in backticks (`` ` ``) to avoid GitHub rendering
153+
glob patterns in the PR description, you can optionally wrap them in backticks (`` ` ``) to avoid GitHub rendering
155154
the pattern as Markdown if it contains e.g. an asterisk. Note that this escaping will not work when using
156155
the `@bors jobs=` parameter.
157156

@@ -190,18 +189,17 @@ of [`jobs.yml`]:
190189
> that are exercised this way.
191190
192191
Try builds are executed on the `try` branch under the `rust-lang/rust` repository and
193-
their results can be seen [here](https://github.com/rust-lang/rust/actions),
192+
their results can be seen on [the GitHub Actions workflows page],
194193
although usually you will be notified of the result by a comment made by bors on
195194
the corresponding PR.
196195
197196
Multiple try builds can execute concurrently across different PRs, but there can be at most
198197
a single try build running on a single PR at any given time.
199198
200-
Note that try builds are handled using the new [bors][new-bors] implementation.
199+
Note that try builds are handled using the [new bors] implementation.
201200
202201
[rustc-perf]: https://github.com/rust-lang/rustc-perf
203-
[crater]: https://github.com/rust-lang/crater
204-
[new-bors]: https://github.com/rust-lang/bors
202+
[new bors]: https://github.com/rust-lang/bors
205203
206204
### Modifying CI jobs
207205
@@ -211,8 +209,7 @@ If you want to modify what gets executed on our CI, you can simply modify the
211209
You can also modify what gets executed temporarily, for example to test a
212210
particular platform or configuration that is challenging to test locally (for
213211
example, if a Windows build fails, but you don't have access to a Windows
214-
machine). Don't hesitate to use CI resources in such situations to try out a
215-
fix!
212+
machine). Don't hesitate to use CI resources in such situations.
216213
217214
You can perform an arbitrary CI job in two ways:
218215
- Use the [try build](#try-builds) functionality, and specify the CI jobs that
@@ -255,8 +252,8 @@ purposes.
255252
</div>
256253

257254
Although you are welcome to use CI, just be conscious that this is a shared
258-
resource with limited concurrency. Try not to enable too many jobs at once (one
259-
or two should be sufficient in most cases).
255+
resource with limited concurrency. Try not to enable too many jobs at once;
256+
one or two should be sufficient in most cases.
260257

261258
## Merging PRs serially with bors
262259

@@ -280,12 +277,12 @@ by listening for either Commit Statuses or Check Runs. Since the merge commit is
280277
based on the latest `master` and only one can be tested at the same time, when
281278
the results are green, `master` is fast-forwarded to that merge commit.
282279

283-
Unfortunately testing a single PR at the time, combined with our long CI (~2
284-
hours for a full run), means we can’t merge too many PRs in a single day, and a
285-
single failure greatly impacts our throughput for the day. The maximum number of
280+
Unfortunately, testing a single PR at a time, combined with our long CI (~2
281+
hours for a full run), means we can’t merge a lot of PRs in a single day, and a
282+
single failure greatly impacts our throughput. The maximum number of
286283
PRs we can merge in a day is around ~10.
287284

288-
The large CI run times and requirement for a large builder pool is largely due
285+
The long CI run times, and requirement for a large builder pool, is largely due
289286
to the fact that full release artifacts are built in the `dist-` builders. This
290287
is worth it because these release artifacts:
291288

@@ -298,12 +295,11 @@ is worth it because these release artifacts:
298295

299296
Some PRs don’t need the full test suite to be executed: trivial changes like
300297
typo fixes or README improvements *shouldn’t* break the build, and testing every
301-
single one of them for 2+ hours is a big waste of time. To solve this, we
298+
single one of them for 2+ hours would be wasteful. To solve this, we
302299
regularly create a "rollup", a PR where we merge several pending trivial PRs so
303300
they can be tested together. Rollups are created manually by a team member using
304301
the "create a rollup" button on the [merge queue]. The team member uses their
305-
judgment to decide if a PR is risky or not, and are the best tool we have at the
306-
moment to keep the queue in a manageable state.
302+
judgment to decide if a PR is risky or not.
307303

308304
## Docker
309305

@@ -316,18 +312,22 @@ platform’s custom [Docker container]. This has a lot of advantages for us:
316312
- We can use ancient build environments to ensure maximum binary compatibility,
317313
for example [using older CentOS releases][dist-x86_64-linux] on our Linux
318314
builders.
319-
- We can avoid reinstalling tools (like QEMU or the Android emulator) every time
315+
- We can avoid reinstalling tools (like QEMU or the Android emulator) every time,
320316
thanks to Docker image caching.
321-
- Users can run the same tests in the same environment locally by just running
322-
`cargo run --manifest-path src/ci/citool/Cargo.toml run-local <job-name>`, which is awesome to debug failures. Note that there are only linux docker images available locally due to licensing and
317+
- Users can run the same tests in the same environment locally by just running this command:
318+
319+
cargo run --manifest-path src/ci/citool/Cargo.toml run-local <job-name>
320+
321+
This is helpful for debugging failures.
322+
Note that there are only Linux Docker images available locally due to licensing and
323323
other restrictions.
324324

325-
The docker images prefixed with `dist-` are used for building artifacts while
325+
The Docker images prefixed with `dist-` are used for building artifacts while
326326
those without that prefix run tests and checks.
327327

328328
We also run tests for less common architectures (mainly Tier 2 and Tier 3
329-
platforms) in CI. Since those platforms are not x86 we either run everything
330-
inside QEMU or just cross-compile if we don’t want to run the tests for that
329+
platforms) in CI. Since those platforms are not x86, we either run everything
330+
inside QEMU, or we just cross-compile if we don’t want to run the tests for that
331331
platform.
332332

333333
These builders are running on a special pool of builders set up and maintained
@@ -364,41 +364,41 @@ invalidated if one of the following changes:
364364
[ghcr.io]: https://github.com/rust-lang/rust/pkgs/container/rust-ci
365365
[Docker registry caching]: https://docs.docker.com/build/cache/backends/registry/
366366

367-
### LLVM caching with sccache
367+
### LLVM caching with Sccache
368368

369-
We build some C/C++ stuff in various CI jobs, and we rely on [sccache] to cache
369+
We build some C/C++ stuff in various CI jobs, and we rely on [Sccache] to cache
370370
the intermediate LLVM artifacts. Sccache is a distributed ccache developed by
371371
Mozilla, which can use an object storage bucket as the storage backend.
372372

373-
With sccache there's no need to calculate the hash key ourselves. Sccache
373+
With Sccache there's no need to calculate the hash key ourselves. Sccache
374374
invalidates the cache automatically when it detects changes to relevant inputs,
375375
such as the source code, the version of the compiler, and important environment
376376
variables.
377-
So we just pass the sccache wrapper on top of cargo and sccache does the rest.
377+
So we just pass the Sccache wrapper on top of Cargo and Sccache does the rest.
378378

379-
We store the persistent artifacts on the S3 bucket `rust-lang-ci-sccache2`. So
380-
when the CI runs, if sccache sees that LLVM is being compiled with the same C/C++
381-
compiler and the LLVM source code is the same, sccache retrieves the individual
379+
We store the persistent artifacts on the S3 bucket, `rust-lang-ci-sccache2`. So
380+
when the CI runs, if Sccache sees that LLVM is being compiled with the same C/C++
381+
compiler and the LLVM source code is the same, Sccache retrieves the individual
382382
compiled translation units from S3.
383383

384384
[sccache]: https://github.com/mozilla/sccache
385385

386386
## Custom tooling around CI
387387

388-
During the years we developed some custom tooling to improve our CI experience.
388+
During the years, we developed some custom tooling to improve our CI experience.
389389

390390
### Rust Log Analyzer to show the error message in PRs
391391

392392
The build logs for `rust-lang/rust` are huge, and it’s not practical to find
393-
what caused the build to fail by looking at the logs. To improve the developers’
394-
experience we developed a bot called [Rust Log Analyzer][rla] (RLA) that
395-
receives the build logs on failure and extracts the error message automatically,
396-
posting it on the PR.
393+
what caused the build to fail by looking at the logs.
394+
We therefore developed a bot called [Rust Log Analyzer][rla] (RLA) that
395+
receives the build logs on failure, and extracts the error message automatically,
396+
posting it on the PR thread.
397397

398398
The bot is not hardcoded to look for error strings, but was trained with a bunch
399399
of build failures to recognize which lines are common between builds and which
400400
are not. While the generated snippets can be weird sometimes, the bot is pretty
401-
good at identifying the relevant lines even if it’s an error we've never seen
401+
good at identifying the relevant lines, even if it’s an error we've never seen
402402
before.
403403

404404
[rla]: https://github.com/rust-lang/rust-log-analyzer
@@ -430,11 +430,11 @@ More information is available in the [toolstate documentation].
430430

431431
## Public CI dashboard
432432

433-
To monitor the Rust CI, you can have a look at the [public dashboard] maintained by the infra-team.
433+
To monitor the Rust CI, you can have a look at the [public dashboard] maintained by the infra team.
434434

435435
These are some useful panels from the dashboard:
436436

437-
- Pipeline duration: check how long the auto builds takes to run.
437+
- Pipeline duration: check how long the auto builds take to run.
438438
- Top slowest jobs: check which jobs are taking the longest to run.
439439
- Change in median job duration: check what jobs are slowest than before. Useful
440440
to detect regressions.
@@ -457,8 +457,7 @@ this:
457457
2. Choose the job you are interested in on the left-hand side.
458458
3. Click on the gear icon and choose "View raw logs"
459459
4. Search for the string "Configure the build"
460-
5. All of the build settings are listed below that starting with the
461-
`configure:` prefix.
460+
5. All of the build settings are listed on the line with the text, `build.configure-args`
462461

463462
[GitHub Actions]: https://github.com/rust-lang/rust/actions
464463
[`jobs.yml`]: https://github.com/rust-lang/rust/blob/master/src/ci/github-actions/jobs.yml
@@ -468,3 +467,4 @@ this:
468467
[homu]: https://github.com/rust-lang/homu
469468
[merge queue]: https://bors.rust-lang.org/queue/rust
470469
[dist-x86_64-linux]: https://github.com/rust-lang/rust/blob/master/src/ci/docker/host-x86_64/dist-x86_64-linux/Dockerfile
470+
[the GitHub Actions workflows page]: https://github.com/rust-lang/rust/actions

src/tests/crater.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,30 +8,30 @@ stable compiler versions.
88

99
## When to run Crater
1010

11-
You should request a crater run if your PR makes large changes to the compiler
11+
You should request a Crater run if your PR makes large changes to the compiler
1212
or could cause breakage. If you are unsure, feel free to ask your PR's reviewer.
1313

1414
## Requesting Crater Runs
1515

16-
The rust team maintains a few machines that can be used for running crater runs
17-
on the changes introduced by a PR. If your PR needs a crater run, leave a
16+
The Rust team maintains a few machines that can be used for Crater runs
17+
on the changes introduced by a PR. If your PR needs a Crater run, leave a
1818
comment for the triage team in the PR thread. Please inform the team whether you
19-
require a "check-only" crater run, a "build only" crater run, or a
20-
"build-and-test" crater run. The difference is primarily in time; the
21-
conservative (if you're not sure) option is to go for the build-and-test run. If
19+
require a "check-only" Crater run, a "build only" Crater run, or a
20+
"build-and-test" Crater run. The difference is primarily in time;
21+
if you're not sure, go for the build-and-test run. If
2222
making changes that will only have an effect at compile-time (e.g., implementing
23-
a new trait) then you only need a check run.
23+
a new trait), then you only need a check run.
2424

2525
Your PR will be enqueued by the triage team and the results will be posted when
26-
they are ready. Check runs will take around ~3-4 days, with the other two taking
26+
they are ready. Check runs will take around ~3-4 days, and the other two taking
2727
5-6 days on average.
2828

29-
While crater is really useful, it is also important to be aware of a few
29+
While Crater is really useful, it is also important to be aware of a few
3030
caveats:
3131

3232
- Not all code is on crates.io! There is a lot of code in repos on GitHub and
3333
elsewhere. Also, companies may not wish to publish their code. Thus, a
34-
successful crater run is not a magically green light that there will be no
34+
successful Crater run does not mean there will be no
3535
breakage; you still need to be careful.
3636

3737
- Crater only runs Linux builds on x86_64. Thus, other architectures and
@@ -41,5 +41,5 @@ caveats:
4141
the crate doesn't compile any more (e.g. used old nightly features), has
4242
broken or flaky tests, requires network access, or other reasons.
4343

44-
- Before crater can be run, `@bors try` needs to succeed in building artifacts.
45-
This means that if your code doesn't compile, you cannot run crater.
44+
- Before Crater can be run, `@bors try` needs to succeed in building artifacts.
45+
This means that if your code doesn't compile, you cannot run Crater.

0 commit comments

Comments
 (0)