Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: add rustc-perf submodule to src/tools #125166

Merged
merged 6 commits into from
May 21, 2024

Conversation

lovesegfault
Copy link
Contributor

Currently, it's very challenging to perform a sandboxed opt-dist
bootstrap because the tool requires rustc-perf to be present, but
there is no proper management/tracking of it. Instead, a specific commit
is hardcoded where it is needed, and a non-checksummed zip is fetched
ad-hoc. This happens in two places:

src/ci/docker/host-x86_64/dist-x86_64-linux/Dockerfile:

ENV PERF_COMMIT 4f313add609f43e928e98132358e8426ed3969ae
RUN curl -LS -o perf.zip https://ci-mirrors.rust-lang.org/rustc/rustc-perf-$PERF_COMMIT.zip && \
    unzip perf.zip && \
    mv rustc-perf-$PERF_COMMIT rustc-perf && \
    rm perf.zip

src/tools/opt-dist/src/main.rs

// FIXME: add some mechanism for synchronization of this commit SHA with
// Linux (which builds rustc-perf in a Dockerfile)
// rustc-perf version from 2023-10-22
const PERF_COMMIT: &str = "4f313add609f43e928e98132358e8426ed3969ae";

let url = format!("https://ci-mirrors.rust-lang.org/rustc/rustc-perf-{PERF_COMMIT}.zip");
let client = reqwest::blocking::Client::builder()
    .timeout(Duration::from_secs(60 * 2))
    .connect_timeout(Duration::from_secs(60 * 2))
    .build()?;
let response = retry_action(
    || Ok(client.get(&url).send()?.error_for_status()?.bytes()?.to_vec()),
    "Download rustc-perf archive",
    5,
)?;

This causes a few issues:

  1. Maintainers need to be careful to bump PERF_COMMIT in both places
    every time
  2. In order to run opt-dist in a sandbox, you need to provide your own
    rustc-perf (feat(tools/opt-dist): allow local builds to specify a rustc-perf checkout #125125), but to
    figure out which commit to provide you need to grep the Dockerfile
  3. Even if you manage to provide the correct rustc-perf, its
    dependencies are not included in the vendor/ dir created during
    dist, so it will fail to build from the published source tarballs
  4. It is hard to provide any level of automation around updating the
    rustc-perf in use, leading to staleness

Fundamentally, this means rustc-src tarballs no longer contain
everything you need to bootstrap Rust, and packagers hoping to leverage
opt-dist need to go out of their way to keep track of this "hidden"
dependency on rustc-perf.

This change adds rustc-perf as a git submodule, pinned to the current
PERF_COMMIT 4f313ad. Subsequent
commits ensure the submodule is initialized when necessary, and make use
of it in opt-dist.

@rustbot
Copy link
Collaborator

rustbot commented May 15, 2024

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot
Copy link
Collaborator

rustbot commented May 15, 2024

⚠️ Warning ⚠️

  • These commits modify submodules.

@rustbot rustbot added A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels May 15, 2024
@rust-log-analyzer

This comment has been minimized.

Copy link
Contributor

@Kobzol Kobzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting idea! How long does it take to checkout rustc-perf and how large is it (in vendored/dist archives)? It contains a lot of code.

@@ -1032,6 +1032,10 @@ impl Step for PlainSourceTarball {
.arg(builder.src.join("./compiler/rustc_codegen_gcc/Cargo.toml"))
.arg("--sync")
.arg(builder.src.join("./src/bootstrap/Cargo.toml"))
.arg("--sync")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should do this unconditionally, since opt-dist is only used in two CI dist jobs, most of them don't use it at all 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since opt-dist is only used in two CI dist jobs, most of them don't use it at all

That's true within the Rust project, but my main goal here is to make opt-dist usable for packagers / end users who need to bootstrap Rust themselves, but still want to benefit from PGO and BOLT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should do this unconditionally

I wish we had a "src tarball for developers", which would not include any submodules for example, and a "src tarball for distributing" which would be "complete", but that would be a pretty large yak to shave.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which "developers" do you think are using it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which "developers" do you think are using it?

I think few people use it today, but that's likely because of how hard it is to leverage opt-dist outside the Rust CI. I think, in the future, it would be used:

  • Directly by anyone who wants to package Rust, needs to bootstrap it themselves, and wants to build an optimized dist with PGO+BOLT. Namely, large distros and companies would benefit from this.
  • Indirectly by anyone consuming Rust from the above distributors. Namely, folks using Rust from their package manager, and people using Rust at large companies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, my argument boils down to:

premises

  • the point of the rustc-dev-src dist tarball is to allow people to bootstrap Rust
  • most people bootstrapping Rust want a production-grade toolchain as a result
  • the vanilla bootstrap is not sufficient to bootstrap a production-grade rust-toolchain
  • it is undesirable for users to be forced to reinvent the wheel and roll their own complex PGO/BOLT bootstrapping pipeline

therefore

  • it should be possible to bootstrap a production-grade Rust toolchain from the rustc-dev-src tarball using opt-dist

@lovesegfault
Copy link
Contributor Author

@Kobzol

How long does it take to checkout rustc-perf

I'd be happy to measure this, but I feel doing so on my machine will not be representative. I could use something like trickle to artificially limit bandwidth. Is there a particular bandwidth target you feel is fair?

how large is it (in vendored/dist archives)?

That's my main concern. I submitted the PR while waiting for the build of master to finish locally to compare, here's the result:

 Size Name
415Mi rustc-1.80.0-dev-src-master.tar.gz
606Mi rustc-1.80.0-dev-src-with-rustc-perf.tar.gz
~46% larger 

231Mi rustc-1.80.0-dev-src-master.tar.xz
318Mi rustc-1.80.0-dev-src-with-rustc-perf.tar.xz
~37% larger

It's not great.

I wonder whether it'd be possible to not use all of rustc-perf to build with PGO/BOLT.

@Mark-Simulacrum
Copy link
Member

The source tarball here is only really used by distros or other downstream builders, right? I feel like the size of it doesn't matter that much...

@lovesegfault lovesegfault marked this pull request as ready for review May 15, 2024 23:47
@rustbot
Copy link
Collaborator

rustbot commented May 15, 2024

The list of allowed third-party dependencies may have been modified! You must ensure that any new dependencies have compatible licenses before merging.

cc @davidtwco, @wesleywiser

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@lovesegfault lovesegfault requested a review from Kobzol May 15, 2024 23:48
@lqd
Copy link
Member

lqd commented May 16, 2024

I wonder whether it'd be possible to not use all of rustc-perf to build with PGO/BOLT.

Yes, we don’t use a lot from rustc-perf actually: not all the compile time benchmarks, the runtime benchmarks, the website or its dependencies, and basically only use the collector like a script to run rustc over the chosen benchmarks, without doing any DB/network perf collection and these dependencies.

@lovesegfault
Copy link
Contributor Author

Yes, we don’t use a lot from rustc-perf actually: not all the compile time benchmarks, the runtime benchmarks, the website or its dependencies, and basically only use the collector like a script to run rustc over the chosen benchmarks, without doing any DB/network perf collection and these dependencies.

Nice, while I don't think it should block this PR, I wonder if it would make sense to either:

  1. Split the rustc-perf repo into rustc-perf-collector, rustc-perf-corpus, and rustc-perf, and then change this logic in the future to only pick up the collector and corpus.
  2. Split the rustc-perf collector into its own repo and create a separate corpus for opt-dist

@Kobzol, thoughts? I don't know how de/coupled the collector/corpus/website are.

@Kobzol
Copy link
Contributor

Kobzol commented May 16, 2024

They are very coupled, and amongst other things the website depends on code from collector. I would very much like to avoid any splitting of the repository into more repos, that would make working on it very painful (if anything, I would like to merge repos into monorepos, if possible :) ).

I guess that we could create a branch that would only contain the collector directory, but I'm not sure if it's worth it.

@Kobzol
Copy link
Contributor

Kobzol commented May 16, 2024

Although it was not really an original goal of opt-dist, making it easier for distro maintainers to perform an optimized build is definitely a non-trivial benefit. If the src archives are anyway mostly only used by distro maintainers and similar downstream users, probably increasing its size is not such a big deal, as Mark said.

So I guess that it's fine to just vendor everything. What I was more concerned about was the regular development workflow for people hacking on rustc, that's why I was asking about the checkout speed (something like 50 Mbit/s might be reasonable to simulate?) But we have llvm-project as a submodule, so rustc-perf probably won't be the worst offender in this regard :)

@Nilstrieb
Copy link
Member

But we have llvm-project as a submodule,

people working on this repo often don't have that checked out though

@lovesegfault
Copy link
Contributor Author

lovesegfault commented May 16, 2024

What I was more concerned about was the regular development workflow for people hacking on rustc, that's why I was asking about the checkout speed (something like 50 Mbit/s might be reasonable to simulate?)

As-is, the submodule is only initialized/updated when either:

  1. You build opt-dist
  2. You build the dist target

I don't think developer workflows will be impacted by this change at all, unless you're specifically doing a lot of clean dist bootstraps, in which case the submodule fetch is not really going to be a significant portion of the overall build time.

I was careful to not include the rustc-perf submodule in other update_submodule calls.

@Kobzol
Copy link
Contributor

Kobzol commented May 16, 2024

people working on this repo often don't have that checked out though

Oh, I thought that it happens by default. Well, as @lovesegfault said, rustc-perf would only be checked when you build opt-dist.

I'm still not sure if we need to checkout the submodule and (primarily) vendor rustc-perf when we do dist for platforms other han Linux x86 (well, and technically Windows x64, but I kind of doubt that any "Windows x64 distro maintainers" need PGO/BOLT builds :) ) though. On other platforms, opt-dist is not really applicable at the moment.

@lovesegfault
Copy link
Contributor Author

On other platforms, opt-dist is not really applicable at the moment.

For what it's worth, you can leverage opt-dist to bootstrap Rust with PGO on aarch64-unknown-linux-gnu today, and it works, though you're right the project isn't doing this yet.

BOLT, on the other hand, really only works on x86_64 for now.

@lovesegfault
Copy link
Contributor Author

Gathered data on cloning rustc-perf:

$ hyperfine --warmup 2 --prepare 'rm -rf ./rustc-perf' 'trickle -s -d 50000 git clone https://github.com/rust-lang/rustc-perf'
Benchmark 1: trickle -s -d 50000 git clone https://github.com/rust-lang/rustc-perf
  Time (mean ± σ):      3.027 s ±  0.029 s    [User: 3.181 s, System: 2.307 s]
  Range (min … max):    2.988 s …  3.079 s    10 runs

$ hyperfine --warmup 2 --prepare 'rm -rf ./rustc-perf' 'trickle -s -d 25000 git clone https://github.com/rust-lang/rustc-perf'
Benchmark 1: trickle -s -d 25000 git clone https://github.com/rust-lang/rustc-perf
  Time (mean ± σ):      4.341 s ±  0.022 s    [User: 3.668 s, System: 3.521 s]
  Range (min … max):    4.312 s …  4.372 s    10 runs

$ hyperfine --warmup 2 --prepare 'rm -rf ./rustc-perf' 'trickle -s -d 10000 git clone https://github.com/rust-lang/rustc-perf'
Benchmark 1: trickle -s -d 10000 git clone https://github.com/rust-lang/rustc-perf
  Time (mean ± σ):      8.535 s ±  0.037 s    [User: 4.533 s, System: 6.928 s]
  Range (min … max):    8.485 s …  8.604 s    10 runs

@lqd
Copy link
Member

lqd commented May 16, 2024

Another datapoint: a computer I was using to analyze a windows-gnu bug takes 40s to do that git clone 😄

@Kobzol
Copy link
Contributor

Kobzol commented May 16, 2024

In the grand scheme of things, cloning the submodule will probably be fine, since it's optional. I'm generally in favour of this change, it resolves a minor maintenance annoyance (keeping the rustc-perf version in sync), and could help more distro maintainers get perf. benefits for their Rust toolchain.

Let's see if it works in the current state.

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 16, 2024
@bors
Copy link
Contributor

bors commented May 16, 2024

⌛ Trying commit 83a847f with merge bf9ad42...

("fluent-langneg", "Apache-2.0"), // rustc (fluent translations)
("fortanix-sgx-abi", "MPL-2.0"), // libstd but only for `sgx` target. FIXME: this dependency violates the documentation comment above.
("instant", "BSD-3-Clause"), // rustc_driver/tracing-subscriber/parking_lot
("mdbook", "MPL-2.0"), // mdbook
("openssl", "Apache-2.0"), // opt-dist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we no longer tracking the deps for opt-dist? I see that there's a bunch of dependencies that disappeared from our lockfile as well -- where are those being locked? I don't see a new Cargo.lock added here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were able to remove those dependencies from opt-dist, since it no longer needs to fetch anything over the network.

I removed those lines because, IIRC, tidy complained that they were no longer reflected in the Cargo.lock once I removed the deps from opt-dist, which made sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, perfect. I missed that part of the diff in my skim I guess :)

Cutting some dependencies is great.

@bors
Copy link
Contributor

bors commented May 20, 2024

☔ The latest upstream changes (presumably #125313) made this pull request unmergeable. Please resolve the merge conflicts.

Currently, it's very challenging to perform a sandboxed `opt-dist`
bootstrap because the tool requires `rustc-perf` to be present, but
there is no proper management/tracking of it. Instead, a specific commit
is hardcoded where it is needed, and a non-checksummed zip is fetched
ad-hoc. This happens in two places:

`src/ci/docker/host-x86_64/dist-x86_64-linux/Dockerfile`:

```dockerfile
ENV PERF_COMMIT 4f313ad
RUN curl -LS -o perf.zip https://ci-mirrors.rust-lang.org/rustc/rustc-perf-$PERF_COMMIT.zip && \
    unzip perf.zip && \
    mv rustc-perf-$PERF_COMMIT rustc-perf && \
    rm perf.zip
```

`src/tools/opt-dist/src/main.rs`

```rust
// FIXME: add some mechanism for synchronization of this commit SHA with
// Linux (which builds rustc-perf in a Dockerfile)
// rustc-perf version from 2023-10-22
const PERF_COMMIT: &str = "4f313add609f43e928e98132358e8426ed3969ae";

let url = format!("https://ci-mirrors.rust-lang.org/rustc/rustc-perf-{PERF_COMMIT}.zip");
let client = reqwest::blocking::Client::builder()
    .timeout(Duration::from_secs(60 * 2))
    .connect_timeout(Duration::from_secs(60 * 2))
    .build()?;
let response = retry_action(
    || Ok(client.get(&url).send()?.error_for_status()?.bytes()?.to_vec()),
    "Download rustc-perf archive",
    5,
)?;
```

This causes a few issues:

1. Maintainers need to be careful to bump PERF_COMMIT in both places
   every time
2. In order to run `opt-dist` in a sandbox, you need to provide your own
   `rustc-perf` (rust-lang#125125), but to
   figure out which commit to provide you need to grep the Dockerfile
3. Even if you manage to provide the correct `rustc-perf`, its
   dependencies are not included in the `vendor/` dir created during
   `dist`, so it will fail to build from the published source tarballs
4. It is hard to provide any level of automation around updating the
   `rustc-perf` in use, leading to staleness

Fundamentally, this means `rustc-src` tarballs no longer contain
everything you need to bootstrap Rust, and packagers hoping to leverage
`opt-dist` need to go out of their way to keep track of this "hidden"
dependency on `rustc-perf`.

This change adds rustc-perf as a git submodule, pinned to the current
`PERF_COMMIT` 4f313ad. Subsequent
commits ensure the submodule is initialized when necessary, and make use
of it in `opt-dist`.
This avoids having normal builds pay the cost of initializing that
submodule, while still ensuring it's available whenever `opt-dist` is
built.

Note that, at this point, `opt-dist` will not yet use the submodule,
that will be handled in a subsequent commit.
This replaces the hardcoded rustc-perf commit and ad-hoc downloading and
unpacking of its zipped source with defaulting to use the new rustc-perf
submodule.

While it would be nice to make `opt-dist` able to initialize the
submodule automatically when pointing to a Rust checkout _other_ than
the one opt-dist was built in, that would require a bigger refactor that
moved `update_submodule`, from bootstrap, into build_helper.

Regardless, I imagine it must be quite rare to use `opt-dist` with a
checkout that is neither from a rust-src tarball (which will contain the
submodule), nor the checkout opt-dist itself was built (bootstrap will
update the submodule when opt-dist is built).
It is now available as a submodule in src/tools/rustc-perf, and is
initialized when building opt-dist
@rustbot rustbot added the A-meta Area: Issues about the rust-lang/rust repository. label May 20, 2024
@Mark-Simulacrum
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented May 20, 2024

📌 Commit e253718 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 20, 2024
@bors
Copy link
Contributor

bors commented May 20, 2024

⌛ Testing commit e253718 with merge 60faa27...

@bors
Copy link
Contributor

bors commented May 21, 2024

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 60faa27 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 21, 2024
@bors bors merged commit 60faa27 into rust-lang:master May 21, 2024
7 checks passed
@rustbot rustbot added this to the 1.80.0 milestone May 21, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (60faa27): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary -0.6%, secondary -2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.5% [2.5%, 2.5%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.1% [-2.3%, -1.9%] 2
Improvements ✅
(secondary)
-2.4% [-2.4%, -2.4%] 1
All ❌✅ (primary) -0.6% [-2.3%, 2.5%] 3

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 670.311s -> 675.351s (0.75%)
Artifact size: 315.45 MiB -> 315.41 MiB (-0.01%)

tshepang added a commit to ferrocene/ferrocene that referenced this pull request May 21, 2024
An upstream change resulted in a conflict,
due to rust-lang/rust#125166.
@lovesegfault lovesegfault deleted the embed-rustc-perf branch May 21, 2024 13:35
bors-ferrocene bot added a commit to ferrocene/ferrocene that referenced this pull request May 21, 2024
639: move submodule entry to a place where it will not conflict r=pietroalbini a=tshepang

An upstream change resulted in a conflict,
due to rust-lang/rust#125166.

Co-authored-by: Tshepang Mbambo <tshepang.mbambo@ferrous-systems.com>
weihanglo added a commit to weihanglo/rust that referenced this pull request May 23, 2024
These are the default package set required by opt-dist to correctly work,
hence for people wanting to build a production grade of rustc in a
sandboxed / air-gapped environment, these need to be vendored.

The size of `rustc-src-nightly.tar.xz` before and after this change:

* Before: 298M
* After: 323M (+8%)

These crates are the default set of packages required by opt-dist
to correctly work, hence for people wanting to build a production grade
of rustc in an sandboxed / air-gapped environment, these need to be vendored.

The size of `rustc-src-nightly.tar.xz` before and after this change:

* Before: 298M
* After: 323M (+8%)

Size change might or might not be a concern.
See the previous discussion: rust-lang#125166 (comment)

Previous efforts on making:

* rust-lang#125125
* rust-lang#125166

---

Note that extra works still need to be done to make it fully vendored.

* The current pinned rustc-perf uses `tempfile::Tempdir` as the working
  directory when collecting profiles from some of these packages.
  This "tmp" working directory usage make it impossible for Cargo to pick
  up the correct vendor sources setting in `.cargo/config.toml` bundled
  in the rustc-src tarball. [^1]
* opt-dist verifies the final built rustc against a subset of rustc test
  suite. However it rolls out its own `config.toml` without setting
  `vendor = true`, and that results in `./vendor/` directory removed.
  [^2]

[^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173
[^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request May 24, 2024
bootstrap: vendor crates required by opt-dist to collect profiles

These are the default package set required by opt-dist to correctly work,
hence for people wanting to build a production grade of rustc in a
sandboxed / air-gapped environment, these need to be vendored.

The size of `rustc-src-nightly.tar.xz` before and after this change:

* Before: 298M
* After: 323M (+8%)

Size change might or might not be a concern.
See the previous discussion: rust-lang#125166 (comment)

Previous efforts on making:

* rust-lang#125125
* rust-lang#125166

---

Note that extra works still need to be done to make it fully vendored.

* The current pinned rustc-perf uses `tempfile::Tempdir` as the working
  directory when collecting profiles from some of these packages.
  This "tmp" working directory usage make it impossible for Cargo to pick
  up the correct vendor sources setting in `.cargo/config.toml` bundled
  in the rustc-src tarball. [^1]
* opt-dist verifies the final built rustc against a subset of rustc test
  suite. However it rolls out its own `config.toml` without setting
  `vendor = true`, and that results in `./vendor/` directory removed.
  [^2]

[^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173
[^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
weihanglo added a commit to weihanglo/rust that referenced this pull request May 31, 2024
These are the default package set required by opt-dist to correctly work,
hence for people wanting to build a production grade of rustc in a
sandboxed / air-gapped environment, these need to be vendored.

The size of `rustc-src-nightly.tar.xz` before and after this change:

* Before: 298M
* After: 323M (+8%)

These crates are the default set of packages required by opt-dist
to correctly work, hence for people wanting to build a production grade
of rustc in an sandboxed / air-gapped environment, these need to be vendored.

The size of `rustc-src-nightly.tar.xz` before and after this change:

* Before: 298M
* After: 323M (+8%)

Size change might or might not be a concern.
See the previous discussion: rust-lang#125166 (comment)

Previous efforts on making:

* rust-lang#125125
* rust-lang#125166

---

Note that extra works still need to be done to make it fully vendored.

* The current pinned rustc-perf uses `tempfile::Tempdir` as the working
  directory when collecting profiles from some of these packages.
  This "tmp" working directory usage make it impossible for Cargo to pick
  up the correct vendor sources setting in `.cargo/config.toml` bundled
  in the rustc-src tarball. [^1]
* opt-dist verifies the final built rustc against a subset of rustc test
  suite. However it rolls out its own `config.toml` without setting
  `vendor = true`, and that results in `./vendor/` directory removed.
  [^2]

[^1]: https://github.com/rust-lang/rustc-perf/blob/4f313add609f43e928e98132358e8426ed3969ae/collector/src/compile/benchmark/mod.rs#L164-L173
[^2]: https://github.com/rust-lang/rust/blob/606afbb617a2949a4e35c4b0258ff94c980b9451/src/tools/opt-dist/src/tests.rs#L62-L77
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-meta Area: Issues about the rust-lang/rust repository. A-testsuite Area: The testsuite used to check the correctness of rustc merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants