Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic standard library support. #7216

Merged
merged 9 commits into from Sep 3, 2019

Conversation

@ehuss
Copy link
Contributor

commented Aug 5, 2019

This is not intended to be useful to anyone. If people want to try it, that's great, but do not rely on this. This is only for experimenting and setting up for future work.

This adds a flag -Zbuild-std to build the standard library with a project. The flag can also take optional comma-separated crate names, like -Zbuild-std=core. Default is std,core,panic_unwind,compiler_builtins.

Closes rust-lang/wg-cargo-std-aware#10.

Note: I can probably break some of the refactoring into smaller PRs if necessary.

Overview

The general concept here is to use two resolvers, and to combine everything in the Unit graph. There are a number of changes to support this:

  • A synthetic workspace for the standard library is created to set up the patches and members correctly.
  • Decouple unit_dependencies from Context to make it easier to manage.
  • Add features to Unit to keep it unique and to remove the need to query a resolver.
  • Add a UnitDep struct which encodes the edges between Units. This removes the need to query a resolver for extern_crate_name and public.
  • Remove Resolver from BuildContext to avoid any confusion and to keep the complexity focused in unit_dependencies.
  • Remove Links from Context since it used the resolver. Adjusted so that instead of checking links at runtime, they are all checked at once in the beginning. Note that it does not check links for the standard lib, but it should be safe? I think compiler-rt is the only links?

I currently went with a strategy of linking the standard library dependencies using --extern (instead of --sysroot or -L). This has some benefits but some significant drawbacks. See below for some questions.

For future PRs

  • Add Cargo.toml support. See rust-lang/wg-cargo-std-aware#5
  • Source is not downloaded. It assumes you have run rustup component add rust-src. See rust-lang/wg-cargo-std-aware#11
  • cargo metadata does not include any information about std. I don't know how this should work.
  • cargo clean is not std-aware.
  • cargo fetch does not fetch std dependencies.
  • cargo vendor does not vendor std dependencies.
  • cargo pkgid is not std-aware.
  • --target is required on the command-line. This should default to host-as-target.
  • -p is not std aware.
  • A synthetic Cargo.toml workspace is created which has to know about things like rustc-std-workspace-core. Perhaps rust-lang/rust should publish the source with this Cargo.toml already created?
  • compiler_builtins uses default features (pure Rust implementation, etc.). See rust-lang/wg-cargo-std-aware#15
    • compiler_builtins may need to be built without debug assertions, see this. Could maybe use profile overrides.
  • Panic issues:
    • panic_abort is not yet supported, though it should probably be easy. It could maybe look at the profile to determine which panic implementation to use? This requires more hard-coding in Cargo to know about rustc implementation details.
    • This should probably be handled where panic is set for panic_abort and compiler_builtins. I would like to get a test case for it. This can maybe be done with profile overrides?
  • Using two resolvers is quite messy and causes a lot of complications. It would be ideal if it could only use one, though that may not be possible for the foreseeable future. See rust-lang/wg-cargo-std-aware#12
  • Features are hard-coded. See rust-lang/wg-cargo-std-aware#13
  • Lots of various platform-specific support is not included (musl, wasi, windows-gnu, etc.).
  • Default backtrace is used with C compiler. See rust-lang/wg-cargo-std-aware#16
  • Sanitizers are not built. See rust-lang/wg-cargo-std-aware#17
  • proc_macro has some hacky code to synthesize its dependencies. See rust-lang/wg-cargo-std-aware#18. This may not be necessary if this uses --sysroot instead.
  • Profile overrides cause weird linker errors.
    That is:
    [profile.dev.overrides.std]
    opt-level = 2
    Using [profile.dev.overrides."*"] works. I tried fiddling with it, but couldn't figure it out.
    We may also want to consider altering the syntax for profile overrides. Having to repeat the same profile for std and core and alloc and everything else would not be ideal.
  • Context::unit_deps does not handle build overrides, see #7215. FIXED

Questions for this PR

  • I went with the strategy of using --extern to link the standard lib. This seems to work, and I haven't found any problems, but it seems risky. It also forces Cargo to know about certain implicit dependencies like compiler_builtins and panic_*. The alternative is to create a sysroot and copy all the crates to that directory and pass --sysroot. However, this is complicated by pipelining, which would require special support to copy .rmeta files when they are generated. Let me know if you think I should use a different strategy. I'm on the fence here, and I think using --sysroot may be safer, but adds more complexity.
    • As an aside, if rustc ever tries to grab a crate from sysroot that was not passed in via --extern, then it results in duplicate lang items. For example, saying extern crate proc_macro; without specifying proc_macro as a dependency. We could prevent rustc from ever trying by passing --sysroot=/nonexistent to prevent it from trying. Or add an equivalent flag to rustc.
  • How should this be tested? I added a janky integration test, but it has some drawbacks. It requires internet access. It is slow. Since it is slow, it reuses the same target directory for multiple tests which makes it awkward to work with.
    • What interesting things are there to test?
    • We may want to disable the test before merging if it seems too annoying to make it the default. It requires rust-src to be downloaded, and takes several minutes to run, and are somewhat platform-dependent.
  • How to test that it is actually linking the correct standard library? I did tests locally with a modified libcore, but I can't think of a good way to do that in the test suite.
  • I did not add __CARGO_DEFAULT_LIB_METADATA to the hash. I had a hard time coming up with a test case where it would matter.
    • My only thought is that it is a problem because libstd includes a dylib, which prevents the hash from being added to the filename. It does cause recompiles when switching between compilers, for example, when it normally wouldn't.
    • Very dumb question: Why exactly does libstd include a dylib? This can cause issues (see rust-lang/rust#56443).
    • This should probably change, but I want to better understand it first.
  • The bin_nostd test needs to link libc on linux, and I'm not sure I understand why. I'm concerned there is something wrong there. libstd does not do that AFAIK.
@bors

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2019

☔️ The latest upstream changes (presumably #7215) made this pull request unmergeable. Please resolve the merge conflicts.

@ehuss ehuss force-pushed the ehuss:build-std branch from 3fdb3b8 to fbd9d16 Aug 6, 2019

@alexcrichton
Copy link
Member

left a comment

This is some amazing progress @ehuss, can't thank you enough again for leading the charge here on this!

For future PRs

These are some amazingly detailed and thorough thougts. I'd want to make sure they're not lost! Are you figuring that these would become issues on the wg-std-aware Cargo repository after this PR is merged?

compiler_builtins may need to be built without debug assertions, see this. Could maybe use profile overrides.

This is also true of panic_unwind and panic_abort sorta. What we really need to do here is to delete src/bootstrap/bin/rustc.rs in the Rust repository as much as possible. We should be far more aggressively leading on RUSTFLAGS and for things like compiler-builtins/panic-abort we should be leaning more on profile overrides.

I think this is fine for now in general to be clear, but I'd like to see more config codified in rust-lang/rust's Cargo.toml that we can reuse here in Cargo.

(reading your comments I think you and I are already thinking the same thing...)

It could maybe look at the profile to determine which panic implementation to use?

This matches my thinking as well. I think, however, we should probably "stabilize" an interface between Cargo/libstd about this. For example if libstd is built in unwinding mode it should activate everything necessary to get the panic_unwind runtime working. If built in abort mode it should skip all that. Ideally I think we can basically get by without explicitly hardcoding anything about the panic_* crates but just with libstd features in the long run.

Using two resolvers is quite messy and causes a lot of complications. It would be ideal if it could only use one, though that may not be possible for the foreseeable future.

FWIW I think your refactoring of removing Resolve from the backend was great here and it feels pretty clean to me. This seems like a reasonable-ish way to handle this going forward, even possibly through to stabilization.

Lots of various platform-specific support is not included (musl, wasi, windows-gnu, etc.).

I'm curious to hear more about this, what else is needed for these platforms? I think like musl/wasi will have difficulty in the sense that they assume a C standard library is already built, but are you thinking of other issues? (I'm not sure what the difficulty is with windows-gnu for example)

Profile overrides cause weird linker errors.

Mind sending me a gist of what this looks like?

I went with the strategy of using --extern to link the standard lib.

I wrote this in the comments inline below as well, but I think that --extern is probably the right solution, even in the long-term. I do think, however, that we may want to emulate --sysroot by basically passing --extern to all sysroot crates to all compilations which I think should solve some of the issues about hardcoding too much in Cargo perhaps?

How should this be tested?

This is a really good question!

I think I'd ideally like to see two forms of testing. One is that it actually uses the rust-lang/rust components and such and runs full integration tests. I think these tests can be pretty few and far between though.

Otherwise I think it'd be good if we can create a mock "libstd dependency set" for Cargo itself which we use in tests. Sort of like how we stub out the registry. Ideally these mock crates would just reexport the actual core/std crates in Rust itself (somehow? Maybe some magical test-only feature that makes them visible?) and would be empty crates otherwise to compile very quickly.

How to test that it is actually linking the correct standard library?

With the idea of a "mock root set up by Cargo" this would be easy enough to test perhaps?

I did not add __CARGO_DEFAULT_LIB_METADATA to the hash. I had a hard time coming up with a test case where it would matter.

This is fine I think to cover later, this feature is purely a "let's just smash things until it works" feature so if it works otherwise it's not needed.

Very dumb question: Why exactly does libstd include a dylib?

Otherwise it's impossible to produce dynamic programs at all in Rust (e.g. rustc wouldn't exist). It's a bad reason.

Rust, oh-so-long ago, only supported the dylib crate type. Later rlibs were added and they were chosen as the default for crates. The compiler has always been dynamically linked, with librustc_driver.so being shared between the rustc.exe and rustdoc.exe executables. Additionally librustc_driver.so is used by rls.exe nowadays too.

For almost all use cases other than the compiler itself having a dylib is basically extraneous and not helpful.

It's an extremely longstanding bug in Cargo that you can't configure crate types at build time. For std-aware cargo there's basically zero reason for Cargo to produce a dynamic library of the standard library.

ci/azure-test-all.yml Outdated Show resolved Hide resolved
src/cargo/core/compiler/context/mod.rs Show resolved Hide resolved
src/cargo/core/compiler/unit_dependencies.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/unit_dependencies.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/unit_dependencies.rs Outdated Show resolved Hide resolved
/// Parse the `-Zbuild-std` flag.
pub fn parse_unstable_flag(value: Option<&str>) -> Vec<String> {
// This is a temporary hack until there is a more principled way to
// declare dependencies in Cargo.toml.

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Aug 9, 2019

Member

This definitely is fine for a first pass, it gets anything working! In the long run though I think we're going to want to rely on stable crate names like std and core (and maybe compiler_builtins?) and then work backwards from them in the crate graph to figure out what else to link.

This comment has been minimized.

Copy link
@ehuss

ehuss Aug 9, 2019

Author Contributor

Definitely! Especially if we automatically pass every sysroot crate in, it won't be necessary.

src/cargo/core/compiler/standard_lib.rs Outdated Show resolved Hide resolved
spec_pkgs.push("test".to_string());
let spec = Packages::Packages(spec_pkgs);
let specs = spec.to_package_id_specs(&std_ws)?;
let features = vec!["panic-unwind".to_string(), "backtrace".to_string()];

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Aug 9, 2019

Member

The panic-unwind here is particularly tricky, but I think that long-term we're definitely going to want to select this based on the profile we're going to be compiling with. For example if we compile libstd with panic=abort we shouldn't use this crate, but if we do we should use the crate. Now knowing whether we're compiling with panic=abort is actually pretty tricky because it's defaulted to abort on a bunch of targets (like wasm).

tests/testsuite/standard_lib.rs Outdated Show resolved Hide resolved
@ehuss

This comment has been minimized.

Copy link
Contributor Author

commented Aug 9, 2019

Thanks for taking a look!

Are you figuring that these would become issues on the wg-std-aware Cargo repository after this PR is merged?

Yes. I wanted to hold off in case any of them became non-issues through the course of updating this PR.

I think this is fine for now in general to be clear, but I'd like to see more config codified in rust-lang/rust's Cargo.toml that we can reuse here in Cargo.

Yep. My goal with this was to try to avoid needing to make any changes to rust-lang/rust at first. We can then start rolling out any changes there that will make this easier later.

if libstd is built in unwinding mode it should activate everything necessary to get the panic_unwind runtime working.

Hm, you just made me think of something. When running cargo test some targets can be built with unwind (like tests) and some may be built with abort (like bins). That may complicate the panic support. It would essentially require 2 copies of std in that situation. I will need to think about that more.

what else is needed for these platforms?

I'm not sure, I didn't even try. I'm just thinking of things like this and this and this. I'm not sure if that is actually needed, but I suspect there will be some rough edges for some platforms.

Profile overrides cause weird linker errors.

Here: https://gist.github.com/ehuss/6973e2fad5663f10592f71de08b3eff0
This is with [profile.dev.overrides.std] opt-level=2. Adding core and compiler_builtins doesn't help. I did not investigate much.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Aug 9, 2019

When running cargo test some targets can be built with unwind (like tests) and some may be built with abort (like bins). That may complicate the panic support. It would essentially require 2 copies of std in that situation.

Oh dear, that's a good point! I think though we'd functionally want to do what libstd does today which is to compile everything with panic=abort, have two runtimes, and then let the final compile choose which it prefers.

I'm just thinking of things like this and this and this.

Ah ok I see what you mean. I do think that those will cause issues and will be good to track! Of course not a blocker for this though

I did not investigate much.

Hm ok, that does indeed look horrifyingly confusing. I'll try to help debug once this goes through as well

@ehuss ehuss force-pushed the ehuss:build-std branch from fbd9d16 to 12efdd2 Aug 11, 2019

@ehuss

This comment has been minimized.

Copy link
Contributor Author

commented Aug 15, 2019

Profile overrides cause weird linker errors.

matklad had an interesting idea for a related problem here: rust-lang/rust#63484 (comment)
I wonder if it isn't something similar? Or maybe related to re-exports?

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Aug 15, 2019

Hm so in theory optimization settings should affect symbol visibility to the point of causing linker errors. That's just theory though, and libstd is always a special child, so there's probably something wrong here or there. I would need to dig into it more locally though to figure it out.

@ehuss ehuss force-pushed the ehuss:build-std branch from 12efdd2 to 85c5427 Aug 16, 2019

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 16, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 16, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
@alexcrichton

This comment has been minimized.

Copy link
Member

commented Aug 16, 2019

FWIW I'm personally feeling pretty good about landing this, so @ehuss I don't mind waiting if you've got blockers you'd prefer to handle first but I think it's otherwise fine to start opening issues and adding comments for what should be removed when and then r+ this

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 19, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 19, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 19, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 19, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 19, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
Centril added a commit to Centril/rust that referenced this pull request Aug 20, 2019
Rollup merge of rust-lang#63637 - alexcrichton:remove-libtest-step, r…
…=Mark-Simulacrum

bootstrap: Merge the libtest build step with libstd

Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 20, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
bors added a commit to rust-lang/rust that referenced this pull request Aug 21, 2019
Auto merge of #63637 - alexcrichton:remove-libtest-step, r=Mark-Simul…
…acrum

bootstrap: Merge the libtest build step with libstd

Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 23, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 23, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
bors added a commit to rust-lang/rust that referenced this pull request Aug 23, 2019
Auto merge of #63637 - alexcrichton:remove-libtest-step, r=Mark-Simul…
…acrum

bootstrap: Merge the libtest build step with libstd

Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 23, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Aug 23, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.

@ehuss ehuss force-pushed the ehuss:build-std branch from b67f3e1 to 9382be1 Sep 3, 2019

@ehuss

This comment has been minimized.

Copy link
Contributor Author

commented Sep 3, 2019

Minor fix due to #7310.

@bors r=alexcrichton

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2019

📌 Commit 9382be1 has been approved by alexcrichton

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2019

⌛️ Testing commit 9382be1 with merge 49fbf52...

bors added a commit that referenced this pull request Sep 3, 2019
Auto merge of #7216 - ehuss:build-std, r=alexcrichton
Basic standard library support.

This is not intended to be useful to anyone. If people want to try it, that's great, but do not rely on this. This is only for experimenting and setting up for future work.

This adds a flag `-Zbuild-std` to build the standard library with a project. The flag can also take optional comma-separated crate names, like `-Zbuild-std=core`. Default is `std,core,panic_unwind,compiler_builtins`.

Closes rust-lang/wg-cargo-std-aware#10.

Note: I can probably break some of the refactoring into smaller PRs if necessary.

## Overview
The general concept here is to use two resolvers, and to combine everything in the Unit graph. There are a number of changes to support this:

- A synthetic workspace for the standard library is created to set up the patches and members correctly.
- Decouple `unit_dependencies` from `Context` to make it easier to manage.
- Add `features` to `Unit` to keep it unique and to remove the need to query a resolver.
- Add a `UnitDep` struct which encodes the edges between `Unit`s. This removes the need to query a resolver for `extern_crate_name` and `public`.
- Remove `Resolver` from `BuildContext` to avoid any confusion and to keep the complexity focused in `unit_dependencies`.
- Remove `Links` from `Context` since it used the resolver. Adjusted so that instead of checking links at runtime, they are all checked at once in the beginning. Note that it does not check links for the standard lib, but it should be safe? I think `compiler-rt` is the only `links`?

I currently went with a strategy of linking the standard library dependencies using `--extern` (instead of `--sysroot` or `-L`). This has some benefits but some significant drawbacks. See below for some questions.

## For future PRs
- Add Cargo.toml support. See rust-lang/wg-cargo-std-aware#5
- Source is not downloaded. It assumes you have run `rustup component add rust-src`. See rust-lang/wg-cargo-std-aware#11
- `cargo metadata` does not include any information about std. I don't know how this should work.
- `cargo clean` is not std-aware.
- `cargo fetch` does not fetch std dependencies.
- `cargo vendor` does not vendor std dependencies.
- `cargo pkgid` is not std-aware.
- `--target` is required on the command-line. This should default to host-as-target.
- `-p` is not std aware.
- A synthetic `Cargo.toml` workspace is created which has to know about things like `rustc-std-workspace-core`. Perhaps rust-lang/rust should publish the source with this `Cargo.toml` already created?
- `compiler_builtins` uses default features (pure Rust implementation, etc.). See rust-lang/wg-cargo-std-aware#15
    - `compiler_builtins` may need to be built without debug assertions, see [this](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L210-L214). Could maybe use profile overrides.
- Panic issues:
    - `panic_abort` is not yet supported, though it should probably be easy. It could maybe look at the profile to determine which panic implementation to use? This requires more hard-coding in Cargo to know about rustc implementation details.
    - [This](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L186-L201) should probably be handled where `panic` is set for `panic_abort` and `compiler_builtins`. I would like to get a test case for it. This can maybe be done with profile overrides?
- Using two resolvers is quite messy and causes a lot of complications. It would be ideal if it could only use one, though that may not be possible for the foreseeable future. See rust-lang/wg-cargo-std-aware#12
- Features are hard-coded. See rust-lang/wg-cargo-std-aware#13
- Lots of various platform-specific support is not included (musl, wasi, windows-gnu, etc.).
- Default `backtrace` is used with C compiler. See rust-lang/wg-cargo-std-aware#16
- Sanitizers are not built. See rust-lang/wg-cargo-std-aware#17
- proc_macro has some hacky code to synthesize its dependencies. See rust-lang/wg-cargo-std-aware#18. This may not be necessary if this uses `--sysroot` instead.
- Profile overrides cause weird linker errors.
  That is:
  ```toml
  [profile.dev.overrides.std]
  opt-level = 2
  ```
  Using `[profile.dev.overrides."*"]` works. I tried fiddling with it, but couldn't figure it out.
  We may also want to consider altering the syntax for profile overrides. Having to repeat the same profile for `std` and `core` and `alloc` and everything else would not be ideal.
- ~~`Context::unit_deps` does not handle build overrides, see #7215.~~ FIXED

## Questions for this PR
- I went with the strategy of using `--extern` to link the standard lib. This seems to work, and I haven't found any problems, but it seems risky. It also forces Cargo to know about certain implicit dependencies like `compiler_builtins` and `panic_*`. The alternative is to create a sysroot and copy all the crates to that directory and pass `--sysroot`. However, this is complicated by pipelining, which would require special support to copy `.rmeta` files when they are generated. Let me know if you think I should use a different strategy. I'm on the fence here, and I think using `--sysroot` may be safer, but adds more complexity.
    - As an aside, if rustc ever tries to grab a crate from sysroot that was not passed in via `--extern`, then it results in duplicate lang items. For example, saying `extern crate proc_macro;` without specifying `proc_macro` as a dependency. We could prevent rustc from ever trying by passing `--sysroot=/nonexistent` to prevent it from trying. Or add an equivalent flag to rustc.
- How should this be tested? I added a janky integration test, but it has some drawbacks. It requires internet access. It is slow. Since it is slow, it reuses the same target directory for multiple tests which makes it awkward to work with.
    - What interesting things are there to test?
    - We may want to disable the test before merging if it seems too annoying to make it the default. It requires rust-src to be downloaded, and takes several minutes to run, and are somewhat platform-dependent.
- How to test that it is actually linking the correct standard library? I did tests locally with a modified libcore, but I can't think of a good way to do that in the test suite.
- I did not add `__CARGO_DEFAULT_LIB_METADATA` to the hash. I had a hard time coming up with a test case where it would matter.
    - My only thought is that it is a problem because libstd includes a dylib, which prevents the hash from being added to the filename. It does cause recompiles when switching between compilers, for example, when it normally wouldn't.
    - Very dumb question: Why exactly does libstd include a dylib? This can cause issues (see rust-lang/rust#56443).
    - This should probably change, but I want to better understand it first.
- The `bin_nostd` test needs to link libc on linux, and I'm not sure I understand why. I'm concerned there is something wrong there. libstd does not do that AFAIK.
@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2019

☀️ Test successful - checks-azure
Approved by: alexcrichton
Pushing 49fbf52 to master...

@bors bors merged commit 9382be1 into rust-lang:master Sep 3, 2019

10 checks passed

homu Test successful
Details
rust-lang.cargo Build #20190903.10 succeeded
Details
rust-lang.cargo (Linux beta) Linux beta succeeded
Details
rust-lang.cargo (Linux nightly) Linux nightly succeeded
Details
rust-lang.cargo (Linux stable) Linux stable succeeded
Details
rust-lang.cargo (Windows x86_64-msvc) Windows x86_64-msvc succeeded
Details
rust-lang.cargo (docs) docs succeeded
Details
rust-lang.cargo (macOS) macOS succeeded
Details
rust-lang.cargo (resolver) resolver succeeded
Details
rust-lang.cargo (rustfmt) rustfmt succeeded
Details
@ehuss ehuss referenced this pull request Sep 4, 2019
bors added a commit to rust-lang/rust that referenced this pull request Sep 4, 2019
Auto merge of #64150 - ehuss:update-cargo-books, r=alexcrichton
Update cargo, books

## cargo

8 commits in 22f7dd0495cd72ce2082d318d5a9b4dccb9c5b8c..fe0e5a48b75da2b405c8ce1ba2674e174ae11d5d
2019-08-27 16:10:51 +0000 to 2019-09-04 00:51:27 +0000
- Rename `--all` to `--workspace` (rust-lang/cargo#7241)
- Basic standard library support. (rust-lang/cargo#7216)
- Allow using 'config.toml' instead of just 'config' files. (rust-lang/cargo#7295)
- Retry on SSL Connect Error. (rust-lang/cargo#7318)
- minimal-copy `deserialize` for `InternedString` (rust-lang/cargo#7310)
- Fix typo in cargo vendor examples (rust-lang/cargo#7320)
- Fixes around multiple `[patch]` per crate (rust-lang/cargo#7303)
- Improve error messages on mkdir failure (rust-lang/cargo#7306)

## reference

7 commits in d191a0cdd3b92648e0f1e53b13140a14677cc65b..090c015f7939665866432c334957bd536c811870
2019-08-15 08:42:23 +0200 to 2019-09-03 13:59:28 -0700
- Fix rust-lang-nursery/reference#664: Review Oxford comma usage. (rust-lang-nursery/reference#668)
- Fix some links. (rust-lang-nursery/reference#667)
- Remove trait object warning. (rust-lang-nursery/reference#666)
- Specify pattern types in `let` statements and `for` expressions (rust-lang-nursery/reference#663)
- Fix loop expression link. (rust-lang-nursery/reference#662)
- async-await initial reference material (rust-lang-nursery/reference#635)
- Correct errors in the reference of extern functions definitions and declarations (rust-lang-nursery/reference#652)

## rust-by-example

1 commits in 580839d90aacd537f0293697096fa8355bc4e673..e76be6b2dc84c6a992e186157efe29d625e29b94
2019-08-17 23:17:50 -0300 to 2019-09-03 07:42:26 -0300
- Change link to russian translation repository (rust-lang/rust-by-example#1245)

## embedded-book

1 commits in 432ca26686c11d396eed6a59499f93ce1bf2433c..5ca585c4a7552efb546e7681c3de0712f4ae4fdc
2019-08-09 23:20:22 +0000 to 2019-08-27 13:39:14 +0000
- Fixup book CI  (rust-embedded/book#205)
@Ericson2314

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2019

Congrats and thank you!!

@phil-opp

This comment has been minimized.

Copy link
Contributor

commented Sep 9, 2019

Thanks a lot for implementing this! I don't know if this is the right place to share feedback, so please tell me if there is a more appropriate place!

I tried -Z build-std for https://github.com/phil-opp/blog_os with the following results:

  • cargo build -Z build-std=core,alloc works without problems. I only needed to add a dependency on rlibc because there is currently no way to enable the mem feature of compiler_builtins.
  • cargo run -Z build-std=core,alloc also worked fine with a custom QEMU runner.
  • cargo test -Z build-std=core,alloc failed because it tries to build libtest for a bare-metal target. We use a custom test framework, so compiling libtest is not needed for our case.
  • Specifying the -Z flag explicitly on each invokation is a bit cumbersome. Is there perhaps a way to set it in a .cargo/config file (e.g. similar to the build.rustflags key)?

In summary, I would say that this MVP is already able to replace most uses of cargo-xbuild. The only blocker is the is the missing support for custom test frameworks on bare-metal targets. Thanks again for your work!

@ehuss

This comment has been minimized.

Copy link
Contributor Author

commented Sep 9, 2019

cargo test -Z build-std=core,alloc failed because it tries to build libtest for a bare-metal target. We use a custom test framework, so compiling libtest is not needed for our case.

You should be able to set harness=false for all the targets ([lib], [[test]], etc.) and it should skip libtest.

@phil-opp

This comment has been minimized.

Copy link
Contributor

commented Sep 9, 2019

@ehuss Thanks for the suggestion. Setting harness=false disables the test framework completely, so it does not work together with the custom_test_frameworks feature and the test_runner attribute (e.g. used in in this test).

monoflo added a commit to monoflo/rust that referenced this pull request Sep 12, 2019
bootstrap: Merge the libtest build step with libstd
Since its inception rustbuild has always worked in three stages: one for
libstd, one for libtest, and one for rustc. These three stages were
architected around crates.io dependencies, where rustc wants to depend
on crates.io crates but said crates don't explicitly depend on libstd,
requiring a sysroot assembly step in the middle. This same logic was
applied for libtest where libtest wants to depend on crates.io crates
(`getopts`) but `getopts` didn't say that it depended on std, so it
needed `std` built ahead of time.

Lots of time has passed since the inception of rustbuild, however,
and we've since gotten to the point where even `std` itself is depending
on crates.io crates (albeit with some wonky configuration). This
commit applies the same logic to the two dependencies that the `test`
crate pulls in from crates.io, `getopts` and `unicode-width`. Over the
many years since rustbuild's inception `unicode-width` was the only
dependency picked up by the `test` crate, so the extra configuration
necessary to get crates building in this crate graph is unlikely to be
too much of a burden on developers.

After this patch it means that there are now only two build phasese of
rustbuild, one for libstd and one for rustc. The libtest/libproc_macro
build phase is all lumped into one now with `std`.

This was originally motivated by rust-lang/cargo#7216 where Cargo was
having to deal with synthesizing dependency edges but this commit makes
them explicit in this repository.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.