-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std
Aware Cargo
#2663
std
Aware Cargo
#2663
Conversation
Co-Authored-By: jamesmunns <james@onevariable.com>
Co-Authored-By: jamesmunns <james@onevariable.com>
Suggestion cleanups Co-Authored-By: jamesmunns <james@onevariable.com>
Regarding "stable features"... I think that might be better thought of as bringing Cargo features to See also: rust-lang/api-guidelines#95 ("Determine how crates should expose 'unstable' APIs") Anyway, that's just a matter of semantics. |
|
||
## Should we allow configurable `core` and `std` | ||
|
||
If we are to uphold stability guarantees for all configurations of `core` and `std`, this could require testing 2^(n+m) versions of Rust, where `n` is the number of `core` features, and `m` is the number of `std` features. This would have a negative impact on CI times. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our CI times are already quite stretched; I think bors is the bottleneck overall in our development processes. This would also require 2^(n+m) for reasonably many platforms, not just for one.
If core & std (and probably alloc) are to be configurable, it should, for the foreseeable future be exclusively limited to removing things from the standard library, rather than changing algorithms and whatnot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if rust opens to embedded, we can contribute plenty CI machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @rust-lang/infra ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're certainly not going to test all 2m+n combinations on the CI, this is like mindlessly aiming for 100% code coverage. Having one test for all features disabled and one for all features enabled is more than enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having one test for all features disabled and one for all features enabled is more than enough.
In my experience that's rarely the case - features interact with each other in weird ways.
IMO 100% coverage is impossible, but that should not be the goal. The goal should be to get as close to 100% coverage as possible while using a tiny fraction of resources.
I don't think we can achieve that goal by using fixed rules (like testing everything, or testing just A and B). Achieving the goal is going to require investing time into evaluating which features and features groups makes sense to tests where.
For example, it might make sense to test more combinations on the x86_64 tier 1 targets only, and it might also make sense to set up a weekly cron job that tests even more combinations, random combinations, etc. But which features should be tested in isolation and which ones can be grouped in the tests with other features is something that we should evaluate and constantly re-evaluate as new features get added on a 1:1 basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gnzlbg Cargo features are supposed to be additive, features interacting within libstd is already a bug IMO.
For sure if it turns out A+C+F has weird behavior compared with no features and A+B+C+D+E+F+G features, we could create a special run-make
test to guarantee that special behavior in A+C+F.
|
||
Another option in this area is to force the use of profile overrides, as specified by [RFC2822](https://github.com/rust-lang/rfcs/blob/master/text/2282-profile-dependencies.md). | ||
|
||
## Should providing a custom `core` or `std` require a nightly compiler? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this should be considered a requirement. We cannot reasonably have stability and at the same time allow people to do this with with a stable compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But isn't this already possible on stable? Just specify a core
/std
dependency in your Cargo.toml
to override the default. Rust even automatically imports the prelude
of the custom std
.
We use this approach in our stm32f7-discorvery
crate to augment the core
library with the required Future
implementations for using async/await on no_std
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh... @alexcrichton ^-- Was this intended and who intended it?
(the Future
link is dead)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, fixed it. Basically we provide our own implementation for the parts of std::future that currently use thread local storage (including the await
macro).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, part of the discussion on the pre-RFC was that overriding std
/core
on a stable compiler should be fine because actually implementing std
/core
will require using unstable features that will require a nightly compiler.
EDIT: and if in the future it’s possible to provide an implementation of std
/core
without using any unstable features, why should the act of switching your dependencies to it require a nightly compiler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise if you crates can finagle some way to compile on stable but still override libstd, then more power to them! (aka you link to the real std and maybe wrap some of its functionality).
Should cargo provide a way to use the non-patched dependency from the patched one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But isn't this already possible on stable?
@phil-opp Unlike [patch.sysroot]
, having a crates.io dependency that happens to be named core
or std
only affects your own crate. And when doing so on stable, you’re very limited in what this dependency can do. In particular it cannot (re)define lang items.
Forcing unstable compiler just to be able to pick an std that works on embedded
@aep Do you mean that works with async/await without thread-local storage? IMO this is a failure of async/await, not of the core
v.s. std
structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SimonSapin yes. async is in core now, so we need to patch core. Hence the need for a way to have alt stable std/core in cargo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aep Sorry, I haven’t followed all the recent development around async. Why do you need to patch core
? Is there a path so that you don’t need to in the future?
In addition to being much preferable IMO, very practically for you, changes to async could happen sooner than stabilization of everything needed to define core
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only parts of core
that the async_await
feature uses are core::{future::Future, task::{Poll, Context}}
, these are such trivial constructs that I don't see how you could change them while still having the async_await
feature work? (I guess replacing Context
would be possible since async_await
only passes the type around, but doing that would definitely be regarded as operating outside standard procedures).
async_await
also uses some functions from std
currently, that is just an implementation defect (rust-lang/rust#56974) and doesn't really seem like a good motivation for stabilizing the ability to provide your own std
(it would be much better to put the energy towards fixing the defect so that you can just use the builtin async transform).
|
||
When compiling for a non-standard target, users may specify their target using a target specification file, rather than a pre-defined target. | ||
|
||
> NOTE: The current target specification is described in JSON, and contains some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In jamesmunns#1 (comment) you noted that we should be "blind" to this format and what it contains and that stabilizing could be done later.. My question is then: how can we reasonably change anything if people start relying on the format on stable? Stability is not just about what you say, it must also be practical.
With my language team hat on (but speaking for myself only) it is important that we be able to feasibly use other backends than LLVM. It should not just be a theoretical possibility to use Cranelift or some other backend, but practically possible in cargo for e.g. debug builds or whatnot.
I would suggest embracing the otherwise incremental approach of this RFC where we start with other things first and let custom target specs be unstable until we are confident that stabilizing won't cause headaches wrt. backends.
Meanwhile we can also do what should be a straightforward switch to TOML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the switch to TOML as a first step.
However, I am very hesitant of the idea of stabilising the current custom target format. While there is an RFC describing the format, it feels very much like an unstable implementation detail at the moment, and I've personally put up with many problems with it because it's not meant to be a permanent solution and is not a priority at the moment; I wonder if this reflects other people's thoughts.
On alternative backends: if we do end up stabilising this, the minimum change I think we'd need to make would be to split some options into a backend-options
map (like has been done with pre-link-args
, for example). For example, you'd be required to write (in the current JSON format):
"backend-options": {
"llvm": {
"data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
"target": "x86_64-unknown-none",
}
{{other backends could then be added with backwards compatibility here}}
}
You could then only build a target with a specific backend if it has a corresponding entry in the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Centril being blind to something is just parametricity :). Only rustc cares about the format, cargo need not care about the format at all. What it does need to do however is be able to tell if two machine configs are equal, since Cargo controls caching. So we need a:
abstract type MachineConfig: Eq;
Cargo can forward the contents as raw bytes to rustc (as a CLI arg, or temp file Cargo creates to avoid races, or many other things). As to equality, Cargo can take the hash of the config and use that as a cache key. Sure, this is never complete (concrete formats usually have a more flexible notion of equality), but it is always sound. That's just fine for now.
This one was a bit tricky to assign teams to... but:
(that's a lot of teams but I expect one team to take the lead and others to review... it may also change...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Implementation detail) I suspect there will need to be special considerations for proc macros and build scripts. My guess is that proc macros will be sensitive to having a different libstd, and may need to only be built with the libstd binaries from the rustc sysroot. This might be quite expensive for some projects.
text/0000-std-aware-cargo.md
Outdated
|
||
Currently, `compiler-builtins` contains components implemented in the C programming language. While these dependencies have been highly optimized, the use of them would require the builder of the root crate to also have a working compilation environment for compilation in C. | ||
|
||
This RFC proposes instead to use the [pure rust implementation] when compiling for a custom target, removing the need for a C compiler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if this is a dumb question, would backtrace
also require building a C library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely not a dumb question, I hadn't thought of this. @alexcrichton is there a pure Rust version of libbacktrace
? Or are there any other C-dependencies you are aware of that would also need to be listed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is rust-lang/rust#46439.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's correct yeah that libbacktrace requires a C compiler right now, and while all the pieces exist in basic forms (e.g. gimli, addr2line exmples for gimli, etc) in only-Rust they haven't been integrated in such a way yet to get pulled into the standard library. (the issue @kennytm pointed out is tracking that)
|
||
## Should profile changes always prompt a rebuild of `core`/`std`? | ||
|
||
For example, if a user sets their debug build to use `opt-level = 'z'`, should this rebuild `core`/`std` to use that opt level? Or should an additional flag, such as `apply-to-sysroot` be required to opt-in to this behavior, unless otherwise needed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused on this point. Would it rebuild core/std even if they are not listed in [dependencies]
or [patch]
? I fear that always rebuilding when using non-default profile settings would be too much disruption. It would also be a little confusing, because there are multiple profiles, none of which match the settings used in the precompiled libraries. Perhaps core/std should only be rebuilt when they are explicitly listed in Cargo.toml
?
Another potential issue is that the features to enable for libstd currently are driven by bootstrap's config.toml, so it may not be obvious to the user that they need to enable things like "backtrace" to have feature parity with the defaults (which change per platform). How do you switch to a custom-built std that retains feature parity with the default distribution per platform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is exactly the open question here. On one hand: if the user states they want "opt-level = 's'", and we can now give them that for libstd too, it would make sense to keep the total program size down.
On the other hand, this could surprise some users, as it could drastically increase a clean build time by rebuilding core and standard for them.
Good call regarding config.toml
, I was not aware of how this mechanism worked. In xargo
, you can set some (or all?) of these flags using a feature = [ ... ]
syntax. I would guess we would need to expose these features as "Cargo Features" rather than config.toml features so the users may configure them, with default-features
matching what the CI builds of libstd
and libcore
currently are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they are "Cargo Features", config.toml
just drives the defaults. This is done here.
One thing I'm aware of that is not driven via features, and that is sanitizer support. That is done here. It looks like sanitizers are used on the linux builds, but I don't really know anything about them. Looks like it requires llvm?
Maybe the defaults could be captured somewhere when the distribution is made, and Cargo could read those (maybe in the "src" component? maybe rewrite the Cargo.toml
?)? EDIT: 🤔 Except that won't work for non-host targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a general question whether any dependency needs to be rebuilt for these non-abi-affecting switches, so as usual I hope for a Cargo solution that doesn't special-case core
and std
.
Co-Authored-By: jamesmunns <james@onevariable.com>
Co-Authored-By: jamesmunns <james@onevariable.com>
Co-Authored-By: jamesmunns <james@onevariable.com>
Is it necessary to introduce the concept of "stable features" in this RFC? Can this be in a separate RFC from the building/source specification? I brought this up in the pre-RFC: I think the RFC should say something about the current What about the |
That's just a cross compiling concern. I'm strongly of the opinion that one should always be cross compiling, and native compiling is just cross compiling where the platform you are building for and the platform you're building on happen to be the same. Under this philosophy, it should be possible to do (I'm strongly of this opinion having previously refactored two existing build tools to adopt this philosophy, even though there initially didn't at all. It's not too late for Cargo either.) |
In today's Rust environment, `core` and `std` are shipped as precompiled objects. This was done for a number of reasons, including faster compile times, and a more consistent experience for users of these dependencies. This design has served the bulk of users fairly well. However there are a number of less common uses of Rust, that are not well served by this approach. Examples include: | ||
|
||
* Supporting new/arbitrary targets, such as those defined by a custom target (".json") file | ||
* Modifying `core` or `std` through use of feature flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cargo feature flags? or compiler feature flags (-C target-cpu
, -C target-feature
, --cfg foo
, etc. ?) Or both?
It is necessary to use unstable features to build `core`. To allow users of a stable compiler to build `core`, we would set the `RUSTC_BOOTSTRAP` environment variable **ONLY** for the compilation of `core`. | ||
|
||
This should be considered sound, as stable users may not change the source used to build `core`, or the features used to build `core`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't this also be sound for crates on crates.io (or for alloc and libstd) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe mention that even when compiled with RUSTC_BOOTSTRAP
unstable features from core are not available to stable users.
In general, the same restrictions for building `core` will apply to building `std`. These include: | ||
|
||
* Users of the stable compiler must use the source used to build the current Rust compiler | ||
* Only compile time features considered `stable` may be used outside of nightly. Initially the list of `stable` features would be empty, and stabilizing these features would require a PR/RFC to `libstd`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would have to support these stable features forever, so each feature should go through the RFC process. A PR + mini FCP is not an option for this in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that was meant to be RFC + PR, rather than either.
By using stable feature flags for `std`, we could say that `std` as a crate with `default-features = false` would essentially be `no_core`, or with `features = ["core"]`, we would be the same as `no_std`. | ||
|
||
This abstraction may not map to the actual implementation of `libcore` or `libstd`, but instead be an abstraction layer for the end developer. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we would still need to allow using [dependency.core]
forever and somehow map that to an unified version of the library.
Or in other words: removing the concept of core
and std
(e.g. into an unified std
that uses the portability lint) would be a breaking change if this RFC was merged as is.
I don't how hard would it be to have an unified library, that's provided as a "split" one simultaneously (or in an edition dependent way) for backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this could easily be dealt with by making core
(and alloc
) a facade over std
with a limited set of default active features. As long as the current core
-> std
dependency order is not observable somehow.
One of the things I'm worried about is the trigger that causes Cargo to build libstd/libcore/etc. The RFC currently says that only the root crate can do it and it happens when a custom target is used, feature flags are modified, profile settings change, or
I'm a little uneasy about how we're going to handle |
Hmm there should be some language in the RFC saying by default crates are assumed to depend on Then Cargo simply builds the dependencies which are needed per the normal rules. |
If creates there are allowed to |
I personally prefer several user facing crates (core, alloc, collections, etc.) over a single Also I think we should keep in mind PAL proposal: https://internals.rust-lang.org/t/4301 |
miri needs to build std with xargo, which doesn't allow stable/beta: <japaric/xargo#204 (comment)> Therefore, at this time there's no point in making miri available on any but the nightly channel. If we get a stable way to build `std`, like [RFC 2663], then we can re-evaluate whether to start including miri, perhaps still as `miri-preview`. [RFC 2663]: rust-lang/rfcs#2663
manifest: only include miri on the nightly channel miri needs to build std with xargo, which doesn't allow stable/beta: <japaric/xargo#204 (comment)> Therefore, at this time there's no point in making miri available on any but the nightly channel. If we get a stable way to build `std`, like [RFC 2663], then we can re-evaluate whether to start including miri, perhaps still as `miri-preview`. [RFC 2663]: rust-lang/rfcs#2663
What is the status of this? |
@mark-i-m this is probably due an update to the RFC to take the feedback from the comments, and summarize the open discussion points. Unfortunately I will not have time to address this until early May, but I do plan to continue pushing for this then. As far as I know, there have not been any critical issues raised, though there are a number of open points that need to be addressed or discussed further. |
I have been lax in responding to some points made in this RFC; I hope to change that soon. |
We discussed this RFC in the @rust-lang/lang meeting today! In general, it seemed like there weren't a lot of lang-team specific concerns at this stage, though going forward there may be more. You can watch the video and find some notes here. On a personal note, I wanted to raise a concern about the use of cargo features: the |
Good point. One way to address this, though it might generally be considered an anti-pattern, would be to not have any on-by-default features, and instead have features to disable things. Otherwise, I suppose it would be necessary to carefully choose what core features should not be behind feature flags. And then when (perhaps inevitably) a need arises for a way to disable one of those, that would end up with a feature to disable it rather than enable it (in contrast to some other features). I suppose the other extreme would be to have And then if all the crates (not only the top level one) might be specifying different std dependencies... Luckily, the main point of this RFC can be implemented and stabilized without the "stable features", which could be added later (by initially making cargo reject any attempt to set features for std). Though this loses some, but not all, of the benefits of this feature. Edit: Another conservative approach: initially ban |
|
||
This RFC proposes the following concrete changes, which may or may not be implemented in this order, and may be done incrementally. The details and caveats around these stages are discussed in the [Reference Level Explanation][reference-level-explanation]. | ||
|
||
In this document, we use the term "root crate" to refer to the Rust project being built directly by Cargo. This crate contains the Cargo.toml used to guide the modifications described below. This would typically be a crate containing a binary application, or a standalone item, such as an `rlib`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that “root crate” is not the right concept. In a virtual workspace, there may not be a root crate at all. As far as I can tell what’s important for the purpose of this RFC is having a root Cargo.toml
file where to specify some configuration, whether or not there’s a corresponding compiler artifact.
So “root manifest” may be better here. It refers to the Cargo.toml
file pointed to by --manifest-path
, or in the current directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes frankly I wish we always had a virtual workspace root. That would make the distinction between a "workspace query" and "create graph solution" a lot clearer.
|
||
In this document, we use the term "root crate" to refer to the Rust project being built directly by Cargo. This crate contains the Cargo.toml used to guide the modifications described below. This would typically be a crate containing a binary application, or a standalone item, such as an `rlib`. | ||
|
||
1. Allow developers of root crates to recompile `core` (and `compiler-builtins`) when their desired target does not match one available as a `rustup target add` target, without the usage of a nightly compiler. This version of `core` would be built from the same source files used to build the current version of `rustc`/`cargo`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t see a reason to limit this to libcore. In other words, I think we should do part 3 from the start and support all standard library crates. Currently this is core
, alloc
, std
, proc_macro
, and test
. Maybe also their dependencies?
In the same vein, please use “standard library crates” thorough the RFC wherever it currently says “core
”.
|
||
Users of a stable compiler would not be able to customize `core` outside of these profile settings. | ||
|
||
For users of a nightly compiler, compile time features of `core` may be specified using the same syntax used for other crate dependencies. These specified features may include unstable features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The syntax should be similar to that for crates.io dependencies. But I feel rather strongly that it should not be exactly the same. Currently the example below refers to https://crates.io/crates/core. The fact that the crates.io server rejects uploads to that name is a separate concern, unrelated to the meaning of a given bit of Cargo.toml
syntax.
How does Cargo know the difference between a sysroot dependency and a crates.io dependency? I think we should not hard-code a list of known standard library crate names.
Cargo already has a concept of “source” for all package, with the default being crate.io. Some keys in the TOML table/dictionary for a dependency can change the source: path
, git
, and regitry
. I think sysroot = true
(name to be bikeshedded) should be another one of these, and necessary to refer to standard library crates.
I believe this is compatible (aligns, in the underlying concepts in Cargo) with the [patch.sysroot]
syntax proposed later in this RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SimonSapin the problem is that which libraries are special-cased to the implementation is necessarily an implementation specific detail. We want to move creates to crates.io when they become 100% stable code and not have things break. I think the compiler should provide a "workspace override" which does the [patch.sysroot]
-ing and/or we start to distinguish the "default source" from crates.io
(the default source may be a composition of sources).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ericson2314 I’m sorry. I feel like I should respond to your message here, but I have a very hard time parsing it :/ It seems based on multiple unstated assumptions. For example:
an implementation specific detail
Are you imagining a world where there are multiple implementations of Rust where they each have their take on the entire toolchain? Today mrustc exists, but doesn’t have its own Cargo.
My objection was to hard-coding a list of names known to be crates of the standard library in Cargo, because Cargo and the standard library are developed separately.
We want to move creates to crates.io when they become 100% stable code and not have things break.
I know from previous interactions with you that you have this idea of the standard library somehow having its source of truth on crates.io. You take it as a given here but I still maintain that this idea is fundamentally contradictory, because of the difference in versioning between these two worlds.
A library gets to pick what other libraries from crates.io it depends on, and it gets to pick their version numbers. A program can even end up with multiple versions of the “same” library in its dependency graph.
A library does not however get to pick what version of the compiler is being used. (Only which versions it tries to be compatible with.) The standard library is by definition what ships with a given compiler. It doesn’t have its own version number, it shares the compiler’s version number.
I think that one can make a valid argument that less functionality should be the responsibility of the standard library. For example, std::sync::mpsc
could be deprecated without a replacement in std
and https://crates.io/crates/crossbeam-channel be recommended instead. But being shipped with the compiler is what “standard library” means. “Moving std
to crates.io” doesn’t make any sense in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how the standard library works today, but there's no rule that it must always be that way. It's true that some parts of it, mostly in core
, are tightly bound to the compiler implementation, such as wrappers around intrinsics and definitions of language items. But significant chunks are not, including most of std
itself (all the wrappers around OS functionality) as well as significant parts of core
(e.g. unicode, flt2dec, iterator adapters). The current implementations of those APIs may or may not use unstable features, but the APIs don't inherently require it. The deprecation route you mentioned could work in theory, but I don't think users would appreciate a large percentage of the standard library being deprecated.
Personally, I would like to see a new "even more core than core
" crate that has the absolute bare minimum amount of code required to expose all of the compiler's functionality. Then the rest can be moved to crates.io
, and can be made optional – useful for cases where, say, you're extremely constrained on binary size and want to be in control of every single line of code that actually makes it into the output binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the rest can be moved to crates.io
rand
moved from the standard library to crates.io. Because it was not marked at stable in std
(or it was before 1.0, I don’t remember), we could simply remove it from there.
But there’s not a lot of that left that we can do. For APIs that are #[stable]
in std
, we promised to keep having them in std
. At most when can deprecate them, which as you noted has downsides. With that in mind, what does “move to crates.io” even mean? Something can not move if it also stays at its current location. Are you saying you want two copies of the same APIs? How is that useful? Can they diverge, in particular if the crates.io one wants to make breaking changes in a 2.0 version?
can be made optional
This RFC proposes cargo feature flags for standard library crates, which presumably will allow making parts of the standard library optional: you can disable them if you don’t use them. How does “move to crates.io” help with not using something?
for cases where, say, you're extremely constrained on binary size
The decision to move Unicode tables into libcore was based explicitly on them not affecting the binary size of programs that don’t use them, with confirmation from the Embedded WG. (The linker eliminates unused symbols.) So I feel that binary size is not a valid argument for a smaller standard library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you imagining a world where there are multiple implementations of Rust where they each have their take on the entire toolchain? Today mrustc exists, but doesn’t have its own Cargo.
My objection was to hard-coding a list of names known to be crates of the standard library in Cargo, because Cargo and the standard library are developed separately.
You may be surprised that I agree with that. When I said implementation-specific I meant rustc, not Cargo. Much unstable exists because it is more tightly coupled with rustc. A different compiler or interpreter may need to be tightly coupled in a different way.
You take it as a given here but I still maintain that this idea is fundamentally contradictory, because of the difference in versioning between these two worlds. [...]
I hear what you are saying. But I don't hear any contradictions. To be clear, we agree that std
needs to continue to export everything it does today. But nothing says it cannot reexport stuff rather than implement it itself.
Maybe we don't agree that reexporting like this is desirable, but let's first agree that it's possible. OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@comex I don't think we can have a single compiler-specific crate unless we allow crates to have cyclic dependencies, but yes glad to hear we both want a clean separation between rustc-specific and rustc-agnostic code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The decision to move Unicode tables into libcore was based explicitly on them not affecting the binary size of programs that don’t use them, with confirmation from the Embedded WG. (The linker eliminates unused symbols.) So I feel that binary size is not a valid argument for a smaller standard library.
I know. But if you don't want something in the binary, it's cleaner to have it not present at all, rather than needing to manually avoid calling something that's always in scope (e.g. char::is_whitespace
). This is especially weird with the format machinery (which is baked into libcore functions like panic_bounds_check
); I see a comment of yours from last year saying it is possible to avoid it being linked in, but it's definitely nonintuitive what exactly the constraints are.
But sure, a minimal core could also be achieved by putting everything else behind Cargo feature flags, rather than actually creating a separate crate.
By the way, another use case for a minimal core is if you just want to write your own standard library with a different design, and want to be able to replace things like impls on primitives (impl char
, impl<T> [T]
, impl<T> *const T
), or the precise list of methods on Iterator
(even if some basic functionality has to be fixed for the compiler to codegen for
loops). Those things are lang items, so letting third-party crates implement them on stable would require additional work. However, a minimal core is a prerequisite.
With that in mind, what does “move to crates.io” even mean? Something can not move if it also stays at its current location. Are you saying you want two copies of the same APIs? How is that useful? Can they diverge, in particular if the crates.io one wants to make breaking changes in a 2.0 version?
As I imagine it, the source of truth would be on crates.io, but the Rust distribution would also include a copy (which would simply be a copy of some version of the crates.io crate), mainly for backwards compatibility purposes. (Edit: Cargo would probably default to pulling the latest version from crates.io when you run cargo update
, though we'd want a better concept of 'minimum compiler version' to ensure it only picks versions that are compatible with the current compiler.). A 2.0 could be done in theory but is probably inadvisable. Benefits of having a separate, stable-source crate include:
- Ability to patch the source on stable (this RFC already proposes
patch
functionality but only for nightly). - Ability for conservatively-minded users to update to a newer compiler while holding back on changes to
std
, e.g. the recent wholesale replacement ofHashMap
withhashbrown
or the upcoming replacement of synchronization primitives withparking_lot
. - On the flipside: ability for experimentally-minded users to test such changes in advance of a full release, without having to also change their compiler configuration (which makes comparison more difficult).
cargo build --target thumbv7em-freertos-eabihf.json | ||
``` | ||
|
||
In general, any of the following would prompt Cargo to recompile `core`, rather than use a pre-compiled version: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than trying to make an exhaustive list of cases where a standard library crate is re-compiled, instead this could specify that it is not compiled when a identical binary is not available as pre-compiled. Where “identical” is based on the target, the feature flags, the profile settings, the sources, etc.
To nitpick: does “custom” target mean only JSON files? Not targets known to rustc but where a precompiled libcore is not available? Removing this list of condition removes the need to make it accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful point! "explicit invalidation" is basically incorrect by construction. Bad in code, and bad in writing.
|
||
#### Stabilization of a Target Specification Format | ||
|
||
As the custom target specifications (currently JSON) would become part of the stable interface of Cargo. The format used by this file must become stabilized, and further changes must be made in a backwards compatible way to guarantee stability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please clarify this “must”. Is this RFC proposing to stabilize the JSON file syntax (and set of keys) as-is? Or is it saying that another RFC will need to propose stabilization, possibly with changes? Based on other comments, I think it should be the latter as the set of keys needs work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nobody contradicted me when I said this can be a black-box the Cargo compares byte-by-byte as a stop-gap. I think trading some cache hits for future compat is a good move here.
|
||
## Should Cargo obtain or verify the source code for `libcore` or `libstd`? | ||
|
||
Right now we depend on `rustup` to obtain the correct source code for these libraries, and we rely on the user not to tamper with the contents. Are these reasonable decisions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is at least consistent with the RUSTC_BOOTSTRAP
environment variable being a simple boolean flag rather than something more tricky like it used to. (I think it was a somewhat-obfuscated release-specific key?) The stability mechanism is a social contract, it does not do much technically to stop someone determined to use unstable features on the stable channel.
|
||
## Should the custom built `libcore` and `libstd` reside locally or globally? | ||
|
||
e.g., should the build artifacts be placed in `target/`, only usable by this project, or in `.cargo/`, to be possibly reused by multiple projects, if they happen to have the same settings? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conservative default is target/
.
If we are sharing local builds of std
between projects in ~/.cargo
, why not also share local builds of, say, serde
? Or any crate from crates.io, or other source? This is an idea worth exploring, but probably better left to another RFC.
|
||
## How do we handle `libcore` and `libstd`'s `Cargo.lock` file? | ||
|
||
Right now these are built using the global lock file in `rust-lang/rust`. Should this always be true? How should Cargo handle this gracefully? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is important and needs to be resolved before this RFC can be stabilized.
Related: currently when std
depends on a crate from cartes.io like libc
, that crate is compiled on CI together with std
and shipped with in in the sysroot. Users end up with a sysroot libc
which is a separate crate from a libc
that they would depend on from crates.io themselves. I believe std
does not reexport any struct or enum type from libc
but if it did, in a user crate it would be a different incompatible type than the one with the same name from crates.io libc
.
Based on the principle that locally-compiled sysroot should behave the same as pre-compiled, I think we want to preserve this separation. Even if Cargo will download them from the same place (and share the download cache), sysroot+crates.io packages should be separate from crates.io packages as far as dependency graph resolution is concerned. (Similar to how a git dependency is separate from a crates.io dependency of the same name.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hasn't private dependencies landed? I think that is the best way to express this. Unlike a non-creates.io-source trick, it also catches std leaking those types in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope that someday:
-
Cargo gets a global cache
-
Sysroot goes away
-
stdlib crates can safely have public dependencies
But that is a problem for another day.
|
||
Another option in this area is to force the use of profile overrides, as specified by [RFC2822](https://github.com/rust-lang/rfcs/blob/master/text/2282-profile-dependencies.md). | ||
|
||
## Should providing a custom `core` or `std` require a nightly compiler? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But isn't this already possible on stable?
@phil-opp Unlike [patch.sysroot]
, having a crates.io dependency that happens to be named core
or std
only affects your own crate. And when doing so on stable, you’re very limited in what this dependency can do. In particular it cannot (re)define lang items.
Forcing unstable compiler just to be able to pick an std that works on embedded
@aep Do you mean that works with async/await without thread-local storage? IMO this is a failure of async/await, not of the core
v.s. std
structure.
|
||
With the ability to build these crates on demand, we may want to decide not to ship `target` bundles for any users. | ||
|
||
This would come at a cost of increased compile times, at least for the first build, if the artifacts are cached globally. However it would remove a mental snag of having to sometimes run `rustup target add`, and confusion from some users why parts of `std` and `core` have different optimization settings (particularly for debug builds) when debugging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn’t this RFC as-is already make rustup target add
unnecessary? (In at least some cases.) When the pre-compiled standard library is not available for a given target, Cargo falls back to compiling it. We could emit a warning for target that are known to have a pre-compiled copy available through rustup but not installed locally.
That's not just an anti-pattern, that doesn't work with how Cargo treats features. If crate
This doesn't actually seem that crazy to me (I have released a crate which is empty without any features active), you could even take it to the extreme and start with
That's seems like how it must work, if The above is written from the perspective that |
Just because line comments can get lost, I wrote what I'm pretty sure is a complete solution for the |
We are closing this PR for the time being to take a step back and break this down into smaller pieces. We have created a new repository at https://github.com/rust-lang/wg-cargo-std-aware/ to continue discussion on individual issues. Everyone is welcome and encouraged to continue discussion on that repo. We envision separating this into the following tasks, some of which can begin immediately. How these map to future RFCs is yet to be decided.
We'd like the thank everyone for the discussion so far, it has been very useful! We hope that taking this approach will allow us to better focus on the issues and make incremental progress sooner. |
Rendered.
Please see Pre-RFC discussion here.
Note: I'm happy to squash/rebase out the multiple commits below. I have included them as they were part of the Pre-RFC discussion.