Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std Aware Cargo #2663

Closed
wants to merge 14 commits into from
335 changes: 335 additions & 0 deletions text/0000-std-aware-cargo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,335 @@
- Feature Name: cargo_the_std_awakens
- Start Date: 2018-03-16
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

As of Rust 1.33.0, the `core` and `std` components of Rust are handled in a different way than Cargo handles other crate dependencies. This causes issues for non-mainstream targets, such as WASM, Embedded, and new not-yet-tier-1 targets. The following RFC proposes a roadmap to address these concerns in a consistent and incremental process.

# Motivation
[motivation]: #motivation

In today's Rust environment, `core` and `std` are shipped as precompiled objects. This was done for a number of reasons, including faster compile times, and a more consistent experience for users of these dependencies. This design has served the bulk of users fairly well. However there are a number of less common uses of Rust, that are not well served by this approach. Examples include:

* Supporting new/arbitrary targets, such as those defined by a custom target (".json") file
* Modifying `core` or `std` through use of feature flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cargo feature flags? or compiler feature flags (-C target-cpu, -C target-feature, --cfg foo, etc. ?) Or both?

* Users who would like to make different optimizations to `core` or `std`, such as `opt-level = 's'`, with `panic = "abort"`

Previously, these needs were somewhat addressed by the external tool [xargo], which managed the recompilation of these dependencies when necessary. However, this tool has become [deprecated], and even when supported, required a nightly version of the compiler for all operation.

This approach has [gathered support] from various [Rust team members]. This RFC aims to take inspiration from tools and workflows used by tools like [xargo], integrating them into Cargo itself.

[xargo]: https://github.com/japaric/xargo
[deprecated]: https://github.com/japaric/xargo/issues/193
[gathered support]: https://github.com/japaric/xargo/issues/193#issuecomment-359180429
[Rust team members]: https://www.ncameron.org/blog/cargos-next-few-years/

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

This proposal aims to make `core` and `std` feel a little bit less like a special case compared to other dependencies to the end users of Cargo. This proposal aims to minimize the number of new concepts introduced to achieve this, by interacting, configuring, modifying, and patching `core` and `std` in a similar manner to other dependent crates.

This RFC proposes the following concrete changes, which may or may not be implemented in this order, and may be done incrementally. The details and caveats around these stages are discussed in the [Reference Level Explanation][reference-level-explanation].

In this document, we use the term "root crate" to refer to the Rust project being built directly by Cargo. This crate contains the Cargo.toml used to guide the modifications described below. This would typically be a crate containing a binary application, or a standalone item, such as an `rlib`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that “root crate” is not the right concept. In a virtual workspace, there may not be a root crate at all. As far as I can tell what’s important for the purpose of this RFC is having a root Cargo.toml file where to specify some configuration, whether or not there’s a corresponding compiler artifact.

So “root manifest” may be better here. It refers to the Cargo.toml file pointed to by --manifest-path, or in the current directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes frankly I wish we always had a virtual workspace root. That would make the distinction between a "workspace query" and "create graph solution" a lot clearer.


1. Allow developers of root crates to recompile `core` (and `compiler-builtins`) when their desired target does not match one available as a `rustup target add` target, without the usage of a nightly compiler. This version of `core` would be built from the same source files used to build the current version of `rustc`/`cargo`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t see a reason to limit this to libcore. In other words, I think we should do part 3 from the start and support all standard library crates. Currently this is core, alloc, std, proc_macro, and test. Maybe also their dependencies?

In the same vein, please use “standard library crates” thorough the RFC wherever it currently says “core”.

2. Allow the usage of Cargo features with the `core` library, additionally introducing the concept of "stable features" for `core`, which allow the end user to influence the behavior of their custom version of `core` without the use of a nightly compiler.
3. Extend the new behaviors described in step 1 and 2 for `std` (and `alloc`).
4. Allow the user to provide their own custom source versions of `core` and `std`, allowing for deep customizations when necessary. This will require a nightly version of the compiler.

As a new concept, the items above propose the existence of "stable features" for `core` and `std`. These features would be considered stable with the same degree of guarantees made for stability in the rest of the language. These features would allow configuration of certain functionalities of `core` or `std`, in a way decided at compile time.

For example, we could propose a feature called `force-tiny-fmt`, which would use different algorithms to implement `fmt` for use on resource constrained systems. The developer of the root crate would be able to choose the default behavior, or the `force-tiny-fmt` behavior while still retaining the ability of using a stable compiler.


# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

A reference-level explanation is made for each of the items enumerated above.

## 1 - Allow developers of root crates to recompile `core`

### Use Case

For developers working with new targets not yet supported by the Rust project, this feature would allow the compilation of `core` for any target that can be specified as a valid [custom target specification].

[custom target specification]: https://rust-lang.github.io/rfcs/0131-target-specification.html

This functionality would be possible even with the use of a stable compiler.

Users of a nightly compiler would be able to set compile time feature flags for `core` through settings made in their `Cargo.toml`.

### Caveats

For users of a stable compiler, it would not be possible to modify the source code contents of `core`, or change any compile time features of `core` from the defaults used when publishing pre-compiled versions of `core`.

The source code used to build `core` would be the same as the compiler used for building the current project.

### User Interaction

When compiling for a non-standard target, users may specify their target using a target specification file, rather than a pre-defined target.

> NOTE: The current target specification is described in JSON, and contains some
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In jamesmunns#1 (comment) you noted that we should be "blind" to this format and what it contains and that stabilizing could be done later.. My question is then: how can we reasonably change anything if people start relying on the format on stable? Stability is not just about what you say, it must also be practical.

With my language team hat on (but speaking for myself only) it is important that we be able to feasibly use other backends than LLVM. It should not just be a theoretical possibility to use Cranelift or some other backend, but practically possible in cargo for e.g. debug builds or whatnot.

I would suggest embracing the otherwise incremental approach of this RFC where we start with other things first and let custom target specs be unstable until we are confident that stabilizing won't cause headaches wrt. backends.

Meanwhile we can also do what should be a straightforward switch to TOML.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the switch to TOML as a first step.

However, I am very hesitant of the idea of stabilising the current custom target format. While there is an RFC describing the format, it feels very much like an unstable implementation detail at the moment, and I've personally put up with many problems with it because it's not meant to be a permanent solution and is not a priority at the moment; I wonder if this reflects other people's thoughts.


On alternative backends: if we do end up stabilising this, the minimum change I think we'd need to make would be to split some options into a backend-options map (like has been done with pre-link-args, for example). For example, you'd be required to write (in the current JSON format):

"backend-options": {
    "llvm": {
        "data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
        "target": "x86_64-unknown-none",
    }

    {{other backends could then be added with backwards compatibility here}}
}

You could then only build a target with a specific backend if it has a corresponding entry in the map.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Centril being blind to something is just parametricity :). Only rustc cares about the format, cargo need not care about the format at all. What it does need to do however is be able to tell if two machine configs are equal, since Cargo controls caching. So we need a:

abstract type MachineConfig: Eq;

Cargo can forward the contents as raw bytes to rustc (as a CLI arg, or temp file Cargo creates to avoid races, or many other things). As to equality, Cargo can take the hash of the config and use that as a cache key. Sure, this is never complete (concrete formats usually have a more flexible notion of equality), but it is always sound. That's just fine for now.

> implementation details regarding the use of LLVM as the compiler backend. This
> RFC does not prescribe any changes to the Target Specification format, and is
> intended to work with whatever the current/stable method of specifying a
> custom target is.

For example, currently a user may cross-compile by specifying a target known by Rust:

```sh
cargo build --target thumbv7em-none-eabihf
```

Users would also be able to specify a target specification file, by providing a path to the file to be used.

```sh
cargo build --target thumbv7em-freertos-eabihf.json
```

In general, any of the following would prompt Cargo to recompile `core`, rather than use a pre-compiled version:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than trying to make an exhaustive list of cases where a standard library crate is re-compiled, instead this could specify that it is not compiled when a identical binary is not available as pre-compiled. Where “identical” is based on the target, the feature flags, the profile settings, the sources, etc.

To nitpick: does “custom” target mean only JSON files? Not targets known to rustc but where a precompiled libcore is not available? Removing this list of condition removes the need to make it accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful point! "explicit invalidation" is basically incorrect by construction. Bad in code, and bad in writing.


* A custom target specification is used
* The root crate has modified the feature flags of `core`
* The root crate has set certain profile settings, such as opt-level, etc.
* The root crate has specified a `patch.sysroot` (this is defined in a later section)

Users of a stable compiler would not be able to customize `core` outside of these profile settings.

For users of a nightly compiler, compile time features of `core` may be specified using the same syntax used for other crate dependencies. These specified features may include unstable features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The syntax should be similar to that for crates.io dependencies. But I feel rather strongly that it should not be exactly the same. Currently the example below refers to https://crates.io/crates/core. The fact that the crates.io server rejects uploads to that name is a separate concern, unrelated to the meaning of a given bit of Cargo.toml syntax.

How does Cargo know the difference between a sysroot dependency and a crates.io dependency? I think we should not hard-code a list of known standard library crate names.

Cargo already has a concept of “source” for all package, with the default being crate.io. Some keys in the TOML table/dictionary for a dependency can change the source: path, git, and regitry. I think sysroot = true (name to be bikeshedded) should be another one of these, and necessary to refer to standard library crates.

I believe this is compatible (aligns, in the underlying concepts in Cargo) with the [patch.sysroot] syntax proposed later in this RFC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonSapin the problem is that which libraries are special-cased to the implementation is necessarily an implementation specific detail. We want to move creates to crates.io when they become 100% stable code and not have things break. I think the compiler should provide a "workspace override" which does the [patch.sysroot]-ing and/or we start to distinguish the "default source" from crates.io (the default source may be a composition of sources).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ericson2314 I’m sorry. I feel like I should respond to your message here, but I have a very hard time parsing it :/ It seems based on multiple unstated assumptions. For example:

an implementation specific detail

Are you imagining a world where there are multiple implementations of Rust where they each have their take on the entire toolchain? Today mrustc exists, but doesn’t have its own Cargo.

My objection was to hard-coding a list of names known to be crates of the standard library in Cargo, because Cargo and the standard library are developed separately.

We want to move creates to crates.io when they become 100% stable code and not have things break.

I know from previous interactions with you that you have this idea of the standard library somehow having its source of truth on crates.io. You take it as a given here but I still maintain that this idea is fundamentally contradictory, because of the difference in versioning between these two worlds.

A library gets to pick what other libraries from crates.io it depends on, and it gets to pick their version numbers. A program can even end up with multiple versions of the “same” library in its dependency graph.

A library does not however get to pick what version of the compiler is being used. (Only which versions it tries to be compatible with.) The standard library is by definition what ships with a given compiler. It doesn’t have its own version number, it shares the compiler’s version number.

I think that one can make a valid argument that less functionality should be the responsibility of the standard library. For example, std::sync::mpsc could be deprecated without a replacement in std and https://crates.io/crates/crossbeam-channel be recommended instead. But being shipped with the compiler is what “standard library” means. “Moving std to crates.io” doesn’t make any sense in my opinion.

Copy link

@comex comex Jun 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how the standard library works today, but there's no rule that it must always be that way. It's true that some parts of it, mostly in core, are tightly bound to the compiler implementation, such as wrappers around intrinsics and definitions of language items. But significant chunks are not, including most of std itself (all the wrappers around OS functionality) as well as significant parts of core (e.g. unicode, flt2dec, iterator adapters). The current implementations of those APIs may or may not use unstable features, but the APIs don't inherently require it. The deprecation route you mentioned could work in theory, but I don't think users would appreciate a large percentage of the standard library being deprecated.

Personally, I would like to see a new "even more core than core" crate that has the absolute bare minimum amount of code required to expose all of the compiler's functionality. Then the rest can be moved to crates.io, and can be made optional – useful for cases where, say, you're extremely constrained on binary size and want to be in control of every single line of code that actually makes it into the output binary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the rest can be moved to crates.io

rand moved from the standard library to crates.io. Because it was not marked at stable in std (or it was before 1.0, I don’t remember), we could simply remove it from there.

But there’s not a lot of that left that we can do. For APIs that are #[stable] in std, we promised to keep having them in std. At most when can deprecate them, which as you noted has downsides. With that in mind, what does “move to crates.io” even mean? Something can not move if it also stays at its current location. Are you saying you want two copies of the same APIs? How is that useful? Can they diverge, in particular if the crates.io one wants to make breaking changes in a 2.0 version?

can be made optional

This RFC proposes cargo feature flags for standard library crates, which presumably will allow making parts of the standard library optional: you can disable them if you don’t use them. How does “move to crates.io” help with not using something?

for cases where, say, you're extremely constrained on binary size

The decision to move Unicode tables into libcore was based explicitly on them not affecting the binary size of programs that don’t use them, with confirmation from the Embedded WG. (The linker eliminates unused symbols.) So I feel that binary size is not a valid argument for a smaller standard library.

Copy link
Contributor

@Ericson2314 Ericson2314 Jun 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonSapin

Are you imagining a world where there are multiple implementations of Rust where they each have their take on the entire toolchain? Today mrustc exists, but doesn’t have its own Cargo.

My objection was to hard-coding a list of names known to be crates of the standard library in Cargo, because Cargo and the standard library are developed separately.

You may be surprised that I agree with that. When I said implementation-specific I meant rustc, not Cargo. Much unstable exists because it is more tightly coupled with rustc. A different compiler or interpreter may need to be tightly coupled in a different way.

You take it as a given here but I still maintain that this idea is fundamentally contradictory, because of the difference in versioning between these two worlds. [...]

I hear what you are saying. But I don't hear any contradictions. To be clear, we agree that std needs to continue to export everything it does today. But nothing says it cannot reexport stuff rather than implement it itself.

Maybe we don't agree that reexporting like this is desirable, but let's first agree that it's possible. OK?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@comex I don't think we can have a single compiler-specific crate unless we allow crates to have cyclic dependencies, but yes glad to hear we both want a clean separation between rustc-specific and rustc-agnostic code.

Copy link

@comex comex Jun 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decision to move Unicode tables into libcore was based explicitly on them not affecting the binary size of programs that don’t use them, with confirmation from the Embedded WG. (The linker eliminates unused symbols.) So I feel that binary size is not a valid argument for a smaller standard library.

I know. But if you don't want something in the binary, it's cleaner to have it not present at all, rather than needing to manually avoid calling something that's always in scope (e.g. char::is_whitespace). This is especially weird with the format machinery (which is baked into libcore functions like panic_bounds_check); I see a comment of yours from last year saying it is possible to avoid it being linked in, but it's definitely nonintuitive what exactly the constraints are.

But sure, a minimal core could also be achieved by putting everything else behind Cargo feature flags, rather than actually creating a separate crate.

By the way, another use case for a minimal core is if you just want to write your own standard library with a different design, and want to be able to replace things like impls on primitives (impl char, impl<T> [T], impl<T> *const T), or the precise list of methods on Iterator (even if some basic functionality has to be fixed for the compiler to codegen for loops). Those things are lang items, so letting third-party crates implement them on stable would require additional work. However, a minimal core is a prerequisite.

With that in mind, what does “move to crates.io” even mean? Something can not move if it also stays at its current location. Are you saying you want two copies of the same APIs? How is that useful? Can they diverge, in particular if the crates.io one wants to make breaking changes in a 2.0 version?

As I imagine it, the source of truth would be on crates.io, but the Rust distribution would also include a copy (which would simply be a copy of some version of the crates.io crate), mainly for backwards compatibility purposes. (Edit: Cargo would probably default to pulling the latest version from crates.io when you run cargo update, though we'd want a better concept of 'minimum compiler version' to ensure it only picks versions that are compatible with the current compiler.). A 2.0 could be done in theory but is probably inadvisable. Benefits of having a separate, stable-source crate include:

  • Ability to patch the source on stable (this RFC already proposes patch functionality but only for nightly).
  • Ability for conservatively-minded users to update to a newer compiler while holding back on changes to std, e.g. the recent wholesale replacement of HashMap with hashbrown or the upcoming replacement of synchronization primitives with parking_lot.
  • On the flipside: ability for experimentally-minded users to test such changes in advance of a full release, without having to also change their compiler configuration (which makes comparison more difficult).


```toml
[dependencies.core]
default-features = false
features = [...]
```

It is not necessary to explicitly mention the dependency of `core`, unless changes to features are necessary.

Cargo would use the source of `core` located in the user's `SYSROOT` directory. This source code would be obtained in the same way as necessary today, through the use of `rustup component add rust-src`. If this component is missing, Cargo would exit with an error code, and would prompt the user to execute the command specified above.

### Technical Implications

#### Stabilization of a Target Specification Format

As the custom target specifications (currently JSON) would become part of the stable interface of Cargo. The format used by this file must become stabilized, and further changes must be made in a backwards compatible way to guarantee stability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify this “must”. Is this RFC proposing to stabilize the JSON file syntax (and set of keys) as-is? Or is it saying that another RFC will need to propose stabilization, possibly with changes? Based on other comments, I think it should be the latter as the set of keys needs work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody contradicted me when I said this can be a black-box the Cargo compares byte-by-byte as a stop-gap. I think trading some cache hits for future compat is a good move here.


#### Building of `compiler-builtins`

Currently, `compiler-builtins` contains components implemented in the C programming language. While these dependencies have been highly optimized, the use of them would require the builder of the root crate to also have a working compilation environment for compilation in C.

This RFC proposes instead to use the [pure Rust implementation] when compiling for a custom target, removing the need for a C compiler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, what does “custom” mean exactly?

Ideally, I’d like to get to a point where shipping a pre-compiled standard library is only a matter of speeding up the initial compilation of most projects. We could in theory not ship any without changing the behavior of compiled programs.

For builtins, I think this means there should be a Cargo feature flag to pick the C or Rust ones. Possibly an unstable flag, at first.


While this may have code size or performance implications, this would allow for maximum portability.

[pure rust implementation]: https://github.com/rust-lang-nursery/compiler-builtins

#### `RUSTC_BOOTSTRAP`

It is necessary to use unstable features to build `core`. To allow users of a stable compiler to build `core`, we would set the `RUSTC_BOOTSTRAP` environment variable **ONLY** for the compilation of `core`.

This should be considered sound, as stable users may not change the source used to build `core`, or the features used to build `core`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't this also be sound for crates on crates.io (or for alloc and libstd) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe mention that even when compiled with RUSTC_BOOTSTRAP unstable features from core are not available to stable users.

## 2 - Introduce the concept of "stable features" for `core`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think “stable feature” should not be a new stand-alone concept. We already have a concept of feature flags in Cargo, and a stability mechanism (with explicit opt-in, only on Nightly) for language features and standard library items, as well as another stability mechanism for Cargo functionality.

What’s new is having a stability mechanism for feature flags.

Even though it’s not for public consumption, this RFC should mention what this mechanism looks like. I imagine it could be:

[features]
foo = []  # This becomes short for:
bar = { dependencies = [] }  # `unstable = false` is the default
bar = { dependencies = [], unstable = true }

Specifying the unstable key is itself a permanently-unstable Cargo functionality that standard library crates can opt into with cargo-features = ["feature-stability"].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifying the unstable key is itself a permanently-unstable Cargo functionality that standard library crates can opt into with cargo-features = ["feature-stability"].

It would be great to have this eventually stabilized for use in other crates that have optional support for nightly features (but this is pretty disconnected from the point of this RFC so should be RFCed separately).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote this as a parallel with the #[unstable] attribute which is currently not intended to ever be used outside of the standard library. But yes, possibly another RFC could propose to reconsider this in the future.


### Use Case

In some cases, it may be desirable to modify `core` in set of predefined manners. For example, on some targets it may be necessary to have lighter weight machinery for `fmt`.

This step would provide a path for stabilization of compile time `core` features, which would be a subset of all total compile time features of `core`.

### Caveats

Initially, the list of stable compile time features for `core` would be empty, as none of the current features have had an explicit decision to be stable or not.

### User Interaction

Compile time features for `core` may be specified using the same Cargo.toml syntax used for other crates.

The syntax is the same when using `unstable` and `stable` features, however the former may only be used with a nightly compiler, and use of an `unstable` feature with a stable compiler would result in a compile time error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the interaction of unstable and default, for feature flags? Options include:

  • A feature flag cannot have both, or
  • Users on the stable channel cannot use default-features = false if any feature flag of the corresponding dependencies has both
  • Users on the stable channel cannot use default-features = false ever, for standard library dependencies

The second option makes it a breaking change to add a default unstable feature flag after the initial release that stabilizes this RFC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @nikomatsakis pointed out default-features = false is already broken as it makes adding new default features a breaking change. Let me modify his workaround slightly

  • foo = { version = bound, default-features = false } means no default features as of the earlierest create allowed by bound. features added by then have their natural default.

  • To add a new feature, make it default = true (same as before)

  • To make existing functionality optional, make a new default = false feature.

  • To split an existing feature, make a new default = false feature and make sure the old feature depends on (implies) the new feature.

  • To combine features, make all the code gated by one and make the other one just depend on it.

Somebody should work through the lattice homormophisms from version to version but pretty sure this solves the problem and makes all of the above non-breaking changes.


The syntax for these features would look as follows:

```toml
[dependencies.core]
default-features = false
features = [...]
```

It is not necessary to explicitly mentione the dependency of `core`, unless changes to features are necessary.

### Technical Implications

#### Path to stabilization

The stabilization of a `core` feature flag would require a process similar to the stabilization of a feature in the language:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace

similar to the stabilization of a feature in the language

with:

identical to the stabilization of an item in the standard library

While the existence of feature flags for standard library crates is for the Language and Cargo teams to decide, once that’s accepted the behavior and stability of individual such flags is part of the API of the standard library, and therefore the responsibility of the Library team.


* Any new feature begins as unstable, requiring a nightly compiler
* When the feature is sufficiently explored, an RFC/PR can be made to `libcore` to promote this feature to stable
* When this has been accepted, the feature of `core` may be used with the stable compiler.

#### Implementation of Stable Features

There would be some mechanism of differentiating between flags used to build core, sorting them into the groups `unstable` and `stable`. This RFC does not prescribe a certain way of implementation.

## 3 - Extend the new behaviors described for `std` (and `alloc`)

### Use Case

Once the design and implications of the changes have been made for `core`, it will be necessary to extend these abilities for `std`, including components like `liballoc`.

### Caveats

In general, the same restrictions for building `core` will apply to building `std`. These include:

* Users of the stable compiler must use the source used to build the current Rust compiler
* Only compile time features considered `stable` may be used outside of nightly. Initially the list of `stable` features would be empty, and stabilizing these features would require a PR/RFC to `libstd`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would have to support these stable features forever, so each feature should go through the RFC process. A PR + mini FCP is not an option for this in my opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was meant to be RFC + PR, rather than either.


### User Interaction

The building of `std` would respect the current build profile, including optimization settings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the compilation of core not include optimization settings? If it does, why is this mentioned specifically in this section?


The syntax for these features would look as follows:

```toml
[dependencies.std]
default-features = false
features = [
"force_alloc_system",
]
```

It is not necessary to explicitly mention the dependency of `std`, unless changes to features are necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So effectively, every crate has an implicit dependency on std.

How does this interact with #![no_std]? When a pre-compiled std is not available, how does Cargo decide whether it should build one?

In theory I think it should always, unless all crates in the dependency graph are no_std, but that’s not known until the crates start parsing which is the responsibility of rustc and not Cargo. Should the (un)availability of std be a property of the target?

We could imagine a way to express in Cargo.toml the absence of dependency on std, to replace the #![no_std] crate attribute. However #![no_std] is stable, so this new way could only become mandatory in a future edition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes Cargo.toml must include a way to opt-out of back-compat explicit dependencies. (And frankly I would advocate later deprecating explicit dependencies as I want depending just on core to be "first class".)

To make things a little more succinct, in #1133 I proposed that if a crate had any explicit standard library deps, there would be no implicit deps. Then only core in practice would need to explicitly opt-out as everything else presumably depends on std or core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever new mechanism we add, the constraint here is existing stable crates who don’t use it (obviously, since it doesn’t exist yet) but who do use #![no_std]. We need to find a way not to not break those, on targets that don’t have a working std at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonSapin Are targets that don't have a working std stable enough that that matters? I can't think of anything that is not a horrible hack letting Cargo find the #![no_std] in Rust code----I don't think trying to do that is worth it at all.


### Technical Implications

None beyond the technical implications listed for `core`.

## 4 - Allow the user to provide their own custom source versions of `core` and `std`

### Use Case

This will allow users of a nightly compiler to provide a custom version of `core` and `std`, without requiring the recompilation of the compiler itself.

### Caveats

As stability guarantees cannot be made around modified versions of `core` or `std`, a nightly compiler must always be used.

### User Interaction

For this interaction, the existing `patch` syntax of Cargo.toml will be used. For example:

```toml
[patch.sysroot]
core = { path = 'my/local/core' }
std = { git = 'https://github.com/example/std' }
```

> NOTE: The use of `sysroot` as a category may be changed to a less loaded
> category name. This is likely an area for bikeshedding. `sysroot` will be
> used for the remainder of the document for consistency.

### Technical Implications

The `patch.sysroot` term will be introduced for patch when referring to components such as `std` and `core`.

# Drawbacks
[drawbacks]: #drawbacks

This RFC introduces new concepts to the use of Rust and Cargo, and could be confusing for current users of Rust who have not had to consider changes to `core` or `std` previously. However, in the normal case, most users are unlikely to need these settings, while they allow users that DO need to make changes to control important steps of the build process.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

> Why is this design the best in the space of possible designs?

This approach borrows from existing behaviors used by Cargo to allow configuration of `core` and `std`, as if they were a regular crate dependency.

This approach also offers an approach that can be developed and applied incrementally, allowing for time to find coner cases not considered by this RFC

> What other designs have been considered and what is the rationale for not choosing them?

To the author of this RFC's knowledge, there are no other open designs, other than the use tools that wrap Cargo entirely, such as [xargo].

[xargo]: https://github.com/japaric/xargo

> What is the impact of not doing this?

By not doing this, Rust will continue to be difficult to use for users and platforms "on the edge", such as new platform developers or embedded and WASM users.

# Prior art
[prior-art]: #prior-art

* [RFC1133] - This RFC from 2015 proposed making cargo aware of std. I still need to review in more detail to find the parts and syntax that may solve some open questions.
* [xargo] - This external tool was used to achieve a similar workflow as described above, limited to use with a nightly compiler
* [Cargo Issue 5002] - This issue proposed a syntax for explicit dependency on std
* [Cargo Issue 5003] - This issue discussed how to be backwards compatible with crates that don't explicitly mention std

[RFC1133]: https://github.com/rust-lang/rfcs/pull/1133
[Cargo Issue 5002]: https://github.com/rust-lang/cargo/issues/5002
[Cargo Issue 5003]: https://github.com/rust-lang/cargo/issues/5003

# Unresolved questions
[unresolved-questions]: #unresolved-questions

## How are dependencies (or non-dependency) on `core` and `std` specified?

For example in a `no_core` or `no_std` crate, how would we tell Cargo **not** to build the `core` and/or `std` dependencies?

## Should `std` be rebuilt if `core` is rebuilt?

Is it necessary to rebuild `std` using the customized `core`, even if no changes to `std` are necessary?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Like any dependent crate is rebuilt whenever a dependency changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right it's probably better to be explicit about this being like any other create, and implicit on what that means. Otherwise we get a "denormalized" spec that separately specifies identical behavior for stdlib and non-stdlib creates, which is subject to confusion and bit-rot.


## Should Cargo obtain or verify the source code for `libcore` or `libstd`?

Right now we depend on `rustup` to obtain the correct source code for these libraries, and we rely on the user not to tamper with the contents. Are these reasonable decisions?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is at least consistent with the RUSTC_BOOTSTRAP environment variable being a simple boolean flag rather than something more tricky like it used to. (I think it was a somewhat-obfuscated release-specific key?) The stability mechanism is a social contract, it does not do much technically to stop someone determined to use unstable features on the stable channel.


## Should the custom built `libcore` and `libstd` reside locally or globally?

e.g., should the build artifacts be placed in `target/`, only usable by this project, or in `.cargo/`, to be possibly reused by multiple projects, if they happen to have the same settings?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conservative default is target/.

If we are sharing local builds of std between projects in ~/.cargo, why not also share local builds of, say, serde? Or any crate from crates.io, or other source? This is an idea worth exploring, but probably better left to another RFC.


## How do we handle `libcore` and `libstd`'s `Cargo.lock` file?

Right now these are built using the global lock file in `rust-lang/rust`. Should this always be true? How should Cargo handle this gracefully?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is important and needs to be resolved before this RFC can be stabilized.

Related: currently when std depends on a crate from cartes.io like libc, that crate is compiled on CI together with std and shipped with in in the sysroot. Users end up with a sysroot libc which is a separate crate from a libc that they would depend on from crates.io themselves. I believe std does not reexport any struct or enum type from libc but if it did, in a user crate it would be a different incompatible type than the one with the same name from crates.io libc.

Based on the principle that locally-compiled sysroot should behave the same as pre-compiled, I think we want to preserve this separation. Even if Cargo will download them from the same place (and share the download cache), sysroot+crates.io packages should be separate from crates.io packages as far as dependency graph resolution is concerned. (Similar to how a git dependency is separate from a crates.io dependency of the same name.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hasn't private dependencies landed? I think that is the best way to express this. Unlike a non-creates.io-source trick, it also catches std leaking those types in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that someday:

  1. Cargo gets a global cache

  2. Sysroot goes away

  3. stdlib crates can safely have public dependencies

But that is a problem for another day.


## Should profile changes always prompt a rebuild of `core`/`std`?

For example, if a user sets their debug build to use `opt-level = 'z'`, should this rebuild `core`/`std` to use that opt level? Or should an additional flag, such as `apply-to-sysroot` be required to opt-in to this behavior, unless otherwise needed?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused on this point. Would it rebuild core/std even if they are not listed in [dependencies] or [patch]? I fear that always rebuilding when using non-default profile settings would be too much disruption. It would also be a little confusing, because there are multiple profiles, none of which match the settings used in the precompiled libraries. Perhaps core/std should only be rebuilt when they are explicitly listed in Cargo.toml?

Another potential issue is that the features to enable for libstd currently are driven by bootstrap's config.toml, so it may not be obvious to the user that they need to enable things like "backtrace" to have feature parity with the defaults (which change per platform). How do you switch to a custom-built std that retains feature parity with the default distribution per platform?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is exactly the open question here. On one hand: if the user states they want "opt-level = 's'", and we can now give them that for libstd too, it would make sense to keep the total program size down.

On the other hand, this could surprise some users, as it could drastically increase a clean build time by rebuilding core and standard for them.

Good call regarding config.toml, I was not aware of how this mechanism worked. In xargo, you can set some (or all?) of these flags using a feature = [ ... ] syntax. I would guess we would need to expose these features as "Cargo Features" rather than config.toml features so the users may configure them, with default-features matching what the CI builds of libstd and libcore currently are.

Copy link
Contributor

@ehuss ehuss Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are "Cargo Features", config.toml just drives the defaults. This is done here.

One thing I'm aware of that is not driven via features, and that is sanitizer support. That is done here. It looks like sanitizers are used on the linux builds, but I don't really know anything about them. Looks like it requires llvm?

Maybe the defaults could be captured somewhere when the distribution is made, and Cargo could read those (maybe in the "src" component? maybe rewrite the Cargo.toml?)? EDIT: 🤔 Except that won't work for non-host targets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a general question whether any dependency needs to be rebuilt for these non-abi-affecting switches, so as usual I hope for a Cargo solution that doesn't special-case core and std.


This could increase compile times for users that have set profile overrides, but have not previously needed a custom `core` or `std`.

Another option in this area is to force the use of profile overrides, as specified by [RFC2822](https://github.com/rust-lang/rfcs/blob/master/text/2282-profile-dependencies.md).

## Should providing a custom `core` or `std` require a nightly compiler?
Copy link
Contributor

@Centril Centril Mar 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this should be considered a requirement. We cannot reasonably have stability and at the same time allow people to do this with with a stable compiler.

Copy link

@phil-opp phil-opp Mar 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this already possible on stable? Just specify a core/std dependency in your Cargo.toml to override the default. Rust even automatically imports the prelude of the custom std.

We use this approach in our stm32f7-discorvery crate to augment the core library with the required Future implementations for using async/await on no_std.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh... @alexcrichton ^-- Was this intended and who intended it?

(the Future link is dead)

Copy link

@phil-opp phil-opp Mar 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, fixed it. Basically we provide our own implementation for the parts of std::future that currently use thread local storage (including the await macro).

Copy link
Member

@Nemo157 Nemo157 Mar 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, part of the discussion on the pre-RFC was that overriding std/core on a stable compiler should be fine because actually implementing std/core will require using unstable features that will require a nightly compiler.

EDIT: and if in the future it’s possible to provide an implementation of std/core without using any unstable features, why should the act of switching your dependencies to it require a nightly compiler?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise if you crates can finagle some way to compile on stable but still override libstd, then more power to them! (aka you link to the real std and maybe wrap some of its functionality).

Should cargo provide a way to use the non-patched dependency from the patched one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this already possible on stable?

@phil-opp Unlike [patch.sysroot], having a crates.io dependency that happens to be named core or std only affects your own crate. And when doing so on stable, you’re very limited in what this dependency can do. In particular it cannot (re)define lang items.

Forcing unstable compiler just to be able to pick an std that works on embedded

@aep Do you mean that works with async/await without thread-local storage? IMO this is a failure of async/await, not of the core v.s. std structure.

Copy link

@aep aep Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonSapin yes. async is in core now, so we need to patch core. Hence the need for a way to have alt stable std/core in cargo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aep Sorry, I haven’t followed all the recent development around async. Why do you need to patch core? Is there a path so that you don’t need to in the future?

In addition to being much preferable IMO, very practically for you, changes to async could happen sooner than stabilization of everything needed to define core.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only parts of core that the async_await feature uses are core::{future::Future, task::{Poll, Context}}, these are such trivial constructs that I don't see how you could change them while still having the async_await feature work? (I guess replacing Context would be possible since async_await only passes the type around, but doing that would definitely be regarded as operating outside standard procedures).

async_await also uses some functions from std currently, that is just an implementation defect (rust-lang/rust#56974) and doesn't really seem like a good motivation for stabilizing the ability to provide your own std (it would be much better to put the energy towards fixing the defect so that you can just use the builtin async transform).


It is currently unknown whether it is possible to provide a custom version of `core` or `std` without unstable features, as there are some compiler intrinsics and "magic" that are necessary (the format macros and box keyword come to mind).

I initially wrote the RFC in this manner, however I was later convinced this was not possible to do.

I am of the opinion that if you could, then it should be allowed to use a stable compiler, but that might be too theoretical for this RFC.

We could also move forward with the current restriction to nightly, and allow that to be lifted later by a follow-on RFC if this is possible and necessary.

## Should we allow configurable `core` and `std`

If we are to uphold stability guarantees for all configurations of `core` and `std`, this could require testing 2^(n+m) versions of Rust, where `n` is the number of `core` features, and `m` is the number of `std` features. This would have a negative impact on CI times.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our CI times are already quite stretched; I think bors is the bottleneck overall in our development processes. This would also require 2^(n+m) for reasonably many platforms, not just for one.

If core & std (and probably alloc) are to be configurable, it should, for the foreseeable future be exclusively limited to removing things from the standard library, rather than changing algorithms and whatnot.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if rust opens to embedded, we can contribute plenty CI machines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @rust-lang/infra ^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're certainly not going to test all 2m+n combinations on the CI, this is like mindlessly aiming for 100% code coverage. Having one test for all features disabled and one for all features enabled is more than enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having one test for all features disabled and one for all features enabled is more than enough.

In my experience that's rarely the case - features interact with each other in weird ways.

IMO 100% coverage is impossible, but that should not be the goal. The goal should be to get as close to 100% coverage as possible while using a tiny fraction of resources.

I don't think we can achieve that goal by using fixed rules (like testing everything, or testing just A and B). Achieving the goal is going to require investing time into evaluating which features and features groups makes sense to tests where.

For example, it might make sense to test more combinations on the x86_64 tier 1 targets only, and it might also make sense to set up a weekly cron job that tests even more combinations, random combinations, etc. But which features should be tested in isolation and which ones can be grouped in the tests with other features is something that we should evaluate and constantly re-evaluate as new features get added on a 1:1 basis.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnzlbg Cargo features are supposed to be additive, features interacting within libstd is already a bug IMO.

For sure if it turns out A+C+F has weird behavior compared with no features and A+B+C+D+E+F+G features, we could create a special run-make test to guarantee that special behavior in A+C+F.


# Future possibilities
[future-possibilities]: #future-possibilities

## Unified `core` and `std`

With the mechanisms specified above, it could be possible to remove the concept of `core` and `std` from the user, leaving only `std`.

By using stable feature flags for `std`, we could say that `std` as a crate with `default-features = false` would essentially be `no_core`, or with `features = ["core"]`, we would be the same as `no_std`.

This abstraction may not map to the actual implementation of `libcore` or `libstd`, but instead be an abstraction layer for the end developer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we would still need to allow using [dependency.core] forever and somehow map that to an unified version of the library.

Or in other words: removing the concept of core and std (e.g. into an unified std that uses the portability lint) would be a breaking change if this RFC was merged as is.

I don't how hard would it be to have an unified library, that's provided as a "split" one simultaneously (or in an edition dependent way) for backwards compatibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this could easily be dealt with by making core (and alloc) a facade over std with a limited set of default active features. As long as the current core -> std dependency order is not observable somehow.

## Stop shipping pre-compiled `core` and `std`

With the ability to build these crates on demand, we may want to decide not to ship `target` bundles for any users.

This would come at a cost of increased compile times, at least for the first build, if the artifacts are cached globally. However it would remove a mental snag of having to sometimes run `rustup target add`, and confusion from some users why parts of `std` and `core` have different optimization settings (particularly for debug builds) when debugging.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn’t this RFC as-is already make rustup target add unnecessary? (In at least some cases.) When the pre-compiled standard library is not available for a given target, Cargo falls back to compiling it. We could emit a warning for target that are known to have a pre-compiled copy available through rustup but not installed locally.