Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize the `#[alloc_error_handler]` attribute (for no_std + liballoc) #66740

Open
SimonSapin opened this issue Nov 25, 2019 · 6 comments
Open

Stabilize the `#[alloc_error_handler]` attribute (for no_std + liballoc) #66740

SimonSapin opened this issue Nov 25, 2019 · 6 comments

Comments

@SimonSapin
Copy link
Contributor

@SimonSapin SimonSapin commented Nov 25, 2019

Summary

This issue formally proposes stabilizing the #[alloc_error_handler] attribute as-is, after adding some documentation.

Tracking issue: #51540

Normally the tracking issue is where we propose FCP to stabilize, but this one already has many comments that go into a number of sub-topics. Since the feature did not originally go through the RFC process, this proposal is loosely structured after the RFC template.

Background

Heap memory in libstd

Many parts of the standard library rely on a global heap memory allocator. For example Box::new takes a single parameter, the value to be boxed, and returns a struct that wraps a pointer to newly-allocated heap memory. The allocator is not part this API (on many platforms it defaults to malloc) and neither is the possibility that allocation fails: the return value always contains a valid pointer.

Allocation can in fact fail (malloc can return a null pointer), but in practice this is uncommon enough and hard enough to recover from that Box::new and many other APIs make the choice of not propagating that error to callers. We call these APIs “infallible” because allocation failure is not a concern of the caller (as opposed to “fallible” APIs like Vec::try_reserve which return a Result). “Infallible” APIs deal with failures by calling the handle_alloc_error(Layout) -> ! function, which never returns. The current behavior in libstd is to print an error message and abort the process. Any low-level code that makes allocations and wants to expose an infallible API is expected to call this function. For example a custom container library could look like:

use std::alloc::{Layout, alloc, handle_alloc_error};
use std::ptr::NonNull;

impl<T> MyBox<T> {
    pub fn new(x: T) -> Self {
        let layout = Layout::new::<T>();
        assert!(layout.size() > 0); // Not dealing with the zero-size case for example brevity
        let maybe_null = unsafe { alloc(layout) };
        let ptr = NonNull::new(maybe_null)
            .unwrap_or_else(|| handle_alloc_error(layout));
        Self(ptr.cast())
    }
}

no_std and liballoc

The Rust standard library is split into three crates (that are relevant to this issue): core, alloc, and std.

  • std expects much functionality to be provided by the underlying operating system or environment: a filesystem, threads, a network stack, … and relevant here: a memory allocator and a way to abort the current process. Large parts of its code are target-specific. Porting it to a new target can take non-trivial efforts.

  • core contains the subset of std that has almost no such requirement. A crate can use the #![no_std] attribute to opt into having its implicit dependency to std replaced by an implicit dependency to core. When all crates in an application do this, this enables porting to a target that might not have std at all. Notably, @rust-embedded does this with micro-controllers that do not have an operating system.

  • alloc is in-between. It depends on core and std depends on it. It contains the subset of std that relies on heap memory allocation, but makes no other external requirements over those of core. Specifically, using alloc requires:

    • A heap memory allocator, that provides an implementation of the alloc function and related functions.
    • An allocation error handler, that provides an implementation of the handle_alloc_error function.

    The std crate provides both of these, so linking it in an application (having any crate in the dependency graph that doesn’t have #![no_std], or has extern crate std;) is sufficient to use alloc. Of course this doesn’t work for targets/environments where std is not available.

#[panic_handler]

core does have an external requirement: a way to handle panics. std normally provides this by printing a message to stderr, optionally with a stack trace, and unwinding the thread. In a no_std application however there may not be an stderr to print to, and unwinding may not be supported. Such apps can therefore provide a handler:

#[panic_handler]
fn panic(panic_info: &core::panic::PanicInfo) -> ! {
    // …
}

(See also in the Nomicon.)

The attribute is effectively a procedural macro that checks the signature of the function and turns it into an extern "Rust" fn with a known symbol name, so that it can be called without going through Rust’s usual crate/module/path name resolution.

The compiler also checks for “top-level” compilations (executables, cdylibs, etc.) that there is exactly one panic handler in the whole crate dependency graph. std (effectively) provides one, so the attribute is both necessary for no_std applications and can only be used there.

#[global_allocator]

Depending on the workload, an alternative allocator may be more performant than the platform’s default. In earlier versions of Rust, the standard library used jemalloc. In order to leave that choice to users, Rust 1.28 stabilized the GlobalAlloc trait and #[global_allocator] attribute, and changed the standard library’s default to the system’s allocator.

This incidentally enabled (in Nightly) the use of alloc in no_std applications which can now provide an allocator implementation not just to be used instead of std’s default, but where std is not necessarily available at all. However such applications still require Rust Nightly in order to fulfil alloc’s second requirement: the allocation error handler.

#[global_allocator] is similar to #[panic_handler]: it also expands to extern "Rust" fn function definitions that can be called by a crate (this time alloc instead of core) that doesn’t have a Cargo-level dependency on the crate that contains the definition, and in that the compiler checks for “top-level” compilation that it isn’t used twice. (It differs in that it can be used when std is linked, and overrides std’s default.)

Motivation

As of Rust 1.36, specifying an allocation error handler is the only requirement for using the alloc crate in no_std environments (i.e. without the std crate being also linked in the program) that cannot be fulfilled by users on the Stable release channel.

Stabilizing #[alloc_error_handler] as the way to fulfil this requirement would allow:

  • no_std + liballoc applications to start running on the Stable channel
  • no_std applications that run on Stable to start using liballoc

Guide-level explanation

Many of the APIs in the alloc crate that allocate memory are said to be “infallible”. Allocation appears to always succeed as far as their signatures are concerned. When allocation does fail, they call alloc::alloc::handle_alloc_error which never returns. For example, Vec::reserve is said to be infallible while Vec::try_reserve is fallible (and returns a Result). Other libraries who want to expose this infallible style of API may also call handle_alloc_error.

We call an application no_std if it doesn’t link the std crate. That is, if all crates in its dependency graph have the #![no_std] attribute and (after cfg-expansion) do not contain extern crate std;.

A no_std application may use the standard library’s alloc crate if and only if it specifies both a global allocator with the #[global_allocator] attribute, and an allocation error handler with the #[alloc_error_handler] attribute. Each may only be defined once in the crate dependency graph. They can be defined anywhere, not necessarily in the top-level crate. The handler defines what to do when handle_alloc_error is called. It must be a function with the signature as follows:

#[alloc_error_handler]
fn my_example_handler(layout: core::alloc::Layout) -> ! {
    panic!("memory allocation of {} bytes failed", layout.size())
}

The handler is given the Layout of the allocation that failed, for diagnostics purpose. As it is called in cases that are considered not recoverable, it may not return. std achieves this by aborting the process. In a no_std environment − which might not have processes in the first place − panicking calls the #[panic_handler] which is also required to not return.

Reference-level explanation

#[alloc_error_handler] is very similar to #[panic_handler]: it locally checks that it used on a function with the appropriate signature and turns it into an extern "Rust" fn with a known symbol name, so that alloc::alloc::handle_alloc_error can call it.

Like with the panic handler, the compiler also checks for “top-level” compilations (executables, cdylibs, etc.) that there is exactly one allocation error handler in the whole crate dependency graph. std literally provides one, so the attribute is both necessary for no_std applications and can only be used there.

The above is already implemented, although not well documented. This issue is about deciding to stabilize the attribute. If we find consensus on this direction, documentation should come before or with a stabilization PR. The alloc crate’s doc-comment could be a good place for this documentation, which could be based on the guide-level explanation above.

#[panic_handler] is already stable, so the Rust project is already committed to maintaining this style of attribute.

Alternatives

@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

@SimonSapin SimonSapin commented Nov 25, 2019

Proposing FCP to stabilize, is described above:

@rfcbot fcp merge

@rfcbot

This comment has been minimized.

Copy link

@rfcbot rfcbot commented Nov 25, 2019

Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

@SimonSapin SimonSapin commented Nov 25, 2019

Instead of stabilizing a way to fulfil the requirement to define a handler, another way to unlock the no_std + liballoc on Stable use case could be to remove that requirement: when no handler is defined, the compiler could inject a default handler that panics

If that default handler is accepted, should we not stabilize this attribute and wait for a more general "global things" mechanism?

@rfcbot fcp what if default

@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

@SimonSapin SimonSapin commented Nov 25, 2019

@rfcbot concern what if default

@withoutboats

This comment has been minimized.

Copy link
Contributor

@withoutboats withoutboats commented Nov 25, 2019

Just throwing this out there (and maybe it's already been proposed and rejected): if we wanted to go with the default handler approach, we could make it possible through the PanicInfo API to distinguish alloc failures from other panics, making it straight forward to handle this specially in your panic handler, unlike other panics. Two ways we could do this:

  • Rather than picking a default message, make the Layout the &dyn Any of the payload, or a newtype wrapping Layout. Downcasting the payload to this type assumes an alloc error. Possibly this interacts poorly with existing users' panic handlers.
  • Just directly provide a new field and method on PanicInfo which contains the layout in the case of alloc failures.

Probably this would mean there's little point in having alloc_error_handler.

I don't have any opinion about which of these is best. In my opinion we should just make core + alloc work on stable as soon as reasonable.

@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

@SimonSapin SimonSapin commented Nov 25, 2019

This is an interesting idea! (Though perhaps one for #66741. The perils of starting two related threads at the same time…)

One thing is that there’s currently no stable way in no_std to end up with PanicInfo::payload returning Some, so making the default allocation error handler do this would be slightly magic. Single-argument core::panic! requires &str and creates an fmt::Arguments, unlike single-argument std::panic! which takes a generic T: Any + Send and creates a payload.

We could extend single-argument core::panic! to also be generic and make a dyn Any payload. In order to not regress also setting a message for &str we’d need some kind of specialization, although it’s the kind that can be hacked together on Stable with auto-ref. Or maybe we can regress this, since PanicInfo::message is still unstable. (CC #66745)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.