Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: Allow changing the default allocator #1183
Conversation
sfackler
reviewed
Jun 30, 2015
|
|
||
| ```rust | ||
| extern { | ||
| fn __rust_allocate(size: usize, align: usize) -> *mut u8; |
This comment has been minimized.
This comment has been minimized.
sfackler
Jun 30, 2015
Member
Why are we using magic symbol names instead of annotation-tagged functions a la #[lang_item="foo"] or #[plugin_registrar]?
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jun 30, 2015
Author
Member
Implementation-wise, this is what everything will boil down to (pre-defined symbols), and this is currently the path of least resistance forward. This is all unstable, however, so we'll definitely be able to change it in the future to perhaps using lang items or more official attributes. The current downside of attributes are:
- During a compilation, there may actually be two loaded allocators in the crate store (but we won't link one of them), so the compiler would detect duplicate lang items and yield an error. Extra logic would have to be added to "not worry about" the allocator lang items.
- None of the signatures are currently typechecked, and having an official attribute makes it feel like it should be typechecked.
Basically I'd love to move to using attributes and such, but I don't see much immediate benefit over just defining some symbols in the short-term. I also don't mind adding some words to this effect in the RFC, though, and we could perhaps spec the "ideal implementation" here where the actual implementation just has some TODOs.
My ideal situation would be to have an attribute-per-function which defines the symbol, visibility, and typechecks the signature. We'd then also have a check that an #![allocator] crate contains the necessary functions (tagged with attributes). That's a good deal of attribute-surface-area to start stabilizing right off the bat though.
nagisa
reviewed
Jun 30, 2015
|
|
||
| ### High level design | ||
|
|
||
| The design of this RFC from 10,000 feet (referred to below), which was |
This comment has been minimized.
This comment has been minimized.
nagisa
Jun 30, 2015
Contributor
nit: use SI, imperial is deprecated everywhere except a certain country
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
cuviper
Jul 7, 2015
Member
Aviation still uses feet for altitude in most of the world, so let the idiom be!
nagisa
reviewed
Jun 30, 2015
| # Alternatives | ||
|
|
||
| The compiler's knowledge about allocators could be simplified quite a bit to the | ||
| point where a compiler flag is used to just turn injection on/off, and then it's |
This comment has been minimized.
This comment has been minimized.
nagisa
Jun 30, 2015
Contributor
I think this would be a way to go, similar to how we have #![no_std] and #![no_main]. #![no_allocator] would blend in quite well, after which you’d just have to define your own allocation language items. A custom allocator then could be chosen by using something similar to
#[allocator]
extern crate my_awesome_allocator;which is also pretty similar to how you’d use a custom standard library.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jun 30, 2015
Author
Member
The difference between no_main and no_std, however, is that choosing an allocator is a global decision, not a local one. There's a number of bad use cases you can get into when dealing with the compiler otherwise, for example if you're linking to a Rust dynamic library, then it had to have an allocator defined when it was linked, so you have no choice but to use that, yet you can still happily link to your own.
I put this alternative here mostly as a gut feeling rather than having anything concrete in mind. On the surface, though, this RFC proposes basically 0 overhead on consumer crates of allocators (they do nothing or otherwise just have one extern crate statement). Some extra error messages may pop up here and there, but very little is actually changing about how an allocator is used.
Do you have some specific aspects of this RFC you feel are too ambitious?
This comment has been minimized.
This comment has been minimized.
nagisa
Jul 1, 2015
Contributor
The difference between no_main and no_std, however, is that choosing an allocator is a global decision, not a local one.
The compiler, on the other hand has the full power to propagate the top-most choice down the dependency chain (except, of course, when staticlibs or dylibs are encountered; I actually maintain a viewpoint that neither of these should have allocator built-in and producer of the final executable should link all the appropriate allocator libraries instead), no?
Do you have some specific aspects of this RFC you feel are too ambitious?
Rather than ambitious, to me at the first sight the alternative looks like a more elegant solution, that’s it.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 1, 2015
Author
Member
I did not have much concrete in mind when I wrote this alternative, and I'm not quite sure what you're thinking of, so could you go into more detail about how you would envision "dumbing down" the compiler's knowlege of allocators?
I actually maintain a viewpoint that neither of these should have allocator built-in and producer of the final executable should link all the appropriate allocator libraries instead
Unfortunately a dynamic library will not link on Windows unless all symbols are resolved (unlike linux where you can have unresolved symbols in a dynamic library)
alexcrichton
added
T-libs
T-lang
labels
Jul 1, 2015
This comment has been minimized.
This comment has been minimized.
|
Are any stable interfaces proposed here? Or are we just changing the way the allocator is automatically picked as far as stable rust is concerned? I find it hard to tell. I like the general goal, but as I said in the other Core, alloc, and log all have a need to use functionality defined elsewhere, and traits won't cut it, so it would be nice to really think through a language-level way to solve this problem once and for all (something like ML functors on the crate level, probably). If nothing is being stabilized here, great! This is definitely a better situation than what we have currently. If interfaces are being stabilized, than I rather way for a general solution for all three crates. |
This comment has been minimized.
This comment has been minimized.
Tobba
commented
Jul 1, 2015
|
I'm pretty sure what everyone has wanted in this area for a very long time is trait-based allocator selection a la RFC #39 (which we sadly never got due to some GC-related concerns, and the GC-aware version was such an abomination everyone pretends it never happened). This would allow you to adjust the allocator for not just an entire crate, but for individual objects and in a much cleaner fashion. |
nagisa
reviewed
Jul 1, 2015
| allocation functions used by Rust, defined as: | ||
|
|
||
| ```rust | ||
| extern { |
This comment has been minimized.
This comment has been minimized.
nagisa
Jul 1, 2015
Contributor
Must it be C ABI?
I’d rather have something #[lang]-ish here as well.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 1, 2015
Author
Member
The C ABI is not required, but leaves the door open to allowing external implementations of an allocator in the future (e.g. implementing one in C instead of Rust).
I discussed #[lang] above which may be of interest as well.
This comment has been minimized.
This comment has been minimized.
|
I'm more sympathetic to not stabilizing an allocators interface until we have GC, but it seems pretty harmless to implement something like #39 without stabilizing it, and just use it behind |
This comment has been minimized.
This comment has been minimized.
Currently, no
I see the concept of collection-specific allocators as orthogonal to this RFC, and implementation-wise there basically must be some global symbols which represent the "allocator interface". This RFC is just connecting the dots to allow programs to switch the global allocator, not have a full-blown allocation API (hence the instability of all items proposed here) |
This comment has been minimized.
This comment has been minimized.
nnethercote
commented
Jul 2, 2015
|
From my point of view this all looks quite plausible. Thank you, @alexcrichton. |
alexcrichton
self-assigned this
Jul 2, 2015
cmr
reviewed
Jul 6, 2015
|
|
||
| * `alloc_system` is a crate that will be tagged with `#![allocator]` and will | ||
| redirect allocation requests to the system allocator. | ||
| * `alloc_jemalloc` is another allocator crate that will bundle a static copy of |
This comment has been minimized.
This comment has been minimized.
cmr
Jul 6, 2015
Member
#![allocator] instead of allocator would be less confusing (I wasn't sure if it was implied that it would not have the tag)
cmr
reviewed
Jul 6, 2015
|
|
||
| ### Default allocator specifications | ||
|
|
||
| Target specifications will be extended with two keys: `lib_allocation_crate` |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
It might be worth discussing how this RFC solves or improves on the situation described in Reenix: Implementing a Unix-Like Operating System in Rust 3.3 Critical Problem: Allocation. |
This comment has been minimized.
This comment has been minimized.
|
@gnzlbg this is somewhat orthogonal in the sense that it's not stabilizing an allocator API, nor is it altering the semantics of what to do on a failed allocation. It would only help in terms of switching out which allocator is used by default. |
This comment has been minimized.
This comment has been minimized.
|
It is possible if an allocator trait is created to only introduce the system allocator (as per this RFC) in std. That would force libcollections to be allocator agnostic :D. |
This comment has been minimized.
This comment has been minimized.
@alexcrichton would it be possible to modify this API to return |
This comment has been minimized.
This comment has been minimized.
|
I'm writing Rust libraries that I expect to be linked statically with both C and Rust programs. Would there be a way to say "Use malloc if linked with C, and whatever Rust program wants when linked with Rust"? i.e. my library doesn't care about which allocator is used, but doesn't want to impose any allocator on the client. |
This comment has been minimized.
This comment has been minimized.
|
@pornel |
This comment has been minimized.
This comment has been minimized.
|
@gnzlbg Sure it could possibly use one of those types eventually, but this RFC isn't stabilizing the signatures of these functions currently, just adding infrastructure to swap them out. @pornel You could manually link to |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton Great! |
This comment has been minimized.
This comment has been minimized.
I think that is an unfair characterization on multiple levels. In the second RFC you are referencing (#244), the handling of GC issues certainly had problems, but feeding more type-metadata into a high-level allocator is not an inherently bad idea, IMO. Anyway, trait-based allocator selection is a distinct issue that we are planning to address independently of this RFC. Having a high-level / low-level split in the trait definitions may or may not be necessary, but I suspect it will be the only way to actually placate all of the parties involved. |
This comment has been minimized.
This comment has been minimized.
erickt
commented
Jul 8, 2015
|
@Tobba: I suspect you were making a joke, but please keep from describing other people's work in that way. |
This comment has been minimized.
This comment has been minimized.
|
I guess another way to put it is: this unresolved question suggests that there is one rather obvious case we didn't analyze as thoroughly as the others. We know that calling Rust from C makes Rust use the allocator. We know that pure Rust gets to use the builtin jemalloc. But we really want to make sure that C used by Rust will use jemalloc too! And naturally this gets into the static/dynamic linking question, and (for dynamic linking in particular) the differences betweeen platforms, right? It feels like there ought to be some obvious precedent to follow here! Why don't other big C frameworks have this sort of problem? I guess nobody is in quite our position of wanting to simultaneously function as |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis I expect the more general approach is to just use malloc/free, and let the executable link an unprefixed jemalloc implementation if desired, or let the user set Of course, you don't get any advanced jemalloc functionality this way, unless perhaps you create weak fallbacks for those extra functions. |
This comment has been minimized.
This comment has been minimized.
Ah I should clarify in that I'm not sure how to do this on all platforms. On linux I believe if we just don't prefix jemalloc then "everything should work out", but I'm less certain how to override the system allocator on OSX and Windows. I think we can coerce the system allocator on OSX to be overridden (and jemalloc may already do this), but I haven't tested any of these use cases.
I agree! This is a very good point. I think one of the problems here is that it's a very platform-specific issue. For example on many unixes you can just use Otherwise some C library provide the ability to define an allocator (e.g. via a virtual function call), but that's definitely a library-specific concern. |
This comment has been minimized.
This comment has been minimized.
|
Right. The goal of the current design was to give us the full advantage of jemalloc when rust was in charge, and fallback to system allocator otherwise. Niko -------- Original message -------- From: Josh Stone notifications@github.com Date:07/08/2015 18:13 (GMT-05:00) To: rust-lang/rfcs rfcs@noreply.github.com Cc: Niko Matsakis niko@alum.mit.edu Subject: Re: [rfcs] RFC: Allow changing the default allocator (#1183) Of course, you don't get any advanced jemalloc functionality this way, unless perhaps you create weak fallbacks for those extra functions. — |
pnkfelix
reviewed
Jul 9, 2015
| funnel Rust allocations to the same source as the host application's allocations | ||
| then a crate can be written and linked in. | ||
|
|
||
| Finally, providers of allocators will simply provide a crate to do so, and then |
This comment has been minimized.
This comment has been minimized.
pnkfelix
Jul 9, 2015
Member
Can you add text to this section (either in this paragraph or in a separate one) spelling out how a client who wants to provide a wrapper around Rust's default allocator (or otherwise instrument it) would do so?
This use case was alluded to, at the end of the motivation section, but I am not 100% clear on how arduous the process will be, in particular whether one will be confident that the allocator one is injecting is truly a wrapper around the allocator that Rust would have selected otherwise (that is, without the injection)
This comment has been minimized.
This comment has been minimized.
pnkfelix
Jul 9, 2015
Member
(if the answer is "It is indeed a bit arduous to write such a wrapper robustly, e.g. involving cfg switches to select properly between alloc_system and alloc_jemalloc in the alloc crate one is injecting, that is acceptable. I just want to know up front if that is the expectation.)
This comment has been minimized.
This comment has been minimized.
pnkfelix
Jul 9, 2015
Member
(its also possible that the answer involves somehow observing the values of lib_allocation_crate and exe_allocation_crate during the compilation of the crate I want to inject, and just assume they will stay the same at the time of the final link where I am being injected? Still wondering out loud; probably should just wait for @alexcrichton to answer...)
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 9, 2015
Author
Member
Unfortunately this RFC doesn't currently easily allow this sort of instrumentation to happen. If we wanted to support this right out of the gate, this RFC would necessitate four crates:
- Two crates for implementing the allocation API, but not tagged with
#![allocator]. There'd be one crate for jemalloc and one for the system. - Two crates for linking to the previous crates, but are tagged with
#![allocator]and redirect the formal allocation API into the desired crate.
In a nutshell, if you want to write an allocator which can be instrumented, or shimmed then you need to write a crate which is not tagged #![allocator] but probably still exposes the allocation API via normal Rust functions. The provider of the allocator would then write their own shims that redirect to the allocator desired after the instrumentation has happened.
Does that make sense? If so I'll add some words.
This comment has been minimized.
This comment has been minimized.
pnkfelix
Aug 6, 2015
Member
hmm I missed this response back when it was written.
I guess I would have liked for some more concrete details in the RFC regarding use cases like this, i.e. spelling out what the steps are for the expected uses of this RFC, and then also including little sketches like the one in your comment for unexpected use cases.
Anyway I plan to have a shot at playing around with the PR rust-lang/rust#27400 since I am finding myself needing to do some allocation debugging. Perhaps it will inspire me to write an amendment for the RFC with such notes.
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 10, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 10, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 10, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 11, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 11, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 11, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 11, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 12, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 12, 2015
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Aug 13, 2015
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Aug 13, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 13, 2015
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Aug 13, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 13, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 14, 2015
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Aug 14, 2015
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Aug 14, 2015
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Aug 14, 2015
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Aug 14, 2015
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Aug 14, 2015
This comment has been minimized.
This comment has been minimized.
froydnj
commented
Apr 15, 2016
|
It would be splendid if the RFC described the semantics of the various Some of these things can be derived from exploring the built-in crates of Rust, but it'd be much nicer for people who have to implement custom allocators to have the function semantics written down somewhere. |
This comment has been minimized.
This comment has been minimized.
|
@froydnj this RFC actually intentionally left out the specifications for each symbol (because they're all unstable), and the exact semantics/requirements may change over time (depending on how allocators shake out). So in that sense I don't believe these have been highly scrutinized in terms of solidifying what the semantics should be vs what they do now. Essentially the only "stable implementations" of a custom allocator are You can learn more about what we currently require, however, from reading the |
This comment has been minimized.
This comment has been minimized.
froydnj
commented
Apr 15, 2016
|
@alexcrichton thanks for the explanation! It seems quite odd to introduce an interface that's stable (that's my understanding of the Rust RFC process, anyway), but then to not define interface semantics because the interface is subject to change over time. I see after a more careful reading that the RFC does call this out, though. I guess at some point these interfaces will be stabilized and then their API will be documented? The |
This comment has been minimized.
This comment has been minimized.
|
It's not stable. |
This comment has been minimized.
This comment has been minimized.
|
The acceptance of an RFC is only the first step on the road to stability. The implementation of an RFC will almost always land unstable, and can still change after that point, until it is formally stabilized. |
This comment has been minimized.
This comment has been minimized.
|
@froydnj yeah as mentioned by @Ericson2314 and @sfackler most of this RFC isn't actually stable. The only stable feature is that dylibs/staticlibs use the system allocator whereas executables use jemalloc. Beyond that everything is unstable and feature gated. Now that being said, if you guys need any help about clarifications of the current implementation or find it falls short, please let me know as I'd love to help out or help tweak the design :) |
This comment has been minimized.
This comment has been minimized.
froydnj
commented
Apr 18, 2016
|
Thanks for the clarifications! I feel enlightened. :) |
alexcrichton commentedJun 30, 2015
•
edited by mbrubeck
Add support to the compiler to override the default allocator, allowing a
different allocator to be used by default in Rust programs. Additionally, also
switch the default allocator for dynamic libraries and static libraries to using
the system malloc instead of jemalloc.
rendered