Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocator traits and std::heap #32838

Open
6 of 12 tasks
nikomatsakis opened this issue Apr 8, 2016 · 414 comments
Open
6 of 12 tasks

Allocator traits and std::heap #32838

nikomatsakis opened this issue Apr 8, 2016 · 414 comments
Labels
A-allocators B-RFC-approved B-unstable C-tracking-issue disposition-merge finished-final-comment-period Libs-Tracked S-tracking-needs-summary T-lang T-libs-api

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Apr 8, 2016

馃摙 This feature has a dedicated working group, please direct comments and concerns to the working group's repo.

The remainder of this post is no longer an accurate summary of the current state; see that dedicated working group instead.

Old content

Original Post:


FCP proposal: #32838 (comment)
FCP checkboxes: #32838 (comment)


Tracking issue for rust-lang/rfcs#1398 and the std::heap module.

State of std::heap after #42313:

pub struct Layout { /* ... */ }

impl Layout {
    pub fn new<T>() -> Self;
    pub fn for_value<T: ?Sized>(t: &T) -> Self;
    pub fn array<T>(n: usize) -> Option<Self>;
    pub fn from_size_align(size: usize, align: usize) -> Option<Layout>;
    pub unsafe fn from_size_align_unchecked(size: usize, align: usize) -> Layout;

    pub fn size(&self) -> usize;
    pub fn align(&self) -> usize;
    pub fn align_to(&self, align: usize) -> Self;
    pub fn padding_needed_for(&self, align: usize) -> usize;
    pub fn repeat(&self, n: usize) -> Option<(Self, usize)>;
    pub fn extend(&self, next: Self) -> Option<(Self, usize)>;
    pub fn repeat_packed(&self, n: usize) -> Option<Self>;
    pub fn extend_packed(&self, next: Self) -> Option<(Self, usize)>;
}

pub enum AllocErr {
    Exhausted { request: Layout },
    Unsupported { details: &'static str },
}

impl AllocErr {
    pub fn invalid_input(details: &'static str) -> Self;
    pub fn is_memory_exhausted(&self) -> bool;
    pub fn is_request_unsupported(&self) -> bool;
    pub fn description(&self) -> &str;
}

pub struct CannotReallocInPlace;

pub struct Excess(pub *mut u8, pub usize);

pub unsafe trait Alloc {
    // required
    unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr>;
    unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout);

    // provided
    fn oom(&mut self, _: AllocErr) -> !;
    fn usable_size(&self, layout: &Layout) -> (usize, usize);
    unsafe fn realloc(&mut self,
                      ptr: *mut u8,
                      layout: Layout,
                      new_layout: Layout) -> Result<*mut u8, AllocErr>;
    unsafe fn alloc_zeroed(&mut self, layout: Layout) -> Result<*mut u8, AllocErr>;
    unsafe fn alloc_excess(&mut self, layout: Layout) -> Result<Excess, AllocErr>;
    unsafe fn realloc_excess(&mut self,
                             ptr: *mut u8,
                             layout: Layout,
                             new_layout: Layout) -> Result<Excess, AllocErr>;
    unsafe fn grow_in_place(&mut self,
                            ptr: *mut u8,
                            layout: Layout,
                            new_layout: Layout) -> Result<(), CannotReallocInPlace>;
    unsafe fn shrink_in_place(&mut self,
                              ptr: *mut u8,
                              layout: Layout,
                              new_layout: Layout) -> Result<(), CannotReallocInPlace>;

    // convenience
    fn alloc_one<T>(&mut self) -> Result<Unique<T>, AllocErr>
        where Self: Sized;
    unsafe fn dealloc_one<T>(&mut self, ptr: Unique<T>)
        where Self: Sized;
    fn alloc_array<T>(&mut self, n: usize) -> Result<Unique<T>, AllocErr>
        where Self: Sized;
    unsafe fn realloc_array<T>(&mut self,
                               ptr: Unique<T>,
                               n_old: usize,
                               n_new: usize) -> Result<Unique<T>, AllocErr>
        where Self: Sized;
    unsafe fn dealloc_array<T>(&mut self, ptr: Unique<T>, n: usize) -> Result<(), AllocErr>
        where Self: Sized;
}

/// The global default allocator
pub struct Heap;

impl Alloc for Heap {
    // ...
}

impl<'a> Alloc for &'a Heap {
    // ...
}

/// The "system" allocator
pub struct System;

impl Alloc for System {
    // ...
}

impl<'a> Alloc for &'a System {
    // ...
}
@nikomatsakis nikomatsakis added B-RFC-approved T-lang T-libs-api B-unstable labels Apr 8, 2016
@gereeter
Copy link
Contributor

gereeter commented Apr 11, 2016

I unfortunately wasn't paying close enough attention to mention this in the RFC discussion, but I think that realloc_in_place should be replaced by two functions, grow_in_place and shrink_in_place, for two reasons:

  • I can't think of a single use case (short of implementing realloc or realloc_in_place) where it is unknown whether the size of the allocation is increasing or decreasing. Using more specialized methods makes it slightly more clear what is going on.
  • The code paths for growing and shrinking allocations tend to be radically different - growing involves testing whether adjacent blocks of memory are free and claiming them, while shrinking involves carving off properly sized subblocks and freeing them. While the cost of a branch inside realloc_in_place is quite small, using grow and shrink better captures the distinct tasks that an allocator needs to perform.

Note that these can be added backwards-compatibly next to realloc_in_place, but this would constrain which functions would be by default implemented in terms of which others.

For consistency, realloc would probably also want to be split into grow and split, but the only advantage to having an overloadable realloc function that I know of is to be able to use mmap's remap option, which does not have such a distinction.

@gereeter
Copy link
Contributor

gereeter commented Apr 11, 2016

Additionally, I think that the default implementations of realloc and realloc_in_place should be slightly adjusted - instead of checking against the usable_size, realloc should just first try to realloc_in_place. In turn, realloc_in_place should by default check against the usable size and return success in the case of a small change instead of universally returning failure.

This makes it easier to produce a high-performance implementation of realloc: all that is required is improving realloc_in_place. However, the default performance of realloc does not suffer, as the check against the usable_size is still performed.

eddyb added a commit to eddyb/rust that referenced this issue Oct 18, 2016
鈥akis

`#[may_dangle]` attribute

`#[may_dangle]` attribute

Second step of rust-lang#34761. Last big hurdle before we can work in earnest towards Allocator integration (rust-lang#32838)

Note: I am not clear if this is *also* a syntax-breaking change that needs to be part of a breaking-batch.
eddyb added a commit to eddyb/rust that referenced this issue Oct 19, 2016
鈥akis

`#[may_dangle]` attribute

`#[may_dangle]` attribute

Second step of rust-lang#34761. Last big hurdle before we can work in earnest towards Allocator integration (rust-lang#32838)

Note: I am not clear if this is *also* a syntax-breaking change that needs to be part of a breaking-batch.
eddyb added a commit to eddyb/rust that referenced this issue Oct 19, 2016
鈥akis

`#[may_dangle]` attribute

`#[may_dangle]` attribute

Second step of rust-lang#34761. Last big hurdle before we can work in earnest towards Allocator integration (rust-lang#32838)

Note: I am not clear if this is *also* a syntax-breaking change that needs to be part of a breaking-batch.
@pnkfelix
Copy link
Member

pnkfelix commented Oct 26, 2016

Another issue: The doc for fn realloc_in_place says that if it returns Ok, then one is assured that ptr now "fits" new_layout.

To me this implies that it must check that the alignment of the given address matches any constraint implied by new_layout.

However, I don't think the spec for the underlying fn reallocate_inplace function implies that it will perform any such check.

  • Furthermore, it seems reasonable that any client diving into using fn realloc_in_place will themselves be ensuring that the alignments work (in practice I suspect it means that the same alignment is required everywhere for the given use case...)

So, should the implementation of fn realloc_in_place really be burdened with checking that the alignment of the given ptr is compatible with that of new_layout? It is probably better in this case (of this one method) to push that requirement back to the caller...

@pnkfelix
Copy link
Member

pnkfelix commented Oct 26, 2016

@gereeter you make good points; I will add them to the check list I am accumulating in the issue description.

@pnkfelix
Copy link
Member

pnkfelix commented Oct 31, 2016

(at this point I am waiting for #[may_dangle] support to ride the train into the beta channel so that I will then be able to use it for std collections as part of allocator integration)

@joshlf
Copy link
Contributor

joshlf commented Jan 4, 2017

I'm new to Rust, so forgive me if this has been discussed elsewhere.

Is there any thought on how to support object-specific allocators? Some allocators such as slab allocators and magazine allocators are bound to a particular type, and do the work of constructing new objects, caching constructed objects which have been "freed" (rather than actually dropping them), returning already-constructed cached objects, and dropping objects before freeing the underlying memory to an underlying allocator when required.

Currently, this proposal doesn't include anything along the lines of ObjectAllocator<T>, but it would be very helpful. In particular, I'm working on an implementation of a magazine allocator object-caching layer (link above), and while I can have this only wrap an Allocator and do the work of constructing and dropping objects in the caching layer itself, it'd be great if I could also have this wrap other object allocators (like a slab allocator) and truly be a generic caching layer.

Where would an object allocator type or trait fit into this proposal? Would it be left for a future RFC? Something else?

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jan 4, 2017

I don't think this has been discussed yet.

You could write your own ObjectAllocator<T>, and then do impl<T: Allocator, U> ObjectAllocator<U> for T { .. }, so that every regular allocator can serve as an object-specific allocator for all objects.

Future work would be modifying collections to use your trait for their nodes, instead of plain ole' (generic) allocators directly.

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Jan 4, 2017

@pnkfelix

(at this point I am waiting for #[may_dangle] support to ride the train into the beta channel so that I will then be able to use it for std collections as part of allocator integration)

I guess this has happened?

@joshlf
Copy link
Contributor

joshlf commented Jan 4, 2017

@Ericson2314 Yeah, writing my own is definitely an option for experimental purposes, but I think there'd be much more benefit to it being standardized in terms of interoperability (for example, I plan on also implementing a slab allocator, but it would be nice if a third-party user of my code could use somebody else's slab allocator with my magazine caching layer). My question is simply whether an ObjectAllocator<T> trait or something like it is worth discussing. Although it seems that it might be best for a different RFC? I'm not terribly familiar with the guidelines for how much belongs in a single RFC and when things belong in separate RFCs...

@steveklabnik
Copy link
Member

steveklabnik commented Jan 4, 2017

@joshlf

Where would an object allocator type or trait fit into this proposal? Would it be left for a future RFC? Something else?

Yes, it would be another RFC.

I'm not terribly familiar with the guidelines for how much belongs in a single RFC and when things belong in separate RFCs...

that depends on the scope of the RFC itself, which is decided by the person who writes it, and then feedback is given by everyone.

But really, as this is a tracking issue for this already-accepted RFC, thinking about extensions and design changes isn't really for this thread; you should open a new one over on the RFCs repo.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jan 4, 2017

@joshlf Ah, I thought ObjectAllocator<T> was supposed to be a trait. I meant prototype the trait not a specific allocator. Yes that trait would merit its own RFC as @steveklabnik says.


@steveklabnik yeah now discussion would be better elsewhere. But @joshlf was also raising the issue lest it expose a hitherto unforeseen flaw in the accepted but unimplemented API design. In that sense it matches the earlier posts in this thread.

@joshlf
Copy link
Contributor

joshlf commented Jan 4, 2017

@Ericson2314 Yeah, I thought that was what you meant. I think we're on the same page :)

@steveklabnik Sounds good; I'll poke around with my own implementation and submit an RFC if it ends up seeming like a good idea.

@alexreg
Copy link
Contributor

alexreg commented Jan 4, 2017

@joshlf I don't any reason why custom allocators would go into the compiler or standard library. Once this RFC lands, you could easily publish your own crate that does an arbitrary sort of allocation (even a fully-fledged allocator like jemalloc could be custom-implemented!).

@joshlf
Copy link
Contributor

joshlf commented Jan 4, 2017

@alexreg This isn't about a particular custom allocator, but rather a trait that specifies the type of all allocators which are parametric on a particular type. So just like RFC 1398 defines a trait (Allocator) that is the type of any low-level allocator, I'm asking about a trait (ObjectAllocator<T>) that is the type of any allocator which can allocate/deallocate and construct/drop objects of type T.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jan 4, 2017

@alexreg See my early point about using standard library collections with custom object-specific allocators.

@alexreg
Copy link
Contributor

alexreg commented Jan 4, 2017

@alexreg
Copy link
Contributor

alexreg commented Jan 4, 2017

@joshlf
Copy link
Contributor

joshlf commented Jan 4, 2017

Sure, but I鈥檓 not sure that would belong in the standard library. Could easily go into another crate, with no loss of functionality or usability.

Yes but you probably want some standard library functionality to rely on it (such as what @Ericson2314 suggested).

I think you鈥檇 want to use standard-library collections (any heap-allocated value) with an arbitrary custom allocator; i.e. not limited to object-specific ones.

Ideally you'd want both - to accept either type of allocator. There are very significant benefits to using object-specific caching; for example, both slab allocation and magazine caching give very significant performance benefits - take a look at the papers I linked to above if you're curious.

@alexreg
Copy link
Contributor

alexreg commented Jan 4, 2017

@joshlf
Copy link
Contributor

joshlf commented Jan 4, 2017

But the object allocator trait could simply be a subtrait of the general allocator trait. It鈥檚 as simple as that, as far as I鈥檓 concerned. Sure, certain types of allocators can be more efficient than general-purpose allocators, but neither the compiler nor the standard really need to (or indeed should) know about this.

Ah, so the problem is that the semantics are different. Allocator allocates and frees raw byte blobs. ObjectAllocator<T>, on the other hand, would allocate already-constructed objects and would also be responsible for dropping these objects (including being able to cache constructed objects which could be handed out later in leu of constructing a newly-allocated object, which is expensive). The trait would look something like this:

trait ObjectAllocator<T> {
    fn alloc() -> T;
    fn free(t T);
}

This is not compatible with Allocator, whose methods deal with raw pointers and have no notion of type. Additionally, with Allocators, it is the caller's responsibility to drop the object being freed first. This is really important - knowing about the type T allows ObjectAllocator<T> to do things like call T's drop method, and since free(t) moves t into free, the caller cannot drop t first - it is instead the ObjectAllocator<T>'s responsibility. Fundamentally, these two traits are incompatible with one another.

@KodrAus KodrAus added the Libs-Tracked label Jul 29, 2020
jacob-hughes added a commit to jacob-hughes/rustgc that referenced this issue Sep 23, 2020
At first this seems counter-intuitive: why put memory that should be
managed with RAII under control of the collector? However, this is
necessary to prevent leaks if we want to take advantage of `NoFinalize`
optimisations. This is best explained with an example:

    let bigvec: Vec<usize> = ...;
    let gc = Gc::new(bigvec);

Here, `bigvec` has a `drop` method which does the following: recursively
call drop on each element; then, deallocate the memory for the backing store
(`RawVec`). When `bigvec` is moved into `Gc`, its `drop` method is
converted from a destructor to a finalizer, and may run later on when
Boehm determines that it's no longer reachable. Unfortunately, this kind
of finalization is very expensive, and leads to bottlenecks in the
collector which can make it impractical to use.

The solution to this is to use the `NoFinalize` marker trait, e.g.:

    impl NoFinalize for Vec<usize> {}

This tells the collector that calling `bigvec`'s `drop` method is
unnecessary: there is no need to recursively call `drop` on the element
type (`usize`) since it's trivially destructable. This is essential to
win-back performance losses from excess finalization.

However, recursively calling drop on element types is only one part of
`Vec<usize>::drop`: it also deallocates the backing `RawVec`.  Omitting
the finalizer -- as would be the case when marked `NoFinalize` -- means
that the `RawVec`'s memory will leak. This is why we now allocate *all*
memory through `gc_malloc`. From now on, just like with `Gc` values, the
`RawVec` will be reclaimed when the collector sees that it is
unreachable.

There are two major concerns about this change which immediately jump out:

    1. We now have non-deterministic freeing for RAII values, which
    breaks a users mental model of how Rust does memory.

    2. Using `gc_malloc` to manage non-gc'd memory will have performance
    overhead.

Fortunately, I think these concerns are unfounded. First, `gc_free`
works with `gc_malloc`, just as it does with `gc_malloc_uncollectable`,
and we *have not* removed `free` calls for standard non-gc Rust
allocation. In essence, this is identical to what we had before with the
exception that leaks are cleaned up later on by the collector.

Second, the performance costs of using `gc_malloc` over
`gc_malloc_uncollectable` are not hugely different, since blocks
allocated with the latter call are still scanned for GC pointers during
marking anyway. The table below shows a comparison of using `gc_malloc`
and `gc_malloc_uncollectable` for non-GC'd rust values with no
additional optimizations.

--------------------------------------------------------------------------------
  Benchmark     Executor                  Suite   Extra   Core  #Smpls Mean (ms)
--------------------------------------------------------------------------------
  Bounce        gc_malloc                 micro       2      1   10          71
  Bounce        gc_malloc_uncollectable   micro       2      1   10         105
  BubbleSort    gc_malloc                 micro       3      1   10          81
  BubbleSort    gc_malloc_uncollectable   micro       3      1   10         100
  DeltaBlue     gc_malloc                 macro      50      1   10          90
  DeltaBlue     gc_malloc_uncollectable   macro      50      1   10         109
  Dispatch      gc_malloc                 micro       2      1   10          46
  Dispatch      gc_malloc_uncollectable   micro       2      1   10          73
  Fannkuch      gc_malloc                 micro       6      1   10          54
  Fannkuch      gc_malloc_uncollectable   micro       6      1   10          53
  Fibonacci     gc_malloc                 micro       3      1   10         226
  Fibonacci     gc_malloc_uncollectable   micro       3      1   10         232
  FieldLoop     gc_malloc                 micro       1      1   10          36
  FieldLoop     gc_malloc_uncollectable   micro       1      1   10          55
  GraphSearch   gc_malloc                 macro       4      1   10          24
  GraphSearch   gc_malloc_uncollectable   macro       4      1   10          27
  IntegerLoop   gc_malloc                 micro       2      1   10          89
  IntegerLoop   gc_malloc_uncollectable   micro       2      1   10         147
  JsonSmall     gc_malloc                 macro       1      1   10          99
  JsonSmall     gc_malloc_uncollectable   macro       1      1   10         102
  List          gc_malloc                 micro       2      1   10         106
  List          gc_malloc_uncollectable   micro       2      1   10         109
  Loop          gc_malloc                 micro       5      1   10         130
  Loop          gc_malloc_uncollectable   micro       5      1   10         170
  Mandelbrot    gc_malloc                 micro      30      1   10         121
  Mandelbrot    gc_malloc_uncollectable   micro      30      1   10         183
  NBody         gc_malloc                 macro     500      1   10          65
  NBody         gc_malloc_uncollectable   macro     500      1   10          64
  PageRank      gc_malloc                 macro      40      1   10         107
  PageRank      gc_malloc_uncollectable   macro      40      1   10         113
  Permute       gc_malloc                 micro       3      1   10          93
  Permute       gc_malloc_uncollectable   micro       3      1   10         111
  Queens        gc_malloc                 micro       2      1   10          69
  Queens        gc_malloc_uncollectable   micro       2      1   10          95
  QuickSort     gc_malloc                 micro       1      1   10          33
  QuickSort     gc_malloc_uncollectable   micro       1      1   10          40
  Recurse       gc_malloc                 micro       3      1   10          67
  Recurse       gc_malloc_uncollectable   micro       3      1   10          96
  Richards      gc_malloc                 macro       1      1   10        2247
  Richards      gc_malloc_uncollectable   macro       1      1   10        2282
  Sieve         gc_malloc                 micro       4      1   10         131
  Sieve         gc_malloc_uncollectable   micro       4      1   10         185
  Storage       gc_malloc                 micro       1      1   10          61
  Storage       gc_malloc_uncollectable   micro       1      1   10          53
  Sum           gc_malloc                 micro       2      1   10          45
  Sum           gc_malloc_uncollectable   micro       2      1   10          72
  Towers        gc_malloc                 micro       2      1   10         111
  Towers        gc_malloc_uncollectable   micro       2      1   10         116
  TreeSort      gc_malloc                 micro       1      1   10         150
  TreeSort      gc_malloc_uncollectable   micro       1      1   10         143
  WhileLoop     gc_malloc                 micro      10      1   10         108
  WhileLoop     gc_malloc_uncollectable   micro      10      1   10         171

Surprisingly, gc_malloc appears to perform better in a few benchmarks,
I'd speculate this is due to locality since `gc_malloc_uncollectable`
always allocates in pages distinct from blocks which are gc managed.

This does feel a bit like using a sledgehammer to crack a nut: ideally
we would only want to replace `gc_malloc_uncollectable` with `gc_malloc`
for certain collections (like `Vec`) where this problem can arise.
Unfortunately, this is not practical until Rust provides a way to use
collections with custom allocators [1]. An even more ideal solution
would be to use this in combination with _destination propagation_:
allocating a value directly on the GC heap when it can be determined
statically that it gets moved into a GC. I've prototyped the latter (see
cb06b9), and right now, without proper custom allocation support, the
benefits don't justify the sheer complexity it adds.

[1]: rust-lang#32838
bors bot added a commit to softdevteam/rustgc that referenced this issue Sep 23, 2020
13: Replace gc_malloc_uncollectable with gc_malloc for non-GC'd values r=ltratt a=jacob-hughes

At first this seems counter-intuitive: why put memory that should be
managed with RAII under control of the collector? However, this is
necessary to prevent leaks if we want to take advantage of `NoFinalize`
optimisations. This is best explained with an example:

    let bigvec: Vec<usize> = ...;
    let gc = Gc::new(bigvec);

Here, `bigvec` has a `drop` method which does the following: recursively
call drop on each element; then, deallocate the memory for the backing store
(`RawVec`). When `bigvec` is moved into `Gc`, its `drop` method is
converted from a destructor to a finalizer, and may run later on when
Boehm determines that it's no longer reachable. Unfortunately, this kind
of finalization is very expensive, and leads to bottlenecks in the
collector which can make it impractical to use.

The solution to this is to use the `NoFinalize` marker trait, e.g.:

    impl NoFinalize for Vec<usize> {}

This tells the collector that calling `bigvec`'s `drop` method is
unnecessary: there is no need to recursively call `drop` on the element
type (`usize`) since it's trivially destructable. This is essential to
win-back performance losses from excess finalization.

However, recursively calling drop on element types is only one part of
`Vec<usize>::drop`: it also deallocates the backing `RawVec`.  Omitting
the finalizer -- as would be the case when marked `NoFinalize` -- means
that the `RawVec`'s memory will leak. This is why we now allocate *all*
memory through `gc_malloc`. From now on, just like with `Gc` values, the
`RawVec` will be reclaimed when the collector sees that it is
unreachable.

There are two major concerns about this change which immediately jump out:

    

1. We now have non-deterministic freeing for RAII values, which
    breaks a users mental model of how Rust does memory.

2. Using `gc_malloc` to manage non-gc'd memory will have performance
    overhead.

Fortunately, I think these concerns are unfounded. First, `gc_free`
works with `gc_malloc`, just as it does with `gc_malloc_uncollectable`,
and we *have not* removed `free` calls for standard non-gc Rust
allocation. In essence, this is identical to what we had before with the
exception that leaks are cleaned up later on by the collector.

Second, the performance costs of using `gc_malloc` over
`gc_malloc_uncollectable` are not hugely different, since blocks
allocated with the latter call are still scanned for GC pointers during
marking anyway. The table below shows a comparison of using `gc_malloc`
and `gc_malloc_uncollectable` for non-GC'd rust values with no
additional optimizations.
```
--------------------------------------------------------------------------------
  Benchmark     Executor                  Suite   Extra   Core  #Smpls Mean (ms)
--------------------------------------------------------------------------------
  Bounce        gc_malloc                 micro       2      1   10          71
  Bounce        gc_malloc_uncollectable   micro       2      1   10         105
  BubbleSort    gc_malloc                 micro       3      1   10          81
  BubbleSort    gc_malloc_uncollectable   micro       3      1   10         100
  DeltaBlue     gc_malloc                 macro      50      1   10          90
  DeltaBlue     gc_malloc_uncollectable   macro      50      1   10         109
  Dispatch      gc_malloc                 micro       2      1   10          46
  Dispatch      gc_malloc_uncollectable   micro       2      1   10          73
  Fannkuch      gc_malloc                 micro       6      1   10          54
  Fannkuch      gc_malloc_uncollectable   micro       6      1   10          53
  Fibonacci     gc_malloc                 micro       3      1   10         226
  Fibonacci     gc_malloc_uncollectable   micro       3      1   10         232
  FieldLoop     gc_malloc                 micro       1      1   10          36
  FieldLoop     gc_malloc_uncollectable   micro       1      1   10          55
  GraphSearch   gc_malloc                 macro       4      1   10          24
  GraphSearch   gc_malloc_uncollectable   macro       4      1   10          27
  IntegerLoop   gc_malloc                 micro       2      1   10          89
  IntegerLoop   gc_malloc_uncollectable   micro       2      1   10         147
  JsonSmall     gc_malloc                 macro       1      1   10          99
  JsonSmall     gc_malloc_uncollectable   macro       1      1   10         102
  List          gc_malloc                 micro       2      1   10         106
  List          gc_malloc_uncollectable   micro       2      1   10         109
  Loop          gc_malloc                 micro       5      1   10         130
  Loop          gc_malloc_uncollectable   micro       5      1   10         170
  Mandelbrot    gc_malloc                 micro      30      1   10         121
  Mandelbrot    gc_malloc_uncollectable   micro      30      1   10         183
  NBody         gc_malloc                 macro     500      1   10          65
  NBody         gc_malloc_uncollectable   macro     500      1   10          64
  PageRank      gc_malloc                 macro      40      1   10         107
  PageRank      gc_malloc_uncollectable   macro      40      1   10         113
  Permute       gc_malloc                 micro       3      1   10          93
  Permute       gc_malloc_uncollectable   micro       3      1   10         111
  Queens        gc_malloc                 micro       2      1   10          69
  Queens        gc_malloc_uncollectable   micro       2      1   10          95
  QuickSort     gc_malloc                 micro       1      1   10          33
  QuickSort     gc_malloc_uncollectable   micro       1      1   10          40
  Recurse       gc_malloc                 micro       3      1   10          67
  Recurse       gc_malloc_uncollectable   micro       3      1   10          96
  Richards      gc_malloc                 macro       1      1   10        2247
  Richards      gc_malloc_uncollectable   macro       1      1   10        2282
  Sieve         gc_malloc                 micro       4      1   10         131
  Sieve         gc_malloc_uncollectable   micro       4      1   10         185
  Storage       gc_malloc                 micro       1      1   10          61
  Storage       gc_malloc_uncollectable   micro       1      1   10          53
  Sum           gc_malloc                 micro       2      1   10          45
  Sum           gc_malloc_uncollectable   micro       2      1   10          72
  Towers        gc_malloc                 micro       2      1   10         111
  Towers        gc_malloc_uncollectable   micro       2      1   10         116
  TreeSort      gc_malloc                 micro       1      1   10         150
  TreeSort      gc_malloc_uncollectable   micro       1      1   10         143
  WhileLoop     gc_malloc                 micro      10      1   10         108
  WhileLoop     gc_malloc_uncollectable   micro      10      1   10         171
```
Surprisingly, `gc_malloc` appears to perform better in a few benchmarks,
I'd speculate this is due to locality since `gc_malloc_uncollectable`
always allocates in pages distinct from blocks which are gc managed.

This does feel a bit like using a sledgehammer to crack a nut: ideally
we would only want to replace `gc_malloc_uncollectable` with `gc_malloc`
for certain collections (like `Vec`) where this problem can arise.
Unfortunately, this is not practical until Rust provides a way to use
collections with custom allocators [1]. An even more ideal solution
would be to use this in combination with _destination propagation_:
allocating a value directly on the GC heap when it can be determined
statically that it gets moved into a GC. I've prototyped the latter (see
cb06b9), and right now, without proper custom allocation support, the
benefits don't justify the sheer complexity it adds.

[1]: rust-lang#32838

Co-authored-by: Jacob Hughes <jh@jakehughes.uk>
@adsnaider
Copy link

adsnaider commented Nov 23, 2021

Figured I would put this in here. I ran into a compiler bug when using the allocator_api feature with Box: #90911

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 23, 2021
鈥locator-api-in-tuple, r=davidtwco

Suggestion to wrap inner types using 'allocator_api' in tuple

This PR provides a suggestion to wrap the inner types in tuple when being along with 'allocator_api'.

Closes rust-lang#83250

```rust
fn main() {
    let _vec: Vec<u8, _> = vec![]; //~ ERROR use of unstable library feature 'allocator_api'
}
```

```diff
 error[E0658]: use of unstable library feature 'allocator_api'
   --> $DIR/suggest-vec-allocator-api.rs:2:23
    |
 LL |     let _vec: Vec<u8, _> = vec![];
-   |                       ^
+   |                   ----^
+   |                   |
+   |                   help: consider wrapping the inner types in tuple: `(u8, _)`
    |
    = note: see issue rust-lang#32838 <rust-lang#32838> for more information
    = help: add `#![feature(allocator_api)]` to the crate attributes to enable
```
@joshtriplett joshtriplett added the S-tracking-needs-summary label Jan 19, 2022
@RalfJung RalfJung added the A-allocators label Jul 4, 2022
@RalfJung
Copy link
Member

RalfJung commented Jul 4, 2022

Figured I would put this in here. I ran into a compiler bug when using the allocator_api feature with Box: #90911

In fact this feature is littered with ICEs. That's because it fits quite badly with MIR and the MIR-consuming parts of the compiler -- codegen, Miri all need to do lots of special-casing for Box, and adding an allocator field made it all a lot worse. See #95453 for more details. IMO we shouldn't stabilize Box<T, A> until these design issues are resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-allocators B-RFC-approved B-unstable C-tracking-issue disposition-merge finished-final-comment-period Libs-Tracked S-tracking-needs-summary T-lang T-libs-api
Projects
None yet
Development

No branches or pull requests