Tracking issue for custom allocators in standard collections #42774

alexcrichton · 2017-06-20T15:11:06Z

The new Alloc trait defined in RFC 1398 was added in #42313 but the libs teamdecided to hold off on Vec integration just yet to be conservative.

This issue is intended to track the integration of the Alloc trait into standard collections like Vec, BTreeMap, etc. I'm not personally aware of any current blockers, but what we'll probably want to do as part of this issue includes:

Develop a set of conventions for collections which support custom allocators
Ensure no regressions with integration of a custom allocator:
- error messages should look as they do today
- no performance regressions at runtime
- ideally minimal compile-time impact

The text was updated successfully, but these errors were encountered:

Ericson2314 · 2017-06-20T15:40:54Z

There are some very interesting issues here:

"Flat collections" (Vec, VecDec) can easily store a &mut to the allocator they were crated with, indirection-heavy ones would be significantly bloated if every Box also had an allocator reference.
Ideally all collections would be resilient in the face of allocation failure, but that will take some major algorithms design work. It's better to convert quickly blowing up on OOM as they do today.
I had the idea of doing trait InfallableAllocator = Allocator<Error=!> to signify an allocator that wouldn't fail (and thus must panick on OOM). If, when we en-mass convert collections, we have them take an InfallableAllocator parameter, it will be clearer to the end-user that the collections are not oom-resilient, and there would be less panics strune about.
- I'm not up-to-date with all the discussion of invalid Layouts, so I'll need to double-check that ! wouldn't instead need to be some strict Unsupported;. Then again, it isn't like the collections will be able to handle an unsupported layout either...
Not sure what the ETA on default parameters is, so we're probably best off reexporting collections in std with the allocator parameter specialized to the global allcoator. This will work today, right?

joshlf · 2017-06-20T16:10:42Z

@Ericson2314

I had the idea of doing trait InfallableAllocator = Allocator<Error=!> to signify an allocator that wouldn't fail (and thus must panick on OOM). If, when we en-mass convert collections, we have them take an InfallableAllocator parameter, it will be clearer to the end-user that the collections are not oom-resilient, and there would be less panics strune about.

I believe that the current design doesn't allow you to parametrize the error, so it must return the concrete error type with two arms - Unsupported and Exhausted - and if you want to panic on OOM, it's your (the client's) responsibility to do that by panicking yourself or by calling the oom method.

Ericson2314 · 2017-06-20T17:01:38Z

@joshlf err yes, this is predicated on us adding an associated error type.

ghost · 2017-07-11T11:56:59Z

To support the allocator trait in the collection types, the constructors will need to be duplicated.
@pnkfelix has used the _in suffix and appended the allocator to the list of arguments:

impl RawVec<T, A>
    fn with_capacity(n: usize) -> RawVec<T, HeapAlloc>;
    fn with_capacity_in(n: usize, a: A) -> RawVec<T, A>;
}

I personally quite like this approach.

ghost · 2017-07-14T15:46:52Z

@Ericson2314 mentioned this:

"Flat collections" (Vec, VecDec) can easily store a &mut to the allocator they were crated with, indirection-heavy ones would be significantly bloated if every Box also had an allocator reference.

I am currently implementing a non-trivial allocator myself (for a relocatable heap) and noticed that it don't need to keep anything around for deallocation (the type information is still needed).

This still works when multible allocators are used.

I wonder if this could be integrated somehow? Associated types to indicate what needs to be stored?

joshlf · 2017-07-14T17:27:25Z

@s3bk

I am currently implementing a non-trivial allocator myself (for a relocatable heap) and noticed that it don't need to keep anything around for deallocation (the type information is still needed).

This still works when multible allocators are used.

I wonder if this could be integrated somehow? Associated types to indicate what needs to be stored?

So the idea is that, for allocators where all instances of the type refer to the same global singleton, you wouldn't actually need to store a reference? Presumably that wouldn't work for allocators that could actually have multiple distinct instances, right?

This motivates a follow-up: if you were implementing a global singleton, couldn't you just make the actual Alloc type that you expose to users be a ZST that delegates to some private instance under the hood?

ghost · 2017-07-15T09:36:09Z

@joshlf

So the idea is that, for allocators where all instances of the type refer to the same global singleton, you wouldn't actually need to store a reference? Presumably that wouldn't work for allocators that could actually have multiple distinct instances, right?

Yes, for allocators with multiple instances the allocation function would benefit from non-zero knowledge.
(Mine needs to know the base address for instance. And working with more than one allocator is perfectly reasonable.)
What I was trying to point at, is that the pointer passed to dealloc is enough information to recover the actual instance used to allocate the memory.

hanna-kruppe · 2017-07-15T15:47:58Z

@s3bk How common is the case where you will only ever need to deallocate? All the standard collections and smart pointers have methods that allocate new memory from the same allocator (usually clone; Rc and Arc have make_mut).

Hm, now that I think about it, those methods already need to clone the allocator, so that would potentially offer a way out. Although it's unclear to me if we can switch between storing A and A::DeallocInfo in the collection depending on whether A: Clone, and it's also not clear to me if that would be a good design (it seems to work out mostly by chance, assuming it works out at all).

ghost · 2017-07-15T19:20:40Z

@rkruppe there could be

trait Alloc {
    type DeallocInfo;
    fn dealloc(&self, ptr: *mut T, layout: Layout);
    fn dealloc_with(info: Self::DeallocInfo, ptr: *mut T, layout: Layout);
    // snip
}
struct Rc<T, A: Alloc> {
  dealloc: A::DeallocInfo;
  // snip
}
impl<T> Rc<T, HeapAlloc> {
    fn make_mut(&mut self) -> &mut T { 
         // use HeapAlloc to potentially clone T
    }
}
impl<T, A: Alloc> Rc<T, A> {
    fn make_mut_in(&mut self, a: A) -> &mut T { 
         // use A to potentially clone T
    }
}

This would keep Rc as thin as possible, not break existing code and allow to use make_mut_in for code that is generic over Alloc

~~edit: The case for only dealloc: Box, and many non-std types that only provide a minimal api.~~
edit 2: completely wrong.

hanna-kruppe · 2017-07-15T19:39:45Z

@s3bk make_mut_in makes HeapAlloc a special case for no good reason, granting it better ergonomics than other allocators that can support make_mut just as well.

edit: The case for only dealloc: Box, and many non-std types that only provide a minimal api.

Box::clone allocates. Do you have examples of such non-std types?

ghost · 2017-07-15T20:12:43Z

@rkruppe It looks like I was completely wrong here. Most cases are covered by Box and the others reallocate.
Not sure how the the ergonomics could be solved, apart from adding another trait.

joshlf · 2017-07-19T17:28:17Z

@s3bk

Most cases are covered by Box and the others reallocate.

Is that a problem? The key property of DeallocInfo is the ability to, given a pointer allocated from the allocator, locate the allocator itself. Thus, you should be able to have any method that takes an already-allocated pointer have a _with variant - realloc, shrink, etc.

joshlf · 2017-07-21T20:06:01Z

@s3bk

There's another angle to consider here: how do we obtain a DeallocInfo? Presumably we need to guarantee that an object of that type has the same lifetime as the Alloc it was obtained from, so how do we ensure that? I'm imagining something like fn dealloc_info(&mut self) -> &mut Self::DeallocInfo, but then the lifetimes get pretty hairy. In particular, once somebody has a &mut DeallocInfo, the allocator can't be used again until the reference goes away because it's a mutable reference. This means you can allocate one Box at a time from an allocator...

ghost · 2017-07-22T16:58:44Z

@joshlf The lifetime problem is indeed tricky…

What if a lifetime would be added to the Alloc trait?

trait Alloc<'a> {
    type DeallocInfo;
    fn dealloc_info(&self) -> Self::DeallocInfo;
    ...
}
impl<'a> Alloc<'a> for &'a MyAllocator {
    type DeallocInfo = MyDeallocInfo<'a>;
    …
}

joshlf · 2017-07-22T20:52:46Z

@s3bk that's an interesting approach, but doesn't doing impl<'a> Alloc<'a> for &'a MyAllocator require all allocators to be threadsafe? I'm writing an allocator right now that very much benefits from not needing to be threadsafe.

ghost · 2017-07-22T21:34:37Z

@joshlf this crude example should not be Send or Sync. So it could safely be non-theadsafe .
Or am I missing something?

joshlf · 2017-07-23T04:51:15Z

@s3bk Ah I think you're right about that. Good call.

Ericson2314 · 2017-07-23T20:16:18Z

Not quite sure where the conversation is now, but we should probably discuss

type SomeCollection<T> = collections::SomeCollection<T, GlobalAlloc>;

As a temporary way to get the ball rolling here without impacting breaking std, needing newtypes or language design work for default params with aliases.

(Picking up the discussion from #43112 (comment).)

Ericson2314 · 2017-12-28T03:48:17Z

Cross post #32838 (comment)

I've started generalizing the collections myself; the easiest way to demonstrate how the associated return type helps is to just do the code change, I suppose.

alexreg · 2018-04-30T02:40:16Z

So, as far as I can tell, the real issue here is the one with knowing which allocator to use for further allocations or deallocation using a type. In the case of a single instance of each Alloc type, this can be solved all statically (no memory or runtime overhead), as mentioned. For multiple instances of an allocator type, there needs to be a (mutable) reference in each value to the allocator used. I would thus propose an associated type on Alloc with an AllocRef trait bound. The AllocRef trait would have a constructor method that takes the Alloc instance – for a singleton Alloc type (including the global allocator), the type implementing AllocRef would be zero-sized, and the constructor method wouldn't care about the instance it was passed, whereas it would simple store the reference for multiple-instance allocators.

This would seem to achieve the best of both worlds – no additional cost whatsoever over the current situation when you use the global allocator, and the necessary (but minimum) cost when you use custom allocators.

SimonSapin · 2018-04-30T09:24:04Z

@alexreg I don’t think there is a need for an AllocRef trait.

What happens at the moment for RawVec is that it has a type parameter A: Alloc and contains a value of type A directly. When the allocator type is zero-size like struct Global; in std::heap, this field has no overhead. For allocators that need to keep per-instance state, the idea is that the trait would be implemented not for the allocator struct, but for references to it: impl<'a> Alloc for &'a MyAllocator, and what you pass to generic collections is such a reference. (In this case with &_ rather than &mut _ so that you can use the same allocator in multiple collections at the same time, the allocator needs to use somthing like Cell or Mutex internally.)

SimonSapin · 2018-11-29T20:34:02Z

Regarding Default and FromIterator: those can exist iff the allocator type implements Default.

Regarding Box::leak, this is an interesting case. I think the today’s type system might not be able to express this in its full generality as that would require something like "associated lifetimes" in traits (if that even makes sense). The simplest for now would be to only implement it for Box<T, std::alloc::Global>.

Even if "deallocation" of the memory directly owned by Vec is a no-op, you should still run drop_in_place on the (slice of) elements. You could for example have Vec<Box<Foo>, &'bump Bump> and those inner boxes do need dropping to avoid memory leaks.

Ericson2314 · 2018-11-29T20:58:11Z

Yes that is what I did for Default. We can add additional methods like new_in and from_iterator_in; new classes can be made later.

glandium · 2018-11-29T21:26:51Z

@fitzgen See the allocator_api crate, which has the Alloc trait, Box and RawVec.

fitzgen · 2018-11-29T21:27:43Z

@SimonSapin

Regarding Box::leak, this is an interesting case. I think the today’s type system might not be able to express this in its full generality as that would require something like "associated lifetimes" in traits (if that even makes sense). The simplest for now would be to only implement it for Box<T, std::alloc::Global>.

I think this signature would work with Rust today (although I have not tried it yet):

impl<T, A: Alloc> Box<T, A> {
    pub fn leak<'a>(self) -> &'a T
    where
        A: 'a,
    { ... }
}

fitzgen · 2018-11-29T21:29:28Z

@fitzgen See the allocator_api crate, which has the Alloc trait, Box and RawVec.

Filed fitzgen/bumpalo#2

lachlansneff · 2019-02-16T05:53:48Z

Has there been any progress with stabilizing this?

Ericson2314 · 2019-02-16T06:23:40Z

@glandium just got their PR for Box less blocked. Once that lands I'll rebase my PR for the other collections.

joshlf · 2019-02-16T07:53:15Z

@Ericson2314 @glandium Link to that PR?

passchaos · 2019-02-16T08:42:14Z

@joshlf maybe #58457

joshlf · 2019-02-16T11:11:27Z

@passchaos Looks like it; thanks!

lachlansneff · 2019-02-16T18:01:36Z

It'd be useful if we could use the traditional collection apis (new, with_capacity, etc) if our allocator type parameter implemented Default.

SimonSapin · 2019-02-17T11:41:23Z

As long as #27336 is not implemented, that would make some currently-valid programs ambiguous because type inference doesn’t know what allocator to pick.

lachlansneff · 2019-02-17T16:10:24Z

That's been feature-gated for 3 years. Can it not just be stabilized?

SimonSapin · 2019-02-17T17:09:17Z

Assuming yea mean #27336, it’s better discussed in the relevant tracking issue. I haven’t been following this one, but from a quick glance at the thread its implementation is not complete yet and it’s not "just" a matter of flipping the stabilization switch.

Manishearth · 2019-03-27T20:49:01Z

Is there an RFC or design document talking about what the current plans for allocators in stdlib collections are? The allocators RFC leaves this stuff undecided, and this issue seems to be an implementation tracking issue.

SimonSapin · 2019-03-27T21:41:55Z

At this point I think we need a champion who would summarize what’s still unresolved in the (numerous) discussions so far and see, try to build consensus, and make set of proposals (possibly as a new RFC).

Note that I am not volunteering at this time :]

gnzlbg · 2019-03-27T22:44:14Z

Such a champion might probably want to start a working group, similar to the unsafe-code-guidelines, with its own github repo, where multiple parts of the "Allocators issue" can be explored in parallel, with active meetings to make progress, and where the parts of the issue that achieve consensus get merged slowly into a document with rationale, which is then sliced into suitable RFCs.

Summarizing the discussions effectively and putting them into a single comment would be a lot of work, and it won't IMO help much, because 10 people are going to answer to the summary, asking questions and discussing 10 different aspects of it.

The champion of such a working group would probably need to be a core team or lib team member with the time and will to see this through, and that will probably require somehow fitting allocators in the 2019 roadmap.

SimonSapin · 2019-03-27T22:55:03Z

Regarding the 2019 roadmap, allocators are already the one item specifically mention for the library team in rust-lang/rfcs#2657. Though to some extent this might be wishful thinking as long as no-one steps up to lead this work.

I feel that this is a much smaller topic than unsafe code guidelines, but on further thought you’re probably right that this is a better model to keep sub-discussions organized.

Ericson2314 · 2019-03-29T17:09:57Z

Once Box<T, A> has landed, I'll finish off https://github.com/QuiltOS/rust/commits/allocator-error / #52420 . I'd be happy to make an RFC at that point describing what I did.

alexcrichton · 2019-04-22T23:51:02Z

I've made a post on internals where the libs team is considering spinning up a working group to drive this issue to completion, and if you're interested please feel free to comment there!

TimDiekmann · 2019-05-04T12:26:23Z

As we now have a dedicated repository for the allocators WG this issue may be closed/locked to keep discussions in one place?

mark-i-m · 2019-05-04T16:32:41Z

I think this issue still serves to funnel people in the direction of that repo. Nobody will know to go there otherwise.

alexcrichton · 2019-05-06T15:18:33Z

I agree with @TimDiekmann that centralizing discussion is probably best now, so I'm going to close this in favor of the WG repository.

I'll open an issue on the repository about discoverability.

alexcrichton added B-unstable Feature: Implemented in the nightly compiler and unstable. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jun 20, 2017

alexcrichton mentioned this issue Jun 20, 2017

Allocator traits and std::heap #32838

Open

12 tasks

alexcrichton mentioned this issue Jul 12, 2017

Consider reverting the merge of collections into alloc #43112

Closed

Mark-Simulacrum added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Jul 22, 2017

Ericson2314 mentioned this issue Mar 16, 2018

Refactor std to improve ease of porting rust-lang-nursery/portability-wg#1

Open

steveklabnik mentioned this issue Apr 3, 2018

Include custom allocators in the book rust-lang/book#1270

Closed

matthewjasper mentioned this issue Apr 3, 2018

Custom global allocators (RFC 1398, RFC 1974) rust-lang/reference#287

Open

2 tasks

steveklabnik mentioned this issue Apr 4, 2018

Update for custom allocators rust-lang/rust-by-example#1059

Closed

lachlansneff mentioned this issue Feb 17, 2019

Tracking Issue for RFC 213: Default Type Parameter Fallback #27336

Closed

fitzgen mentioned this issue Apr 2, 2019

Add collect_in for iterators. fitzgen/bumpalo#12

Closed

cuviper mentioned this issue May 2, 2019

Add feature = "zeroize" rust-num/num-bigint#92

Open

alexcrichton closed this as completed May 6, 2019

alexcrichton mentioned this issue May 6, 2019

Increase discoverability of this repository? rust-lang/wg-allocators#22

Open

Ericson2314 mentioned this issue May 10, 2019

WIP: Allocator- and fallibility-polymorphic collections #60703

Closed

TjeuKayim mentioned this issue Jul 21, 2020

Serialize and deserialize interfaces rusqlite/rusqlite#786

Open

Tracking issue for custom allocators in standard collections #42774

Tracking issue for custom allocators in standard collections #42774

Comments

alexcrichton commented Jun 20, 2017

Ericson2314 commented Jun 20, 2017

joshlf commented Jun 20, 2017

Ericson2314 commented Jun 20, 2017

ghost commented Jul 11, 2017

ghost commented Jul 14, 2017

joshlf commented Jul 14, 2017

ghost commented Jul 15, 2017

hanna-kruppe commented Jul 15, 2017

ghost commented Jul 15, 2017 • edited by ghost

hanna-kruppe commented Jul 15, 2017 • edited

ghost commented Jul 15, 2017

joshlf commented Jul 19, 2017

joshlf commented Jul 21, 2017

ghost commented Jul 22, 2017

joshlf commented Jul 22, 2017

ghost commented Jul 22, 2017

joshlf commented Jul 23, 2017

Ericson2314 commented Jul 23, 2017 • edited

Ericson2314 commented Dec 28, 2017 • edited

alexreg commented Apr 30, 2018

SimonSapin commented Apr 30, 2018

SimonSapin commented Nov 29, 2018

Ericson2314 commented Nov 29, 2018

glandium commented Nov 29, 2018

fitzgen commented Nov 29, 2018

fitzgen commented Nov 29, 2018

lachlansneff commented Feb 16, 2019

Ericson2314 commented Feb 16, 2019

joshlf commented Feb 16, 2019

passchaos commented Feb 16, 2019

joshlf commented Feb 16, 2019

lachlansneff commented Feb 16, 2019

SimonSapin commented Feb 17, 2019

lachlansneff commented Feb 17, 2019 • edited

SimonSapin commented Feb 17, 2019

Manishearth commented Mar 27, 2019

SimonSapin commented Mar 27, 2019

gnzlbg commented Mar 27, 2019

SimonSapin commented Mar 27, 2019

Ericson2314 commented Mar 29, 2019

alexcrichton commented Apr 22, 2019

TimDiekmann commented May 4, 2019

mark-i-m commented May 4, 2019

alexcrichton commented May 6, 2019

ghost commented Jul 15, 2017 •

edited by ghost

hanna-kruppe commented Jul 15, 2017 •

edited

Ericson2314 commented Jul 23, 2017 •

edited

Ericson2314 commented Dec 28, 2017 •

edited

lachlansneff commented Feb 17, 2019 •

edited