New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocator RFC, take II #244

Closed
wants to merge 70 commits into
from

Conversation

Projects
None yet
@pnkfelix
Member

pnkfelix commented Sep 17, 2014

(rendered)

Summary

Add a standard allocator interface and support for user-defined allocators, with the following goals:

  1. Allow libraries to be generic with respect to the allocator, so that users can supply their own memory allocator and still make use of library types like Vec or HashMap (assuming that the library types are updated to be parameterized over their allocator).

    In particular, stateful per-container allocators are supported.

  2. Support ability of garbage-collector (GC) to identify roots stored in statically-typed user-allocated objects outside the GC-heap, without sacrificing efficiency for code that does not use Gc<T>.

  3. Do not require an allocator itself to extract metadata (such as the size of an allocation) from the initial address of the allocated block; instead, force the client to supply size at the deallocation site.

    In other words, do not provide a free(ptr)-based API. (This can improve overall performance on certain hot paths.)

  4. Incorporate data alignment constraints into the API, as many allocators have efficient support for meeting such constraints built-in, rather than forcing the client to build their own re-aligning wrapper around a malloc-style interface.

pnkfelix added some commits Aug 27, 2014

more content.
fleshed out discussion of call correspondence.

added initial work on standard high-level 1:n Alloc.

Moved note-to-self on Reap support to an unresolved question.
Checkpoint.
* Added note at outset that `typed_alloc` is not currently for direct
  implementation.

* Added "Example usage" section with one example motivated by a
  `Vec`-like library.  (Tried to add an example of a library using
  `InstanceAlloc`, but I could not make something small that also
  motivated use of `InstanceAlloc`.)

* Added sub-sections that follow `typed_alloc` module API.

* Removed `BaseAlloc` and its `tightly_gc_integrated`

* Attempted to clarify overflow behavior on `ArrayAlloc` methods.

* Removed the weak note I was worried about in `deinit_range`.  Also,
  alpha-renamed `deinit_range` to `reinit_range`.

* Added references to "fully-backed" terminology in `StdAlloc` and
  `Direct`.

* Added an unresolved question about whether I should represent the
  default allocator as just `()`.
@pcwalton

This comment has been minimized.

Show comment
Hide comment
@pcwalton

pcwalton Sep 19, 2014

Contributor

Trace hooks would be implemented as a trait, so they would have no overhead in the no-GC case.

Contributor

pcwalton commented Sep 19, 2014

Trace hooks would be implemented as a trait, so they would have no overhead in the no-GC case.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Sep 20, 2014

Contributor

So, a lot of good points have been raised in this discussion so far. It's late here, and I want to keep this short, so for now I'll just restrict this comment to how I think this RFC (and a hypothetical GC integration) would support @thestinger's example. I don't see why the traits in this RFC prevent that from happening. I think the misunderstanding is that the typed allocator traits defined in the RFC do not have to be the only legal way to allocate memory -- rather they are just the most convenient (and forwards compatible) way, because they handle all the GC integration for you. (They may also be more efficient than implementing a custom trace hook, should your values contain roots.)

Building a type like HashMap using a RawAlloc directly remains an option, but it means that this type would have to handle the GC integration itself. Of course GC integration is not described in the RFC, so that implies that any type that consumes a RawAlloc directly is inherently unstable until such time as the GC integration is stabilized, or we decide not to support a tracing GC at all. That's how I understand it, anyway. @pnkfelix can confirm.

If it turns out that we opt not to do GC, either because it has no purpose or because it can't be made to work, then this doesn't invalidate any code using the APIs described in this RFC. Instead it just means that the RawAlloc consumers are correct as is, since there are no additional requirements for GC integration. It seems to me that the traits described here are mostly still a pretty good fit, even without GC -- though we'd probably want to deprecate the deinit method, which wouldn't be needed. In particular, I'd still want the option of using a typed malloc interface, since it's just nicer to use.

Contributor

nikomatsakis commented Sep 20, 2014

So, a lot of good points have been raised in this discussion so far. It's late here, and I want to keep this short, so for now I'll just restrict this comment to how I think this RFC (and a hypothetical GC integration) would support @thestinger's example. I don't see why the traits in this RFC prevent that from happening. I think the misunderstanding is that the typed allocator traits defined in the RFC do not have to be the only legal way to allocate memory -- rather they are just the most convenient (and forwards compatible) way, because they handle all the GC integration for you. (They may also be more efficient than implementing a custom trace hook, should your values contain roots.)

Building a type like HashMap using a RawAlloc directly remains an option, but it means that this type would have to handle the GC integration itself. Of course GC integration is not described in the RFC, so that implies that any type that consumes a RawAlloc directly is inherently unstable until such time as the GC integration is stabilized, or we decide not to support a tracing GC at all. That's how I understand it, anyway. @pnkfelix can confirm.

If it turns out that we opt not to do GC, either because it has no purpose or because it can't be made to work, then this doesn't invalidate any code using the APIs described in this RFC. Instead it just means that the RawAlloc consumers are correct as is, since there are no additional requirements for GC integration. It seems to me that the traits described here are mostly still a pretty good fit, even without GC -- though we'd probably want to deprecate the deinit method, which wouldn't be needed. In particular, I'd still want the option of using a typed malloc interface, since it's just nicer to use.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Sep 20, 2014

Contributor

Of course I see a lot of comments were posted in the meantime while I was drafting that response. :) I'll try to catch up later on, perhaps what I wrote up winds up duplicating some of the discussion...

Contributor

nikomatsakis commented Sep 20, 2014

Of course I see a lot of comments were posted in the meantime while I was drafting that response. :) I'll try to catch up later on, perhaps what I wrote up winds up duplicating some of the discussion...

@Ericson2314

This comment has been minimized.

Show comment
Hide comment
@Ericson2314

Ericson2314 Sep 20, 2014

Contributor

Trace hooks would be implemented as a trait, so they would have no overhead in the no-GC case.

So this would be a trait, probably derivable (which would suffice for most cases), that would call the main tracing function on whatever field/address is necessary (and how to recur, etc)? That function and the method in this trait would just co-recur to traverse the whole heap? That was how I was thinking/hoping it would work.

Contributor

Ericson2314 commented Sep 20, 2014

Trace hooks would be implemented as a trait, so they would have no overhead in the no-GC case.

So this would be a trait, probably derivable (which would suffice for most cases), that would call the main tracing function on whatever field/address is necessary (and how to recur, etc)? That function and the method in this trait would just co-recur to traverse the whole heap? That was how I was thinking/hoping it would work.

@bjadamson

This comment has been minimized.

Show comment
Hide comment

cc me

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Sep 22, 2014

Member

Okay, some thoughts (scroll to the bottom for the pseudo-questions for you, dear reader):

  • I am glad that @thestinger provided a concrete example (namely HashMap<K,V> of where the provided typed_alloc API is insufficient.
  • If we were to only adopt the RawAlloc portions of this RFC (or if we were to stick with the status quo of calling alloc::heap::allocate(size, align) for most library allocations), I would see that as being equivalent to giving up on language support for GC. That would be a real shame since I think we are really close to having a nice low-level language with a static semantics that can really make GC integration nice.
  • As @nikomatsakis said, the two traits in the typed_alloc module written here are meant to maximize forward-compatibility for their clients, under the assumption that we will add a tracing GC in the future.
    • I want to try to encourage client code to reach for some type-based abstraction before considering trace-hooks for GC integration (the latter, as @thestinger points out, are not part of this RFC). There are a collection of object layouts, inferable from the type alone, that we can special-case in a GC to keep its inner loop fast. I'm thinking of "a contiguous series of words of type Gc<T>, followed by an inline [u8]", for example. If we only offered a trace-hook API, it may be infeasible to get the same kind of inner loop in the GC; which would not be the end of the world, but still unfortunate. (I believe that such object layouts will also be common in code that resorts to GC, but this is just a guess at this point.)
    • Having said that, I am now thinking it is inevitable that we will need to expose some sort of trace hook API along the lines proposed by @pcwalton eventually, for essentially the reasons that both @pcwalton and @thestinger have given here (and that I think others have argued in various issues on rust-lang/rust). But I do not want to attempt to specify GC integraion to that level of detail before I have at least implemented a prototype to work with directly.
    • The strategy @nikomatsakis lays out, of allowing client to bypass typed_alloc (at least for now) with the understanding that they may need to be revised in the future, while clients that can use the given typed_alloc interface will be guaranteed forward-compatibility, seems reasonable to me. I guess the relevant question there is how many clients can usefully employ the traits in the mod typed_alloc from the RFC?
    • Related to that latter bullet: while reflecting on the typed_alloc API over the weekend, I think the posted RFC omits support for many of the interesting DST types, such as struct S { count: uint, data: [int] }. The AllocCore API (that was moved to a non-normative appendix) is able to describe the allocation of instances of S, but it was deliberately removed from the normative part of the RFC due to its untested and complex nature.

Okay, so, some closing thoughts, which I would like feedback on from the participants:

  1. I still think this RFC has value, but it needs to say up front that it does not handle every use case.
  2. It may need to be revised so that it can cover arbitrary DSTs as well. I am hoping we can come up with an acceptable API in the short-term that does not need something as complex as the given AllocCore, though I am not sure whether that is feasible.
    • If I'm going to try to put something like AllocCore into the RFC, it may be nice to figure out if it can be further generalized to cover HashMap<K,V>'s three-inner-vectors in one allocation case as well. But perhaps that is over-engineering.
Member

pnkfelix commented Sep 22, 2014

Okay, some thoughts (scroll to the bottom for the pseudo-questions for you, dear reader):

  • I am glad that @thestinger provided a concrete example (namely HashMap<K,V> of where the provided typed_alloc API is insufficient.
  • If we were to only adopt the RawAlloc portions of this RFC (or if we were to stick with the status quo of calling alloc::heap::allocate(size, align) for most library allocations), I would see that as being equivalent to giving up on language support for GC. That would be a real shame since I think we are really close to having a nice low-level language with a static semantics that can really make GC integration nice.
  • As @nikomatsakis said, the two traits in the typed_alloc module written here are meant to maximize forward-compatibility for their clients, under the assumption that we will add a tracing GC in the future.
    • I want to try to encourage client code to reach for some type-based abstraction before considering trace-hooks for GC integration (the latter, as @thestinger points out, are not part of this RFC). There are a collection of object layouts, inferable from the type alone, that we can special-case in a GC to keep its inner loop fast. I'm thinking of "a contiguous series of words of type Gc<T>, followed by an inline [u8]", for example. If we only offered a trace-hook API, it may be infeasible to get the same kind of inner loop in the GC; which would not be the end of the world, but still unfortunate. (I believe that such object layouts will also be common in code that resorts to GC, but this is just a guess at this point.)
    • Having said that, I am now thinking it is inevitable that we will need to expose some sort of trace hook API along the lines proposed by @pcwalton eventually, for essentially the reasons that both @pcwalton and @thestinger have given here (and that I think others have argued in various issues on rust-lang/rust). But I do not want to attempt to specify GC integraion to that level of detail before I have at least implemented a prototype to work with directly.
    • The strategy @nikomatsakis lays out, of allowing client to bypass typed_alloc (at least for now) with the understanding that they may need to be revised in the future, while clients that can use the given typed_alloc interface will be guaranteed forward-compatibility, seems reasonable to me. I guess the relevant question there is how many clients can usefully employ the traits in the mod typed_alloc from the RFC?
    • Related to that latter bullet: while reflecting on the typed_alloc API over the weekend, I think the posted RFC omits support for many of the interesting DST types, such as struct S { count: uint, data: [int] }. The AllocCore API (that was moved to a non-normative appendix) is able to describe the allocation of instances of S, but it was deliberately removed from the normative part of the RFC due to its untested and complex nature.

Okay, so, some closing thoughts, which I would like feedback on from the participants:

  1. I still think this RFC has value, but it needs to say up front that it does not handle every use case.
  2. It may need to be revised so that it can cover arbitrary DSTs as well. I am hoping we can come up with an acceptable API in the short-term that does not need something as complex as the given AllocCore, though I am not sure whether that is feasible.
    • If I'm going to try to put something like AllocCore into the RFC, it may be nice to figure out if it can be further generalized to cover HashMap<K,V>'s three-inner-vectors in one allocation case as well. But perhaps that is over-engineering.
@pczarn

This comment has been minimized.

Show comment
Hide comment
@pczarn

pczarn Sep 22, 2014

nit: if one interested (is?)

nit: if one interested (is?)

@pczarn

This comment has been minimized.

Show comment
Hide comment
@pczarn

pczarn Sep 22, 2014

The thing that is needed to cover 99% of cases, currently including collections implemented in the std lib, is allocation of such unzipped 'multi-vectors'. To my understanding, trace hooks can handle many instances and arrays at arbitrary offsets in one allocation. Perhaps there's middle ground. In place of simple trace hooks, could there be a option to provide a function from *mut T to e.g. (*mut U, *mut W)? Have multidimensional DSTs ever been discussed?

I'm planning to make HashMap's internal representation have two-inner-vectors, and possibly also allocate them separately, if there's benefit to it.

pczarn commented Sep 22, 2014

The thing that is needed to cover 99% of cases, currently including collections implemented in the std lib, is allocation of such unzipped 'multi-vectors'. To my understanding, trace hooks can handle many instances and arrays at arbitrary offsets in one allocation. Perhaps there's middle ground. In place of simple trace hooks, could there be a option to provide a function from *mut T to e.g. (*mut U, *mut W)? Have multidimensional DSTs ever been discussed?

I'm planning to make HashMap's internal representation have two-inner-vectors, and possibly also allocate them separately, if there's benefit to it.

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Sep 22, 2014

Contributor

The kind of "multivector" thing @pczarn mentioned is definitely something I'd like to see looked at in some form or another. BTree also wants it for its nodes, and I understand it's a very common pattern in (at least) gamedev for avoiding multiple allocations.

Contributor

Gankro commented Sep 22, 2014

The kind of "multivector" thing @pczarn mentioned is definitely something I'd like to see looked at in some form or another. BTree also wants it for its nodes, and I understand it's a very common pattern in (at least) gamedev for avoiding multiple allocations.

@pczarn

This comment has been minimized.

Show comment
Hide comment
@pczarn

pczarn Sep 23, 2014

However, BTree's nodes may contain vectors that vary in length. I guess the handling of this case would be more complex.

pczarn commented Sep 23, 2014

However, BTree's nodes may contain vectors that vary in length. I guess the handling of this case would be more complex.

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Sep 23, 2014

Contributor

If you could express "allocate me a buffer to fit x T's, y U's, and z V's, and return a ptr to it" and "interpret this ptr as a buffer to x T's, y U's, and z V's, and return me a ptr for each buffer", I think that would address the use case. Then you can wrap that logic to represent the capacities in as efficient of a way as possible/desired in the given domain. In the case of BTree, the capacities are (x, x, x+1), so the wrapper just needs to remember x. A generic multi-vector could be built that only supports all-equal capacity, or only supports all-different capacity, if desired.

Contributor

Gankro commented Sep 23, 2014

If you could express "allocate me a buffer to fit x T's, y U's, and z V's, and return a ptr to it" and "interpret this ptr as a buffer to x T's, y U's, and z V's, and return me a ptr for each buffer", I think that would address the use case. Then you can wrap that logic to represent the capacities in as efficient of a way as possible/desired in the given domain. In the case of BTree, the capacities are (x, x, x+1), so the wrapper just needs to remember x. A generic multi-vector could be built that only supports all-equal capacity, or only supports all-different capacity, if desired.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Sep 23, 2014

Contributor

On Mon, Sep 22, 2014 at 04:38:28AM -0700, Felix S Klock II wrote:

  1. I still think this RFC has value, but it needs to say up front that it does not handle every use case.

Sure.

  1. It may need to be revised so that it can cover arbitrary DSTs as well. I am hoping we can come up with an acceptable API in the short-term that does not need something as complex as the given AllocCore, though I am not sure whether that is feasible.

I personally don't think this needs to be in this RFC and could come
in a follow-up at some later time (or may not even need an RFC at all
-- it's just a new library, basically).

To me, the key questions that decide the fate of this RFC at this
time
are (1) whether we think these APIs make sense even without a GC
and (2) what is the impetus to standardize now. I think it's clear
that a typed interface is useful with a GC, and it seems like we've
established that we can accommodate custom layouts in various ways.
But what happens if we decide that support for GC types doesn't carry
its weight?

It seems to me that having typed APIs is still useful even without GC,
just as a convenience. We might simplify the API somewhat. We
certainly wouldn't need a deinit method. We might not want a trait
at all, and we could instead make the typed methods just be top-level
fns or default methods in the RawAlloc trait. That said, I can still
imagine wanting to use the typed API in order to have type-specific
policies (though it requires either reflection or impl specialization
to do so, since you need some way to "test" a type T and decide what
its properties are). So it seems that if we wound up with GC, the APIs
in this RFC might be slightly more complex than necessary, but not too
much -- at worst we have one deprecated method (deinit) and a
wrapper type to go from raw to typed that seems somewhat superfluous
(but hey, composition over inheritance is all the rage these days
anyway).

Still, as @strcat's original message suggested, we can maintain the
status quo and hold off on standardizing any allocator API. The main
drivers for standardizing sooner, I think, are the following:

a. it encourages people to use the typed APIs when suitable, thus
making more code forwards compatible with tracing later. Otherwise
tracing will invalidate quite a bit of unsafe code.
b. it allows us to make a number of types -- not necessarily including
hashtable -- parameterized by an allocator in a stable way.

I'm not sure yet whether it makes more sense to take this RFC, some
variant of this RFC that tries to do a bit less, or no RFC yet. I
personally am not ready yet to give up on support for tracing -- it
seems to be within reach and without great cost (to code that doesn't
use it), though obviously the jury is still out.

Contributor

nikomatsakis commented Sep 23, 2014

On Mon, Sep 22, 2014 at 04:38:28AM -0700, Felix S Klock II wrote:

  1. I still think this RFC has value, but it needs to say up front that it does not handle every use case.

Sure.

  1. It may need to be revised so that it can cover arbitrary DSTs as well. I am hoping we can come up with an acceptable API in the short-term that does not need something as complex as the given AllocCore, though I am not sure whether that is feasible.

I personally don't think this needs to be in this RFC and could come
in a follow-up at some later time (or may not even need an RFC at all
-- it's just a new library, basically).

To me, the key questions that decide the fate of this RFC at this
time
are (1) whether we think these APIs make sense even without a GC
and (2) what is the impetus to standardize now. I think it's clear
that a typed interface is useful with a GC, and it seems like we've
established that we can accommodate custom layouts in various ways.
But what happens if we decide that support for GC types doesn't carry
its weight?

It seems to me that having typed APIs is still useful even without GC,
just as a convenience. We might simplify the API somewhat. We
certainly wouldn't need a deinit method. We might not want a trait
at all, and we could instead make the typed methods just be top-level
fns or default methods in the RawAlloc trait. That said, I can still
imagine wanting to use the typed API in order to have type-specific
policies (though it requires either reflection or impl specialization
to do so, since you need some way to "test" a type T and decide what
its properties are). So it seems that if we wound up with GC, the APIs
in this RFC might be slightly more complex than necessary, but not too
much -- at worst we have one deprecated method (deinit) and a
wrapper type to go from raw to typed that seems somewhat superfluous
(but hey, composition over inheritance is all the rage these days
anyway).

Still, as @strcat's original message suggested, we can maintain the
status quo and hold off on standardizing any allocator API. The main
drivers for standardizing sooner, I think, are the following:

a. it encourages people to use the typed APIs when suitable, thus
making more code forwards compatible with tracing later. Otherwise
tracing will invalidate quite a bit of unsafe code.
b. it allows us to make a number of types -- not necessarily including
hashtable -- parameterized by an allocator in a stable way.

I'm not sure yet whether it makes more sense to take this RFC, some
variant of this RFC that tries to do a bit less, or no RFC yet. I
personally am not ready yet to give up on support for tracing -- it
seems to be within reach and without great cost (to code that doesn't
use it), though obviously the jury is still out.

@zwarich

This comment has been minimized.

Show comment
Hide comment
@zwarich

zwarich Sep 26, 2014

Contributor

@pnkfelix What are the drawbacks of a GC API based entirely on a GcTraceable trait, where the compiler identifies the local variables that must be traced via this trait, with no allocator support? Obviously you get into situations like trait objects where a type is abstracted over with an existential, and for soundness you need to expose GcTraceable in the type of the trait object, but what else goes wrong?

Contributor

zwarich commented Sep 26, 2014

@pnkfelix What are the drawbacks of a GC API based entirely on a GcTraceable trait, where the compiler identifies the local variables that must be traced via this trait, with no allocator support? Obviously you get into situations like trait objects where a type is abstracted over with an existential, and for soundness you need to expose GcTraceable in the type of the trait object, but what else goes wrong?

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Sep 26, 2014

Member

@zwarich The main issue that the allocator support here is trying to resolve is the case where you have blocks of allocated data (not managed by the GC heap) that still needs to be treated as additional roots by the GC [*]. Note that some such blocks might currently only be reachable via pointers held in native libraries; they may not be visible via any path from a local variable that is statically known to the compiler.

There are a number of ways one can attempt to address the above problem. (For example, one could require separate manual registration of such blocks of memory before allowing the pointers to them from being lost from the locally scanned roots.) The design of this RFC is to choose a high-level approach that can accommodate many different approaches, at a level of abstraction that was thought to be easy to understand.

[*] I posted a more complete explanation of my thinking here: http://discuss.rust-lang.org/t/on-native-allocations-pointing-into-a-gc-heap/564

Member

pnkfelix commented Sep 26, 2014

@zwarich The main issue that the allocator support here is trying to resolve is the case where you have blocks of allocated data (not managed by the GC heap) that still needs to be treated as additional roots by the GC [*]. Note that some such blocks might currently only be reachable via pointers held in native libraries; they may not be visible via any path from a local variable that is statically known to the compiler.

There are a number of ways one can attempt to address the above problem. (For example, one could require separate manual registration of such blocks of memory before allowing the pointers to them from being lost from the locally scanned roots.) The design of this RFC is to choose a high-level approach that can accommodate many different approaches, at a level of abstraction that was thought to be easy to understand.

[*] I posted a more complete explanation of my thinking here: http://discuss.rust-lang.org/t/on-native-allocations-pointing-into-a-gc-heap/564

@mitsuhiko

This comment has been minimized.

Show comment
Hide comment
@mitsuhiko

mitsuhiko Oct 1, 2014

Contributor

One thing I want to add is that the horrible allocator system in the STL is the primary reason why a large part of the C++ community that really likes performance (gamedev) does not use it. Because Rust's language design generally encourages stuff to allocate memory everywhere I think an allocator API generally is important.

One part that has not been mentioned here however is that it's not just collections that want to allocate memory. There are also algorithms and other things. One common pattern is that you have a certain subsystem and you want to make sure that everything that subsystem does is allocated through one allocator so that you can track your memory patterns.

Maybe it would make sense to collect some outsider feedback on allocators?

Contributor

mitsuhiko commented Oct 1, 2014

One thing I want to add is that the horrible allocator system in the STL is the primary reason why a large part of the C++ community that really likes performance (gamedev) does not use it. Because Rust's language design generally encourages stuff to allocate memory everywhere I think an allocator API generally is important.

One part that has not been mentioned here however is that it's not just collections that want to allocate memory. There are also algorithms and other things. One common pattern is that you have a certain subsystem and you want to make sure that everything that subsystem does is allocated through one allocator so that you can track your memory patterns.

Maybe it would make sense to collect some outsider feedback on allocators?

@huonw

This comment has been minimized.

Show comment
Hide comment
@huonw

huonw Oct 1, 2014

Member

Another thing for the typed allocation stuff, the Vec.map_in_place method converts a Vec<T> to a Vec<U> in place (T and U have to be compatible); this presumably wrecks havoc on trying to 'statically' classify allocations.

Member

huonw commented Oct 1, 2014

Another thing for the typed allocation stuff, the Vec.map_in_place method converts a Vec<T> to a Vec<U> in place (T and U have to be compatible); this presumably wrecks havoc on trying to 'statically' classify allocations.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Oct 2, 2014

Contributor

On Wed, Oct 01, 2014 at 04:06:13PM -0700, Huon Wilson wrote:

Another thing for the typed allocation stuff, the
Vec.map_in_place
method

converts a Vec<T> to a Vec<U> in place (T and U have to be
compatible); this presumably wrecks havoc on trying to 'statically'
classify allocations.

Indeed. That seems like it would require a "reclassify" method to
accommodate (or else a custom tracer for vecs, but if even vecs
require a custom tracer than perhaps more-or-less everything does).
Of course it might be easier -- and satisfactory -- to slap a
NoManaged bound on the man-in-place method.

(As an aside, this seems like a nice use-case for having a marker trait
that checks that types have same size/alignment/whatever. This could
be used for transmute() as well in place of the ad-hoc compiler checks
we do now. I was thinking of trying to implement this in my copious
spare time just to see how hard it is (should be fairly easy, I
think).)

Contributor

nikomatsakis commented Oct 2, 2014

On Wed, Oct 01, 2014 at 04:06:13PM -0700, Huon Wilson wrote:

Another thing for the typed allocation stuff, the
Vec.map_in_place
method

converts a Vec<T> to a Vec<U> in place (T and U have to be
compatible); this presumably wrecks havoc on trying to 'statically'
classify allocations.

Indeed. That seems like it would require a "reclassify" method to
accommodate (or else a custom tracer for vecs, but if even vecs
require a custom tracer than perhaps more-or-less everything does).
Of course it might be easier -- and satisfactory -- to slap a
NoManaged bound on the man-in-place method.

(As an aside, this seems like a nice use-case for having a marker trait
that checks that types have same size/alignment/whatever. This could
be used for transmute() as well in place of the ad-hoc compiler checks
we do now. I was thinking of trying to implement this in my copious
spare time just to see how hard it is (should be fairly easy, I
think).)

@glaebhoerl glaebhoerl referenced this pull request Oct 25, 2014

Open

Garbage collection #415

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Nov 11, 2014

Contributor

@pnkfelix Are the contents of this RFC still relevant for discussion, or has your design substantially changed?

Contributor

Gankro commented Nov 11, 2014

@pnkfelix Are the contents of this RFC still relevant for discussion, or has your design substantially changed?

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Nov 13, 2014

Member

closing; I want to do a take III at some point that handles cases like the HashMap implementation more properly.

Member

pnkfelix commented Nov 13, 2014

closing; I want to do a take III at some point that handles cases like the HashMap implementation more properly.

@pnkfelix pnkfelix closed this Nov 13, 2014

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Nov 17, 2014

Contributor

I have been instructed to leave any allocator notes here, for future reference. A lot of this is me just trying to transcribe what was discussed in IRC.

First off, I really like the high-level API design here, it removes a lot of the boiler-plate we're seeing in libcollections now. Vec and Ringbuf as well as HashMap's RawTable and BTree's Node (in rust-lang/rust/#18028) are duplicating a lot of logic that I'd like to see moved into a higher-level API. I've been deferring trying to do this because this might trivially fall out this allocator work. However it looks like I may create a mock version of the top-level API as a shim to improve code quality while this bakes.

I'd like to note that non-global allocators put libcollections in an awkward spot. Unless the local allocator is managed by some global one, we will almost certainly have to have every Box/Rc/etc contain a pointer back to their local allocator instance. This would add an extra uint that is strictly unnecessary in the context of single-allocator collections. Therefore collections that use Box (currently TrieMap, TreeMap, and DList), may be forced to use only raw ptrs in order to avoid this. This would rack up our unsafeness in those collections by a significant amount. The interaction with unwinding is also unfortunate.

One solution to this would be an aborting-on-destruction linear type Box. It would basically be a Box that you have to manually deallocate. Let's call it Mox. Mox could have the following API:

pub struct Mox<T, A>{
  ptr: *mut T
  allocator: ContravariantType<A>
}

// add allocation/emplace/new logic here

impl<T, A> Mox<T, A> {
  /// Deallocate and drop contents, unsafe because it assumes the right allocator was given
  unsafe fn deallocate(self, allocator: A);
  /// Deallocate and return contents, unsafe because it assumes the right allocator was given
  unsafe fn into_inner(self, allocator: A) -> T;
}

impl<T,A> Drop for Mox<T, A> {
  fn drop(&mut self) {
    // we'll never be correctly freed if this happens, so just abort rather than leaking memory
    abort!();
  }
}

This gets rid of most of the boiler-plate, and ensures that either we handle the memory correctly or abort. However there doesn't seem to be any way to ensure that the right allocator is used beyond the fact that it was the right kind of allocator. Of course just accepting the uint-per-box overhead will always work exactly right.

I would likely to reaffirm my desire for a multialloc! macro that take an allocator and (Type, len) pairs, allocates all the space needed in a single buffer, and then return a tuple of all the ptrs to the "individual" arrays. Would also need a multidealloc!. HashMap wants this, BTree will want this soon.

We also need to formally work out the OOM semantics for libstd and libcollections in particular. On IRC @nikomatsakis suggested that panic! would be acceptable for the standard library. This sucks for anyone who wants different behaviour, though (Kernels, Embedded, ..?). To accommodate them we would have to, I guess, return Results for every allocating operation. This would be very noisy, and likely less efficient. It has been suggested that in the future we can offer an alternative collections library with these semantics. Although the maintenance cost would likely be nasty. Maybe codegen can handle it.

Still, if we're all happy with panic for now, I would like to have a "just panic on OOM" (or I guess call alloc::oom()?) wrapper for allocators so that users can focus on the logic that matters, and not checking for nulls. Having a wrapper that returns a Result would also be interesting.

@nikomatsakis also suggested the possibility of adding more built-in pointer types, like one that is explicitly not null, but is otherwise unsafe to use (might be uninitialized, for one). The suggested wrappers could use such a ptr type.

Finally, I'm not sure if this is strictly in the scope of the allocators API, but it feels strongly related, so I want to note it in case it has any influence on design. There is strong desire to eventually have support for a HeapVec (what we have now), StackVec (uses a [T, ..n] as backing storage-- potentially avoids storing any capacity field in favour of having n part of its type), and HybridVec (StackVec that falls back to HeapVec when too full). Of course, there's nothing fundamental to Vec; all of our array-based collections could leverage this functionality. I believe this can be achieved by making these collections generic over a Buffer provider. This is very similar to what the allocator API provides, but a bit higher level. The capacity and ptr management would be completely abstracted away. In this way the StackBuffer can omit storing a capacity, and the HeapBuffer can reallocate.

Here's a sketch of what this would look like (haven't dug too deeply into it because I don't know what shape the allocator API will take):

pub trait Buffer<T> {
    /// Gets the capacity this buffer has. 
    ///
    /// May be zero for HeapBuffer, in which case it hasn't actually allocated 
    /// yet. Need to call `grow` on it first.
    ///
    /// For a StackBuffer this would be a fixed constant.
    fn capacity(&self) -> uint;

    // maybe this?
    /// This buffer cannot reserve more space than this. 
    ///
    /// Useful for dealing with next_power_of_two amortization strategies 
    /// with something inflexible like StackBuffer.
    // fn max_capacity(&self) -> uint; 

    /// Gets a raw ptr to the head of the buffer's storage. This should not be
    /// cached between operations, as `grow` and `shrink` may change where the
    /// buffer is.
    fn ptr(&mut self) -> *mut T;

    /// Asks the buffer to grow. Returns the new size. 
    ///
    /// StackBuffer naturally ignores this. Unsure if this should be handled by:
    /// * panic
    /// * return a Result
    /// * simply report that the capacity is the same, expect them to check
    fn grow(&mut self, uint: amount) -> uint;

    /// Asks the buffer to shrink. Returns the new size. StackBuffer naturally 
    /// ignores this.
    ///
    /// see above
    fn shrink(&mut self, uint: amount) -> uint;

    // Should Buffers free themselves when Dropped, or have a manual deallocate
    // call? Do we want to support taking an &[T] as a Buffer so someone can
    // populate it with a Vec that gets mem::forgotten?

    // May want try_grow_inplace for RingBuf, as it wants to manually handle
    // copying in the non-inplace case.
}

struct HeapBuffer<T, A: Allocator> {
    ptr: *mut T,
    capacity: uint,
    allocator: A,
}

// hypothetical world where n can be the type, and you can manually_drop the
// contents of a fixed-size array.
struct StackBuffer<T, n> {
    buf: [T, ..n],
}

struct HybridBuffer<T, n, A: Allocator> {
    bool: on_stack,
    stack_buf: StackBuffer<T, n>,
    heap_buf: HeapBuffer<T, A>,
}

// impl buffer for all these...

struct MetaVec<T, B: Buffer<T>> {
    len: uint,
    buf: B, // note, for B=HeapBuffer, this is indentical to current Vec
}

impl<T, B: Default> MetaVec<T, B> {
    fn new() -> MetaVec {
        MetaVec {
            len: 0,
            // Makes a [T, n] of uninitialized() for StackVec, just makes a null ptr for HeapVec
            buf: Default::default(); 
        }
    }
}

// support with_buffer as well?

impl<T, B> MetaVec<T, B> {
    fn push(&mut self, elem: T) {
        self.reserve(1); 
        let len = self.len() as int;
        unsafe {
            //using my proposed ptr methods here
            self.buf.ptr().offset(len).write(elem); 
        }
    }

    fn reserve(&mut self, additional: uint) {
        let len = self.len();
        let desired_cap = len.checked_add(additional).expect("capacity overflow");
        let old_cap = self.buf.capacity();
        if desired_cap > old_cap {
            let new_cap = checked_next_power_of_two(desired_cap).expect("capacity overflow");
            let max_cap = self.buf.max_capacity();
            // This will immediately panic for StackBuffer (assuming we choose those semantics)
            // May also panic for OOM, capacity overflow, or "max size of contiguous buffer issues"
            self.buf.grow(new_cap - old_cap);
        }
    }
}

// drop/clear/etc 
Contributor

Gankro commented Nov 17, 2014

I have been instructed to leave any allocator notes here, for future reference. A lot of this is me just trying to transcribe what was discussed in IRC.

First off, I really like the high-level API design here, it removes a lot of the boiler-plate we're seeing in libcollections now. Vec and Ringbuf as well as HashMap's RawTable and BTree's Node (in rust-lang/rust/#18028) are duplicating a lot of logic that I'd like to see moved into a higher-level API. I've been deferring trying to do this because this might trivially fall out this allocator work. However it looks like I may create a mock version of the top-level API as a shim to improve code quality while this bakes.

I'd like to note that non-global allocators put libcollections in an awkward spot. Unless the local allocator is managed by some global one, we will almost certainly have to have every Box/Rc/etc contain a pointer back to their local allocator instance. This would add an extra uint that is strictly unnecessary in the context of single-allocator collections. Therefore collections that use Box (currently TrieMap, TreeMap, and DList), may be forced to use only raw ptrs in order to avoid this. This would rack up our unsafeness in those collections by a significant amount. The interaction with unwinding is also unfortunate.

One solution to this would be an aborting-on-destruction linear type Box. It would basically be a Box that you have to manually deallocate. Let's call it Mox. Mox could have the following API:

pub struct Mox<T, A>{
  ptr: *mut T
  allocator: ContravariantType<A>
}

// add allocation/emplace/new logic here

impl<T, A> Mox<T, A> {
  /// Deallocate and drop contents, unsafe because it assumes the right allocator was given
  unsafe fn deallocate(self, allocator: A);
  /// Deallocate and return contents, unsafe because it assumes the right allocator was given
  unsafe fn into_inner(self, allocator: A) -> T;
}

impl<T,A> Drop for Mox<T, A> {
  fn drop(&mut self) {
    // we'll never be correctly freed if this happens, so just abort rather than leaking memory
    abort!();
  }
}

This gets rid of most of the boiler-plate, and ensures that either we handle the memory correctly or abort. However there doesn't seem to be any way to ensure that the right allocator is used beyond the fact that it was the right kind of allocator. Of course just accepting the uint-per-box overhead will always work exactly right.

I would likely to reaffirm my desire for a multialloc! macro that take an allocator and (Type, len) pairs, allocates all the space needed in a single buffer, and then return a tuple of all the ptrs to the "individual" arrays. Would also need a multidealloc!. HashMap wants this, BTree will want this soon.

We also need to formally work out the OOM semantics for libstd and libcollections in particular. On IRC @nikomatsakis suggested that panic! would be acceptable for the standard library. This sucks for anyone who wants different behaviour, though (Kernels, Embedded, ..?). To accommodate them we would have to, I guess, return Results for every allocating operation. This would be very noisy, and likely less efficient. It has been suggested that in the future we can offer an alternative collections library with these semantics. Although the maintenance cost would likely be nasty. Maybe codegen can handle it.

Still, if we're all happy with panic for now, I would like to have a "just panic on OOM" (or I guess call alloc::oom()?) wrapper for allocators so that users can focus on the logic that matters, and not checking for nulls. Having a wrapper that returns a Result would also be interesting.

@nikomatsakis also suggested the possibility of adding more built-in pointer types, like one that is explicitly not null, but is otherwise unsafe to use (might be uninitialized, for one). The suggested wrappers could use such a ptr type.

Finally, I'm not sure if this is strictly in the scope of the allocators API, but it feels strongly related, so I want to note it in case it has any influence on design. There is strong desire to eventually have support for a HeapVec (what we have now), StackVec (uses a [T, ..n] as backing storage-- potentially avoids storing any capacity field in favour of having n part of its type), and HybridVec (StackVec that falls back to HeapVec when too full). Of course, there's nothing fundamental to Vec; all of our array-based collections could leverage this functionality. I believe this can be achieved by making these collections generic over a Buffer provider. This is very similar to what the allocator API provides, but a bit higher level. The capacity and ptr management would be completely abstracted away. In this way the StackBuffer can omit storing a capacity, and the HeapBuffer can reallocate.

Here's a sketch of what this would look like (haven't dug too deeply into it because I don't know what shape the allocator API will take):

pub trait Buffer<T> {
    /// Gets the capacity this buffer has. 
    ///
    /// May be zero for HeapBuffer, in which case it hasn't actually allocated 
    /// yet. Need to call `grow` on it first.
    ///
    /// For a StackBuffer this would be a fixed constant.
    fn capacity(&self) -> uint;

    // maybe this?
    /// This buffer cannot reserve more space than this. 
    ///
    /// Useful for dealing with next_power_of_two amortization strategies 
    /// with something inflexible like StackBuffer.
    // fn max_capacity(&self) -> uint; 

    /// Gets a raw ptr to the head of the buffer's storage. This should not be
    /// cached between operations, as `grow` and `shrink` may change where the
    /// buffer is.
    fn ptr(&mut self) -> *mut T;

    /// Asks the buffer to grow. Returns the new size. 
    ///
    /// StackBuffer naturally ignores this. Unsure if this should be handled by:
    /// * panic
    /// * return a Result
    /// * simply report that the capacity is the same, expect them to check
    fn grow(&mut self, uint: amount) -> uint;

    /// Asks the buffer to shrink. Returns the new size. StackBuffer naturally 
    /// ignores this.
    ///
    /// see above
    fn shrink(&mut self, uint: amount) -> uint;

    // Should Buffers free themselves when Dropped, or have a manual deallocate
    // call? Do we want to support taking an &[T] as a Buffer so someone can
    // populate it with a Vec that gets mem::forgotten?

    // May want try_grow_inplace for RingBuf, as it wants to manually handle
    // copying in the non-inplace case.
}

struct HeapBuffer<T, A: Allocator> {
    ptr: *mut T,
    capacity: uint,
    allocator: A,
}

// hypothetical world where n can be the type, and you can manually_drop the
// contents of a fixed-size array.
struct StackBuffer<T, n> {
    buf: [T, ..n],
}

struct HybridBuffer<T, n, A: Allocator> {
    bool: on_stack,
    stack_buf: StackBuffer<T, n>,
    heap_buf: HeapBuffer<T, A>,
}

// impl buffer for all these...

struct MetaVec<T, B: Buffer<T>> {
    len: uint,
    buf: B, // note, for B=HeapBuffer, this is indentical to current Vec
}

impl<T, B: Default> MetaVec<T, B> {
    fn new() -> MetaVec {
        MetaVec {
            len: 0,
            // Makes a [T, n] of uninitialized() for StackVec, just makes a null ptr for HeapVec
            buf: Default::default(); 
        }
    }
}

// support with_buffer as well?

impl<T, B> MetaVec<T, B> {
    fn push(&mut self, elem: T) {
        self.reserve(1); 
        let len = self.len() as int;
        unsafe {
            //using my proposed ptr methods here
            self.buf.ptr().offset(len).write(elem); 
        }
    }

    fn reserve(&mut self, additional: uint) {
        let len = self.len();
        let desired_cap = len.checked_add(additional).expect("capacity overflow");
        let old_cap = self.buf.capacity();
        if desired_cap > old_cap {
            let new_cap = checked_next_power_of_two(desired_cap).expect("capacity overflow");
            let max_cap = self.buf.max_capacity();
            // This will immediately panic for StackBuffer (assuming we choose those semantics)
            // May also panic for OOM, capacity overflow, or "max size of contiguous buffer issues"
            self.buf.grow(new_cap - old_cap);
        }
    }
}

// drop/clear/etc 
@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Nov 17, 2014

Contributor

Modified the MetaVec example after I realized max_capacity is unecessary.

Contributor

Gankro commented Nov 17, 2014

Modified the MetaVec example after I realized max_capacity is unecessary.

@Ericson2314

This comment has been minimized.

Show comment
Hide comment
@Ericson2314

Ericson2314 Nov 18, 2014

Contributor

Grankro, can't the Allocator instance for a global allocator be a zero sized type so there is no overhead in the global allocator case?

Contributor

Ericson2314 commented Nov 18, 2014

Grankro, can't the Allocator instance for a global allocator be a zero sized type so there is no overhead in the global allocator case?

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Nov 18, 2014

Contributor

Yes, global allocators can have no overhead, as is the case today.

Contributor

Gankro commented Nov 18, 2014

Yes, global allocators can have no overhead, as is the case today.

@Ericson2314

This comment has been minimized.

Show comment
Hide comment
@Ericson2314

Ericson2314 Nov 18, 2014

Contributor

I'd like to note that non-global allocators put libcollections in an awkward spot. Unless the local allocator is managed by some global one, we will almost certainly have to have every Box/Rc/etc contain a pointer back to their local allocator instance. This would add an extra uint that is strictly unnecessary in the context of single-allocator collections.

Wait, then what is the problem you mean by this?

Contributor

Ericson2314 commented Nov 18, 2014

I'd like to note that non-global allocators put libcollections in an awkward spot. Unless the local allocator is managed by some global one, we will almost certainly have to have every Box/Rc/etc contain a pointer back to their local allocator instance. This would add an extra uint that is strictly unnecessary in the context of single-allocator collections.

Wait, then what is the problem you mean by this?

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Nov 18, 2014

Contributor

If you have a non-global allocator. That is one you make several concrete instances of, and dole out to specific collections at runtime, the Boxes that use these allocators need a pointer to their local allocator (or some other identifier that prevents the allocator from being dropped, and lets the Box's destructor call into it).

Of course there's some possible work-arounds. Like if the allocators are managed by a global allocator that can figure out which sub-allocators control which pointers. But in general some local allocators will need the boxes to be bloated with runtime metadata. Especially since the value of local custom allocators (I'm told) is in their simplicity. Presumably this is unacceptable overhead for our collections to inherit when all the boxes in a single collection "use" the same allocator.


Another thing I just remembered. We need a mechanism to identify if two Allocators are the same at runtime. This is necessary to support functionality like moving the Node of one DList to another without allocating.

Contributor

Gankro commented Nov 18, 2014

If you have a non-global allocator. That is one you make several concrete instances of, and dole out to specific collections at runtime, the Boxes that use these allocators need a pointer to their local allocator (or some other identifier that prevents the allocator from being dropped, and lets the Box's destructor call into it).

Of course there's some possible work-arounds. Like if the allocators are managed by a global allocator that can figure out which sub-allocators control which pointers. But in general some local allocators will need the boxes to be bloated with runtime metadata. Especially since the value of local custom allocators (I'm told) is in their simplicity. Presumably this is unacceptable overhead for our collections to inherit when all the boxes in a single collection "use" the same allocator.


Another thing I just remembered. We need a mechanism to identify if two Allocators are the same at runtime. This is necessary to support functionality like moving the Node of one DList to another without allocating.

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Nov 18, 2014

Member

@Gankro I'm not sure if you addressed this or not, but another work-around for the desired mapping, that is, the mapping from: (a value (i.e. address of some heap block) + zero-sized per-value source-allocator type) and to: the allocator instance, would be to play tricks within the allocator itself where one can derive a pointer to the allocator instance from the heap block address.

Concrete example: the allocator could ensure that all the blocks it hands out come from a pool where applying bitmask on the addresses it hands out yields the address of a pointer to the source allocator. This provides a foundation for a cheap mapping from an object's address to its source allocator.

Note: This is basically trading fragmentation in one spot (the per-allocation back-pointer) for fragmentation elsewhere (namely, the requirement that the local allocator pre-reserve chunks of the address space -- chunks that may be larger than you expected).

But my point is, the same tricks that are used in modern global allocators can also be employed by the local ones, in order to avoid per-allocation overheads.

Note 2: I explicitly noted that the mapping gets the zero-sized source allocator type as input, to make it clear that the protocol described that maps the allocated address to the pointer to the allocator is not a global scheme that all allocators would have to subscribe to -- it is a local scheme that we "know" is sound to use because we are given the zero-sized type of the source allocator as a marker telling us that this is the right scheme to use.

Member

pnkfelix commented Nov 18, 2014

@Gankro I'm not sure if you addressed this or not, but another work-around for the desired mapping, that is, the mapping from: (a value (i.e. address of some heap block) + zero-sized per-value source-allocator type) and to: the allocator instance, would be to play tricks within the allocator itself where one can derive a pointer to the allocator instance from the heap block address.

Concrete example: the allocator could ensure that all the blocks it hands out come from a pool where applying bitmask on the addresses it hands out yields the address of a pointer to the source allocator. This provides a foundation for a cheap mapping from an object's address to its source allocator.

Note: This is basically trading fragmentation in one spot (the per-allocation back-pointer) for fragmentation elsewhere (namely, the requirement that the local allocator pre-reserve chunks of the address space -- chunks that may be larger than you expected).

But my point is, the same tricks that are used in modern global allocators can also be employed by the local ones, in order to avoid per-allocation overheads.

Note 2: I explicitly noted that the mapping gets the zero-sized source allocator type as input, to make it clear that the protocol described that maps the allocated address to the pointer to the allocator is not a global scheme that all allocators would have to subscribe to -- it is a local scheme that we "know" is sound to use because we are given the zero-sized type of the source allocator as a marker telling us that this is the right scheme to use.

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Nov 19, 2014

Contributor

Yep, that's basically the idea I was getting at in the above comment. Totally viable, but I'm not sure if we can abandon allocators that don't go in for such a scheme.

Contributor

Gankro commented Nov 19, 2014

Yep, that's basically the idea I was getting at in the above comment. Totally viable, but I'm not sure if we can abandon allocators that don't go in for such a scheme.

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Nov 19, 2014

Member

@Gankro i don't know what you mean by "abandon allocators that don't go in for such a scheme" -- the design outlined by this RFC (and probably others) can support both strategies, and in fact such allocators can be mixed together in the same program, as I tried to outline in "Note 2".

Member

pnkfelix commented Nov 19, 2014

@Gankro i don't know what you mean by "abandon allocators that don't go in for such a scheme" -- the design outlined by this RFC (and probably others) can support both strategies, and in fact such allocators can be mixed together in the same program, as I tried to outline in "Note 2".

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Nov 19, 2014

Contributor

Yeah, sorry I was a bit overly dramatic there. I just mean treating them as second-class citizens, efficiency-wise, because not doing that would necessitate More Unsafe (and some dubious tricks to deal with unwinding).

Contributor

Gankro commented Nov 19, 2014

Yeah, sorry I was a bit overly dramatic there. I just mean treating them as second-class citizens, efficiency-wise, because not doing that would necessitate More Unsafe (and some dubious tricks to deal with unwinding).

@Ericson2314

This comment has been minimized.

Show comment
Hide comment
@Ericson2314

Ericson2314 Nov 19, 2014

Contributor

@pnkfelix That is exactly what I was thinking with my "Deallocator trait" trait proposal. Since the type one parametrizes a Box with is really a (usually zero-sized) proxy, and it is mainly needed for freeing, I figured Deallocator would be a good name for the trait. Allocator would be implemented by the local instance itself.

A variation, to implement @Gankro's Mox trick: Make the deallocator trait's methods take a the associated allocator. That way metadata can be stored both once per collection, and once per pointer (here the allocator trait would probably be implemented with a pointer type), rather than obligate the implementers to find the instance solely from the pointer dealloc is called with.

A very rough sketch (ignoring the raw vs nice traits and other proposed stuff just for the simplicity of the example):

trait Deallocator {
  type A: Allocator;
  // I explain why self is consumed below
  fn dealloc(self, &A, *mut u8);
  // realloc, etc...
}

struct Box<D, T> {
  // This is not &D so that there is no overhead in the global allocator common case.
  // Since D is stored one copy per pointer, it might as well be consumed by `dealloc`.
  per_pointer_metadata: D, 
  raw: *T, // or whatever normally hides in a Box
}

struct Collection<D, T> {
  // stuff in array dealloc'd with this as second arg to `dealloc`
  per_collection_metadata: Arc<D::A>,
  ptrs: [Box<D, T>, ..16],
}
Contributor

Ericson2314 commented Nov 19, 2014

@pnkfelix That is exactly what I was thinking with my "Deallocator trait" trait proposal. Since the type one parametrizes a Box with is really a (usually zero-sized) proxy, and it is mainly needed for freeing, I figured Deallocator would be a good name for the trait. Allocator would be implemented by the local instance itself.

A variation, to implement @Gankro's Mox trick: Make the deallocator trait's methods take a the associated allocator. That way metadata can be stored both once per collection, and once per pointer (here the allocator trait would probably be implemented with a pointer type), rather than obligate the implementers to find the instance solely from the pointer dealloc is called with.

A very rough sketch (ignoring the raw vs nice traits and other proposed stuff just for the simplicity of the example):

trait Deallocator {
  type A: Allocator;
  // I explain why self is consumed below
  fn dealloc(self, &A, *mut u8);
  // realloc, etc...
}

struct Box<D, T> {
  // This is not &D so that there is no overhead in the global allocator common case.
  // Since D is stored one copy per pointer, it might as well be consumed by `dealloc`.
  per_pointer_metadata: D, 
  raw: *T, // or whatever normally hides in a Box
}

struct Collection<D, T> {
  // stuff in array dealloc'd with this as second arg to `dealloc`
  per_collection_metadata: Arc<D::A>,
  ptrs: [Box<D, T>, ..16],
}
@thestinger

This comment has been minimized.

Show comment
Hide comment
@thestinger

thestinger Nov 19, 2014

I don't think there's a use case for more stateless allocators. The general purpose allocator is already great, and there's little that can be done to improve over it. Rust's general purpose allocator already supports making isolated arenas rather than using the default ones. The whole point of custom allocator support is enabling allocators with very low constant costs by having a pointer to the arena and making good use of sized deallocation.

I don't think there's a use case for more stateless allocators. The general purpose allocator is already great, and there's little that can be done to improve over it. Rust's general purpose allocator already supports making isolated arenas rather than using the default ones. The whole point of custom allocator support is enabling allocators with very low constant costs by having a pointer to the arena and making good use of sized deallocation.

@Ericson2314

This comment has been minimized.

Show comment
Hide comment
@Ericson2314

Ericson2314 Nov 19, 2014

Contributor

@thestinger I'm confused, is that at me or earlier comments? I made those contortions so the general purpose allocator would fit the allocator API without any overhead whatsoever. Even if the global allocator is the only stateless one in practice, I'd think supporting it without overhead would be reason enough to contort the API.

Contributor

Ericson2314 commented Nov 19, 2014

@thestinger I'm confused, is that at me or earlier comments? I made those contortions so the general purpose allocator would fit the allocator API without any overhead whatsoever. Even if the global allocator is the only stateless one in practice, I'd think supporting it without overhead would be reason enough to contort the API.

@pnkfelix pnkfelix referenced this pull request Dec 22, 2014

Closed

user-defined allocators #538

withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017

Merge pull request #244 from dwrensha/bilock-public-types
Export BiLockAcquire and BiLockAcquired.

@pnkfelix pnkfelix referenced this pull request Nov 22, 2017

Open

Allocator traits and std::heap #32838

5 of 12 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment