Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upAllocator RFC, take II #244
Conversation
pnkfelix
added some commits
Aug 27, 2014
This comment has been minimized.
This comment has been minimized.
|
If you could express "allocate me a buffer to fit x T's, y U's, and z V's, and return a ptr to it" and "interpret this ptr as a buffer to x T's, y U's, and z V's, and return me a ptr for each buffer", I think that would address the use case. Then you can wrap that logic to represent the capacities in as efficient of a way as possible/desired in the given domain. In the case of BTree, the capacities are (x, x, x+1), so the wrapper just needs to remember x. A generic multi-vector could be built that only supports all-equal capacity, or only supports all-different capacity, if desired. |
This comment has been minimized.
This comment has been minimized.
|
On Mon, Sep 22, 2014 at 04:38:28AM -0700, Felix S Klock II wrote:
Sure.
I personally don't think this needs to be in this RFC and could come To me, the key questions that decide the fate of this RFC at this It seems to me that having typed APIs is still useful even without GC, Still, as @strcat's original message suggested, we can maintain the a. it encourages people to use the typed APIs when suitable, thus I'm not sure yet whether it makes more sense to take this RFC, some |
This comment has been minimized.
This comment has been minimized.
|
@pnkfelix What are the drawbacks of a GC API based entirely on a |
This comment has been minimized.
This comment has been minimized.
|
@zwarich The main issue that the allocator support here is trying to resolve is the case where you have blocks of allocated data (not managed by the GC heap) that still needs to be treated as additional roots by the GC [*]. Note that some such blocks might currently only be reachable via pointers held in native libraries; they may not be visible via any path from a local variable that is statically known to the compiler. There are a number of ways one can attempt to address the above problem. (For example, one could require separate manual registration of such blocks of memory before allowing the pointers to them from being lost from the locally scanned roots.) The design of this RFC is to choose a high-level approach that can accommodate many different approaches, at a level of abstraction that was thought to be easy to understand. [*] I posted a more complete explanation of my thinking here: http://discuss.rust-lang.org/t/on-native-allocations-pointing-into-a-gc-heap/564 |
This comment has been minimized.
This comment has been minimized.
|
One thing I want to add is that the horrible allocator system in the STL is the primary reason why a large part of the C++ community that really likes performance (gamedev) does not use it. Because Rust's language design generally encourages stuff to allocate memory everywhere I think an allocator API generally is important. One part that has not been mentioned here however is that it's not just collections that want to allocate memory. There are also algorithms and other things. One common pattern is that you have a certain subsystem and you want to make sure that everything that subsystem does is allocated through one allocator so that you can track your memory patterns. Maybe it would make sense to collect some outsider feedback on allocators? |
This comment has been minimized.
This comment has been minimized.
|
Another thing for the typed allocation stuff, the |
This comment has been minimized.
This comment has been minimized.
|
On Wed, Oct 01, 2014 at 04:06:13PM -0700, Huon Wilson wrote:
Indeed. That seems like it would require a "reclassify" method to (As an aside, this seems like a nice use-case for having a marker trait |
alexcrichton
force-pushed the
rust-lang:master
branch
from
b9e2b8c
to
5020131
Oct 29, 2014
This comment has been minimized.
This comment has been minimized.
|
@pnkfelix Are the contents of this RFC still relevant for discussion, or has your design substantially changed? |
This comment has been minimized.
This comment has been minimized.
|
closing; I want to do a take III at some point that handles cases like the |
pnkfelix
closed this
Nov 13, 2014
This comment has been minimized.
This comment has been minimized.
|
I have been instructed to leave any allocator notes here, for future reference. A lot of this is me just trying to transcribe what was discussed in IRC. First off, I really like the high-level API design here, it removes a lot of the boiler-plate we're seeing in libcollections now. Vec and Ringbuf as well as HashMap's RawTable and BTree's Node (in rust-lang/rust/#18028) are duplicating a lot of logic that I'd like to see moved into a higher-level API. I've been deferring trying to do this because this might trivially fall out this allocator work. However it looks like I may create a mock version of the top-level API as a shim to improve code quality while this bakes. I'd like to note that non-global allocators put libcollections in an awkward spot. Unless the local allocator is managed by some global one, we will almost certainly have to have every Box/Rc/etc contain a pointer back to their local allocator instance. This would add an extra uint that is strictly unnecessary in the context of single-allocator collections. Therefore collections that use Box (currently TrieMap, TreeMap, and DList), may be forced to use only raw ptrs in order to avoid this. This would rack up our unsafeness in those collections by a significant amount. The interaction with unwinding is also unfortunate. One solution to this would be an aborting-on-destruction linear type Box. It would basically be a Box that you have to manually deallocate. Let's call it Mox. Mox could have the following API:
This gets rid of most of the boiler-plate, and ensures that either we handle the memory correctly or abort. However there doesn't seem to be any way to ensure that the right allocator is used beyond the fact that it was the right kind of allocator. Of course just accepting the I would likely to reaffirm my desire for a We also need to formally work out the OOM semantics for libstd and libcollections in particular. On IRC @nikomatsakis suggested that Still, if we're all happy with panic for now, I would like to have a "just panic on OOM" (or I guess call @nikomatsakis also suggested the possibility of adding more built-in pointer types, like one that is explicitly not null, but is otherwise unsafe to use (might be uninitialized, for one). The suggested wrappers could use such a ptr type. Finally, I'm not sure if this is strictly in the scope of the allocators API, but it feels strongly related, so I want to note it in case it has any influence on design. There is strong desire to eventually have support for a HeapVec (what we have now), StackVec (uses a [T, ..n] as backing storage-- potentially avoids storing any Here's a sketch of what this would look like (haven't dug too deeply into it because I don't know what shape the allocator API will take):
|
This comment has been minimized.
This comment has been minimized.
|
Modified the MetaVec example after I realized max_capacity is unecessary. |
This comment has been minimized.
This comment has been minimized.
|
Grankro, can't the Allocator instance for a global allocator be a zero sized type so there is no overhead in the global allocator case? |
This comment has been minimized.
This comment has been minimized.
|
Yes, global allocators can have no overhead, as is the case today. |
This comment has been minimized.
This comment has been minimized.
Wait, then what is the problem you mean by this? |
This comment has been minimized.
This comment has been minimized.
|
If you have a non-global allocator. That is one you make several concrete instances of, and dole out to specific collections at runtime, the Boxes that use these allocators need a pointer to their local allocator (or some other identifier that prevents the allocator from being dropped, and lets the Box's destructor call into it). Of course there's some possible work-arounds. Like if the allocators are managed by a global allocator that can figure out which sub-allocators control which pointers. But in general some local allocators will need the boxes to be bloated with runtime metadata. Especially since the value of local custom allocators (I'm told) is in their simplicity. Presumably this is unacceptable overhead for our collections to inherit when all the boxes in a single collection "use" the same allocator. Another thing I just remembered. We need a mechanism to identify if two Allocators are the same at runtime. This is necessary to support functionality like moving the Node of one DList to another without allocating. |
This comment has been minimized.
This comment has been minimized.
|
@Gankro I'm not sure if you addressed this or not, but another work-around for the desired mapping, that is, the mapping from: (a value (i.e. address of some heap block) + zero-sized per-value source-allocator type) and to: the allocator instance, would be to play tricks within the allocator itself where one can derive a pointer to the allocator instance from the heap block address. Concrete example: the allocator could ensure that all the blocks it hands out come from a pool where applying bitmask on the addresses it hands out yields the address of a pointer to the source allocator. This provides a foundation for a cheap mapping from an object's address to its source allocator. Note: This is basically trading fragmentation in one spot (the per-allocation back-pointer) for fragmentation elsewhere (namely, the requirement that the local allocator pre-reserve chunks of the address space -- chunks that may be larger than you expected). But my point is, the same tricks that are used in modern global allocators can also be employed by the local ones, in order to avoid per-allocation overheads. Note 2: I explicitly noted that the mapping gets the zero-sized source allocator type as input, to make it clear that the protocol described that maps the allocated address to the pointer to the allocator is not a global scheme that all allocators would have to subscribe to -- it is a local scheme that we "know" is sound to use because we are given the zero-sized type of the source allocator as a marker telling us that this is the right scheme to use. |
This comment has been minimized.
This comment has been minimized.
|
Yep, that's basically the idea I was getting at in the above comment. Totally viable, but I'm not sure if we can abandon allocators that don't go in for such a scheme. |
This comment has been minimized.
This comment has been minimized.
|
@Gankro i don't know what you mean by "abandon allocators that don't go in for such a scheme" -- the design outlined by this RFC (and probably others) can support both strategies, and in fact such allocators can be mixed together in the same program, as I tried to outline in "Note 2". |
This comment has been minimized.
This comment has been minimized.
|
Yeah, sorry I was a bit overly dramatic there. I just mean treating them as second-class citizens, efficiency-wise, because not doing that would necessitate More Unsafe (and some dubious tricks to deal with unwinding). |
This comment has been minimized.
This comment has been minimized.
|
@pnkfelix That is exactly what I was thinking with my "Deallocator trait" trait proposal. Since the type one parametrizes a Box with is really a (usually zero-sized) proxy, and it is mainly needed for freeing, I figured Deallocator would be a good name for the trait. Allocator would be implemented by the local instance itself. A variation, to implement @Gankro's A very rough sketch (ignoring the raw vs nice traits and other proposed stuff just for the simplicity of the example): trait Deallocator {
type A: Allocator;
// I explain why self is consumed below
fn dealloc(self, &A, *mut u8);
// realloc, etc...
}
struct Box<D, T> {
// This is not &D so that there is no overhead in the global allocator common case.
// Since D is stored one copy per pointer, it might as well be consumed by `dealloc`.
per_pointer_metadata: D,
raw: *T, // or whatever normally hides in a Box
}
struct Collection<D, T> {
// stuff in array dealloc'd with this as second arg to `dealloc`
per_collection_metadata: Arc<D::A>,
ptrs: [Box<D, T>, ..16],
} |
This comment has been minimized.
This comment has been minimized.
thestinger
commented
Nov 19, 2014
|
I don't think there's a use case for more stateless allocators. The general purpose allocator is already great, and there's little that can be done to improve over it. Rust's general purpose allocator already supports making isolated arenas rather than using the default ones. The whole point of custom allocator support is enabling allocators with very low constant costs by having a pointer to the arena and making good use of sized deallocation. |
This comment has been minimized.
This comment has been minimized.
|
@thestinger I'm confused, is that at me or earlier comments? I made those contortions so the general purpose allocator would fit the allocator API without any overhead whatsoever. Even if the global allocator is the only stateless one in practice, I'd think supporting it without overhead would be reason enough to contort the API. |
pnkfelix commentedSep 17, 2014
(rendered)
Summary
Add a standard allocator interface and support for user-defined allocators, with the following goals:
Allow libraries to be generic with respect to the allocator, so that users can supply their own memory allocator and still make use of library types like
VecorHashMap(assuming that the library types are updated to be parameterized over their allocator).In particular, stateful per-container allocators are supported.
Support ability of garbage-collector (GC) to identify roots stored in statically-typed user-allocated objects outside the GC-heap, without sacrificing efficiency for code that does not use
Gc<T>.Do not require an allocator itself to extract metadata (such as the size of an allocation) from the initial address of the allocated block; instead, force the client to supply size at the deallocation site.
In other words, do not provide a
free(ptr)-based API. (This can improve overall performance on certain hot paths.)Incorporate data alignment constraints into the API, as many allocators have efficient support for meeting such constraints built-in, rather than forcing the client to build their own re-aligning wrapper around a
malloc-style interface.