Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upConsider reverting the merge of collections into alloc #43112
Comments
brson
added this to the 1.20 milestone
Jul 7, 2017
brson
added
P-high
T-libs
labels
Jul 7, 2017
This comment has been minimized.
This comment has been minimized.
|
@brson I'm not sure I agree with the reasons you laid out. In my mind liballoc provides the ability to depend on dynamic memory allocation, but nothing else. On top of that one assumption you can build up the entire crate (pointers + collections) with no more assumptions than libcore already has. In that sense I don't see this as a violation of layering but rather just giving you one layer and then everything else that it implies.
Is this a problem? Linkers gc'ing sections should take care of this? Isn't this equivalent to libcore bringing utf-8 formatting when you otherwise don't need it?
Do you have an example of this? |
This comment has been minimized.
This comment has been minimized.
|
I'm in favor of undoing the merge. In general, I'd prefer not to reduce the number of crates in the std facade. I'm In the particular case of alloc / collections it feels odd to have a full suite Personally, I'd prefer if |
This comment has been minimized.
This comment has been minimized.
|
If we're going to separate this crate then I think we need to be super clear about why. For example the previous incarnation of
Yes, and to me this is not a reason to separate the crates. This is already a "problem" in the standard library where the "fix" is otherwise too unergonomic to work with. |
This comment has been minimized.
This comment has been minimized.
If cargo had the ability to run with multiple different forks of std see this RFC, this would be much less of an issue. |
brson
referenced this issue
Jul 11, 2017
Merged
Implement From<&[T]> and others for Arc/Rc (RFC 1845) #42565
This comment has been minimized.
This comment has been minimized.
It is a problem. I do not think depending on linker to throw away unwanted code is a reasonable excuse to make unsatisfactory architectural decisions. If this were an out-of-tree project where people have to spend CPU time compiling code, then telling people to just merge your abstraction layers, compile it all, and throw most of it away later, would not be acceptable.
Yes. The entire motivation for this merge is to follow it up with this PR, which is seemingly not possible without the collections being literally in the alloc crate. No other collections crate can achieve this.
This was indeed a weakness of the previous alloc crate. I'd love to pull the smart pointers out if possible, and would love to not make the conflation worse.
I don't think this follows. Each layer in the facade adds new capabilities. Alloc is a crucial capability that separates different classes of Rust software. |
This comment has been minimized.
This comment has been minimized.
So merging alloc and collections happened to be able to |
This comment has been minimized.
This comment has been minimized.
|
If the goal is to minimize library code bloat, should we move to |
This comment has been minimized.
This comment has been minimized.
|
@brson thanks for reopening discussion, and @japaric thanks for the ping.
Exactly. I have a few reasons why I think combining facade crates relating to allocation is bad, but I want to start positively with why I think breaking facde crates further apart in general is good. Writing this with @joshlf's help. Small, quickly compiled binaries are nice, but more important is the efficiency of the development process. We want a large, diverse, and flexible no-std ecosystem, because ultimately we are targeting a large diverse space of projects. Small libraries are key to this both because they reduce the cognitive load of everyone developing those libraries, and allow more developers to participate. For cognitive load, it's important both to be utterly ignorant of the code that doesn't matter---as in, don't know what you don't know---and incrementally learn the code that does. For distributed development and diverse goals, the benefits are probably more obvious. Lighter coupling means less coordination so more people can scratch their own itches in isolation. But it also allows complexity to be managed by splitting the ecosystem into small, modular components - this allows people to take only what they need, and thus only opt-in to as much complexity as their application requires. All this begs the question---how far do we go down the road of splitting coarse crates into fine crates? I think quite far. It is my hope that as crates behind the facade and the language mature, more crates will be able to be written in stable (or otherwise trustworthy) code, and be moved outside rust-lang/rust into their own repo. Likewise, std should be able to (immediately and transitively) depend on third party crates just fine---the lockfile keeps this safe. Rustbuild, and my RFC #1133 are our friends here. To put it another way, there should be as little magic in rust-lang/rust as possible because magic cannot be replicated outside of rust-lang/rust. By splitting crates, we decrease the risk that a given piece of near-stable code will be bundled with code that cannot be stabilized, thus preventing the former from ever becoming stabilized and thus "de-magicked" in the sense of becoming eligible for being moved outside of rust-lang/rust. This runs counter to the arguments in favor of the collections/alloc merge. In concrete terms, I recall there being talk of incorporating Back to allocation in particular, @joshlf's been doing some great work with object allocation (type-parametric allocators that only allocate objects of a particular type and cache initialized objects for performance), and it would be nice to use the default collections with that: the fact is, most collections only need to allocate a few types of objects, and so could work with object allocators just fine. Now if we were to combine alloc and collections and the object allocator traits in one crate, that would be a very unwieldy crate playing not just double but triple duty.
Besides the technical reasons, as @japaric and I mentioned elsewhere, including anything allocation related in core, even something as harmless as a trait that need not be used, will scare away developers of real-time systems. OTOH, I like the idea of allocator types being usable without Also, there's an analogous problematic cycle to avoid when defining the static allocator even if it is initialized statically: For crates which implement global allocators it's incorrect for them to use CC @eternaleye |
This comment has been minimized.
This comment has been minimized.
|
Given @Ericson2314 's comment, we (again, jointly) would like to make a proposal for how all of these various components might be structured. We have the following goals:
Thus, we propose the following:
|
This comment has been minimized.
This comment has been minimized.
I don't think that the arguments provided suggest that that should happen. Keeping all of the collections together doesn't increase the difficulty of understanding how things work because, for the most part, collections do not depend on one another, and do not share magic under the hood. From a usability perspective, they're logically related, so it would not be surprising to a developer to find them together in the same crate. The arguments we and others have presented do suggest splitting collections into its own thing - separate from, e.g., |
This comment has been minimized.
This comment has been minimized.
@joshlf ^ seems to imply to me that it would suggest that level of granularity. |
This comment has been minimized.
This comment has been minimized.
|
Ah, we definitely didn't mean to imply that. @Ericson2314 can weigh in when he gets a chance, but speaking for myself, I would interpret that as "quite far within reason." I don't think that our reasoning provides a good argument for splitting collections up into separate crates, although maybe @Ericson2314 will disagree and will think we should even go that far. |
This comment has been minimized.
This comment has been minimized.
|
Well I... wouldn't fight that level of granularity if many thers want it :). Trying to think of a principle of why it's more important to separate alloc from collections than each collection from each other, I arrived at a sort of tick-tock model where one crate (tick) adds some new capability, and the next (tok) builds a bunch of with the capabilities added so far (it's "marginally pure"). Crates like alloc or a kernel bindings (e.g |
This comment has been minimized.
This comment has been minimized.
@brson: Just a minor correction here: The referenced PR doesn't require collections to be within the alloc crate. It only requires An out-of-tree collections crate would be able to make the same impls, as it could have both |
This comment has been minimized.
This comment has been minimized.
Presumably the visibility requirement exists because |
This comment has been minimized.
This comment has been minimized.
|
@joshlf: Yes, that's correct. |
This comment has been minimized.
This comment has been minimized.
|
In that case, that'd be my suggestion - to either make a separate |
This comment has been minimized.
This comment has been minimized.
I don't think I entirely agree with this. The whole premise of this crate is that we're shipping it in binary form so the compilation time doesn't matter too much. We absolutely rely on gc-sections for so many other features I think the ship has long since sailed on making an optional feature of the linker that we invoke. I think it's also important to keep this all in perspective and extra concrete. On 2017-06-01 liballoc took 0.3s to compile in release mode and libcollections took 3s. Today (2017-07-11) liballoc takes 3.2s to compile in release mode. This is practically nothing compared to crates in the ecosystem.
I think this is too quick an interpretation, though. As @murarth pointed out above we're not empowering std collections with extra abilities. Any collection outside std can have these impls.
I think my main point is that we should not automatically go back to what we were doing before. I believe the separation before didn't make sense, and I believe the current separate makes a more sense. If there's a desire to separate the concept of allocation from the default collections that's fine by me, but I don't think we should blindly attempt to preserve what existed previously which wasn't really all that well thought out (the alloc crate looked basically exactly the same as when I first made it ever so long ago) I'd personally find it quite useful if we stick to concrete suggestions here. The facade is full of subtle intricacies that make seemingly plausible proposals impossible to implement today and only marginally possible in the future. One alternative is the title of this issue, "Consider reverting the merge of collections into alloc". I have previously stated why I disagree with this. Another alternative by @joshlf above relies on defining collection types that don't know about I also further more disagree with rationale that keeps |
This comment has been minimized.
This comment has been minimized.
I wasn't thinking that you'd add default type parameters after-the-fact, but rather re-export as a newtype. E.g., in collections: pub struct Vec<T, A: Alloc> { ... }And then in std: use collections::vec;
use heap::Heap;
pub type Vec<T, A: Alloc = Heap> = vec::Vec<T, A>;
That's fine - as we mentioned, keeping |
This comment has been minimized.
This comment has been minimized.
no-std devs will be recompiling it.
How much longer does it take to build a final binary that depends only on alloc not collections? I suspect that will tell a different story?
Huh? The issue is Vec, Arc, and RC need to live in the same crate, but that create need not contain the allocator traits. I'd say we do indeed have a problem and while moving all 3 of those to collections is a good step, there's still a problem because anyone else writing there own vec-like thing runs into the same issue.
I think there is some consensus beside you that the first step could be making a smaller alloc than before: with no Rc or Arc, and maybe not Box either? Heap and the Alloc traits would stay in alloc, and then as a second step either the traits would move to core, or heap would move to its own crate.
@joshlf lf beat me to using an alias (or if they fails newtype). CC @Gankro cause HashMap and the hasher is 100% analogous. |
This comment has been minimized.
This comment has been minimized.
So can I, but nobody was saying just put it in libstd! libcore <- liballoc <- {libcollections | liballoc-system} <- libstd: I can see any sub-graph (including libcore) being useful.
|
This comment has been minimized.
This comment has been minimized.
Yes I understand, and I'm re-emphasizing that this does not work today. Whether it will ever work is still up in the air. As a result it's not a viable proposal at this time.
No, I highly doubt it. Everything in
This is missing the point. @brson originally though that by moving
Can you articulate precisely what you think this problem is?
I disagree with two things here. I don't think it makes sense to couple
Again, this is not possible. I was the one that put |
This comment has been minimized.
This comment has been minimized.
|
Let's consider this from another point of view. I see the current crate hierarchy as follows:
As a no-std/embedded developer, I do not see any practical use in having what's currently in liballoc split into any number of crates. It is permeated by infallibility on every level, from The savings in terms of code size do not exist because the majority of embedded software is compiled with LTO and at least opt-level 1 even for debugging; without LTO, libcore alone would blow through the entire storage budget, and without opt-level 1, the generated code is all of: too large, too slow, and too hard to read at that special moment you have to read assembly listings. It seems straightforward and obviously good to put the |
This comment has been minimized.
This comment has been minimized.
If you, @whitequark, as an embedded developer, don't mind putting the trait in libcore, then I think your opinion overrides mine and @Ericson2314's since we aren't embedded devs :) |
This comment has been minimized.
This comment has been minimized.
|
@joshlf Mostly it's that I don't really understand the argument for keeping it out. The argument for having separate libcore and liballoc goes: libcore's only dependency in terms of linking is libcompiler_builtins, and it introduces no global entities, whereas liballoc introduces linking dependencies to the A trait that isn't used has no cost. |
This comment has been minimized.
This comment has been minimized.
Right; my apologies; I forgot about that.
I thought it was a coherence issue. If it's a name-reachability issue with
Ah there is some naming confusing here because @joshlf used a type alias. A wrapper struct is what I consider a newtype, and that would work, right?. Wrapping every inherent method or using a one-off trait is annoying, but works. |
This comment has been minimized.
This comment has been minimized.
tarcieri
commented
Jul 18, 2017
|
This is perhaps getting a bit off topic, but what I really want is |
This comment has been minimized.
This comment has been minimized.
|
@tarcieri the best solution to that is custom preludes. |
This comment has been minimized.
This comment has been minimized.
tarcieri
commented
Jul 18, 2017
•
|
@Ericson2314 custom preludes don't really solve the problem I'd like to solve, which is allowing crates to leverage The whole point of automatically adding them to a core/alloc prelude would be to remove this explicit dependency. Then use of |
This comment has been minimized.
This comment has been minimized.
|
@tarcieri I'm not sure what to tell you, sorry. Some insta-stable trick to expose items through abnormal means based on whether anything implements allocator anywhere is....unwise in my book. I'd say stabilize the crates quicker, but @whitequark bring up a good point that our current story around handling allocation failure in everything but the allocator trait itself is atrocious: unergonomic and unsafe. I'm loath to stabilize the "beneath the facade" interfaces until that is fixed. |
This comment has been minimized.
This comment has been minimized.
What? That's the exact opposite of reality. It is safe (because crashes are safe), and it's ergonomic, because explicicly handling allocation failures in typical server, desktop, or even hosted embedded software has a high cost/benefit ratio. Consider this. With mandatory explicit representation of allocation failure, almost every function that returns or mutates a
Also, let's say you have a public function that returns The only remotely workable solution I see is extending the API of |
This comment has been minimized.
This comment has been minimized.
|
To add to this, the majority of small embedded devices whose memory is truly scarce cope in one of the two ways:
As such, I feel like the primary structure providing fallible allocations would be some sort of memory pool. This can be easily handled outside the alloc crate. |
This comment has been minimized.
This comment has been minimized.
|
As a former libs team member, I'm not opposed to adding try_push, try_reserve, etc to the stdlib at this point in time. Someone just needs to put in the legwork to design the API, which I think was partially blocked on landing allocator APIs -- to provide guidance on what allocation error types are like -- when this first came up. I believe the gecko team broadly wants these functions, as there are some allocations (often user-controlled, like images) which are relatively easy and useful to make fallible. |
This comment has been minimized.
This comment has been minimized.
|
@whitequark Sorry, I meant handling allocation in the client is unergonomic/unsafe. Everyone agrees that what we do today is both safe and ergonomic, but not flexible in that regard. |
This comment has been minimized.
This comment has been minimized.
Then this sounds like the best solution to me, yeah. |
This comment has been minimized.
This comment has been minimized.
|
This thread has gone in a lot of different directions, and I'm having I'm afraid I framed this incorrectly by presenting a defense of the This was a major breaking architectural decision, done out of process. I suggest we take the following actions:
|
This comment has been minimized.
This comment has been minimized.
tarcieri
commented
Jul 19, 2017
•
|
My gut feeling is there are some complex interrelationships between these types which are difficult to model given the previous alloc/collections split. I think that was the motivation for the merge in the first place. As an example, at the error-chain meeting we discussed moving This means any crates that want to work with I guess my question is: is there still a path forward for unlocking |
This comment has been minimized.
This comment has been minimized.
|
@tarcieri with or without the revert, it can go in alloc. If @brson Ah, I wasn't actually sure what the process is for changing crates whose very existence is unstable. Thanks for clarifying. |
This comment has been minimized.
This comment has been minimized.
tarcieri
commented
Jul 19, 2017
|
My apologies if that's all orthogonal to the revert. If that's the case, I have no objections. |
This comment has been minimized.
This comment has been minimized.
|
A couple of process points:
From the libs team meeting, it sounded like @alexcrichton was quite confident that the original crate organization was buying us nothing, and that the problems people are interested in solving are better addressed in some other way. I think it'd be good for @alexcrichton and @brson to sync up, and then summarize their joint perspective on thread, before we make more changes here. |
tonychain
referenced this issue
Jul 20, 2017
Closed
"allocator" and "nightly" features (for no_std environments) #153
Ericson2314
referenced this issue
Jul 23, 2017
Open
Tracking issue for custom allocators in standard collections #42774
Mark-Simulacrum
added
the
C-bug
label
Jul 28, 2017
This comment has been minimized.
This comment has been minimized.
|
There doesn't seem to have been any progress on this in the last 3 weeks; and PR #42565 is blocked on this resolving one way or the other. What steps do we need to take to unstick this? "Watch me hit this beehive with a big ol' stick", he said, pressing Comment |
This comment has been minimized.
This comment has been minimized.
tarcieri
commented
Aug 11, 2017
|
#42565 was the sort of thing I was alluding to earlier. Is there a path forward for that if the merger were to be reverted? |
This comment has been minimized.
This comment has been minimized.
Yes -- move just |
This comment has been minimized.
This comment has been minimized.
|
I've re-read this entire thread, and to me the key points are: Clarity of layering. I think @whitequark put it best:
though we are working toward supporting fallible allocation in Crates providing features that aren't used. It's true that the crates.io ecosystem in general skews toward small crates, but these core library crates are a very important exception to that. The thread has made clear that bloat isn't an issue (due to linkers), nor is compilation time (which is trivial here, due to generics). Special treatment of core crates. Grouping type and/or trait definitions together can enable impls that are not possible externally. However, (1) the very same issues apply to any choice of breaking up crates and (2) the standard library already makes substantial and vital use of its ability to provide impls locally. Separating collections from the global heap assumption. There is not currently a plausible path to do this, but with some language additions there may eventually be. But by the same token, this kind of question also seems amenable to a feature flag treatment. Personally, I find the new organization has a much more clear rationale than the current one, and is simpler as well. Stakeholders from impacted communities (the @rfcbot fcp close |
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Aug 12, 2017
•
|
Team member @aturon has proposed to close this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
rfcbot
added
the
proposed-final-comment-period
label
Aug 12, 2017
This comment has been minimized.
This comment has been minimized.
|
Bummer. Then I suppose the next fronts for pushing for modularity are:
|
rfcbot
added
final-comment-period
and removed
proposed-final-comment-period
labels
Aug 22, 2017
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Aug 22, 2017
|
|
This comment has been minimized.
This comment has been minimized.
|
Please don't revert it. This change has been out in the wild for a long time, and many projects have updated to the new organization of |
This comment has been minimized.
This comment has been minimized.
|
I'm going to close this issue now that the full libs team has signed off. (@jackpot51, to be clear, that means: we will not revert.) |
brson commentedJul 7, 2017
•
edited
This PR merges the collections crate into the alloc crate, with the intent of enabling this PR.
Here are some reasons against the merge:
It is a violation of layering / seperation of responsibilities. There is no conceptual reason for collections and allocation to be in the same crate. It seems to have been done to solve a language limitation, for the enablement of a fairly marginal feature. The tradeoff does not seem worth it to me.
It forces any no_std projects that want allocation to also take collections with it. There are presumably use cases for wanting allocator support without the Rust collections design (we know the collections are insufficient for some use cases).
It gives the std collections crate special capabilities that other collections may not implement themselves - no other collections will be able to achieve feature parity with the conversion this merger is meant to enable.
Personally I value the decomposition of the standard library into individual reusable components and think the merge is moving in the wrong direction.
I am curious to know what embedded and os developers think of this merge cc @japaric @jackpot51 .