Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for Iterator::collect_into #94780

Open
1 of 4 tasks
frengor opened this issue Mar 9, 2022 · 5 comments
Open
1 of 4 tasks

Tracking Issue for Iterator::collect_into #94780

frengor opened this issue Mar 9, 2022 · 5 comments
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@frengor
Copy link
Contributor

frengor commented Mar 9, 2022

Feature gate: #![feature(iter_collect_into)]

This is a tracking issue for adding the collect_into method to the Iterator trait.
Iterator::collect_into lets an iterator to be collected into a collection which implements the Extend trait, consuming the iterator and adding every of its item to the collection.
Adding this method has also the benefit of making the Extend trait more discoverable.

Public API

trait Iterator {
    type Item;

    fn collect_into<E: Extend<Self::Item>>(self, collection: &mut E) -> &mut E
    where
        Self: Sized;
}

Steps / History

Unresolved Questions

@frengor frengor added C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Mar 9, 2022
@Corfucinas
Copy link

This would be a useful method when using MPSC and appending to different vectors depending on the object.

@mqudsi
Copy link
Contributor

mqudsi commented Nov 8, 2022

Not sure if this is where any discussion should go but I would like it to formally request from now that consideration should be given to the ability to (try to?) collect into a fixed-size buffer well, primarily for purposes of avoiding heap allocation. Presumably this would just be a separate feature (try_collect_into?) but if the design of that feature would in any way conflict with the current collect_into feature, then I would like us to discuss these issues here and now before collect_into stabilizes any further.

collect_into is nice because it lets you reduce allocations by directly extending a previous buffer/collection rather than allocating into a wholly separate one and then forcing you to merge the two, which (boilerplate aside) might be less efficient. On the other hand, collecting into a fixed-size buffer could let you use iterators to directly collect into a stack-allocated array/slice, which is pretty much guaranteed to be a huge performance win. I am happy to open a separate issue to specifically track such a feature, but again, I just wanted to mention it here so that if there's something the collect_into feature should (or shouldn't) have in order to facilitate such a try_collect_into in the future, we can hopefully fix it.


The idea is that sometimes you want to use the iterator façade to simplify code transforming an existing collection and your input set is either of a known size or a capped size. Being able to collect into a stack-allocated array or mutable slice thereof would allow collecting the results of an enumeration cleanly. The semantics can (but don't necessarily have to) differ from those of collect_into, as certain things (like tracking the number of elements collected) become harder and there's the question of what to do when the provided fixed-length destination doesn't fit everything (return false and just leave the remainder unread/uncollected in the iterator? panic?).

You can currently mock this yourself by implementing Extend<A> (though the Extend api assumes infallibility) and then using .inspect() before .collect_into() to track the number of elements actually written to the collection.


EDIT:

It may very well just make more sense to loop over the items and add them to the array rather than building a huge façade around that. (Technically the same goes for collect_into as well...)

@feature-engineer
Copy link

feature-engineer commented Jan 18, 2023

Regarding the last point:

Is it worth it to have this API? The Iterator interface is already pretty large, and use cases can easily be written differently without this API.

I have an example use case, and would like to know how it should be written differently without this API:

I'm implementing a limited size priority queue (i.e. I want to keep only the top N elements of the queue at any given time).
For this I create a struct which has a collection and a size as its members.
I create this struct with a given size field - which is why it has to be created before it is being collected to.
Using this queue after implementing its insert fn, and extend fn is trivial when this API exists:

q = SizedQueue::new(5);
<some iter over large data>.collect_into(q);

How would it look with an alternative API?

@frengor
Copy link
Contributor Author

frengor commented Jan 18, 2023

q = SizedQueue::new(5);
<some iter over large data>.collect_into(q);

How would it look with an alternative API?

q = SizedQueue::new(5);
q.extend(<some iter over large data>);

The two codes are equivalent (in fact, collect_into is implemented calling Extend::extend).

@frengor
Copy link
Contributor Author

frengor commented Jan 18, 2023

Presumably this would just be a separate feature (try_collect_into?)

Yeah, I think it should be. However, seeing that try_collect doesn't handle fallible allocations, the name of either try_collect_into or try_collect should be changed to maintain coherence (also see the unresolved questions of try_collect).

but if the design of that feature would in any way conflict with the current collect_into feature, then I would like us to discuss these issues here and now before collect_into stabilizes any further.

I don't think it will really impact collect_into, I suppose it would mostly require adding something like a TryExtend trait (it would be good to add a TryFromIterator trait, too), but maybe there is a better way to handle fallible allocations (?).

It may very well just make more sense to loop over the items and add them to the array rather than building a huge façade around that. (Technically the same goes for collect_into as well...)

That would probably just be the implementation of TryExtend for slices I think, and having it implemented once in the stdlib doesn't require users to duplicate the same code over and over. collect_into uses Extend for this exact reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants