Tracking Issue for `Iterator::try_collect` #94047

a-lafrance · 2022-02-16T08:02:29Z

Feature gate: #![feature(iterator_try_collect)]

This is a tracking issue for adding the try_collect() method to the Iterator trait, as originally discussed here. Iterator::try_collect() is a fallible variation of Iterator::collect() analogous to similar methods for reduce and fold, among others, that provides a simpler, generic way to collect iterators of Try elements into Try-wrapped types.

Difference from `Iterator::collect`

In terms of functionality, the main difference between try_collect() and collect() is that try_collect() allows you to fallibly collect iterators yielding types that implement Try but not FromIterator, which collect() can't do. Concretely this means you can try_collect() iterators yielding ControlFlow<_, i32> into ControlFlow<_, Vec<i32>>, which you can't do with collect().

It's also a &mut self method instead of collect's self, so that you can resume iteration after it early exits on an error. (Like how try_fold is &mut self while fold is just self.)

Another benefit of try_collect() is discoverability. Because the discoverability of the "collect-into-Result" appears to be lower than desired, "promoting" the technique into its own method seems like a good way to increase its discoverability and reach more users who would find it useful. (One of many examples of people just not being able to find the Option: FromIterator themselves: https://users.rust-lang.org/t/try-collect-as-iter-consuming-operation/20479?u=scottmcm.)

Finally, try_collect() presents a small ergonomics benefit in terms of type hints when collecting, namely that you only have to hint at the output type you're collecting into, not the whole Try type. For example, that means you can collect an iterator yielding Result<i32, SomeComplicatedError> with try_collect::<Vec<_>>(), as opposed to having to specify the whole Result with collect::<Result<Vec<_>, _>>().

Public API

trait Iterator {
    type Item;

    fn try_collect<B>(&mut self) -> ChangeOutputType<Self::Item, B>
    where
        Self: Sized,
        <Self as Iterator>::Item: Try,
        <<Self as Iterator>::Item as Try>::Residual: Residual<B>,
        B: FromIterator<<Self::Item as Try>::Output>;
}

Steps / History

Implementation: Add a try_collect() helper method to Iterator #94041
Final comment period (FCP)
Stabilization PR

Unresolved Questions

Should it have a more complicated signature to be able to return the partial results too? (@CAD97 https://internals.rust-lang.org/t/idea-fallible-iterator-mapping-with-try-map/15715/6?u=scottmcm )
Should it take self rather than &mut self, to prevent users from accidentally continuing to use the iterator after a try_collect() failure? Note that you can still continue to use the iterator if you use by_ref() first, so it's not necessarily a functionality change.
Does the name try_collect() conflict too much with the idea of collecting that's fallible in allocation? (i.e. collecting with Vec::try_reserve or similar)

The text was updated successfully, but these errors were encountered:

ChayimFriedman2 · 2022-02-16T10:12:39Z

The name try_collect() may become a problem if one day we would want a collect()-like method with fallible allocations (like e.g. Vec::try_reserve()).

hellow554 · 2022-02-16T12:19:55Z

Maybe I'm blind and I can't find it, but you can already to something like this:

fn main() {
    let a = vec![Some(3), Some(4)];
    let b = vec![Some(3), None];
    let c = vec![Result::<_, ()>::Ok(3), Ok(4)];
    let d = vec![Ok(3), Err(())];

    println!("{:?}", a.into_iter().collect::<Option<Vec<_>>>());
    println!("{:?}", b.into_iter().collect::<Option<Vec<_>>>());
    println!("{:?}", c.into_iter().collect::<Result<Vec<_>, _>>());
    println!("{:?}", d.into_iter().collect::<Result<Vec<_>, _>>());
}

What is the difference to try_collect?

ChayimFriedman2 · 2022-02-16T12:22:04Z

IIRC the point was to ease the discovery of the method.

Xiretza · 2022-02-16T14:11:31Z

The name try_collect() may become a problem if one day we would want a collect()-like method with fallible allocations (like e.g. Vec::try_reserve()).

Right, the name of this function made me think it was about collecting regular (non-Try) values into things like arrays or fallibly allocated Vecs.

a-lafrance · 2022-02-16T19:39:52Z

What is the difference to try_collect?

@hellow554 the main differences are:

Discoverability (as @ChayimFriedman2 pointed out): It seems like the discoverability of this technique is lower than ideal, so adding a try_collect() helper is meant to make the technique more discoverable by "promoting" it into its own method.
Try but not FromIterator types: This is the main functional difference between try_collect() and collect(). collect() requires the Try type involved to also implement FromIterator, which is fine for simple cases like Option and Result, but doesn't allow other "Try-not-FromIterator" types like ControlFlow to be fallibly collected. In contrast, try_collect() only requires that the output type being collected into implements FromIterator, so I can try_collect() an iterator yielding ControlFlow<_, i32> into a ControlFlow<_, Vec<i32>>, which there's no way to do with collect().
Resuming after failure: collect() takes ownership of the iterator, so you can't continue to use it after a failure occurs when collecting. try_collect() mutably borrows the iterator, so even after a failure occurs, you can continue to perform iteration (e.g. you can call try_collect() again).
Type hints: This is a relatively minor benefit, but you don't have to hint the whole complicated Try type when using try_collect() like you do with collect(). Concretely, if I have an iterator yielding Result<i32, SomeComplicatedError>, I can call try_collect::<Vec<_>>() to perform fallible collection, but I have to call collect::<Result<Vec<_>, SomeComplicatedError>() to do the same thing.

Perhaps I wasn't clear enough about the difference between try_collect() and collect() in my original description of the feature. If it would make things more clear, I'm happy to modify the feature description up above to get the points I just mentioned across.

The name try_collect() may become a problem if one day we would want a collect()-like method with fallible allocations (like e.g. Vec::try_reserve()).

Yeah, I see how that might become a conflict; I guess there's two distinct ways that collecting can fail. If anyone has suggestions for method names that are clearer, I'd love to hear them.

TrolledWoods · 2022-02-17T04:59:11Z

On the point about collect consuming the iterator, I think you could already do .by_ref().collect(). This might be useful to prevent mistakenly using the iterator again, the by_ref makes sure the programmer actually wanted to continue using the iterator. So would it be a benefit for try_collect to also consume the iterator and then mention using by_ref in the docs if you actually want to resume using it?

hellow554 · 2022-02-17T07:03:09Z

@a-lafrance thanks for that explanation. That helps me understanding your intent. And yes, IMHO that should be in the original post as a description :)

@TrolledWoods That was my first thought too, why don't you just use by_ref. Then I looked at the other three functions and they are each taking &mut self.
I mean... it is what is is, but self would have been better IMHO :) But now let's stick to that. I doesn't really matter, does it?

ChayimFriedman2 · 2022-02-17T07:06:51Z

@hellow554 If there is an argument for them to take self by reference, fine. But we should not repeat past mistakes in new APIs 😃

a-lafrance · 2022-02-17T09:02:29Z

Yeah, given that the by_ref() technique exists I can definitely see the desire to take self rather than &mut self, so I'm definitely willing to consider that change. I guess since the PR is already landing we can leave it as-is for now and revisit it later, so I'll add it as an unresolved question for now.

scottmcm · 2022-02-18T07:53:31Z

Should it take self rather than &mut self, to prevent users from accidentally continuing to use the iterator after a try_collect() failure? Note that you can still continue to use the iterator if you use by_ref() first, so it's not necessarily a functionality change.

Note that by_ref is a huge performance footgun, since it prevents using internal iteration, as it uses the &mut I where I: Iterator impl that, because it works for dyn Iterator, can't use any of the generics. The benchmarks in core can be used to see the extra cost:

rust/library/core/benches/iter.rs

Lines 150 to 158 in f838a42

    
           #[bench] 
        
           fn $bench_sum(b: &mut Bencher) { 
        
               b.iter(|| -> i64 { $iter.map(black_box).sum() }); 
        
           } 
        
           #[bench] 
        
           fn $bench_ref_sum(b: &mut Bencher) { 
        
               b.iter(|| -> i64 { $iter.map(black_box).by_ref().sum() }); 
        
           }

It's absolutely by design that anything which can early-exit takes &mut self. That's why all and any take &mut self, for example, while count and max and such don't. And because all takes &mut self, that means try_fold needs to as well in order to meet its primary purpose (#45595) of being the one internal iteration method in terms of which every single Iterator method can be defined.

There's nothing wrong with resuming iteration after getting an error from try_collect. And sure, you can do that wrong and try to resume iteration after getting a success from try_collect on a non-fused iterator, but that's exactly the same mistake as resuming iteration after getting None from .next(), so that seems fine for the same reasons.

a-lafrance · 2022-02-18T08:38:24Z

Oh that's an interesting point @scottmcm. I was wondering why some methods took self while others took &mut self so this clears it up for me. In that case I'm personally in favor of continuing to take &mut self rather than self.

ChayimFriedman2 · 2022-02-18T09:50:07Z

Note that by_ref is a huge performance footgun, since it prevents using internal iteration, as it uses the &mut I where I: Iterator impl that, because it works for dyn Iterator, can't use any of the generics.

Can this situation be improved using specialization? I.e. specialize impl<I: Sized> Iterator for &mut I to use internal iteration where possible? If not, I think this definitely should be mentioned on the docs, since it is not at all obvious.

TrolledWoods · 2022-02-18T17:46:58Z

Thinking about it more there's also the argument that &mut self signifies that it's intended to sometimes resume the value, whereas if it's self by value, you should expect weird behaviour when resuming, so it may be a better idea, yeah. Surprising that by_ref is a performance footgun though, that feels like something that should be fixed asap

scottmcm · 2022-02-18T21:53:22Z

I probably shouldn't have used a word as strong as "footgun" for this. If you're just iterating over, say, a slice, then it makes no difference at all.

It's only in certain cases -- Chain being the usual example -- where it matters. And even then only when the body of the loop is tiny. As soon as the body is doing anything substantial, any difference immediately disappears in the noise.

So it's very important in core, where we make sure to avoid pessimizing code. But outside of that it's often unimportant, and the knowledge lives in things like that blog post I linked. I also don't want people to overindex on this and start telling anyone using a for loop or by_ref that they're "doing it wrong" because of a note in the docs either.

I don't know whether specialization in currently in a place where it could fix it.

a-lafrance · 2022-02-25T09:52:11Z

Just a thought: I was looking back at Rust by Example recently and I noticed that this section mentions fallible collection with Iterator::collect. Would it be worth updating to include try_collect upon stabilization? Just so that people are aware of the new technique that we've introduced.

Let `try_collect` take advantage of `try_fold` overrides No public API changes. With this change, `try_collect` (rust-lang#94047) is no longer going through the `impl Iterator for &mut impl Iterator`, and thus will be able to use `try_fold` overrides instead of being forced through `next` for every element. Here's the test added, to see that it fails before this PR (once a new enough nightly is out): https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=462f2896f2fed2c238ee63ca1a7e7c56 This might as well go to the same person as my last `try_process` PR (rust-lang#93572), so r? `@yaahc`

Let `try_collect` take advantage of `try_fold` overrides No public API changes. With this change, `try_collect` (rust-lang#94047) is no longer going through the `impl Iterator for &mut impl Iterator`, and thus will be able to use `try_fold` overrides instead of being forced through `next` for every element. Here's the test added, to see that it fails before this PR (once a new enough nightly is out): https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=462f2896f2fed2c238ee63ca1a7e7c56 This might as well go to the same person as my last `try_process` PR (rust-lang#93572), so r? ``@yaahc``

…llect The Rust team has begun to introduce try_collect. I will keep an eye on this implementation and revisit this, but for the time being, I'm going to disambiguate this so that I can move on without worrying about a future breakage. - rust-lang/rust#94047 - https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.try_collect

TennyZhuang · 2022-06-09T02:45:53Z

I'd like to share a benefit from try_collect vs collect::<Result<?, ?>>. try_collect allows me to hide the concrete type of the Result::E type, which is convenient when we want to split an error from a more extensive scope to a smaller one. If we use try_collect, we don't need to change every code that uses which Result, and ? will do the implicit cast from the low-level error to the high-level one.

An example risingwavelabs/risingwave#3081

cryptoquick · 2022-07-14T00:48:39Z

Maybe I'm blind and I can't find it, but you can already to something like this:

fn main() {
    let a = vec![Some(3), Some(4)];
    let b = vec![Some(3), None];
    let c = vec![Result::<_, ()>::Ok(3), Ok(4)];
    let d = vec![Ok(3), Err(())];

    println!("{:?}", a.into_iter().collect::<Option<Vec<_>>>());
    println!("{:?}", b.into_iter().collect::<Option<Vec<_>>>());
    println!("{:?}", c.into_iter().collect::<Result<Vec<_>, _>>());
    println!("{:?}", d.into_iter().collect::<Result<Vec<_>, _>>());
}

What is the difference to try_collect?

First of all, thank you for this, I needed that snippet. Also, if this ever gets merged, it'd be nice to have a clippy lint so I don't have to remember to remove where I've copied and pasted that snippet all over my codebase. :3

rben01 · 2022-07-29T21:26:38Z

The name try_collect() may become a problem if one day we would want a collect()-like method with fallible allocations (like e.g. Vec::try_reserve()).

Right, the name of this function made me think it was about collecting regular (non-Try) values into things like arrays or fallibly allocated Vecs.

I think the iterator methods prefixed with try_ can all be (and should continue to be) taken to be mean "this iterator method short-circuits with the first error encountered". The try_ in Vec::try_reserve is like the try_ in Rc::try_new and array::try_from — it simply means "this thing might not work out". Since the proposed method is/would be defined on iterators and not more general Rust objects/types, I think try_collect is the correct name for it. (One could say that the iterator-specific meaning of the try_ prefix "shadows" the language-wide meaning.)

If collecting can fail due to a TryReserveError — which AFAIK is the only way collecting into a FromIterator could fail — then I think the appropriate name for that method would be something like collect_try_reserve. And if more general errors need to be handled then I think a TryFromIterator trait would be appropriate.

Xiretza · 2022-12-12T19:07:19Z

If collecting can fail due to a TryReserveError — which AFAIK is the only way collecting into a FromIterator could fail — then I think the appropriate name for that method would be something like collect_try_reserve. And if more general errors need to be handled then I think a TryFromIterator trait would be appropriate.

It's not specific to reservation of Vecs at all, there are many data structures that would benefit from a fallible FromIterator and such a try_collect() method: arrays (as previously mentioned) are an obvious example, but another one would be a Grid type that can be constructed from an iterator of iterators (the rows of the grid), and which would fail if one of the rows has a different length from the others.

ilyvion · 2023-05-05T12:18:22Z

I don't like the name try_collect for this; it doesn't fit the rest of the standard library API; i.e. it doesn't use a TryFromIterator trait, but rather a FromIterator over items that implement Try, which is a different idea.

Based on its name, I would've expected try_collect to behave similarly to how try_into() can be used where it only succeeds where the source can be converted to the target; i.e. I'd expect to be able to try_collect() into a [T; 10], e.g., where it would only succeed if the iterator had exactly 10 items.

This seems like an unnecessary generalization of the idea of being able to collect() to Result/Option, which already exists today and similar implementations could be added to any new type that implements Try in the future without "stealing" the try_collect name for a different purpose than (at least I think) seems the most natural -- i.e. fallible collection, not collecting fallible items.

It might seem a more clunky name, but this seems like it should be named collect_try rather than try_collect, if it should exist at all.

withoutboats · 2023-06-29T09:31:44Z

(NOT A CONTRIBUTION)

It is my strongly held position that this API should be removed from the standard library. Its usefulness is far far less than the cognitive load it brings and therefore it does not carry its weight.

The biggest issue with this API is that its signature is completely unreasonable. Return <<Self::Item as Try>::Residual as Residua<B>l>::TryType? Multiple associated type projections to perform type changing like this is not a pattern that should be accepted into std without extremely compelling motivation for the simple reason that is totally inscrutable to most users.

And the motivation to add this API to std is not compelling at all. From the API docs and this thread I see:

Types that implement Try but not FromIterator can be collected into (the overwhelmingly most common Try types, Result and Option, do implement FromIterator).
You don't have to specify Result with try_collect whereas you do with collect.
Mutably borrowing the iterator, instead of taking ownership of it, so you can continue iterating after.

Remember that collect is just a convenient sugar over a for loop. For that final stated advantage, I would strongly prefer just writing a for loop that doesn't short circuit on the error case.

When contrasting these benefits with the signature of this function, I do not have any doubt that this is not a motivation that justifies a signature like this. When you consider other problems brought up in this thread - like that it takes a name that could plausibly useful for a fallibly allocating collect API, which would presumably have a simpler signature and a more compelling motivation - I am honestly shocked that this API has been added to std.

Jules-Bertholet · 2023-06-29T12:58:01Z

The biggest issue with this API is that its signature is completely unreasonable. Return <<Self::Item as Try>::Residual as Residual<B>>::TryType? Multiple associated type projections to perform type changing like this is not a pattern that should be accepted into std without extremely compelling motivation for the simple reason that is totally inscrutable to most users.

The Try trait is still unstable, it may be that the final design won't be so inscrutable. I would suspend judgment until Try is stabilized.

Another potential issue is that a hypothetical faillibly-allocating collection function might want to be called try_collect.

WhyNotHugo · 2023-08-30T13:38:52Z

What use cases are served by Iterator::try_collect which aren't already covered by Result::from_iter?

a-lafrance added C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Feb 16, 2022

a-lafrance mentioned this issue Feb 16, 2022

Add a try_collect() helper method to Iterator #94041

Merged

scottmcm mentioned this issue Feb 18, 2022

Let try_collect take advantage of try_fold overrides #94115

Merged

This comment was marked as off-topic.

Sign in to view

ghost mentioned this issue Nov 7, 2022

Use a GAT for Residual #104128

Closed

ilyagr mentioned this issue Dec 19, 2022

Allow jj git push to push multiple branches/changes at once martinvonz/jj#912

Merged

3 tasks

frengor mentioned this issue Jan 18, 2023

Tracking Issue for Iterator::collect_into #94780

Open

4 tasks

danakj mentioned this issue Jul 30, 2023

Add Iterator::try_collect chromium/subspace#295

Closed

polarathene mentioned this issue Oct 17, 2023

refactor: Use a common from_value method mehcode/config-rs#472

Open

allada mentioned this issue Dec 8, 2023

Add Completeness Checking Store TraceMachina/nativelink#404

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking Issue for `Iterator::try_collect` #94047

Tracking Issue for `Iterator::try_collect` #94047

a-lafrance commented Feb 16, 2022 •

edited by scottmcm

ChayimFriedman2 commented Feb 16, 2022

hellow554 commented Feb 16, 2022 •

edited

ChayimFriedman2 commented Feb 16, 2022 •

edited

Xiretza commented Feb 16, 2022

a-lafrance commented Feb 16, 2022

TrolledWoods commented Feb 17, 2022

hellow554 commented Feb 17, 2022

ChayimFriedman2 commented Feb 17, 2022

a-lafrance commented Feb 17, 2022

scottmcm commented Feb 18, 2022 •

edited

a-lafrance commented Feb 18, 2022

ChayimFriedman2 commented Feb 18, 2022

TrolledWoods commented Feb 18, 2022

scottmcm commented Feb 18, 2022

a-lafrance commented Feb 25, 2022

This comment was marked as off-topic.

TennyZhuang commented Jun 9, 2022 •

edited

cryptoquick commented Jul 14, 2022

rben01 commented Jul 29, 2022

Xiretza commented Dec 12, 2022 •

edited

ilyvion commented May 5, 2023 •

edited

withoutboats commented Jun 29, 2023 •

edited

Jules-Bertholet commented Jun 29, 2023

WhyNotHugo commented Aug 30, 2023 •

edited

Tracking Issue for Iterator::try_collect #94047

Tracking Issue for Iterator::try_collect #94047

Comments

a-lafrance commented Feb 16, 2022 • edited by scottmcm

Difference from Iterator::collect

Public API

Steps / History

Unresolved Questions

ChayimFriedman2 commented Feb 16, 2022

hellow554 commented Feb 16, 2022 • edited

ChayimFriedman2 commented Feb 16, 2022 • edited

Xiretza commented Feb 16, 2022

a-lafrance commented Feb 16, 2022

TrolledWoods commented Feb 17, 2022

hellow554 commented Feb 17, 2022

ChayimFriedman2 commented Feb 17, 2022

a-lafrance commented Feb 17, 2022

scottmcm commented Feb 18, 2022 • edited

a-lafrance commented Feb 18, 2022

ChayimFriedman2 commented Feb 18, 2022

TrolledWoods commented Feb 18, 2022

scottmcm commented Feb 18, 2022

a-lafrance commented Feb 25, 2022

This comment was marked as off-topic.

TennyZhuang commented Jun 9, 2022 • edited

cryptoquick commented Jul 14, 2022

rben01 commented Jul 29, 2022

Xiretza commented Dec 12, 2022 • edited

ilyvion commented May 5, 2023 • edited

withoutboats commented Jun 29, 2023 • edited

Jules-Bertholet commented Jun 29, 2023

WhyNotHugo commented Aug 30, 2023 • edited

Tracking Issue for `Iterator::try_collect` #94047

Tracking Issue for `Iterator::try_collect` #94047

a-lafrance commented Feb 16, 2022 •

edited by scottmcm

Difference from `Iterator::collect`

hellow554 commented Feb 16, 2022 •

edited

ChayimFriedman2 commented Feb 16, 2022 •

edited

scottmcm commented Feb 18, 2022 •

edited

TennyZhuang commented Jun 9, 2022 •

edited

Xiretza commented Dec 12, 2022 •

edited

ilyvion commented May 5, 2023 •

edited

withoutboats commented Jun 29, 2023 •

edited

WhyNotHugo commented Aug 30, 2023 •

edited