Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add `Iterator::for_each` #42782

Merged
merged 3 commits into from Jun 30, 2017

Conversation

Projects
None yet
@cuviper
Copy link
Member

cuviper commented Jun 20, 2017

This works like a for loop in functional style, applying a closure to
every item in the Iterator. It doesn't allow break/continue like
a for loop, nor any other control flow outside the closure, but it may
be a more legible style for tying up the end of a long iterator chain.

This was tried before in #14911, but nobody made the case for using it
with longer iterators. There was also Iterator::advance at that time
which was more capable than for_each, but that no longer exists.

The itertools crate has Itertools::foreach with the same behavior,
but thankfully the names won't collide. The rayon crate also has a
ParallelIterator::for_each where simple for loops aren't possible.

I really wish we had for_each on seq iterators. Having to use a
dummy operation is annoying. - @nikomatsakis

Add `Iterator::for_each`
This works like a `for` loop in functional style, applying a closure to
every item in the `Iterator`.  It doesn't allow `break`/`continue` like
a `for` loop, nor any other control flow outside the closure, but it may
be a more legible style for tying up the end of a long iterator chain.

This was tried before in #14911, but nobody made the case for using it
with longer iterators.  There was also `Iterator::advance` at that time
which was more capable than `for_each`, but that no longer exists.

The `itertools` crate has `Itertools::foreach` with the same behavior,
but thankfully the names won't collide.  The `rayon` crate also has a
`ParallelIterator::for_each` where simple `for` loops aren't possible.

> I really wish we had `for_each` on seq iterators. Having to use a
> dummy operation is annoying.  - [@nikomatsakis][1]

[1]: rayon-rs/rayon#367 (comment)
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Jun 20, 2017

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jun 20, 2017

@bluss I'm curious about your "interesting reasons" to use fold in Itertools::foreach. What sort of optimizations come out of that?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 20, 2017

👍 I'm game!

@rust-lang/libs, any others have thoughts?

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jun 21, 2017

Works for me!

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Jun 21, 2017

I seem to remember there being philosophical objections to this back in the day, but I don't feel particularly strongly.

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jun 21, 2017

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jun 21, 2017

Woah, for_each based on fold is a clear win over a for loop when chain is involved. I'll update it and see about adding benchmarks to show this benefit, and we should figure out how to express this in the documentation.

Use `fold` to implement `Iterator::for_each`
The benefit of using internal iteration is shown in new benchmarks:

    test iter::bench_for_each_chain_fold     ... bench:     635,110 ns/iter (+/- 5,135)
    test iter::bench_for_each_chain_loop     ... bench:   2,249,983 ns/iter (+/- 42,001)
    test iter::bench_for_each_chain_ref_fold ... bench:   2,248,061 ns/iter (+/- 51,940)

@cuviper cuviper force-pushed the cuviper:iterator_for_each branch from c5c238b to 4a8ddac Jun 21, 2017

/// #![feature(iterator_for_each)]
///
/// let mut v = vec![];
/// (0..5).for_each(|x| v.push(x * 100));

This comment has been minimized.

@frewsxcv

frewsxcv Jun 22, 2017

Member

i'm not necessarily opposed to the current example you have written, but i find the current example slightly less idiomatic (subjectively) than something like:

let v: Vec<_> = (0..5).map(|x| x * 100).collect();

This comment has been minimized.

@cuviper

cuviper Jun 22, 2017

Author Member

Sure, I was just aiming for something simple and testable. I would definitely use collect or extend for that in real code. Any ideas for something more meaningful?

Similarly, the added benchmarks are just sums.

This comment has been minimized.

@budziq

budziq Jun 24, 2017

Contributor

Any ideas for something more meaningful?

New cookbook contributors usually have problem with consuming functional flow they have built just for the sake of side effects (if they do not wish to obtain any value like in fold or collect). Switching to imperative for just to obtain side effects feels not idiomatic.

Some artificial examples that might not be any better 😸

let (tx, rx) = channel();
(0..5).map(|x| x * 2 + 1).for_each(|x| { tx.send(x).unwrap(); } );
["1", "2", "lol", "baz", "55"]
    .iter()
    .filter_map(|s| s.parse::<u16>().ok())
    .map(|v| v * v)
    .for_each(|v| println!("{}", v));

This comment has been minimized.

@cuviper

cuviper Jun 26, 2017

Author Member

@budziq I'm glad to hear of more motivation for this!

Your additional examples are OK, but still not testing anything besides successfully compiling. Note that my first example has an assert_eq with the for-loop result, so we actually get some sanity check that it really works, as trivial as that is. Your channel example could read the rx side to check the result though.

This comment has been minimized.

@budziq

budziq Jun 27, 2017

Contributor

@cuviper so something like that might be ok?

use std::sync::mpsc::channel;

let (tx, rx) = channel();
(0..5).map(|x| x * 2 + 1).for_each(|x| { tx.send(x).unwrap(); } );
assert_eq!(vec![1, 3, 5, 7, 9], rx.iter().take(5).collect::<Vec<_>>());

This comment has been minimized.

@cuviper

cuviper Jun 27, 2017

Author Member

That would work. Do folks like that better? Or perhaps as one more example?

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Jun 27, 2017

I personally really wanted a foreach but yeah, it was rejected, what's changed since then?

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jun 27, 2017

@steveklabnik I think at that time the main point was that short iterators are probably better off with a for loop and more control-flow options. It wasn't properly considered for use with longer iterator chains. When your iterator takes up multiple lines, a for loop either makes awkward formatting or requires saving to a local first, both annoying.

Also, the performance benefit of internal iteration via fold is pretty compelling, and we can implement that in for_each without users having to understand it.

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Jun 27, 2017

It wasn't properly considered for use with longer iterator chains.

I must have done a bad job advocating back then, then. Oh well.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 27, 2017

Ok sounds like there's no reason to not experiment with this at this point, @cuviper if you want to update the example (which I think the comments indicate?) then I'll r+

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Jun 27, 2017

History time!

I would argue that rust-lang/rfcs#1064 (comment) by @nagisa is a decent summary:

So far none of the points arguing against the RFC have become false:

  • There’s a more general version of this function already available in the standard library as well as various chains that produce the desired behaviour;
  • THere’s the for-loop construct;
  • itertools package still provides a convenience wrapper for the for-loop.

The first bullet point has changed. The other ones are still true.

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jun 27, 2017

Thanks for finding more context! My search-fu is apparently weak...

I don't think we have to falsify every point against -- only compare them to the points for:

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jun 27, 2017

In general, I feel like there are a lot of people that do want this, and the people against are just "meh".

Anyway, I updated the first example like @budziq's suggestion.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jun 29, 2017

Obviously I've already been quoted at the top, but I agree with @cuviper that I think the pros outweigh the cons. I think that there is indeed new information since the last time this was discussed:

  1. The performance benefits of "internal iteration" were not widely discussed at the time (unless I remember incorrectly; I confess I didn't bother to click all of @steveklabnik's links, I'm just going based on my memory).
  2. The fact that, for parallel iteration, for_each is necessary.

The two together mean that encouraging the use of for_each when it is convenient for sequential iteration will make code faster by default, and also facilitate the conversion to parallel execution. And it's more ergonomic to boot! Seems like a win-win to me.

The main argument against is basically "TMWTDI", which -- from my POV -- is a good enough reason to stop things without compelling advantages, but not an absolute blocker.

I also think one of the points made at the time is at least somewhat false:

There’s a more general version of this function already available in the standard library as well as various chains that produce the desired behaviour;

I presume that this is referring to fold or all? I don't consider that a real alternative. It's true that one can model for_each this way, but it's misleading and makes the code harder to read -- you have to realize that the function is being abused for something other than its intended purpose, which imo starts to defeat the point of using iterators. Moreover, as a consequence of that, code written in this style cannot be parallelized as efficiently or as well (e.g., the all combinator will waste time propagating booleans and checking for shortcircuits; fold isn't even available).

///
/// let (tx, rx) = channel();
/// (0..5).map(|x| x * 2 + 1)
/// .for_each(move |x| tx.send(x).unwrap());

This comment has been minimized.

@nikomatsakis

nikomatsakis Jun 29, 2017

Contributor

Is the move necessary here? I would not expect so.

This comment has been minimized.

@cuviper

cuviper Jun 29, 2017

Author Member

Maybe it's too sneaky, but that lets tx drop automatically, and then rx won't block.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 30, 2017

@bors: r+

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 30, 2017

📌 Commit e72ee6e has been approved by alexcrichton

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 30, 2017

⌛️ Testing commit e72ee6e with merge 919c4a6...

bors added a commit that referenced this pull request Jun 30, 2017

Auto merge of #42782 - cuviper:iterator_for_each, r=alexcrichton
Add `Iterator::for_each`

This works like a `for` loop in functional style, applying a closure to
every item in the `Iterator`.  It doesn't allow `break`/`continue` like
a `for` loop, nor any other control flow outside the closure, but it may
be a more legible style for tying up the end of a long iterator chain.

This was tried before in #14911, but nobody made the case for using it
with longer iterators.  There was also `Iterator::advance` at that time
which was more capable than `for_each`, but that no longer exists.

The `itertools` crate has `Itertools::foreach` with the same behavior,
but thankfully the names won't collide.  The `rayon` crate also has a
`ParallelIterator::for_each` where simple `for` loops aren't possible.

> I really wish we had `for_each` on seq iterators. Having to use a
> dummy operation is annoying.  - [@nikomatsakis][1]

[1]: rayon-rs/rayon#367 (comment)
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 30, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 919c4a6 to master...

@bors bors merged commit e72ee6e into rust-lang:master Jun 30, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jun 30, 2017

Yay!

Now, since I left it unstable with issue = "0", do we need to open an issue and update that with a PR?

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

Mark-Simulacrum commented Jun 30, 2017

Oh, yes, please do. We probably shouldn't have merged yet actually, but no worries!

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jun 30, 2017

See #42987 for that update.

@phimuemue

This comment has been minimized.

Copy link

phimuemue commented Apr 2, 2018

Hi, I stumbled upon this, but I wondered why for_each does not support break. I hope I don't miss something essential, but I imagined it would be no problem for for_each to take a function returning a value that determines whether to break or not.

This value could be - as it is now - a () (never causing the loop to break) or a bool indicating whether we want to break or an Option<T> indicatin whether we want to "break with a certain item".

In code, this could be captured by a trait BreakIndicator as follows:

trait BreakIndicator {
    fn is_break(&self) -> bool;
    fn final_continue() -> Self;
}

impl BreakIndicator for () {
    // always continue
    fn is_break(&self) -> bool { false }
    fn final_continue() -> () {}
}

impl BreakIndicator for bool {
    // true means break; false means continue
    fn is_break(&self) -> bool { *self }
    fn final_continue() -> bool { false }
}

impl<T> BreakIndicator for Option<T> {
    // Some(v) means "break wich value v"; None means continue
    fn is_break(&self) -> bool { self.is_some() }
    fn final_continue() -> Option<T> { None }
}

I implemented a fn breaking_for_each on top of Iterator to see how that would work out:

trait IteratorWithBreakingForEach : Iterator {
    fn breaking_for_each<F, BI>(self, f: F) -> BI
        where F: FnMut(Self::Item) -> BI,
              BI: BreakIndicator,
    ;
}

impl<I> IteratorWithBreakingForEach for I where I: Iterator {
    fn breaking_for_each<F, BI>(self, mut f: F) -> BI
        where F: FnMut(Self::Item) -> BI,
              BI: BreakIndicator,
    {
        for item in self {
            let break_indicator = f(item);
            if break_indicator.is_break() {
                return break_indicator;
            }
        }
        BI::final_continue()
    }
}

Usage could be as follows:

fn main() {
    (1..10).breaking_for_each(|i| {
        println!("{}", i) // does not break at all
    });
    (1..10).breaking_for_each(|i| {
        println!("{}", i);
        i>=5 // break if i>=5
    });
    let x = (1..10).breaking_for_each(|i| {
        println!("{}", i);
        if i>=5 {
            Some(i) // break with value
        } else {
            None // continue
        }
    });
    println!("{:?}", x);
}

Has this ever been thought about? And if so, why was it apparently rejected?

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Apr 2, 2018

@phimuemue There's a form of that built around the Try trait with try_for_each (docs). There's also some discussion in #42327 (comment) whether Try should be re-framed more like Break/Continue.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Apr 2, 2018

@phimuemue As an example of "break with a certain item", check out how find is implemented:

fn find<P>(&mut self, mut predicate: P) -> Option<Self::Item> where
Self: Sized,
P: FnMut(&Self::Item) -> bool,
{
self.try_for_each(move |x| {
if predicate(&x) { LoopState::Break(x) }
else { LoopState::Continue(()) }
}).break_value()
}

(That LoopState type is currently internal, but personally I'd like Try to use it, as @cuviper said.)

If you wanted to do the same in your code today†, it can be done with Result like this:

    self.try_for_each(move |x| { 
        if predicate(&x) { Err(x) } 
        else { Ok(()) } 
    }).err() 

† Well, once someone makes a stabilization PR for try_for_each...

@cuviper cuviper referenced this pull request Apr 16, 2018

Closed

Add Iterator::exhaust #49990

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.