Add std::iter::unfold #55869

SimonSapin · 2018-11-11T11:23:52Z

This adds an unstable ~~std::iter::iterate~~ std::iter::unfold function and ~~std::iter::Iterate~~ std::iter::Unfold type that trivially wrap a ~~FnMut() -> Option<T>~~ FnMut(&mut State) -> Option<T> closure to create an iterator. ~~Iterator state can be kept in the closure’s environment or captures.~~

This is intended to help reduce amount of boilerplate needed when defining an iterator that is only created in one place. Compare the existing example of the std::iter module: (explanatory comments elided)

struct Counter {
    count: usize,
}

impl Counter {
    fn new() -> Counter {
        Counter { count: 0 }
    }
}

impl Iterator for Counter {
    type Item = usize;

    fn next(&mut self) -> Option<usize> {
        self.count += 1;
        if self.count < 6 {
            Some(self.count)
        } else {
            None
        }
    }
}

… with the same algorithm rewritten to use this new API:

fn counter() -> impl Iterator<Item=usize> {
    std::iter::unfold(0, |count| {
        *count += 1;
        if *count < 6 {
            Some(*count)
        } else {
            None
        }
    })
}

This also add unstable std::iter::successors which takes an (optional) initial item and a closure that takes an item and computes the next one (its successor).

let powers_of_10 = successors(Some(1_u16), |n| n.checked_mul(10));
assert_eq!(powers_of_10.collect::<Vec<_>>(), &[1, 10, 100, 1_000, 10_000]);

rust-highfive · 2018-11-11T11:23:54Z

r? @aidanhs

(rust_highfive has picked a reviewer for you, use r? to override)

SimonSapin · 2018-11-11T11:28:37Z

For another example, the other somewhat-general purpose iterator that I initially considered submitting can be expressed on top of iterate like below. However it feels not quite general enough to belong in the standard library. iterate on the other hand feels more fundamental, as a way of bridging closures (and their very nice syntax) to iterators.

fn successors<T, F>(mut next: Option<T>, mut succ: F) -> impl Iterator<Item = T>
where
    F: FnMut(&T) -> Option<T>,
{
    std::iter::iterate(move || {
        next.take().map(|item| {
            next = succ(&item);
            item
        })
    })
}

Edit: with explicit state:

fn successors<T, F>(first: Option<T>, mut succ: F) -> impl Iterator<Item = T>
where
    F: FnMut(&T) -> Option<T>,
{
    std::iter::unfold(first, move |next| {
        next.take().map(|item| {
            *next = succ(&item);
            item
        })
    })
}

nagisa · 2018-11-11T12:44:06Z

NB: this is closer to Haskell’s unfoldr than iterate. I’m fine with the proposed name, but there may be some value in using the same name as Haskell does.

One thing that could be tweaked about this function is making the state explicit instead of implicit in the closure as it is now. That would make the function’s signature

pub fn iterate<T, S, F: FnMut(S) -> Option<(T, S)>>(state: S, f: F) -> Iterate<F>;

and the motivating example look like this:

fn counter() -> impl Iterator<Item=usize> {
    std::iter::iterate(0, move |count| {
        if count < 6 {
            Some((count + 1, count + 1))
        } else {
            None
        }
    })
}

which, among other things, would make this strictly more composable and flexible. It would also be closer design-wise to our other iterators (esp. Iterator::fold).

SimonSapin · 2018-11-11T13:55:45Z

What does the r mean in unfoldr?

How is explicitly state more composable or flexible? Returning an (optional) tuple feels kinda awkward to me.

nagisa · 2018-11-11T15:04:03Z

r stands for "right-associative", and there’s not really a l variant of unfold, at least not for their built-in lazy lists (which are somewhat close to iterators in Rust). I suspect the exact origin of the r in unfoldr is from unfoldr being dual to foldr. foldr has different semantics when compared to foldl (which is what our Iterator::fold is).

Associativity matters more in Haskell than it does in Rust, though, so I think there’s little reason to worry about the associativity here.

How is explicitly state more composable or flexible? Returning an (optional) tuple feels kinda awkward to me.

At the very least it is possible to use non-closures with the function. I agree with the tuple feeling slightly more awkward than captured state in this specific example, but to me it seems that given a different example it could go the other way as well. All that being said, I feel that Iterator::fold having explicit state is a strong incentive to have similarly explicit state in related functions as well.

ljedrz · 2018-11-11T15:36:39Z

A blast from the past.

bluetech · 2018-11-11T17:36:16Z

If it will be possible to use generators to create iterators, would this function still be useful?

SimonSapin · 2018-11-11T20:43:38Z

@ljedrz In my opinion we’ve been very conservative in the past with the standard library, especially in the months before and after 1.0. You can see that the given deprecation reason can be summarized as "Meh."

@bluetech Indeed it would be less useful. However as far as I know such generators (or generators at all, other than as an unstable implementation detail of async fns) are not on the roadmap at the moment.

I’ve rename to unfold / Unfold, and added explicit state. However rather than moved, the state is passed to the closure as &mut St and is not part of the closure’s return type.

rust-highfive · 2018-11-11T20:46:36Z

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

travis_time:end:00d501e1:start=1541968869333660002,finish=1541968924170331674,duration=54836671672
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#Pull-Requests-and-Security-Restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-5.0

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

ljedrz · 2018-11-11T20:51:34Z

@SimonSapin don't get me wrong, I'm a big fan of the idea :).

clarfonthey · 2018-11-11T22:38:54Z

I'm worried that this will discourage people providing proper size hints for their iterators. This is also very similar to repeat_with and perhaps that should be defined in terms of this?

Centril · 2018-11-11T23:30:31Z

src/libcore/iter/sources.rs

@@ -386,3 +386,68 @@ impl<T> FusedIterator for Once<T> {}
 pub fn once<T>(value: T) -> Once<T> {
    Once { inner: Some(value).into_iter() }
 }
+
+/// Creates a new iterator where each iteration calls the provided closure
+/// `F: FnMut() -> Option<T>`.


Suggested change

/// `F: FnMut() -> Option<T>`.

/// `F: FnMut(&mut St) -> Option<T>`.

Centril · 2018-11-11T23:33:13Z

src/libcore/iter/sources.rs

+/// without using the more verbose syntax of creating a dedicated type
+/// and implementing the `Iterator` trait for it.
+/// Iterator state can be kept in the closure’s captures and environment.
+///


Suggested change

///

/// An initial state can also be passed in the first argument of `unfold`.

///

Centril · 2018-11-11T23:34:12Z

src/libcore/iter/sources.rs

+/// ```
+/// #![feature(iter_unfold)]
+/// let counter = std::iter::unfold(0, |count| {
+///     // increment our count. This is why we started at zero.


Suggested change

/// // increment our count. This is why we started at zero.

/// // Increment our count. This is why we started at zero.

(should be fixed in module level docs also...)

Centril · 2018-11-11T23:34:19Z

src/libcore/iter/sources.rs

+///     // increment our count. This is why we started at zero.
+///     *count += 1;
+///
+///     // check to see if we've finished counting or not.


Suggested change

/// // check to see if we've finished counting or not.

/// // Check to see if we've finished counting or not.

Centril · 2018-11-11T23:35:24Z

src/libcore/iter/sources.rs

+    }
+}
+
+/// An iterator where each iteration calls the provided closure `F: FnMut() -> Option<T>`.


Suggested change

/// An iterator where each iteration calls the provided closure `F: FnMut() -> Option<T>`.

/// An iterator where each iteration calls the provided closure

/// `F: FnMut(&mut St) -> Option<T>`.

Centril · 2018-11-11T23:36:43Z

src/libcore/iter/sources.rs

+///     }
+/// });
+/// assert_eq!(counter.collect::<Vec<_>>(), &[1, 2, 3, 4, 5]);
+/// ```


Maybe leave a note that this iterator is not fused and that calling it after it returned None the first time might return Some afterwards.

Centril · 2018-11-11T23:37:36Z

src/libcore/iter/sources.rs

+#[unstable(feature = "iter_unfold", issue = /* FIXME */ "0")]
+pub struct Unfold<St, F> {
+    /// The current state of the iterator
+    pub state: St,


Double check: are we comfortable with exposing a public field at this point in time?

It was public in the old Unfold that we removed in 1.3 and I didn’t see a reason for it not to be. But I’ve undone this change for now in case it makes anyone feel better.

Centril · 2018-11-11T23:38:46Z

src/libcore/iter/sources.rs

+/// See its documentation for more.
+///
+/// [`unfold`]: fn.unfold.html
+#[derive(Copy, Clone, Debug)]


This will add an impl Debug for Unfold<St, F> where St: Debug, F: Debug. Note in particular F: Debug; this is probably not what the user wants so I think the Debug implementation should be hand rolled in this instance.

We do this all over the place with iterators, for example:

impl<I: Debug, P> Debug for Filter<I, P>

No P: Debug here. ;)

Centril · 2018-11-11T23:41:51Z

I think this is a good idea and I agree with @SimonSapin that generators aren't sufficient justification to skip this given that generators haven't even been RFC accepted yet (experimental don't count) and that they thus are far into the future.

The design seems mostly right to me and the name is also good (follows the Haskell theme we generally have for iterators).

@clarcharr I think it's fine to not provide proper size hints for your iterator; premature optimization and all that... First things first, prototype your application and make it work correctly first, then after, if profiling shows up a problem, then you tune things with a custom iterator.

Centril · 2018-11-11T23:42:33Z

src/libcore/iter/sources.rs

+///     }
+/// });
+/// assert_eq!(counter.collect::<Vec<_>>(), &[1, 2, 3, 4, 5]);
+/// ```


Maybe also leave a note about size_hint.

SimonSapin · 2018-11-12T11:36:18Z

I’ve added a note about size_hint and FusedIterator.

FWIW in the use case that triggered this PR (tree traversal) there is no cheap way to compute a size_hint better than (0, None). (Or perhaps (0, Some(usize::MAX / size_of::<Node>())) would be accurate, but it wouldn’t be particularly useful.)

cuviper · 2018-11-12T23:29:30Z

FWIW, itertools picked up unfold after it was evicted from std::iter. But these are free functions anyway, so they won't conflict the way additions to Iterator can with Itertools.

alexcrichton · 2018-11-13T16:47:41Z

On reading the signature here my first reaction was "why take a state parameter because the closure env also has state?" but it looks like that's how this function started and it's ended up here with a state parameter. On reading the discussion here I'm not quite sure why, but @SimonSapin how come you made the switch? (or @cuviper, do you know why it's the way it is in itertools?)

cuviper · 2018-11-13T17:54:13Z

I believe @bluss just took the Unfold design as-is from std: rust-itertools/itertools@edc9295. Recently, @matklad had a similar question about state vs. captures in rust-itertools/itertools#298.

I guess the state parameter can make it easier to write a single-expression unfold, rather than having to let-bind the initial state separately. And with captures, if you need to leave the initial scope, you have to remember to move into the closure too.

In some sense, the unfold state just mirrors the fold accumulator. Maybe that's enough.

matklad · 2018-11-13T19:04:43Z

Note that we already can avoid creating a struct using the Iterator::scan method. So, the original example can be written, albeit very unintuitively, as

fn counter() -> impl Iterator<Item=i32> {
    let mut state = 0;
    std::iter::repeat(())
        .scan((), move |(), ()| {
            state += 1;
            if state < 6 {
                Some(state)
            } else {
                None
            }
        })
}

matklad · 2018-11-13T19:17:06Z

I personally really enjoy Kotlin's generateSequence design: it fits better for common case, and is much more intuitive then unfold. Using generate, the counter example looks like:

fn counter() -> impl Iterator<Item=usize> {
    generate(Some(1), |it| {
        if it + 1 < 6 { Some(it + 1) } else { None }
    })
}

playground

I'd rather we add generate to std, though, we should probably try it out in itertools first PR.

SimonSapin · 2018-11-13T19:21:27Z

@alexcrichton I initially liked the simplicity of the next() impl that does nothing but call the closure, but I don’t really mind explicit state. I made the switch because I didn’t really have a counter-point to the consistency argument:

#55869 (comment)

making the state explicit instead of implicit in the closure as it is now […] would also be closer design-wise to our other iterators (esp. Iterator::fold).

… but I’m ok with switch back if other people prefer the more "trivial" behavior.

@cuviper Indeed, re move keyword. The earlier version of this PR had an example in the doc-comment that used it with fn counter() -> impl Iterator<Item=u32>.

@matklad I think this generate is identical to my successors in #55869 (comment) ?

matklad · 2018-11-13T19:30:56Z

@matklad I think this generate is identical to my successors in #55869 (comment) ?

Agree, didn't saw that comment. From my experience though, generate/successor is what you need in practice in the overwhelming majority of cases (like, fs::read_to_string). I think unfold can be expressed in terms of generate and map, where generate gens a pair of (state, item). There's also a possibility to add both, of course.

Probably a good thing to do would be to grep something like servo for usages of Unfold, and check how many will be simplified with generate.

bluss · 2018-11-13T19:32:54Z

Implicit state seems more elegant, but I don't use unfold enough in the wild to really know what works best. What is awkward about the &mut St reference is that you'll often go on to read (*state) and write back (*state = x;) updates.

I think fold has a good reason to use an explicit accumulator: when ownership is passed through each accumulation.

SimonSapin · 2018-11-13T19:34:15Z

(FWIW Servo does not use itertools::Unfold. And std::iter::Unfold was removed long ago, this PR is about adding it back.)

cuviper · 2018-11-13T21:25:40Z

This adds an unstable ~~std::iter::iterate~~ std::iter::unfold function and ~~std::iter::Iterate~~ std::iter::Unfold type that trivially wrap a ~~FnMut() -> Option<T>~~ FnMut(&mut State) -> Option<T> closure to create an iterator. ~~Iterator state can be kept in the closure’s environment or captures.~~

BTW, I didn't see your original code, but std::iter::iterate did exist once too:

pub fn iterate<T, F>(
    seed: T,
    f: F,
) -> Unfold<(F, Option<T>, bool), fn(&mut (F, Option<T>, bool)) -> Option<T>>
where
    T: Clone,
    F: FnMut(T) -> T,

And there's a similar itertools::iterate, with its own return type and not requiring Clone:

pub fn iterate<St, F>(initial_value: St, f: F) -> Iterate<St, F>
where
    F: FnMut(&St) -> St,

But unlike @SimonSapin's successors / @matklad's generate, these iterates don't terminate.

SimonSapin · 2018-11-13T21:36:21Z

Hmm, I probably shouldn’t have squashed the commits. The original code was something like this:

impl<T, F: FnMut() -> Option<T>> Iterator for Foo<F> {
    type Item = T;
    fn next(&mut self) -> Option<T> {
        (self.0)()
    }
}

SimonSapin · 2018-11-15T13:40:58Z

I’ve optimistically created a tracking issue at #55977.

From my experience though, generate/successor is what you need in practice in the overwhelming majority of cases

I’ve added successors.

clarfonthey · 2018-11-17T19:10:04Z

Bikeshed: successors as continue_with, maybe?

alexcrichton · 2018-11-19T16:47:59Z

Ok these seem reasonable enough to me to add unstable, r=me with @Centril's doc comments as well

SimonSapin · 2018-11-20T17:23:11Z

@bors r=alexcrichton

bors · 2018-11-20T17:23:12Z

📌 Commit a4279a0 has been approved by alexcrichton

Add std::iter::unfold This adds an **unstable** ~`std::iter::iterate`~ `std::iter::unfold` function and ~`std::iter::Iterate`~ `std::iter::Unfold` type that trivially wrap a ~`FnMut() -> Option<T>`~ `FnMut(&mut State) -> Option<T>` closure to create an iterator. ~Iterator state can be kept in the closure’s environment or captures.~ This is intended to help reduce amount of boilerplate needed when defining an iterator that is only created in one place. Compare the existing example of the `std::iter` module: (explanatory comments elided) ```rust struct Counter { count: usize, } impl Counter { fn new() -> Counter { Counter { count: 0 } } } impl Iterator for Counter { type Item = usize; fn next(&mut self) -> Option<usize> { self.count += 1; if self.count < 6 { Some(self.count) } else { None } } } ``` … with the same algorithm rewritten to use this new API: ```rust fn counter() -> impl Iterator<Item=usize> { std::iter::unfold(0, |count| { *count += 1; if *count < 6 { Some(*count) } else { None } }) } ``` ----- This also add unstable `std::iter::successors` which takes an (optional) initial item and a closure that takes an item and computes the next one (its successor). ```rust let powers_of_10 = successors(Some(1_u16), |n| n.checked_mul(10)); assert_eq!(powers_of_10.collect::<Vec<_>>(), &[1, 10, 100, 1_000, 10_000]); ```

@ghost

Rollup of 14 pull requests Successful merges: - #55767 (Disable some pretty-printers when gdb is rust-enabled) - #55838 (Fix #[cfg] for step impl on ranges) - #55869 (Add std::iter::unfold) - #55945 (Ensure that the argument to `static_assert` is a `bool`) - #56022 (When popping in CTFE, perform validation before jumping to next statement to have a better span for the error) - #56048 (Add rustc_codegen_ssa to sysroot) - #56091 (Fix json output in the self-profiler) - #56097 (Fix invalid bitcast taking bool out of a union represented as a scalar) - #56116 (ci: Download clang/lldb from tarballs) - #56120 (Add unstable Literal::subspan().) - #56154 (Pass additional linker flags when targeting Fuchsia) - #56162 (std::str Adapt documentation to reality) - #56163 ([master] Backport 1.30.1 release notes) - #56168 (Fix the tracking issue for hash_raw_entry) Failed merges: r? @ghost

SimonSapin added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. A-iterators Area: Iterators labels Nov 11, 2018

rust-highfive assigned aidanhs Nov 11, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 11, 2018

SimonSapin force-pushed the iterate branch from eac089d to 515fb03 Compare November 11, 2018 20:39

SimonSapin changed the title ~~Add std::iter::iterate~~ Add std::iter::unfold Nov 11, 2018

SimonSapin force-pushed the iterate branch from 515fb03 to a25af8e Compare November 11, 2018 20:53

Centril reviewed Nov 11, 2018

View reviewed changes

SimonSapin force-pushed the iterate branch from a25af8e to e009d90 Compare November 12, 2018 11:29

SimonSapin mentioned this pull request Nov 15, 2018

Tracking issue for std::iter::from_fn #55977

Closed

2 tasks

SimonSapin force-pushed the iterate branch from e009d90 to f79b7c6 Compare November 15, 2018 13:40

SimonSapin added 6 commits November 20, 2018 18:22

Add std::iter::unfold

48aae09

Unfold<St, F>: Debug without F: Debug

544ad37

Copy is best avoided on iterators

2222818

Add std::iter::successors

641c490

Add tracking issue for unfold and successors

8a5bbd9

Capitalize

a4279a0

SimonSapin force-pushed the iterate branch from f79b7c6 to a4279a0 Compare November 20, 2018 17:22

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 20, 2018

kennytm mentioned this pull request Nov 23, 2018

Rollup of 14 pull requests #56186

Merged

bors merged commit a4279a0 into rust-lang:master Nov 23, 2018

SimonSapin mentioned this pull request Feb 1, 2019

Tracking issue for std::iter::successors #58045

Closed

SimonSapin deleted the iterate branch November 28, 2019 12:04

	/// `F: FnMut() -> Option<T>`.
	/// `F: FnMut(&mut St) -> Option<T>`.

	///
	/// An initial state can also be passed in the first argument of `unfold`.
	///

	/// // increment our count. This is why we started at zero.
	/// // Increment our count. This is why we started at zero.

	/// // check to see if we've finished counting or not.
	/// // Check to see if we've finished counting or not.

	/// An iterator where each iteration calls the provided closure `F: FnMut() -> Option<T>`.
	/// An iterator where each iteration calls the provided closure
	/// `F: FnMut(&mut St) -> Option<T>`.

Add std::iter::unfold #55869

Add std::iter::unfold #55869

Conversation

SimonSapin commented Nov 11, 2018 • edited Loading

rust-highfive commented Nov 11, 2018

SimonSapin commented Nov 11, 2018 • edited Loading

nagisa commented Nov 11, 2018 • edited Loading

SimonSapin commented Nov 11, 2018

nagisa commented Nov 11, 2018

ljedrz commented Nov 11, 2018

bluetech commented Nov 11, 2018

SimonSapin commented Nov 11, 2018

rust-highfive commented Nov 11, 2018

ljedrz commented Nov 11, 2018

clarfonthey commented Nov 11, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Centril commented Nov 11, 2018 • edited Loading

Choose a reason for hiding this comment

SimonSapin commented Nov 12, 2018

cuviper commented Nov 12, 2018 • edited Loading

alexcrichton commented Nov 13, 2018

cuviper commented Nov 13, 2018

matklad commented Nov 13, 2018

matklad commented Nov 13, 2018

SimonSapin commented Nov 13, 2018

matklad commented Nov 13, 2018

bluss commented Nov 13, 2018 • edited Loading

SimonSapin commented Nov 13, 2018

cuviper commented Nov 13, 2018 • edited Loading

SimonSapin commented Nov 13, 2018

SimonSapin commented Nov 15, 2018

clarfonthey commented Nov 17, 2018

alexcrichton commented Nov 19, 2018

SimonSapin commented Nov 20, 2018

bors commented Nov 20, 2018

SimonSapin commented Nov 11, 2018 •

edited

Loading

SimonSapin commented Nov 11, 2018 •

edited

Loading

nagisa commented Nov 11, 2018 •

edited

Loading

Centril commented Nov 11, 2018 •

edited

Loading

cuviper commented Nov 12, 2018 •

edited

Loading

bluss commented Nov 13, 2018 •

edited

Loading

cuviper commented Nov 13, 2018 •

edited

Loading