New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for Iterator::flatten #48213

Closed
Centril opened this Issue Feb 14, 2018 · 35 comments

Comments

Projects
None yet
9 participants
@Centril
Contributor

Centril commented Feb 14, 2018

Tracking issue for #48115, Iterator::flatten and FlatMap.

@leonardo-m

This comment has been minimized.

leonardo-m commented Feb 15, 2018

A flatten could be useful, but I think there are more important/useful functions/iterators to add before a flatten.

@clarcharr

This comment has been minimized.

Contributor

clarcharr commented Feb 15, 2018

@leonardo-m Clearly not important enough for someone to write a PR/RFC before this one, though. ;)

@clarcharr

This comment has been minimized.

Contributor

clarcharr commented Feb 15, 2018

Also I can't list the number of times I've done .flat_map(|x| x), so, I'm glad this was implemented.

One thing that I think should be clarified is that this only flattens one level, i.e. it will not transform [[[1, 2], [3, 4]], [[5, 6], [7, 8]]] into [1, 2, 3, 4, 5, 6, 7, 8] but [[1, 2], [3, 4], [5, 6], [7, 8]].

@Centril

This comment has been minimized.

Contributor

Centril commented Feb 15, 2018

@leonardo-m I don't think it has to be an either-or proposition - if there are more useful additions, and there certainly are some from Itertools, they should be made irrespective of this one. Someone just has to write the PRs for those =)

Regarding .flatten() specifically, I believe it gives us easier-to-express specialization opportunities than .flat_map() does such as, flattening a vec::IntoIter<Vec<_>> with a single allocation.
Also, it was mentioned explicitly that neither of .flat_map(|x| x) nor .flat_map(identity) were rather nice and that .flatten() was clearer at rust-lang/rfcs#2306 (comment), rust-lang/rfcs#2306 (comment), and rust-lang/rfcs#2306 (review).

@clarcharr hmm, it does say in the docs that:

and you want to remove one level of indirection.

Perhaps that's not clear enough? Can we fix this during stabilization perhaps if it is not?

@clarcharr

This comment has been minimized.

Contributor

clarcharr commented Feb 15, 2018

@Centril I mentioned that having not actually read the docs yet. ;)

I honestly think that including an example would be clearer; there are tools for languages such as JavaScript, for example, which will completely flatten an array regardless of how many levels it has. Although such a thing would be very difficult and/or impossible to do with Rust's type system, I think it'd be best for it to be clear that this is not a "deep" flatten, just a shallow one.

@Centril

This comment has been minimized.

Contributor

Centril commented Feb 15, 2018

@clarcharr Updated the docs with your example.

@clarcharr

This comment has been minimized.

Contributor

clarcharr commented Feb 16, 2018

Awesome! Thanks

@yrashk

This comment has been minimized.

yrashk commented Feb 26, 2018

Not sure, how important this angle is, but just in case -- it looks like this introduces a failure for nightly builds on code that relies on itertools' version of flatten:

https://it.sit-it.org/issue/fe5e68e5-22a1-4bc3-8ebf-36586460ba27
https://travis-ci.org/sit-it/sit/jobs/346132639#L1129

It's a trivial fix for the code that uses it (the patch is in the first link) but I thought it is worth mentioning this aspect.

yrashk added a commit to sit-fyi/sit that referenced this issue Feb 26, 2018

Problem: calling .flatten() from itertools fails on nightly
```
   --> sit-web/src/webapp.rs:202:32
       |
   202 |             let record = match sit_core::repository::IssueRecordIter::flatten(issue.record_iter().unwrap()).find(|r| r.encoded_hash() == record) {
       |                                                                       ^^^^^^^ multiple `flatten` found
       |
       = note: candidate #1 is defined in an impl of the trait `std::iter::Iterator` for the type `sit_core::repository::IssueRecordIter<'_>`
       = note: candidate #2 is defined in an impl of the trait `itertools::Itertools` for the type `_`
       = note: candidate #3 is defined in the trait `rayon::iter::ParallelIterator`
```

This is, perhaps, related to the addition of the [`iterator_flatten`
nightly feature](rust-lang/rust#48213).

Solution: use itertools' version explicitly
@Vurich

This comment has been minimized.

Vurich commented Feb 26, 2018

I don't think arbitrary-level flattening is impossible with Rust's type system, it's just that it would rely on specialisation and could produce confusing results.

For example, [[Some(1), None], [None, None]].deep_flatten() would produce [1], obvious to a Rust veteran but maybe unintuitive for a newbie, plus implementing Iterator for a type would massively change the meaning of the expression.

Also, this would cause an infinite loop in the compiler when resolving the output type of deep_flatten:

struct Foo;

impl Iterator for Foo {
    type Item = Foo;

    fn next(&mut self) -> Foo {
        unimplemented!()
    }
}

JS is dynamically typed and so deep flattening is useful since you don't know the nesting level until runtime. With Rust you know ahead of time precisely how many levels deep an iterator will be nested.

@SoniEx2

This comment has been minimized.

@Centril

This comment has been minimized.

Contributor

Centril commented Feb 28, 2018

@SoniEx2 I think the documentation has a sufficient number of examples, more would be too many.

@leonardo-m

This comment has been minimized.

leonardo-m commented Feb 28, 2018

@leonardo-m I don't think it has to be an either-or proposition - if there are more useful additions, and there certainly are some from Itertools, they should be made irrespective of this one. Someone just has to write the PRs for those =)

I don't agree. So far in my whole codebase I have seen zero places where to use a flatten(), on the other hand from the standard library I miss several iterable things already present in itertools. The API of the std library should be kept small. Adding functions is not free, the more things you add, the more time you need to use the API and find the right function. In my opinion the whole business of adding iterators to the Rust std library is currently broken. It should be based on real data. What are the most used iterators of the itertools crate? Add those to the std library.

@SoniEx2

This comment has been minimized.

SoniEx2 commented Feb 28, 2018

I have written scan+flat_map(|x|x) at least 3 times so far. And I don't even use Rust that often yet!

The more I use rust, the more I run into scan+flatten.

See also: rust-lang/rfcs#1978

It's most useful for string parsing, and partial collects.

@Centril

This comment has been minimized.

Contributor

Centril commented Jun 10, 2018

@SimonSapin This has been in nightly for a few months now; could we stabilize?

@kennytm

This comment has been minimized.

Member

kennytm commented Jun 10, 2018

We may want another (check-only) crater run before stabilizing due to conflict with Itertools::flatten after that.

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Jun 11, 2018

@reuvenpo How would that work exactly? As far as I know the type system cannot express this sort of type-level recursion (and boundary condition).

@reuvenpo

This comment has been minimized.

reuvenpo commented Jun 11, 2018

To be honest, I'm still fairly new to rust and found my way here looking for this sort of feature, so I don't know for certain whether this sort of thing is possible.

But at least theoretically, there exists a type system not much different from the subset of the rust type system that I'm familiar with, which would be able to statically generate concrete functions that flatten a nested structure, based on its type.

As you said, that would require type-level recursion with a stop condition.

I was hoping that someone with more in-depth knowledge of the current type system than myself would come across this thread and determine if it's possible or not.

@Centril

This comment has been minimized.

Contributor

Centril commented Jun 11, 2018

In either case, whether it is possible or not, I don't think a nested flatten operation should be called .flatten() since it would lead to surprises. Given that .map corresponds to functor mapping and that .flat_map corresponds to monadic bind, then .flatten should correspond to monadic join.

@reuvenpo

This comment has been minimized.

reuvenpo commented Jun 11, 2018

I've been thinking about this issue for a while, and i came up with something, but sadly i don't currently have the time to try a full implementation, so I'll write down a sketch of it, maybe it makes some sense (If it does I'll invest a few hours into learning how to properly contribute over the weekend).

Basically, have a trait Nested with a method deep_flatten<I: Iterator>(self) -> I (or similar, i hope you get the idea) which returns the final iterator.
This trait can be implemented for all iterators to just return self. That is, flattening a "flat" iterator does nothing.
For any iterator of iterators (exact type signature missing for lack of author's experience), as a specialization, this trait can be implemented to return an iterator that collects its items from calling deep_flatten on its sub-iterators.
If i understand the type system correctly, this should be equivalent to the "type-level recursion with a stop condition" discussed earlier.

@Centril

This comment has been minimized.

Contributor

Centril commented Jun 11, 2018

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Jun 11, 2018

@reuvenpo That sounds maybe plausible with specialization of a trait that has an associated type (for the returned iterator). But the design of the trait specialization feature itself in the language is still evolving, so it might be a long time before "deep flatten" is possible at all.

In the mean time, the existing Iterator::flatten can be stabilized very soon. I feel that the potential of a similar-but-different feature to maybe be added later is not enough to block stabilization now with the shorter name.

bors added a commit that referenced this issue Jun 11, 2018

Auto merge of #51511 - Centril:feature/stabilize_iterator_flatten, r=…
…<try>

Stabilize Iterator::flatten in 1.29, fixes #48213.

This PR stabilizes [`Iterator::flatten`](https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.flatten) in *version 1.29* (1.28 goes to beta in 10 days, I don't think there's enough time to land it in that time, but let's see...).

Tracking issue is:  #48213.

cc @bluss re. itertools.
r? @SimonSapin
ping @pietroalbini -- let's do a crater run when this passes CI :)
@reuvenpo

This comment has been minimized.

reuvenpo commented Jun 12, 2018

Would it be appropriate to open a tracking issue for Iterator::unnest (name taken from @Centril 's earlier gist) to discuss possible implementations of this idea?

Or should an RFC be made first?

@Vurich

This comment has been minimized.

Vurich commented Jun 13, 2018

Generally it's considered bad practice to have the behaviour rely on specialisation (rather than specialisation being an optimisation) because it means that Box::new(some_val) as Box<Trait> can act differently to just some_val depending on the circumstance.

In JavaScript the unnest method would be useful since you don't know until runtime the level of nesting (and different elements can nest different depths). Essentially, unnest is a tree-flattening operation in a dynamically-typed language. In a strongly-, strictly-typed language like Rust there is very little use for a method like unnest because the tree depth is known at compile-time. I don't mean to say this in a rude way, but I can't think of a place where this would be useful. Of course, if you can find an example of where this would be useful then I would like to see it.

@reuvenpo

This comment has been minimized.

reuvenpo commented Jun 16, 2018

@Vurich, i'll start by addressing the second part of your message, and then elaborate on it, hopefully answering the first part too.

It's true that in dynamically typed languages this is useful because both the author and the language may not know the depth of nesting for a given object, but it doesn't necessarily mean that in a statically typed language the author would like to care about the level of nesting of a given object, in some contexts.

My example would be anything resembling a tree structure with unbounded depth (For example a directory structure). It may be desirable to have a function generalizes over Nested<String> (A trait) which provides a method unnest returning an iterator (presumably of type Unnest) over all the leaf-nodes in that tree directly(i.e. the names of plain files) instead of manually unnesting the structure.

In this context, say you had a function that received a structure whose concrete type was something like Vec<Vec<Vec<Vec<i32>>>>(but by some generalization it wasn't necessarily known how many layers of vectors there were), but the function would just like to iterate over the i32s inside. That's impossible to do right now (AFAIK). What if instead this function wanted to iterate over Vec<i32>s, or Vec<Vec<i32>>s, etc?

The type information is all there, it's just impossible to use currently, for these purposes.

It would be best if you couldn't call thing.unnest() on a concrete object, but rather had to specify exactly to which level of unnesting you wanted to reach(e.g. thing.unnest::<String>()), but if you were in a generic function where thing was already bound by a Nested<SomeThing>, then you could just unnest() it.

Under this system, Box::new(some_val) as Box<Trait> wouldn't act that differently differently than some_val. If Trait was Nested<Vec<i32>> but some_val was Vec<Vec<Vec<i32>>> then you couldn't call unnest::<Vec<Vec<i32>>>() on the trait-object anymore, but you could still call unnest::<Vec<i32>>() and unnest::<i32>() on it, plus you can now call unnest() on it, which would be equivalent to calling unnest::<Vec<i32>>(). This behavior would make sense in this context, restricting the object's behavior according to the trait bound, and extending it's behavior slightly by the same bound.

Of course, this is all pretty difficult, if at all possible, to implement today. I certainly don't know the "right" way to do this, if it was possible, I just think this interface makes sense and is doable, even if it still requires some features that aren't yet stabilized, or available at all.

P.S. Obviously for cases where the concrete type is known, one can just call .flatten() the necessary number of times to reach the desired type.

EDIT 1: changed Unnestable to Nested

@SoniEx2

This comment has been minimized.

SoniEx2 commented Jun 16, 2018

"unnestable" is weird for a trait name. it sounds like you can't put it inside stuff.

@reuvenpo

This comment has been minimized.

reuvenpo commented Jun 16, 2018

The name is just a proposal too. It can be called anything else that makes sense.

(Denestable? Nested? Unnesting? I like Nested better now that i think of it.)

@Vurich

This comment has been minimized.

Vurich commented Jun 18, 2018

The design where you specify the iterator type still fails for this:

struct Foo;

impl Iterator for Foo {
    type Item = Foo;

    fn next(&mut self) -> Foo {
        Foo
    }
}
@reuvenpo

This comment has been minimized.

reuvenpo commented Jun 18, 2018

I'm not sure i understand.
Do you mean iterators of a type that always yield iterators of the same type?

(Is that even a useful pattern?)

@SoniEx2

This comment has been minimized.

SoniEx2 commented Jun 18, 2018

It would (should) do the least amount of work to return the desired result.

You want an Iterator<Item=Foo>? Easy, one operation. You want an Foo? Zero operations. You want Iterator<Item=Iterator<Item=Foo>>? Also mostly easy. Etc.

@reuvenpo

This comment has been minimized.

reuvenpo commented Jun 18, 2018

I don't see where i would use this pattern (since it prevents Foo from being an iterator over anything else, and this is a weird recursive way of doing what std::iter::repeat(...) does) but i agree that in this case a Foo can be considered as a Nested<Foo>, Nested<Iterator<Item=Foo>>, etc, but trying to .unnest() any of these constructs would result in an infinite recursion when trying to yield the first item, because of the recursive definition of Iterator for Foo

But again... What is the use case? Would in that use case it be useful to treat a Foo as a Nested<Foo>?

@Centril

This comment has been minimized.

Contributor

Centril commented Jun 18, 2018

Can we move discussion on flattening nested stuff to internals.rust-lang.org?

@reuvenpo

This comment has been minimized.

reuvenpo commented Jun 18, 2018

(I'll have to open an account, but)
Sure

bors added a commit that referenced this issue Jun 30, 2018

Auto merge of #51511 - Centril:feature/stabilize_iterator_flatten, r=…
…SimonSapin

Stabilize Iterator::flatten in 1.29, fixes #48213.

This PR stabilizes [`Iterator::flatten`](https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.flatten) in *version 1.29* (1.28 goes to beta in 10 days, I don't think there's enough time to land it in that time, but let's see...).

Tracking issue is:  #48213.

cc @bluss re. itertools.
r? @SimonSapin
ping @pietroalbini -- let's do a crater run when this passes CI :)

kennytm added a commit to kennytm/rust that referenced this issue Jun 30, 2018

Rollup merge of rust-lang#51511 - Centril:feature/stabilize_iterator_…
…flatten, r=SimonSapin

Stabilize Iterator::flatten in 1.29, fixes rust-lang#48213.

This PR stabilizes [`Iterator::flatten`](https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.flatten) in *version 1.29* (1.28 goes to beta in 10 days, I don't think there's enough time to land it in that time, but let's see...).

Tracking issue is:  rust-lang#48213.

cc @bluss re. itertools.
r? @SimonSapin
ping @pietroalbini -- let's do a crater run when this passes CI :)

bors added a commit that referenced this issue Jun 30, 2018

Auto merge of #51948 - kennytm:rollup, r=kennytm
Rollup of 9 pull requests

Successful merges:

 - #51511 (Stabilize Iterator::flatten in 1.29, fixes #48213.)
 - #51853 (Fix some doc links)
 - #51864 (Update liblibc)
 - #51869 (Avoid needless allocations in `liveness_of_locals`.)
 - #51883 (Suggest correct comparison against negative literal)
 - #51890 (Fix inconsequential typo in GlobalAlloc doc example)
 - #51920 (use literal span for concrete type suggestion)
 - #51921 (improve the error message when `#[panic_implementation]` is missing)
 - #51931 (Use in-tree libbacktrace on Fuchsia)

Failed merges:

r? @ghost

pietroalbini added a commit to pietroalbini/rust that referenced this issue Jul 1, 2018

Rollup merge of rust-lang#51511 - Centril:feature/stabilize_iterator_…
…flatten, r=SimonSapin

Stabilize Iterator::flatten in 1.29, fixes rust-lang#48213.

This PR stabilizes [`Iterator::flatten`](https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.flatten) in *version 1.29* (1.28 goes to beta in 10 days, I don't think there's enough time to land it in that time, but let's see...).

Tracking issue is:  rust-lang#48213.

cc @bluss re. itertools.
r? @SimonSapin
ping @pietroalbini -- let's do a crater run when this passes CI :)

bors added a commit that referenced this issue Jul 1, 2018

Auto merge of #51965 - pietroalbini:rollup, r=pietroalbini
Rollup of 7 pull requests

Successful merges:

 - #51511 (Stabilize Iterator::flatten in 1.29, fixes #48213.)
 - #51853 (Fix some doc links)
 - #51890 (Fix inconsequential typo in GlobalAlloc doc example)
 - #51920 (use literal span for concrete type suggestion)
 - #51922 (rename the llvm-tools component to llvm-tools-preview and tweak its image)
 - #51936 (rename rustc's lld to rust-lld)
 - #51961 (Fix typo in /src/librustc_resolve/lib.rs)

Failed merges:

r? @ghost

pietroalbini added a commit to pietroalbini/rust that referenced this issue Jul 1, 2018

Rollup merge of rust-lang#51511 - Centril:feature/stabilize_iterator_…
…flatten, r=SimonSapin

Stabilize Iterator::flatten in 1.29, fixes rust-lang#48213.

This PR stabilizes [`Iterator::flatten`](https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.flatten) in *version 1.29* (1.28 goes to beta in 10 days, I don't think there's enough time to land it in that time, but let's see...).

Tracking issue is:  rust-lang#48213.

cc @bluss re. itertools.
r? @SimonSapin
ping @pietroalbini -- let's do a crater run when this passes CI :)

bors added a commit that referenced this issue Jul 1, 2018

Auto merge of #51969 - pietroalbini:rollup, r=pietroalbini
Rollup of 7 pull requests

Successful merges:

 - #51511 (Stabilize Iterator::flatten in 1.29, fixes #48213.)
 - #51853 (Fix some doc links)
 - #51890 (Fix inconsequential typo in GlobalAlloc doc example)
 - #51920 (use literal span for concrete type suggestion)
 - #51921 (improve the error message when `#[panic_implementation]` is missing)
 - #51922 (rename the llvm-tools component to llvm-tools-preview and tweak its image)
 - #51961 (Fix typo in /src/librustc_resolve/lib.rs)

Failed merges:

r? @ghost

@bors bors closed this in #51969 Jul 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment