Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IntoIterator should be implemented for [T; N] not just references #25725

Open
mdinger opened this issue May 22, 2015 · 36 comments · May be fixed by #65819
Open

IntoIterator should be implemented for [T; N] not just references #25725

mdinger opened this issue May 22, 2015 · 36 comments · May be fixed by #65819

Comments

@mdinger
Copy link
Contributor

@mdinger mdinger commented May 22, 2015

Currently IntoIterator is implemented for arrays of &'a mut [T; N] and &'a [T; N] but not for [T; N] making it only possible to collect() arrays of references, not arrays of values:

// These are the same in spite of `iter()` changing to `into_iter()`.
let vec_ref1: Vec<&i32> = [1, 2].iter().collect();
let vec_ref2: Vec<&i32> = [1, 2].into_iter().collect();

// This is what `into_iter()` should do but it doesn't actually work.
let vec: Vec<i32> = [1, 2].into_iter().collect();

This is something users are likely to run headlong into (probably accidentally) when testing out examples from the iterator docs. So, IntoIterator should be implemented so that last option is possible. The API stabilization RFC: rust-lang/rfcs#1105 states:

any breakage in a minor release ... it must always be possible to locally fix the problem through some kind of disambiguation that could have been done in advance

Meaning this would be allowed as a minor change in spite of [1, 2].into_iter() behavior changing because it could be changed beforehand to [1, 2].iter().

@mdinger

This comment has been minimized.

Copy link
Contributor Author

@mdinger mdinger commented May 22, 2015

cc @bluss

@steveklabnik steveklabnik added the A-libs label May 25, 2015
@alexcrichton

This comment has been minimized.

Copy link
Member

@alexcrichton alexcrichton commented May 26, 2015

This is tricky to implement for an arbitrary T as panics need to be handled where a moved-out element isn't dropped, and this also leads to a large number of iterators where the reference-based IntoIterator implementations just return a normal slice iterator.

@bluss

This comment has been minimized.

Copy link
Member

@bluss bluss commented May 26, 2015

For the record, crate arrayvec implements a by-value iterator using one type and a trait for a range of array sizes. I think it has solved all safety concerns, but I'll definitely look into what happens during panic again.

@mdinger

This comment has been minimized.

Copy link
Contributor Author

@mdinger mdinger commented May 27, 2015

I take it this would/will be much easier once rust-lang/rfcs#1038 is implemented someday. Then at a minimum, this would be fixable shortly after that issue is.

@steveklabnik

This comment has been minimized.

Copy link
Member

@steveklabnik steveklabnik commented Dec 21, 2015

The lack of this shows up in the beginnings of the book, and it's really annoying. I've shown off arrays, I've shown off loops, I want to show for over arrays. But I haven't introduced method syntax yet, so for e in a.iter() is something new. But you won't often be calling iter() explicitly when you're using Vecs, but we haven't introduced Vec yet either.

@mitchmindtree

This comment has been minimized.

Copy link
Contributor

@mitchmindtree mitchmindtree commented Jul 22, 2016

Just thought I'd give this a +1 and add that I come across this quite frequently. I even occasionally resort to once(a).chain(once(b)).chain(once(c)) during some calls to flat_map just so I can remain on the stack when the only other alternatives are vec![a, b, c] or having to import another crate like arrayvec.

This is tricky to implement for an arbitrary T as panics need to be handled where a moved-out element isn't dropped

At least 90% of my use cases for this are for arrays where the elements are Copy, which I think avoids this issue as a type cannot impl both Copy and Drop. Perhaps it could be useful to first provide an IntoIterator implementation that requires the Copy bound, which may be removed later down the track if we're able to come up with a solution for non-Copy types? Alternatively, perhaps a .copy_iter() that takes self where T: Copy might be useful until a more generalised .into_iter() can be implemented?

Edit: I just want to add that often, .iter().cloned() is not a possible solution when your array has a very short lifetime, i.e.

let (r, g, b) = rgb();
let rgbas: Vec<_> = alphas.into_iter()
    .flat_map(|a| [r, g, b, a].iter().cloned()) // This fails due to an insufficient lifetime
    // .flat_map(|a| [r, g, b, a]) // This would be sooo nice
    .collect();
@Lisoph

This comment has been minimized.

Copy link

@Lisoph Lisoph commented Aug 10, 2017

+1, I also ran against this problem yesterday with this snippet (the Vec variant compiles, array does not). This is very surprising, especially for a beginner.

@kennytm

This comment has been minimized.

Copy link
Member

@kennytm kennytm commented Aug 10, 2017

As discussed in #32871, we either need:

  1. impl Trait to be stable, so we could secretly use the impl IntoIterator for [T; 0] to [T; 32] trick (probably also need rust-lang/rfcs#2071 to name the associated type), or
  2. Get rust-lang/rfcs#2000 merged + implemented first (assuming stable code can use it without stabilizing const-generic, otherwise we need to count in stabilization time as well)
bors added a commit that referenced this issue Mar 14, 2018
impl IntoIterator for arrays

This allows an array to move its values out through iteration.

This was attempted once before in #32871, but closed because the `IntoIter<T, [T; $N]>` type is not something we would want to stabilize.  However, RFC 2000's const generics (#44580) are now on the horizon, so we can plan on changing this to `IntoIter<T, const N: usize>` before stabilization.

Adding the `impl IntoIterator` now will allows folks to go ahead and iterate arrays in stable code.  They just won't be able to name the `array::IntoIter` type or use its inherent `as_slice`/`as_mut_slice` methods until they've stabilized.

Quite a few iterator examples were already using `.into_iter()` on arrays, getting auto-deref'ed to the slice iterator.  These were easily fixed by calling `.iter()` instead, but it shows that this might cause a lot of breaking changes in the wild, and we'll need a crater run to evaluate this.  Outside of examples, there was only one instance of in-tree code that had a problem.

Fixes #25725.

r? @alexcrichton
@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 19, 2018

@kennytm impl Trait is now stable, so can this be implemented now?

@kennytm

This comment has been minimized.

Copy link
Member

@kennytm kennytm commented Oct 20, 2018

@orlp I think you could give it a try. @rust-lang/libs WDYT?

(We'll also need the corresponding clippy lint rust-lang/rust-clippy#1565 to reduce the damage of inference breakage after stabilization.)

@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 20, 2018

@kennytm Actually, IntoIterator needs member type IntoIter: Iterator, so we need existential types first, as you said :(

@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 20, 2018

@kennytm I did come up with a way to do IntoIter<T> instead of IntoIter<[T; N]>: https://gist.github.com/orlp/2c0a704c3a30f6203fcaa33aca941f1c.

But I don't really think that helps as a temporary solution, because in the end we still want IntoIter<T, N>.

@RReverser

This comment has been minimized.

Copy link
Contributor

@RReverser RReverser commented Oct 21, 2018

@orlp I changed your implementation slightly to avoid extra VarArray enum: https://gist.github.com/RReverser/bb9fd5250dba7e126f1df6dc6fb17662

Technically this is equivalent to your implementation, but avoids storing length twice both as an enum tag and as a separate field.

@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 21, 2018

@RReverser I intentionally did not do that because it creates uninitialized memory, which is UB for T that have invalid bit patterns. Just creating uninitialized memory for those T is UB, even if you never access the uninitialized memory.

Here is a minimal example showing something that could happen: https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=23a761b2cda00df365dd11f128ee3409.

@RReverser

This comment has been minimized.

Copy link
Contributor

@RReverser RReverser commented Oct 21, 2018

@orlp Hmm, these answers ("just creating uninitialized memory") don't seem to make much sense, but I haven't dug into ManuallyDrop changes they mention. Either way, you can replace uninitialized with zeroed which will avoid this issue.

@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 21, 2018

@RReverser Did you check my minimal example? zeroed is just as bad.

@RReverser

This comment has been minimized.

Copy link
Contributor

@RReverser RReverser commented Oct 21, 2018

@orlp That's a completely different issue and is related to layout optimisations, not zeroed or uninitialized behaviour.

@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 21, 2018

@RReverser So you don't think it's undefined behavior that if I wrap your iterator type into an Option it could sometimes magically become None even when it's not?

@RReverser

This comment has been minimized.

Copy link
Contributor

@RReverser RReverser commented Oct 21, 2018

No, these optimisations are not UB albeit not explicitly defined for flexibility (these are different things). Point is, they are intended to use free bit patterns for enum tags, which is exactly what happens in your example, and it's a safe assumption compiler makes there. But if you actually fill that buffer with your own valid values, they are still guaranteed to come back the way you put them.

@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 21, 2018

@RReverser But you don't fill the buffer with valid values! The buffer remains partially uninitialized when len() < MAX_ARRAY_INTO_ITER_SIZE.

So when T is, say, &i32 you now have a buffer with potentially a bunch of null references in them returned to the user. They're not accessible to the user, but that doesn't matter as shown in my example above (just wrapping the returned value in an Option can blow things up).

According to https://doc.rust-lang.org/reference/behavior-considered-undefined.html, that is very much UB.

@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 21, 2018

@RReverser Here I made your code fail with 100% safe code: https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=d6957a01e83bfaf273f58360cc92a847

I only changed uninitialized for zeroed. Mind you that if I didn't do this, the bug would still exist but only pop up sometimes, which is significantly worse.

@RReverser

This comment has been minimized.

Copy link
Contributor

@RReverser RReverser commented Oct 21, 2018

Yeah, you're right, I also just realised that 0-size array would be still a problem (anything 1+ should work though because it's the first element that has to be properly filled to prevent these opts).

But to be reliably on the safe side, we'd need to replace the new ManuallyDrop (which seems to break many assumptions) with something like NoDrop instead.

@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 21, 2018

Yeah, you're right, I also just realised that 0-size array would be still a problem (anything 1+ should work though because it's the first element that has to be properly filled to prevent these opts).

"Should work" isn't really the standard we should be going for. It only works until someone decides to change the way stuff is laid out in a future compiler version. What you're doing is explicitly mentioned as undefined behavior, and might break in the future.

This isn't just theoretical. The old versions of https://crates.io/crates/slotmap are yanked because I was unaware that simply creating invalid bit patterns (not accessing them) is defined as UB. It "worked" before, and then a compiler update came and all of a sudden my crate was unsafe.

It has nothing to do with ManuallyDrop. That's just the symptom. The disease is creating uninitialized memory for a type with invalid bit patterns.

That's why I made my version above, which is a bit hacky, but is perfectly safe.

@RReverser

This comment has been minimized.

Copy link
Contributor

@RReverser RReverser commented Oct 21, 2018

Creating uninitialized memory on its own is not UB, no matter the underlying type, it's accessing it afterwards (including via wrapper types) that can be if not careful.

Sorry, I'd continue this discussion, but I don't really appreciate the tone it starts getting. Hopefully that's just a limitation of the written text, and if so, happy to discuss another time in person.

@orlp

This comment has been minimized.

Copy link
Contributor

@orlp orlp commented Oct 21, 2018

@RReverser No particular tone is intended. I'm just trying to discuss the matter at hand, nothing more.

Creating uninitialized memory on its own is not UB, no matter the underlying type, it's accessing it afterwards (including via wrapper types) that can be if not careful.

The Rust reference states that having invalid values is incorrect:

Rust code, including within unsafe blocks and unsafe functions is incorrect if it exhibits any of the behaviors in the following list. It is the programmer's responsibility when writing unsafe code that it is not possible to let safe code exhibit these behaviors.

  • Dereferencing a null or dangling raw pointer.
    ...
  • Invalid values in primitive types, even in private fields and locals:
    • Dangling or null references and boxes.
    • A value other than false (0) or true (1) in a bool.
    • A discriminant in an enum not included in the type definition.
    • A value in a char which is a surrogate or above char::MAX.
    • Non-UTF-8 byte sequences in a str.

Note how null references are mentioned twice, once for dereferencing them, and another time for simply having them. But it's not just null references, using uninitialized for any of the other mentioned types would also be UB.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

@SimonSapin SimonSapin commented Oct 21, 2018

The appropriate wrapper type for a partially-initialized array is std::mem::MaybeUninit.

bors bot added a commit to rust-lang/rust-clippy that referenced this issue Nov 2, 2018
Merge #3344
3344: Added lints `into_iter_on_ref` and `into_iter_on_array`. r=phansch a=kennytm

Fixes #1565.

`into_iter_on_array` is a correctness lint against `into_iter()` on `[T; n]` and `PathBuf` (please provide a concise noun that covers both types 🙃).

`into_iter_on_ref` is a style lint against `into_iter()` on `&C` where `C` is `[T]`, `Vec<T>`, `BTreeSet<T>` etc.

Both suggests replacing the `into_iter()` with `iter()` or `iter_mut()`.

`into_iter_on_array` is a correctness lint since it is very likely the standard library would provide an `into_iter()` method for `[T; n]` (rust-lang/rust#25725) and existing type inference of `[a, b, c].into_iter()` will be broken. `PathBuf` is also placed under this lint since `PathBuf::into_iter` currently doesn't exist and it makes some sense to implement `IntoIterator` on it yielding `OsString`s.

Co-authored-by: kennytm <kennytm@gmail.com>
@LukasKalbertodt LukasKalbertodt mentioned this issue Jul 24, 2019
5 of 5 tasks complete
@stephanemagnenat

This comment has been minimized.

Copy link

@stephanemagnenat stephanemagnenat commented Aug 5, 2019

As an experienced programmer in C++ but newcomer to Rust, I stumbled on this problem as well, for the same use case as mentioned before (processing pixel components).

@Diggsey

This comment has been minimized.

Copy link
Contributor

@Diggsey Diggsey commented Aug 5, 2019

It also comes up if you want to add some items to a Vec, eg.

let mut vec = ...;
if condition {
    vec.extend(["foo", "bar", "etc"]); // Error
}

And the easiest workaround (using vec![]) results in an unnecessary extra allocation.

@RReverser

This comment has been minimized.

Copy link
Contributor

@RReverser RReverser commented Aug 5, 2019

@Diggsey vec.extend([...].iter().copied()) is fairly easy too and would avoid allocation, although it would be still nice to avoid doing that manually.

@rockmenjack

This comment has been minimized.

Copy link

@rockmenjack rockmenjack commented Sep 3, 2019

Implicitly changing into_iter() to iter() for [T;N] also makes below statement inconsistent:

There are three common methods which can create iterators from a collection:
iter(), which iterates over &T.
iter_mut(), which iterates over &mut T.
into_iter(), which iterates over T.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

@SimonSapin SimonSapin commented Sep 3, 2019

@rockmenjack We’re not changing it that way. We want to add <[T; N] as IntoIterator>::into_iter, which doesn’t exist today but would make things more consistent per the pattern that you quote. But since it doesn’t exist, code that does let x: [T; N] = …; x.into_iter(); can compile through auto-ref using <&[T; N] as IntoIterator>::into_iter which gives an iterator of &T. So we’d be changing the meaning of valid code and likely break it.

@rockmenjack

This comment has been minimized.

Copy link

@rockmenjack rockmenjack commented Sep 3, 2019

let x: [T; N] = …; x.into_iter(); can compile through auto-ref using <&[T; N] as IntoIterator>::into_iter which gives an iterator of &T. So we’d be changing the meaning of valid code and likely break it.

@SimonSapin this is what I meant inconsistent, call to [T;N]::into_iter() should be rejected by the compiler not auto-ref.
For a broader case, we shall block SomeTrait::some_fn(self) taking &T

@SimonSapin

This comment has been minimized.

Copy link
Contributor

@SimonSapin SimonSapin commented Sep 3, 2019

Auto-ref in the general case is important and useful. Removing it would break lot of foo.bar() method calls where foo is owned and bar borrows &self, which would have to be rewritten as (&foo).bar(). It’s not only not allowed by the language’s stability promise, but I think it wouldn’t be desirable at all.

@rockmenjack

This comment has been minimized.

Copy link

@rockmenjack rockmenjack commented Sep 3, 2019

Auto-ref in the general case is important and useful. Removing it would break lot of foo.bar() method calls where foo is owned and bar borrows &self, which would have to be rewritten as (&foo).bar(). It’s not only not allowed by the language’s stability promise, but I think it wouldn’t be desirable at all.

I agree with the case you mentioned, but that is bar(&self).
The function signature of into_iter() is:

fn into_iter(self) -> Self::IntoIter

user of into_iter might expect into_iter() taking the ownership of self which is not true at the moment for all types. And yeah, self can be &[T;N], just some what counterintuitive.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

@SimonSapin SimonSapin commented Sep 3, 2019

It’s important to look at what the type of self is. In impl<'a, T> IntoIterator for &'a [T; 4] for example, the Self type is &'a [T; 4], not [T; 4]. The into_iter method does take ownership of its receiver, but that receiver itself is a reference that borrows the array.

Having that impl is important so that you can write a for loop like let x: [T; 4] = …; for i in &x { … } when you want to to iterate over &T references to the array’s items.

Centril added a commit to Centril/rust that referenced this issue Oct 25, 2019
…=scottmcm

Add by-value iterator for arrays

This adds an iterator that can iterate over arrays by value, yielding all elements by value. However, **this PR does _not_ add a corresponding `IntoIterator` impl for arrays**. The `IntoIterator` impl needs some discussion about backwards-compatibility that should take place in a separate PR. With this patch, this code should work (but there is currently still a bug):

```rust
#![feature(array_value_iter)]
use std::array::IntoIter;

let arr = [1, 2, 3];
for x in IntoIter::new(arr) {
    println!("{}", x);
}
```

**TODO**:
- [x] Get initial feedback
- [x] Add tests
- [x] Figure out why stage1 produces weird bugs ([comment](rust-lang#62959 (comment)))
- [x] Add UI tests as mentioned [here](rust-lang#62959 (comment)) (will do that soon-ish)
- [x] Fix [this new bug](rust-lang#62959 (comment))

**Notes for reviewers**
- Is the use of `MaybeUninit` correct here? I think it has to be used due to the `Clone` impl which has to fill the dead array elements with something, but cannot fill it with a correct instance.
- Are the unit tests sufficient?

CC rust-lang#25725
@LukasKalbertodt LukasKalbertodt linked a pull request that will close this issue Oct 25, 2019
2 of 6 tasks complete
@LukasKalbertodt

This comment has been minimized.

Copy link
Member

@LukasKalbertodt LukasKalbertodt commented Oct 25, 2019

FYI: an by-value iterator for arrays was implemented in #62959. It is still unstable and missing the actual IntoIterator impl for arrays. That impl will (maybe) be added in #65819, but some discussion regarding backwards compatibility is necessary first. In case you want to chime in.

bors added a commit that referenced this issue Oct 25, 2019
…r=<try>

Add `IntoIterator` impl for arrays by value (`for [T; N]`)

Closes #25725

Initially part of #62959, this PR adds this impl:

```rust
impl<T, const N: usize> IntoIterator for [T; N]
where
    [T; N]: LengthAtMost32
```

Tests and some docs were adjusted. There are probably more places where docs need adjustment, but I postponed doing a proper search until we decided whether we want this. Which brings us to...

---

## The backwards compatibility problem

Adding this impl is not as straight-forward as it seems: there are some backwards compatibility hazards. In particular, due to autoref, this compiles today:

```rust
let array = [true; 3];
for x in array.into_iter() {
    let _ = *x; // x is a reference
}
```

With this change, that code wouldn't compile anymore, as `x` inside the loop would have the type `bool` and not `&bool` (like it does today). One should note that this only happens when using the `.method` call syntax with `.into_iter()`. It does not affect `.iter()` (different method) and it does not affect the `for`-loop syntax (does not involve autoref).

There has been some discussion in #49000 and in #62959. Most agree that a crater run would be very useful. That's what this PR is for. But the fact that this change didn't break anything in the compiler is already promising.

### Arguments to add this `impl` despite the potential breakage:
- It doesn't really make sense to call `.into_iter()`, as `.iter()` is shorter and the `for` loop desugars to `into_iter()` anyway. So hopefully no one used this in the real world.
- [RFC 1105](https://github.com/nox/rust-rfcs/blob/master/text/1105-api-evolution.md) clearly specifies that "implementing any non-fundamental trait" is a "minor change". It also acknowledges that "implementing any existing trait can cause breakage", "However, as before, this kind of breakage is considered 'minor'".
- @scottmcm wrote [a comment](#62959 (comment)) that I (and apparently many others) completely agree with:
  > My personal opinion: having this impl is so obviously the correct thing that I'd be willing to bend stability guarantees to have it, but we don't even need to because adding a new trait impl is an allowed change, no matter whether it breaks code. And the only code that it breaks, today, is code that was doing `a.into_iter()` when `a.iter()` would have done exactly the same thing for less typing, so any workaround needed to not trigger this change will make the code strictly better regardless.

---

Finally, *if* we want to add this impl, we need to decide if/what we do to limit the fallout of potential breakage, like adding a clippy lint or making this a built-in rustc warning.

CC @Centril @Mark-Simulacrum @cuviper
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

You can’t perform that action at this time.