New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impl IntoIterator for arrays #49000

Closed
wants to merge 1 commit into
base: master
from

Conversation

@cuviper
Copy link
Member

cuviper commented Mar 13, 2018

This allows an array to move its values out through iteration.

This was attempted once before in #32871, but closed because the IntoIter<T, [T; $N]> type is not something we would want to stabilize. However, RFC 2000's const generics (#44580) are now on the horizon, so we can plan on changing this to IntoIter<T, const N: usize> before stabilization.

Adding the impl IntoIterator now will allows folks to go ahead and iterate arrays in stable code. They just won't be able to name the array::IntoIter type or use its inherent as_slice/as_mut_slice methods until they've stabilized.

Quite a few iterator examples were already using .into_iter() on arrays, getting auto-deref'ed to the slice iterator. These were easily fixed by calling .iter() instead, but it shows that this might cause a lot of breaking changes in the wild, and we'll need a crater run to evaluate this. Outside of examples, there was only one instance of in-tree code that had a problem.

Fixes #25725.

r? @alexcrichton

impl<T> IntoIterator for [T; $N]
This allows an array to move its values out through iteration.  I find
this especially handy for `flat_map`, to expand one item into several
without having to allocate a `Vec`, like one of the new tests:

    fn test_iterator_flat_map() {
        assert!((0..5).flat_map(|i| [2 * i, 2 * i + 1]).eq(0..10));
    }

Note the array must be moved for the iterator to own it, so you probably
don't want this for large `T` or very many items.  But for small arrays,
it should be faster than bothering with a vector and the heap.
@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Mar 13, 2018

It comes up fairly often that users are surprised they can't directly iterate array values. For example, this reddit user didn't understand why the compiler was creating an iterator with reference items:
https://www.reddit.com/r/rust/comments/83z5zt/help_with_iterating_over_values_instead_of/

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Mar 14, 2018

@bors try

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 14, 2018

⌛️ Trying commit 032f93b with merge be75eb3...

bors added a commit that referenced this pull request Mar 14, 2018

Auto merge of #49000 - cuviper:array-intoiter, r=<try>
impl IntoIterator for arrays

This allows an array to move its values out through iteration.

This was attempted once before in #32871, but closed because the `IntoIter<T, [T; $N]>` type is not something we would want to stabilize.  However, RFC 2000's const generics (#44580) are now on the horizon, so we can plan on changing this to `IntoIter<T, const N: usize>` before stabilization.

Adding the `impl IntoIterator` now will allows folks to go ahead and iterate arrays in stable code.  They just won't be able to name the `array::IntoIter` type or use its inherent `as_slice`/`as_mut_slice` methods until they've stabilized.

Quite a few iterator examples were already using `.into_iter()` on arrays, getting auto-deref'ed to the slice iterator.  These were easily fixed by calling `.iter()` instead, but it shows that this might cause a lot of breaking changes in the wild, and we'll need a crater run to evaluate this.  Outside of examples, there was only one instance of in-tree code that had a problem.

Fixes #25725.

r? @alexcrichton
@@ -210,6 +214,21 @@ macro_rules! array_impls {
}
}

#[unstable(feature = "array_into_iter", issue = "0")]

This comment has been minimized.

@oli-obk

oli-obk Mar 14, 2018

Contributor

Trait implementations are still insta-stable, right?

This comment has been minimized.

@cuviper

cuviper Mar 14, 2018

Author Member

Oh, yeah. I'll need to update the truly unstable items with a tracking issue (if this PR is accepted), so I'll change the impls to stable then.

This comment has been minimized.

@cramertj

cramertj Mar 15, 2018

Member

Event if the IntoIter type is left as unstable, it will still be reachable through <<[u8; 0]> as IntoIter>::Iter>, right?

This comment has been minimized.

@cuviper

cuviper Mar 15, 2018

Author Member

True, you can write something like:

fn my_iter() -> <[u8; 0] as IntoIterator>::IntoIter {
    [].into_iter()
}

I think that's OK. We could change the associated type entirely without breaking such code.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 14, 2018

☀️ Test successful - status-travis
State: approved= try=True

@ishitatsuyuki

This comment has been minimized.

Copy link
Member

ishitatsuyuki commented Mar 14, 2018

I'm curious whether the discussion in postponed rust-lang/rfcs#2185 applies to this as well.

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Mar 14, 2018

@ishitatsuyuki

I'm curious whether the discussion in postponed rust-lang/rfcs#2185 applies to this as well.

I was just following up on my older PR. I wasn't aware of that RFC, but it would apply, yes. If nothing else, I guess we'll have this PR as a proof of concept, that the implementation is feasible.

I don't agree with the conclusion though. I think this lack is a usability wart that needs fixing, and it is easily fixed. The IntoIter signature is the only piece lacking, and I think it's fine to leave that unstable now, having a clear path to stabilization with const generics.

@novacrazy

This comment has been minimized.

Copy link

novacrazy commented Mar 15, 2018

Do you mind if I incorporate some of this into the generic-array crate? I like how you've handled the Clone implementation and as_slice methods in particular.

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Mar 15, 2018

@novacrazy I don't mind at all - go for it!

#[inline]
fn next(&mut self) -> Option<T> {
if self.index < self.index_back {
let p = &self.array.as_slice()[self.index];

This comment has been minimized.

@cramertj

cramertj Mar 15, 2018

Member

Should these indexes use ptr::offset so as not to acquire &[T] to a partially uninitialized [T]?

The same applies down in next_back.

This comment has been minimized.

@cuviper

cuviper Mar 15, 2018

Author Member

Is an uninitialized slice really a problem if we never read from those parts? I would think it's fine, otherwise sequences like mem::uninitialized() + ptr::write() would be a problem too.

This comment has been minimized.

@rkruppe

rkruppe Mar 15, 2018

Member

It's certainly problematic under the types-as-contracts view. AFAICT the unsafe code guidelines aren't solid enough to really give a hard answer yet, but it would certainly be better to not create invalid slices. Incidentially, yes, mem::uninitialized is problematic :)

This comment has been minimized.

@cuviper

cuviper Mar 15, 2018

Author Member

If that's the case, how can I even get the pointer base to offset from? Deref for ManuallyDrop will create a momentary reference here too. Or even with a local union type, getting the pointer to the field will require forming a reference first.

This comment has been minimized.

@cramertj

cramertj Mar 15, 2018

Member

I'd add an unstable ManuallyDrop::as_ptr and mark ManuallyDrop as repr(transparent). This will tell you that *const ManuallyDrop<T> points to the same place as*const T would.

This comment has been minimized.

@cuviper

cuviper Mar 16, 2018

Author Member

At the moment, repr(transparent) is only allowed on a struct. The RFC left unions as future work: rust-lang/rfcs#1758 (comment)

This comment has been minimized.

@rkruppe

rkruppe Mar 16, 2018

Member

I don't think you need repr(transparent) for this, since this is just a question of memory layout, not calling convention. repr(C) would suffice, or inside libstd we could even simply rely on what rustc currently does (which should place the union contents at the same address as the union).

This comment has been minimized.

@cramertj

cramertj Mar 16, 2018

Member

@rkruppe I've been thinking more about this, and if my conclusion in #49056 (comment) is correct, then this PR would be fine with the current "invalid slice" usage, since the values left in the slice would still be valid values of type T, just ones in the "dropped" typestate.

This comment has been minimized.

@cuviper

cuviper Mar 16, 2018

Author Member

FWIW, here's a comparison using pointers to avoid possibly dropped/uninit refs:
cuviper/rust@array-intoiter...cuviper:array-intoiter-ptr

I can submit the ManuallyDrop additions in a separate PR, if you think it's worth it.

since the values left in the slice would still be valid values of type T, just ones in the "dropped" typestate.

The Clone implementation will create truly uninitialized entries too. IMO the unsafe rules should not declare such references invalid if they're not read, but we can deal with it in pointers if necessary.

This comment has been minimized.

@cramertj

cramertj Mar 16, 2018

Member

Yes, that clone implementation is certainly become UB due to the uninitialized, but that's true of basically every use of mem::uninitialized, so we'll have to go clean them all up later no matter what once the uninit RFC lands

#[inline]
#[unstable(feature = "array_into_iter", issue = "0")]
pub fn as_mut_slice(&mut self) -> &mut [T] {
&mut self.array.as_mut_slice()[self.index..self.index_back]

This comment has been minimized.

@cramertj

cramertj Mar 15, 2018

Member

Same question here-- I think this implementation should use ptr::offset and from_raw_parts so as not to acquire a reference to a partially uninitialized slice.

This comment has been minimized.

@cuviper

cuviper Mar 15, 2018

Author Member

Same question as above, how can I get a base pointer at all without forming some reference?

This comment has been minimized.

@cramertj

cramertj Mar 15, 2018

Member

As above-- if you make ManuallyDrop repr(transparent), then you know that pointers to it will point directly to its element.

novacrazy added a commit to novacrazy/generic-array that referenced this pull request Mar 15, 2018

@xfix

xfix approved these changes Mar 16, 2018

@clarcharr

This comment has been minimized.

Copy link
Contributor

clarcharr commented Mar 16, 2018

Genuine question: considering how index and index_back are usize anyway, wouldn't it be reasonable to just replace them with pointers like the Iter and IterMut implementations? That way there's no extra offsets done in the iterator itself.

@ishitatsuyuki

This comment has been minimized.

Copy link
Member

ishitatsuyuki commented Mar 16, 2018

@clarcharr The Iterator may be moved, which changes the address.

@pietroalbini

This comment has been minimized.

Copy link
Member

pietroalbini commented Mar 20, 2018

Ping from triage @alexcrichton! This PR needs your review.

// Only values in array[index..index_back] are alive at any given time.
// Values from array[..index] and array[index_back..] are already moved/dropped.
array: ManuallyDrop<A>,
index: usize,

This comment has been minimized.

@alexcrichton

alexcrichton Mar 20, 2018

Member

One thing I've personally enjoyed doing historically is storing these two indices as Range<usize> as it helps make implementations for the iterator methods easier

@@ -1242,7 +1242,7 @@ pub trait Iterator {
/// // let's try that again
/// let a = [1, 2, 3];
///
/// let mut iter = a.into_iter();

This comment has been minimized.

@alexcrichton

alexcrichton Mar 20, 2018

Member

Oh dear, these changes mean that this is technically a breaking change, right?

This comment has been minimized.

@oli-obk

oli-obk Mar 20, 2018

Contributor

If I understand correctly what's going on, this "breaking change" is no different from the way adding a new method to Iterator which (if e.g. itertools already implements that method) would break all users of itertools who used the method syntax instead of the UFC syntax to call said method.

This comment has been minimized.

@alexcrichton

alexcrichton Mar 20, 2018

Member

Correct yeah. This doesn't mean it's impossible to land, just that it needs more scrutiny/review.

This comment has been minimized.

@scottmcm

scottmcm Mar 20, 2018

Member

This does feel like one of those changes that, while (allowed) breaking, people have wanted for a very long time and we definitely want to do at some point.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 20, 2018

cc @rust-lang/libs, it looks like this may cause compiling code to stop compiling but fall under the umbrella of "accepted type inference breakage". Along those lines I wanted to cc others on this (especially as everything added is unstable) to make see if anyone's got concerns.

I'd personally prefer to hold off on stabilizing these for now until we can get the true IntoIter<T, N> rather than IntoIter<T, [T; N]> if we can, but I think otherwise we may want to run crater on this PR to assess the possible breakage

@oli-obk

This comment has been minimized.

Copy link
Contributor

oli-obk commented Mar 21, 2018

If we hold off we should at least create a future compat lint that suggests iter over into_iter, so the breakage will be less severe

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 21, 2018

I think we should merge this, but not stabilize the types until they have the correct signature. However I think you'll be able to use them on stable (without naming the type) because the IntoIterator impl will be stable. I'm fine with this, the only part I'm not fine with stabilizing is the signature, which you can't get yet.

I'd also like to see a crater run, but unless its really bad I'm not that perturbed because it feels like you're walking into breakage by calling into_iter to get the slice iterator on an array.

@ishitatsuyuki

This comment has been minimized.

Copy link
Member

ishitatsuyuki commented Mar 21, 2018

It's terrifying enough that we already had code subject to breakage in our code base. If traits are insta-stable (or they can't act if they doesn't exist if the feature is disabled), we cannot really just merge this.

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Mar 21, 2018

We could use inherent methods to have the iterator without the IntoIterator collision, like IntoIter::new(A) or [T; $N]::iter_values(self). But these would be less ergonomic, and only usable on nightly until we stabilize the type signature.

Could we make the impl IntoIterator only visible to 2018 edition? It couldn't be a simple #[cfg(2108)], since core's edition is independent of the user's. Maybe some other attribute could mask its visibility?

I'm game for adding a compat/deprecation warning, but I'm not sure how to do this, at least not at the library level. I guess it would have to be checked as a special case within the compiler?

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Mar 21, 2018

Shadowing existing […].into_iter() calls is business as usual when adding methods or impls to standard library types. (Though it looks like we have yet to do that crater run.)

I’m much more uncomfortable with merging this if the feature is effectively insta-stable while we have definite plans to change the "shape" of the return types later, even if these types can’t be named directly yet.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 22, 2018

cc @rust-lang/infra, can this get a crater run?

@bors: try

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Apr 30, 2018

Thanks for the PR, @cuviper ! Since we haven't heard from you in a few weeks, I'm going to close this for now to keep the queue tidy. Please feel free to reopen this PR when you have time to address the issues, thanks!

@shepmaster shepmaster closed this Apr 30, 2018

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Jul 13, 2018

@cuviper Did you ever get the clippy lint in? Were you able to make progress on fixing the errors?

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jul 13, 2018

I noted my WIP branch in rust-lang/rust-clippy#1565 (comment). It appears to work for arrays and slices, but I haven't gone out and started fixing things, except for the few instances that cropped up in clippy itself. That clippy issue is also requesting more general support for collections.

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Jan 4, 2019

@cuviper Ping on this-- what's the status here? I see that rust-lang/rust-clippy#3344 was merged back in October. What are the next steps?

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jan 11, 2019

We should get a feel for how much that clippy lint actually triggers on large codebases -- e.g. across rustc itself once #56595 lands. Is it possible to run crater for specific clippy lints too?

@cramertj cramertj referenced this pull request Jan 11, 2019

Closed

Test Clippy Lint #388

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Jan 11, 2019

It isn't possible today. I've opened rust-lang-nursery/crater#388 to track this. cc @pietroalbini.

@ishitatsuyuki

This comment has been minimized.

Copy link
Member

ishitatsuyuki commented Jan 12, 2019

I think this lint has record of triggering on std (vague memory), and I think it's not possible to apply this patch without some isolation (however feature-gate does not apply to traits, and edition-gate cannot be used for this purpose as well), given that this lint has triggered on at least one of my project I contribute to.

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jan 14, 2019

I think this will technically qualify as a minor breaking change -- the new trait implementation could break some existing code, but is easily worked around. The same is true of many library additions. But crater can help judge how pervasive this is, so we can decide if it's too much headache.

A fallback position is that we forgo IntoIterator for arrays, but still provide some other explicit method to create that iterator. It would be a little sad to lose the magic associated with IntoIterator though.

@clarcharr

This comment has been minimized.

Copy link
Contributor

clarcharr commented Jan 14, 2019

Perhaps we could still add an into_iter method for arrays that isn't IntoIterator::into_iter?

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jan 14, 2019

No, it's precisely the method call syntax that gets us into trouble, where auto-deref currently resolves array.into_iter() to the slice method. An inherent array into_iter() would break that just as much as the trait method does. We would have to call it something else, and document why it's not idiomatic.

Direct uses of IntoIterator, like for loops and the zip argument, wouldn't be a problem because you can't use array values there at all yet.

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jan 14, 2019

Hmm, we could add an inherent array into_iter() that forwards to slice, and then have our array IntoIterator by value. So auto-deref method resolution would still find the "legacy" behavior, while arrays would be properly iterable elsewhere.

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

Mark-Simulacrum commented Jan 14, 2019

To expand on that a little, why not even mark the inherent into_iter as a deprecated method, perhaps in some future release? That would give us the warning to current users more firmly, and presumably, at some later point, we can change into_iter on arrays to return the by-value iterator. This would perhaps be done at the next edition if we figure out some way to cfg methods on edition boundaries, but I think figuring that out concretely isn't super important to do right now.

In hindsight, it might be good to recommend that any collection-like types implementing Deref to another collection (e.g., slices, vec) should always "reserve" their into_iter ahead of time (i.e., by implementing it with a panic or similar) if they don't want to implement it or can't implement it yet. Might be interesting for the libs team to consider adding to the API guidelines.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Jan 14, 2019

This feels like one of those things we just need to do, even if it's a bit painful. There are so many places just calling out for this, like being able to do .flat_map(|x| [x, x]).

@xfix

This comment has been minimized.

Copy link
Contributor

xfix commented Jan 14, 2019

What about doing something like this?

  • Add a insta-deprecated method called into_iter method for arrays that returns slice::Iter<T>.
  • Implement IntoIterator for arrays.

Explicitly calling into_iter for arrays in such a scenario will call the deprecated into_iter method, as it's an inherent method, not a trait method. The depreciation warning will allow developers to update their code to not call into_iter method.

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Jan 14, 2019

@xfix Yes, that's what @Mark-Simulacrum and I are talking about in our most recent comments.

I guess we would need a new #[lang = "array"] for this, but there might be more to it:

error[E0118]: no base type found for inherent implementation
   --> src/libcore/array.rs:194:21
    |
194 |               impl<T> [T; $N] {
    |                       ^^^^^^^ impl requires a base type
@pietroalbini

This comment has been minimized.

Copy link
Member

pietroalbini commented Jan 27, 2019

It is now possible to start Crater runs with mode=clippy. Thanks @ecstatic-morse!

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Jan 28, 2019

I don't think I have permissions for this, but I can give it a shot and maybe someone else will pick it up for me when it fails ;)

@craterbot run name=clippy-test-run mode=clippy start=stable end=stable+rustflags=-Dclippy::into_iter_on_array

@craterbot

This comment has been minimized.

Copy link
Collaborator

craterbot commented Jan 28, 2019

👌 Experiment clippy-test-run created and queued.
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot

This comment has been minimized.

Copy link
Collaborator

craterbot commented Jan 28, 2019

🚧 Experiment clippy-test-run is now running on agent aws-1.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot

This comment has been minimized.

Copy link
Collaborator

craterbot commented Jan 29, 2019

🎉 Experiment clippy-test-run is completed!
📊 11872 regressed and 1 fixed (50551 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@pietroalbini

This comment has been minimized.

Copy link
Member

pietroalbini commented Jan 29, 2019

Tons of failures with:

[INFO] [stderr] error[E0602]: unknown lint: `clippy::into_iter_on_array`
[INFO] [stderr]   |
[INFO] [stderr]   = note: requested on the command line with `-D clippy::into_iter_on_array`
@ecstatic-morse

This comment has been minimized.

Copy link
Contributor

ecstatic-morse commented Jan 29, 2019

It seems passing clippy lints via RUSTFLAGS fails for projects which have a build script.

@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Feb 11, 2019

We discussed the inherent-method trick at all-hands, but discovered that it doesn't actually work on bare values. See in this playground that the final bare call does not hit the inherent method. Method resolution will choose our new <[T; N] as IntoIterator>::into_iter(self) before auto-ref'ing to call an inherent into_iter(&self), and the latter must be a reference to borrow for slice::Iter. So it seems we can't actually cover and deprecate the call we need, unless we involve some further compiler hackery.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment