Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add implementations of last in terms of next_back on a bunch of DoubleEndedIterators #60130

Merged
merged 1 commit into from May 14, 2019

Conversation

Projects
None yet
10 participants
@khuey
Copy link
Contributor

commented Apr 20, 2019

Provided a DoubleEndedIterator has finite length, Iterator::last is equivalent to DoubleEndedIterator::next_back. But searching forwards through the iterator when it's unnecessary is obviously not good for performance. I ran into this on one of the collection iterators.

I tried adding appropriate overloads for a bunch of the iterator adapters like filter, map, etc, but I ran into a lot of type inference failures after doing so.

The other interesting case is what to do with Repeat. Do we consider it part of the contract that Iterator::last will loop forever on it? The docs do say that the iterator will be evaluated until it returns None. This is also relevant for the adapters, it's trivially easy to observe whether a Map adapter invoked its closure a zillion times or just once for the last element.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Apr 20, 2019

r? @Mark-Simulacrum

(rust_highfive has picked a reviewer for you, use r? to override)

@the8472

This comment has been minimized.

Copy link

commented Apr 20, 2019

The docs do say that the iterator will be evaluated until it returns None

Maybe the docs should be adjusted to mention that this only applies to the default implementation.

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

commented Apr 20, 2019

@sfackler

This comment has been minimized.

Copy link
Member

commented Apr 20, 2019

I think the behavior we want is that the method should behave as if the iterator was advanced until it returns None. So doing this for e.g. slice::Iter is totally fine since it doesn't have side effects, but not for iter::Repeat since it never returns None or iter::Map since it does have side effects.

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2019

That's somewhat unfortunate because it means this optimization can't apply to any combinator.

@sfackler

This comment has been minimized.

Copy link
Member

commented Apr 20, 2019

It would also be unfortunate to break code that is using these methods as documented.

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2019

Indeed.

@scottmcm

This comment has been minimized.

Copy link
Member

commented Apr 24, 2019

Could we just lint the use of last() on DEIs?

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented Apr 24, 2019

Could we just lint the use of last() on DEIs?

To raise just one issue, in generic code you may not even know that your Iterator is a DoubleEndedIterator.

@scottmcm

This comment has been minimized.

Copy link
Member

commented Apr 25, 2019

@khuey My experience in C++ is that having things (such as C++'s advance) be sometimes constant and sometimes linear is actually a misfeature, as it causes performance surprises once someone tries to use it with more complex iterators.

@the8472

This comment has been minimized.

Copy link

commented Apr 25, 2019

@scottmcm on the other hand you might end up wasting joules of the many to reduce surprises of the few.

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented Apr 26, 2019

@khuey My experience in C++ is that having things (such as C++'s advance) be sometimes constant and sometimes linear is actually a misfeature, as it causes performance surprises once someone tries to use it with more complex iterators.

I find this uncompelling for all the obvious reasons, so I'll just say that method specialization for performance has prior art in stdlib. e.g. every specialization of Read::read_exact, Seek::stream_position, etc.

@scottmcm

This comment has been minimized.

Copy link
Member

commented Apr 26, 2019

Really, I'm kinda surprised that .last() exists at all. .first() doesn't exist, and most things that need the back (like .rev()) require DEI. I feel like it's there for legacy, but might not be something we'd be willing to add these days. (I'd personally be tempted to just deprecate it, but that's probably overly aggressive.)

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented May 14, 2019

Ping?

@sfackler

This comment has been minimized.

Copy link
Member

commented May 14, 2019

Oops, lost track of this!

@bors r+

@bors

This comment has been minimized.

Copy link
Contributor

commented May 14, 2019

📌 Commit 3e86cf3 has been approved by sfackler

@bors

This comment has been minimized.

Copy link
Contributor

commented May 14, 2019

🌲 The tree is currently closed for pull requests below priority 500, this pull request will be tested once the tree is reopened

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented May 14, 2019

ta

Centril added a commit to Centril/rust that referenced this pull request May 14, 2019

Rollup merge of rust-lang#60130 - khuey:efficient_last, r=sfackler
Add implementations of last in terms of next_back on a bunch of DoubleEndedIterators

Provided a `DoubleEndedIterator` has finite length, `Iterator::last` is equivalent to `DoubleEndedIterator::next_back`. But searching forwards through the iterator when it's unnecessary is obviously not good for performance. I ran into this on one of the collection iterators.

I tried adding appropriate overloads for a bunch of the iterator adapters like filter, map, etc, but I ran into a lot of type inference failures after doing so.

The other interesting case is what to do with `Repeat`. Do we consider it part of the contract that `Iterator::last` will loop forever on it? The docs do say that the iterator will be evaluated until it returns None. This is also relevant for the adapters, it's trivially easy to observe whether a `Map` adapter invoked its closure a zillion times or just once for the last element.

bors added a commit that referenced this pull request May 14, 2019

Auto merge of #60834 - Centril:rollup-fikyi9i, r=Centril
Rollup of 9 pull requests

Successful merges:

 - #60130 (Add implementations of last in terms of next_back on a bunch of DoubleEndedIterators)
 - #60443 (as_ptr returns a read-only pointer)
 - #60444 (forego caching for all participants in cycles, apart from root node)
 - #60719 (Allow subdirectories to be tested by x.py test)
 - #60780 (fix Miri)
 - #60788 (default to $ARCH-apple-macosx10.7.0 LLVM triple for darwin targets)
 - #60799 (Allow late-bound regions in existential types)
 - #60808 (Improve the "must use" lint for `Future`)
 - #60819 (submodules: update clippy from 3710ec5 to ad3269c)

Failed merges:

r? @ghost

@bors bors merged commit 3e86cf3 into rust-lang:master May 14, 2019

1 check passed

Travis CI - Pull Request Build Passed
Details
@adlerd

This comment has been minimized.

Copy link

commented May 22, 2019

Why was this merged when it breaks the documented contract for the function, as noted above? The documentation hasn't even been updated to reflect the change! Here it is:

This method will evaluate the iterator until it returns None.

Having used rust for many years now, I'm shocked to learn that the team believes that documentation of "provided" trait methods only applies to the default implementation. That does not at all align with... well, any other trait in the library.

To be very clear, this change did not only change iterators where the difference cannot be observed. To wit:

impl<I> Iterator for Rev<I> where I: DoubleEndedIterator {
    ...
    #[inline]
    fn last(mut self) -> Option<Self::Item> {
        self.next_back()
    }
    ...
}

This change breaks the documented contract in an observable way.

The following program produces different side-effects before and after this change:

fn main() {
    let mut iter = [1, 2, 3, 4, 5].iter().inspect(|_| { println!("next()") }).rev();
    
    let _ = iter.last();
}

(@sfackler who reviewed because I'm not sure whether anyone will see this otherwise.)

@sfackler

This comment has been minimized.

Copy link
Member

commented May 22, 2019

Oh, oops. I didn't notice that the Rev implementation was still in this change. That part definitely needs to be reverted.

@sfackler

This comment has been minimized.

Copy link
Member

commented May 22, 2019

Actually, all of the slice overrides are also observable, since they call the user-provided closure. I'm just going to revert the whole PR.

@sfackler

This comment has been minimized.

Copy link
Member

commented May 22, 2019

@adlerd

This comment has been minimized.

Copy link

commented May 22, 2019

@sfackler, is there documentation on adding unit tests to std? I'd love to contribute a test or two which would have failed. I've improved a doc test before but that's probably a much simpler exercise.

Of course, I guess it has to be added separately for each iterator type so it may not really make sense; maybe you can provide guidance there too. Maybe a standard suite of tests for each iterator impl?

@the8472

This comment has been minimized.

Copy link

commented May 22, 2019

Arguably this is a flaw in the iterator API design. The java stream APIs mandate statelessness and side-effect freedom for most of their intermediate operations and by default don't guarantee intermediate operations to be invoked in any particular order, or even at all. Exceptions to that general policy are spelled out on individual operations.

https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/stream/package-summary.html

This allows fairly aggressive optimization of pipelines by fusing intermediate ops to produce the same end results.

@sfackler

This comment has been minimized.

Copy link
Member

commented May 22, 2019

Adding tests to std is basically the same as adding them to any other crate. For example, btree iterator tests are here: https://github.com/rust-lang/rust/blob/master/src/liballoc/tests/btree/map.rs

It seems like it should be possible to make a standard iterator test suite by having a function that compares an iterable to a slice of expected values.

@sfackler

This comment has been minimized.

Copy link
Member

commented May 22, 2019

@the8472 that doesn't really seem actionable in any way?

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2019

Wait, the change that's observable here are the things that invoke user defined callbacks, which, fair enough.

But what's wrong with Rev?

@sfackler

This comment has been minimized.

Copy link
Member

commented May 22, 2019

Rev can wrap an iterator with side effects.

@adlerd

This comment has been minimized.

Copy link

commented May 22, 2019

Rev enabled my "failure" case (really, "observably different", but I could have structured it as a contrived
regression). Without Rev it wouldn't have worked because inspect()'s DEI impl obviously didn't get the last() change. That is, Rev "wrapped" the obviously-side-effectful inspect. Actually I don't understand sfackler's comment that this is observable for slices; I think they're fine because a slice Iter as the outermost can't have side-effects.

@the8472

This comment has been minimized.

Copy link

commented May 22, 2019

@sfackler at least it is something to consider when designing adding new methods on Iterator. Keep the guarantees to an absolute minimum.

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2019

Ah, right, Rev can wrap an arbitrary iterator, ok.

@sfackler

This comment has been minimized.

Copy link
Member

commented May 22, 2019

I wasn't talking about slice::Iter, but rather things like slice::Split, where a user-defined callback is invoke in next and next_back.

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2019

@the8472 unfortunately rust doesn't have a FnPure that would enable these sorts of optimizations.

@adlerd

This comment has been minimized.

Copy link

commented May 22, 2019

@sfackler, wow, yeah, I missed that. It's not 100% clear that Split actually guarantees calling the closure on each element when consumed the way inspect() does, but it's pretty clearly implied, though.

@the8472

This comment has been minimized.

Copy link

commented May 22, 2019

@khuey I don't think pure functions are necessary for most optimizations even though they are sufficient. Making explicit non-guarantees about how/when they are invoked gets you most of the way.

@khuey

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2019

env::Args isn't safe for this optimization either because it could skip past a panic we would otherwise hit.

@timvermeulen

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

in generic code you may not even know that your Iterator is a DoubleEndedIterator.

@khuey Did you have any specific use case in mind here? I haven't been able to come up with examples of any useful calls to last on an iterator that may or may not implement DoubleEndedIterator (other than for implementing last on iterator adaptors).

@RalfJung

This comment has been minimized.

Copy link
Member

commented May 23, 2019

FWIW, liballoc tests are split across the liballoc crate itself (like normal unit tests) and the dedicated test crate at tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.