Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add slice::ExactChunks and ::ExactChunksMut iterators #47126

Merged
merged 11 commits into from Jan 15, 2018

Conversation

Projects
None yet
@sdroege
Copy link
Contributor

commented Jan 2, 2018

These guarantee that always the requested slice size will be returned
and any leftoever elements at the end will be ignored. It allows llvm to
get rid of bounds checks in the code using the iterator.

This is inspired by the same iterators provided by ndarray.

Fixes #47115

I'll add unit tests for all this if the general idea and behaviour makes sense for everybody.
Also see #47115 (comment) for an example what this improves.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Jan 2, 2018

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @bluss (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@kennytm kennytm added the T-libs label Jan 2, 2018

} else {
let start = self.v.len() - self.chunk_size;
Some(&self.v[start..])
}

This comment has been minimized.

Copy link
@bluss

bluss Jan 2, 2018

Contributor

We can use next back here to save that code duplication

}

#[unstable(feature = "exact_chunks", issue = "47115")]
impl<'a, T> ExactSizeIterator for ExactChunksMut<'a, T> {}

This comment has been minimized.

Copy link
@bluss

bluss Jan 2, 2018

Contributor

We can implement is_empty since it is actually simpler than size_hint/len (saves the division)

} else {
let start = (self.v.len() - self.chunk_size) / self.chunk_size * self.chunk_size;
Some(&mut self.v[start..])
}

This comment has been minimized.

Copy link
@bluss

bluss Jan 2, 2018

Contributor

Same, we can use next_back here. (The code here in ExactChunksMut::last is not updated to take advantage of the property that self.v is evenly divisible by the chunk size)

}

#[inline]
fn nth(&mut self, n: usize) -> Option<Self::Item> {

This comment has been minimized.

Copy link
@bluss

bluss Jan 2, 2018

Contributor

I think this method can be much simpler if we just use the fact that self.v is evenly divisible by the chunk size. Same for the mutable version.

This comment has been minimized.

Copy link
@bluss

bluss Jan 2, 2018

Contributor

My first try would be to use n to find the new start of self.v, then call self.next(). Maybe that can be improved upon.

@bluss

This comment has been minimized.

Copy link
Contributor

commented Jan 2, 2018

I've submitted code review, but I think we need the libs team and the ticky boxes to weigh in on whether to include this in libcore & libstd. I think these methods seem fine; a bit obscure (*). An unfortunate point is that these would be yet better with const generics and producing &[T; N] and &mut [T; N] respectively. (Unfortunate since such ideas mean that we need to wait for it to be available in Rust).

cc @rust-lang/libs

(*) Looking closer at the resulting difference it's not exactly obscure, it's essential functionality for that type of code

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2018

@bluss Thanks for your review comments, I've updated everything accordingly. Still no tests yet, they'll come if this is considered a good idea :)

let (_, snd) = self.v.split_at(start);
self.v = snd;
assert!(self.v.len() == self.chunk_size);
self.next()

This comment has been minimized.

Copy link
@bluss

bluss Jan 2, 2018

Contributor

Why the assertion? It doesn't look correct as written, maybe >= was intended?

I'd probably avoid the assertion, there will be a bounds check equivalent to it anyhow in next(?).

This comment has been minimized.

Copy link
@sdroege

sdroege Jan 2, 2018

Author Contributor

Indeed, I was confused. Thanks!

@bluss

bluss approved these changes Jan 2, 2018

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2018

@bluss was this what you were thinking of with regards to TrustedRandomAccess in #47115 (comment) ?

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 3, 2018

The TrustedRandomAccess impl minimally changes the assembly in my testcase (same number of instructions, basically equivalent) but has no real effect on the performance.

@bluss

This comment has been minimized.

Copy link
Contributor

commented Jan 3, 2018

@sdroege Yep that's what I was thinking and, that's good info that it doesn't change anything. I think it can still be a good optimization in other cases?

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 3, 2018

I'm not entirely sure about that, also with regards to #47142. I'll do some benchmarking tomorrow.

Basically (for zip) we assume here that the compiler will optimize away the multiplication on each access. Without the TrustedRandomAccess it would only be an increment every time instead of a multiplication, if the compiler does not optimize the multiplication away.

I assume when the trait was added and the specialized implementation for zip, this was all measured and taken into account though.

@bluss

This comment has been minimized.

Copy link
Contributor

commented Jan 3, 2018

@sdroege I think you bring up details that matter; maybe a smarter zip specialization could be adopted that helps with that, or maybe even adding a more special case.

In my understanding, the current zip specialization helps with something "dumber" than that. In completely general .zip(), we have two input iterators, and for each iteration, we need to ask them both if they have a next element; and this didn't compile well for some common loops with slices at least at the time when the specialization was added. With zip specialization, we only need to check "is there a next element" one time per iteration, and this is true even if we have a n-ary combination of .zip()s.

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 4, 2018

@bluss If you don't mind I would suggest to skip the commit that adds the TrustedRandomAccess from here so that we can discuss about the API itself instead. And move that commit to another PR at a later time.

In the meantime I'll do some more analysis and benchmarking of the effect of the trait impl for the normal Chunks (and the other three).

@sdroege I think you bring up details that matter; maybe a smarter zip specialization could be adopted that helps with that, or maybe even adding a more special case.

Something that just increments on each iteration would be useful, as that would not rely on the optimizer to get rid of the multiplication. Something like a next_unsafe that you must only call if you know that there is a next item. That also seems like it would solve the original purpose of the trait.

But this all seems like something for another issue to discuss it

@bluss

This comment has been minimized.

Copy link
Contributor

commented Jan 4, 2018

Yes, let's split off that commit. Fwiw, both the current approach and a next unchecked approach were considered the first time. I think current was marginally better, but why not look at it again and with a wider use case.

@sdroege sdroege force-pushed the sdroege:exact-chunks branch from b306440 to 391d837 Jan 4, 2018

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 4, 2018

Removed that commit. Now this should all be good to be reviewed for the actual new API.

I'm going to do some archaeology about the original implementation and reason for the trait being like this, and then open a new issue about it.

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 4, 2018

@bluss Ok, you already did all that very same investigation I was going to do back then :) See #33090 (comment) and #33090 (comment)

Basically the counter-index approach instead of pointer-increment can be optimized better by llvm. So I guess let's keep it at that and I'll re-add the commit here again. The chunks iterators are more or less the same as the normal slice iterators in every regard.


So in summary, I think the open questions here are the following:

  • Does it make sense to add such a specialized chunks variant?
    #47115 (comment) and #47115 (comment) would suggest so as it can improve performance a lot. It also potentially improves usability a bit as the code using the iterator can really assume that each slice will be exactly that many elements.

  • Should it use const generics?
    This would mean using &[T; N] instead of &[T] (with a fixed size the compiler can infer). The function signature would be quite more complicated, and how to call it too (exact_chunks(n) vs exact_chunks::<n>()), but it seems more explicit (you have the slice length directly in your types).
    It also has the disadvantage that having the function available and having it be stabilized would be coupled with const generics.
    But I think the biggest disadvantage, and a reason why having both might be useful, is that the chunk size would have to be always known at compile-time.

@bluss

bluss approved these changes Jan 6, 2018

@leonardo-m

This comment has been minimized.

Copy link

commented Jan 6, 2018

and a reason why having both might be useful, is that the chunk size would have to be always known at compile-time.

I agree that having both could be better. Despite the increased API.

@sdroege sdroege force-pushed the sdroege:exact-chunks branch from 9b8d9c1 to 8aa1b90 Jan 9, 2018

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2018

Rebased against latest master to solve a couple of merge conflicts.

@mbrubeck

This comment has been minimized.

Copy link
Contributor

commented Jan 9, 2018

Should the documentation mention that these methods may be faster than the existing ones? This seems like important information for helping users choose the appropriate iterator for a given use case.

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2018

Should the documentation mention that these methods may be faster than the existing ones? This seems like important information for helping users choose the appropriate iterator for a given use case.

Done, thanks

@Kimundi

This comment has been minimized.

Copy link
Member

commented Jan 9, 2018

I think the name exact_chunks for an API that works with dynamic, but fixed sizes is fine, and consistent with other std API like read_exact.

In the same sense, if we would provide and API that works with const generics and [T; N], then we should call that fixed_chunks to be consistent with the naming of fixed sized arrays. This naturally leads to providing both APIs, in my opinion.

In the libs meeting the possible concern got raised about it not being obvious that elements might get dropped at the end. I wounder if a solution to that would be what we did for copy_from_slice: Just panic if the length of the slice is not evenly divisible into chunks, forcing the user to explicitly handle the case up-front.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Jan 10, 2018

The libs team discussed this today and was overall on board with landing this (@Kimundi commented above) so @bluss feel free to r+ when you're satisfied!

sdroege added some commits Jan 2, 2018

Apply review comments from @bluss
- Simplify nth() by making use of the fact that the slice is evenly
  divisible by the chunk size, and calling next() instead of
  duplicating it
- Call next_back() in last(), they are equivalent
- Implement ExactSizeIterator::is_empty()
Mention in the exact_chunks docs that this can often be optimized bet…
…ter by the compiler

And also link from the normal chunks iterator to the exact_chunks one.
Use assert_eq!() instead of assert!(a == b) in slice chunks_mut() uni…
…t test

This way more useful information is printed if the test ever fails.
Test the whole chunks instead of just an element in the chunks/chunks…
…_mut tests

Easy enough to do and ensures that the whole chunk is as expected
instead of just the element that was looked at before.
Add unit tests for exact_chunks/exact_chunks_mut
These are basically modified copies of the chunks/chunks_mut tests.

@sdroege sdroege force-pushed the sdroege:exact-chunks branch from a196f41 to 5f4fc82 Jan 13, 2018

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 13, 2018

Thanks, changed the "Fixes ..." to "See ...", everything else the same.

@bluss

This comment has been minimized.

Copy link
Contributor

commented Jan 13, 2018

Thanks!

@bors r+

@bors

This comment has been minimized.

Copy link
Contributor

commented Jan 13, 2018

📌 Commit 5f4fc82 has been approved by bluss

@sdroege

This comment has been minimized.

Copy link
Contributor Author

commented Jan 13, 2018

Thanks @sdroege. The tracking issue you set up for this is #47115, I'll edit it a bit and maybe you can put in the remaining questions.

I've added the open question there now (panic or skip left-over elements). I don't think there were any other questions here

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jan 14, 2018

Rollup merge of rust-lang#47126 - sdroege:exact-chunks, r=bluss
Add slice::ExactChunks and ::ExactChunksMut iterators

These guarantee that always the requested slice size will be returned
and any leftoever elements at the end will be ignored. It allows llvm to
get rid of bounds checks in the code using the iterator.

This is inspired by the same iterators provided by ndarray.

Fixes rust-lang#47115

I'll add unit tests for all this if the general idea and behaviour makes sense for everybody.
Also see rust-lang#47115 (comment) for an example what this improves.

bors added a commit that referenced this pull request Jan 15, 2018

Auto merge of #47435 - GuillaumeGomez:rollup, r=GuillaumeGomez
Rollup of 8 pull requests

- Successful merges: #47120, #47126, #47277, #47330, #47398, #47413, #47417, #47432
- Failed merges:

kennytm added a commit to kennytm/rust that referenced this pull request Jan 15, 2018

Rollup merge of rust-lang#47126 - sdroege:exact-chunks, r=bluss
Add slice::ExactChunks and ::ExactChunksMut iterators

These guarantee that always the requested slice size will be returned
and any leftoever elements at the end will be ignored. It allows llvm to
get rid of bounds checks in the code using the iterator.

This is inspired by the same iterators provided by ndarray.

Fixes rust-lang#47115

I'll add unit tests for all this if the general idea and behaviour makes sense for everybody.
Also see rust-lang#47115 (comment) for an example what this improves.

bors added a commit that referenced this pull request Jan 15, 2018

Auto merge of #47445 - kennytm:rollup, r=kennytm
Rollup of 11 pull requests

- Successful merges: #47120, #47126, #47277, #47330, #47334, #47368, #47372, #47414, #47417, #47432, #47443
- Failed merges:

kennytm added a commit to kennytm/rust that referenced this pull request Jan 15, 2018

Rollup merge of rust-lang#47126 - sdroege:exact-chunks, r=bluss
Add slice::ExactChunks and ::ExactChunksMut iterators

These guarantee that always the requested slice size will be returned
and any leftoever elements at the end will be ignored. It allows llvm to
get rid of bounds checks in the code using the iterator.

This is inspired by the same iterators provided by ndarray.

Fixes rust-lang#47115

I'll add unit tests for all this if the general idea and behaviour makes sense for everybody.
Also see rust-lang#47115 (comment) for an example what this improves.

bors added a commit that referenced this pull request Jan 15, 2018

Auto merge of #47445 - kennytm:rollup, r=kennytm
Rollup of 10 pull requests

- Successful merges: #47120, #47126, #47277, #47330, #47368, #47372, #47414, #47417, #47432, #47443
- Failed merges: #47334

@bors bors merged commit 5f4fc82 into rust-lang:master Jan 15, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
///
/// Due to each chunk having exactly `chunk_size` elements, the compiler
/// can often optimize the resulting code better than in the case of
/// [`chunks`].

This comment has been minimized.

Copy link
@Kerollmops

Kerollmops Jan 20, 2018

Contributor

Whoops, it seems this is not referenced, resulting in a broken link.

This comment has been minimized.

Copy link
@sdroege

sdroege Jan 21, 2018

Author Contributor

Indeed, thanks for noticing. I'll submit a PR later, not sure yet why... or why rustdoc does not error out on broken links. Oh well :)

This comment has been minimized.

Copy link
@sdroege

sdroege Jan 21, 2018

Author Contributor

Ah understood why!

This comment has been minimized.

Copy link
@Kerollmops

Kerollmops Jan 21, 2018

Contributor

why ? :)

This comment has been minimized.

Copy link
@sdroege

sdroege Jan 21, 2018

Author Contributor

See sdroege@1756f68 . It seems like you have to provide "full" paths somewhere in your doc chunk for "shortened" links. rustdoc doesn't seem to check the current scope for things with the same name.

This comment has been minimized.

Copy link
@Kerollmops

Kerollmops Jan 21, 2018

Contributor

I knew well why, you didn't reference the links in the scope :)
I asked more for:

why rustdoc does not error out on broken links

This comment has been minimized.

Copy link
@steveklabnik

steveklabnik Jan 22, 2018

Member

it should.....

///
/// Due to each chunk having exactly `chunk_size` elements, the compiler
/// can often optimize the resulting code better than in the case of
/// [`chunks_mut`].

This comment has been minimized.

Copy link
@Kerollmops

Kerollmops Jan 20, 2018

Contributor

broken too.

sdroege added a commit to sdroege/rust that referenced this pull request Jan 21, 2018

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jan 21, 2018

Rollup merge of rust-lang#47632 - sdroege:exact-chunks-docs-broken-li…
…nks, r=kennytm

Fix broken links to other slice functions in chunks/chunks_mut/exact_…

…chunk/exact_chunks_mut docs

See rust-lang#47126 (comment)

#[inline]
fn next(&mut self) -> Option<&'a [T]> {
if self.v.len() < self.chunk_size {

This comment has been minimized.

Copy link
@Kerollmops

Kerollmops May 30, 2018

Contributor

This condition can probably be simplified to just self.v.is_empty() because we already know that the slice has a length that is a modulo of the chunk_size so the only reason why the slice can be too short is that the slice is empty.

This comment has been minimized.

Copy link
@sdroege

sdroege May 30, 2018

Author Contributor

Yes, but for that see also #47115 (comment)

#[inline]
fn nth(&mut self, n: usize) -> Option<Self::Item> {
let (start, overflow) = n.overflowing_mul(self.chunk_size);
if start >= self.v.len() || overflow {

This comment has been minimized.

Copy link
@Kerollmops

Kerollmops May 30, 2018

Contributor

This condition seems wrong, we must test the overflow before the test of the start is greater or not than the self.v.len() because if we have overflowed so the start has been wrapped and can be smaller than the self.v.len(). And the returned value will be wrong.

We probably want to panic if the computation is impossible and will overflow here, no ?
https://doc.rust-lang.org/std/iter/trait.Iterator.html#panics

EDIT: I am wrong about the order of the conditions, this is a || we don't care in this case.

This comment has been minimized.

Copy link
@sdroege

sdroege May 30, 2018

Author Contributor

You still think that an overflow should panic instead of doing nothing? This is currently the same behaviour as for the other chunks iterators and what was stabilized for them

This comment has been minimized.

Copy link
@Kerollmops

Kerollmops May 30, 2018

Contributor

Yes, I think checked but silent overflows is not a good behavior. But this is a really rare behavior to have an overflow to address the nth element, so this is not something we should care of.

This comment has been minimized.

Copy link
@sdroege

sdroege May 30, 2018

Author Contributor

I agree but I think it's more problematic to have inconsistent behaviour between the different chunk iterators (and we can't change the stabilized existing ones). But if there's disagreement I'd be happy to change it

This comment has been minimized.

Copy link
@Kerollmops

Kerollmops May 31, 2018

Contributor

If you think that consistency between chunk iterators is important, I think we must have consistency between all other iterators and panic if an overflow occurs like the Enumerate adapter do, so it could be a great improvement to update the current implementation of the Chunks/ChunksMut to follow this rule. Don't you think ?

This comment has been minimized.

Copy link
@sdroege

sdroege May 31, 2018

Author Contributor

Can you open an issue about that? I would generally agree (and also think that panicking would be cleaner here) but changing Chunks/ChunksMut could be considered a breaking change

This comment has been minimized.

Copy link
@Kerollmops

Kerollmops May 31, 2018

Contributor

Here is the issue #51254

This comment has been minimized.

Copy link
@sdroege

sdroege May 31, 2018

Author Contributor

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.