Tracking Issue for `Iterator::next_chunk` #98326

rossmacarthur · 2022-06-21T06:49:25Z

Feature gate: #![feature(iter_next_chunk)]

This is a tracking issue for the .next_chunk() method on iterators which allows you to advance the iterator N elements and return an array [T; N]

Public API

use core::array;

trait Iterator {
    type Item;
    
    fn next_chunk<const N: usize>(&mut self,) -> Result<[Self::Item; N], array::IntoIter<Self::Item, N>>
    where
        Self: Sized;
}

Steps / History

Implementation:
- Add Iterator::next_chunk #93700
- Optimized vec::IntoIter::next_chunk impl #98553
Final comment period (FCP)
Stabilization PR

Unresolved Questions

Naming: other options include next_array() or next_array_chunk().
Should we also add next_chunk_back to DoubleEndedIterator?
How should we handle N = 0?

The text was updated successfully, but these errors were encountered:

ChayimFriedman2 · 2022-06-21T07:49:23Z

Should this method support N=0? itertools does not, and it sounds plausible; however, if we want to do the same, we will have the same concern as <[T]>::array_chunks() that cannot be stabilized until generic_const_exprs or some subset will be stabilized.

the8472 · 2022-06-25T18:19:15Z

I'm wondering about the intended uses of this API.

If it is for performance (autovectorization) then it seems fairly brittle. I just tried doing this benchmark:

#[bench]
fn bench_next_chunk(b: &mut Bencher) {
    let v = vec![13u8; 2048];

    b.iter(|| {
        const CHUNK: usize = 8;

        let mut sum = [0u32; CHUNK];
        let mut iter = black_box(v.clone()).into_iter();

        while let Ok(chunk) = iter.next_chunk::<CHUNK>() {
            for i in 0..CHUNK {
                sum[i] += chunk[i] as u32;
            }
        }

        sum
    })
}

and it took 17,272 ns/iter when compiled with target-cpu=znver2. With a handrolled implementation I got 211 ns/iter. I encountered some 10x slowdowns in a few other combinations. Sometimes codegen-units=1 produced even worse assembly because vectorizer did silly things such as loading one byte at a time into a simd register with vpinsrb.

And that's with direct, linear memory access on a Vec::IntoIter which should be relatively straight-forward to turn into masked simd loads. Many iterators yield references. A [&T; N] would need a gather-load which is expensive and only available on newer CPUs, this is unlike slice::array_chunks which yields &[T; N]. We could try specializing slice.iter().next_chunk() but then you might as well use array_chunks. And of course it'll get more complex with additional adapters.

Another limitation is that it can't be used in a for-in loop or chained further with other iterator adapters, an issue #92393 didn't have.

Lokathor · 2022-09-07T14:54:42Z

This is probably an example of a case where you should first use copied(), and then chunk it, and then you turn that array into a simd type for the work, before possibly turning it back at the end of your loop.

And the change into and out of simd form will have a cost, so you'll want to ensure you're doing enough work inside each loop pass or it might exceed the cost of the scalar only code.

the8472 · 2022-09-07T16:01:01Z

yeah that was just an example to illustrate that the default implementation is suboptimal and we'll need optimized implementations on most linear-memory sources and length-preserving adapters.

Lokathor · 2022-09-07T16:04:44Z

Mm, yeah.

Myself I would imagine using this most on iterators that aren't linear memory sources, and "the point" would be to have a clean and unified way to turn some crazy data source into simple SIMD chunks.

rossmacarthur · 2022-11-01T08:13:24Z

Might be good to align the naming of this function with Iterator::array_chunks, i.e. either

Iterator::chunks
Iterator::next_chunk

Or

Iterator::array_chunks
Iterator::next_array_chunk

mark-i-m · 2023-01-24T22:46:03Z

So I'm finding myself caring less and less about the actual naming. I've wanted this feature at least 3 times in the last year and had to write my own version to get around. Can we just pick something and get the feature stabilized?

ZhennanWu · 2023-04-23T18:19:17Z

I believe the current implementation also suffers from this compiler optimization issue: #110734

Godbolt: https://godbolt.org/z/Te5Wo9eh7

the8472 · 2023-04-23T18:42:23Z

I'm not sure what that issue has to do with next_chunk specifically. It seems to be mostly about the codegen for it.collect::<Vec<_>>(), which I think is easily disturbed by anything mutating the iterator.

pitaj · 2023-05-25T05:00:30Z

Let's get this stabilized. Here are my answers to the Unresolved Questions:

Naming: other options include next_array() or next_array_chunk().

next_chunk seems like the best option to me.

Should we also add next_chunk_back to DoubleEndedIterator?

Probably, but that should be under a separate feature and not block this.

How should we handle N = 0?

It should be allowed and return a zero-length array without advancing the iterator. No good reason to forbid it.

the8472 · 2023-05-25T07:50:29Z

It should be allowed and return a zero-length array without advancing the iterator. No good reason to forbid it.

For next_chunk it might not be a problem but for the chunks iterators on slices and for iterator.array_chunks() it is. So we might as well do something about it here

pitaj · 2023-05-25T10:29:45Z

I disagree. For those, it's necessary by definition. There's no reason to apply the same logic here besides, in my opinion, a misguided desire for consistency. The behavior of next_chunk is totally reasonable for array length 0, unlike the others you mentioned.

"Doing something about it" will pretty much require const expressions just like all the other ones, needlessly delaying the stabilization.

the8472 · 2023-05-25T10:37:36Z

Well, it would prevent implementations of next_chunk using those things internally

pitaj · 2023-05-25T13:10:15Z

I can see that maybe being the case down the line unless we add something like const if or const match. But currently they just panic when N=0 ~~so one can just put a simple if N == 0 { return Ok([]); } at the top which will easily get optimized out.~~

Edit: actually the above does NOT work and would need some kind of const if as well. I'll need to think about this.

It's possible to achieve a similar thing via specialization but it's not ideal.

frewsxcv · 2023-06-19T21:55:38Z

@rust-lang/libs Is there anything blocking a stabilization pull request? I'm happy to open that

Andlon · 2024-05-18T13:39:00Z

I think for the name next_chunk it's slightly unfortunate that a "chunk" in this context is an array and not a slice, like slice.chunks(). This seems inconsistent and a clear potential source of confusion. next_array_chunk or similar removes any such ambiguities, and should be preferred, in my opinion.

If things are otherwise blocked on the handling of N == 0, would it be possible to delay this decision by panicking (at compile time?) for now, and then possibly allow it in the future? My understanding is that the case N == 0 is an edge case that would only come up in some unusual situations involving generics, and it seems unfortunate for such an edge case to delay stabilization for the common use cases of this method.

the8472 · 2024-05-18T14:15:25Z

I think for the name next_chunk it's slightly unfortunate that a "chunk" in this context is an array and not a slice, like slice.chunks(). This seems inconsistent and a clear potential source of confusion. next_array_chunk or similar removes any such ambiguities, and should be preferred, in my opinion.

Whether a chunk is a slice or an array already is not consistent. E.g. slices already have as_chunks that returns a reference to arrays. I don't see much of a chance for confusion because -> [Self::Item] signatures are not possible and -> &[Self::Item] signatures would be neither practical nor useful to implement.

If things are otherwise blocked on the handling of N == 0, would it be possible to delay this decision by panicking (at compile time?) for now, and then possibly allow it in the future?

That would block the option of adding a where N > 0 bound once that's possible

Andlon · 2024-05-18T14:35:03Z

Whether a chunk is a slice or an array already is not consistent. E.g. slices already have as_chunks that returns a reference to arrays. I don't see much of a chance for confusion because -> [Self::Item] signatures are not possible and -> &[Self::Item] signatures would be neither practical nor useful to implement.

Fair point, though I don't think any of these methods have been stabilized, unless I'm mistaken. as_chunks would probably also be better as as_array_chunks, for the same reason.

Actually, upon reviewing this post I see that last_chunk was recently made available in 1.77. Unfortunate. Oh well. In that case, you are correct, and that it's already unfortunately inconsistent.

In any case, the confusion I point to has to do with readability. By using clear and distinct naming, you do not have to stop and think through a logical argument like the one you suggest when scanning over the code.

I just don't see any downside in using a few more characters to avoid the ambiguity. But I also don't think we need to bikeshed this further.

That would block the option of adding a where N > 0 bound once that's possible

Thanks for pointing this out, that's what I was missing.

betelgeuse · 2024-06-16T21:46:08Z

Before making this stable it would be useful to improve the documentation so that it provides an example with a loop. Now the examples have fixed amount of calls to next_chunk. I would think the more common use case is when not operating on a known input.

https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.next_chunk

…cuviper fix Drop items getting leaked in Filter::next_chunk The optimization only makes sense for non-drop elements anyway. Use the default implementation for items that are Drop instead. It also simplifies the implementation. fixes rust-lang#126872 tracking issue rust-lang#98326

Rollup merge of rust-lang#126879 - the8472:next-chunk-filter-drop, r=cuviper fix Drop items getting leaked in Filter::next_chunk The optimization only makes sense for non-drop elements anyway. Use the default implementation for items that are Drop instead. It also simplifies the implementation. fixes rust-lang#126872 tracking issue rust-lang#98326

rossmacarthur added C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jun 21, 2022

the8472 mentioned this issue Jun 26, 2022

Add Iterator::next_chunk #93700

Merged

the8472 mentioned this issue Aug 1, 2022

Add Iterator::array_chunks (take N+1) #100026

Merged

scottmcm mentioned this issue Oct 4, 2022

Add next_array and collect_array rust-itertools/itertools#560

Open

emilHof mentioned this issue May 25, 2023

Add array::try_from_iter as safe way of creating a fixed sized array from an IntoIterator. rust-lang/libs-team#229

Closed

workingjubilee added the A-autovectorization Issue related to autovectorization, which can impact perf or code size. label Jul 22, 2023

gksato mentioned this issue Jun 23, 2024

<core::iter::Filter as Iterator>::next_chunk virtually core::mem::forgets all elements that fail the predicate #126872

Closed

the8472 mentioned this issue Jun 24, 2024

fix Drop items getting leaked in Filter::next_chunk #126879

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking Issue for `Iterator::next_chunk` #98326

Tracking Issue for `Iterator::next_chunk` #98326

rossmacarthur commented Jun 21, 2022 •

edited by the8472

Loading

ChayimFriedman2 commented Jun 21, 2022

the8472 commented Jun 25, 2022

Lokathor commented Sep 7, 2022

the8472 commented Sep 7, 2022

Lokathor commented Sep 7, 2022

rossmacarthur commented Nov 1, 2022 •

edited

Loading

mark-i-m commented Jan 24, 2023

ZhennanWu commented Apr 23, 2023

the8472 commented Apr 23, 2023

pitaj commented May 25, 2023 •

edited

Loading

the8472 commented May 25, 2023

pitaj commented May 25, 2023

the8472 commented May 25, 2023

pitaj commented May 25, 2023 •

edited

Loading

frewsxcv commented Jun 19, 2023

Andlon commented May 18, 2024

the8472 commented May 18, 2024

Andlon commented May 18, 2024

betelgeuse commented Jun 16, 2024

Tracking Issue for Iterator::next_chunk #98326

Tracking Issue for Iterator::next_chunk #98326

Comments

rossmacarthur commented Jun 21, 2022 • edited by the8472 Loading

Public API

Steps / History

Unresolved Questions

ChayimFriedman2 commented Jun 21, 2022

the8472 commented Jun 25, 2022

Lokathor commented Sep 7, 2022

the8472 commented Sep 7, 2022

Lokathor commented Sep 7, 2022

rossmacarthur commented Nov 1, 2022 • edited Loading

mark-i-m commented Jan 24, 2023

ZhennanWu commented Apr 23, 2023

the8472 commented Apr 23, 2023

pitaj commented May 25, 2023 • edited Loading

the8472 commented May 25, 2023

pitaj commented May 25, 2023

the8472 commented May 25, 2023

pitaj commented May 25, 2023 • edited Loading

frewsxcv commented Jun 19, 2023

Andlon commented May 18, 2024

the8472 commented May 18, 2024

Andlon commented May 18, 2024

betelgeuse commented Jun 16, 2024

Tracking Issue for `Iterator::next_chunk` #98326

Tracking Issue for `Iterator::next_chunk` #98326

rossmacarthur commented Jun 21, 2022 •

edited by the8472

Loading

rossmacarthur commented Nov 1, 2022 •

edited

Loading

pitaj commented May 25, 2023 •

edited

Loading

pitaj commented May 25, 2023 •

edited

Loading