-
Notifications
You must be signed in to change notification settings - Fork 12.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing auto-vectorization for take+sum #115160
Comments
The manual loop just nests all the I can think of several ways to fix this. @rustbot claim |
Same problem with |
Same problem with |
On the library level those are a lot harder to optimize than |
Is this meant to highlight take + sum in general or specifically for slice iters? Getting vectorization for other iters is more complicated compared to slices. |
Ideally, there should be no difference between iterators and explicit loops, since this contradicts the idea of zero-cost abstractions.
https://doc.rust-lang.org/book/ch13-04-performance.html But if it's hard to implement, it's fine for slices and other simple cases, and there should probably be a note that this isn't always the case :) |
My question was more about what you're actually reporting. You started with slice.take.sum. Then you followed up with slice.take_while.sum and slice.map_while.sum. The more you generalize the less specific and more open-ended the issue becomes. |
You pointed out that So the problem is the specialization of |
#115273 should fix it for slice.take.sum |
LLVM cannot auto-vectorize the following code:
ASM
But can auto-vectorize when using
loop
instead offold
orsum
:ASM
https://rust.godbolt.org/z/o1hcvczTW
The text was updated successfully, but these errors were encountered: