-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Iterators.partition split arrays into views for faster and easier parallelism #33533
Conversation
Cool!! |
Argument for why this change is an acceptable minor change even though it is technically breaking:
Both of these potential usages seem pretty unlikely. The main usage of this seems like it would be to just look at different partitions of the original array without doing any mutation of the original or the partition. With this change, a new potential use case becomes possible: modifying the original array by operating on partitions. Going back from views to copies would be more breaking since it would make code that used that use case stop working. |
Another nice coincidence: @vtjnash is doing compiler work that will make taking views usually non-allocating and therefore much more efficient in many more cases, so this could get even more efficient by the time 1.4 is actually ready. |
FWIW I've been using
This is great! |
Hasn't been officially triaged, but if anyone has any objections, please post them here. I think this seems popular enough based on the reactions to be merged without a triage debate. |
Previously only `::Vector` was special-cased to use views. The trade-off here is that we lose the ability to predict the concrete eltype -- since arrays can potentially choose to return something different from `vec` or `view`. Generic iterables still collect their elements into a freshly-allocated `Vector`, like before.
And also recompute ranges instead of using views for partitions of ranges. Since `Iterators.partition` is so handy for dividing up iteration spaces, it makes sense to optimize this as much as possible. While it is enherently a "linear" operation, it is a batched linear operation that allows us to skip doing all the effective ind2sub work on every single iteration.
6f6c88e
to
9e10015
Compare
function iterate(itr::PartitionIterator{<:AbstractRange}, state=1) | ||
state > length(itr.c) && return nothing | ||
r = min(state + itr.n - 1, length(itr.c)) | ||
return @inbounds itr.c[state:r], r + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OT: Are we missing an abstraction here: should we define that view(::AbstractRange, slice::AbstractRange) isa AbstractRange
, or would that confuse consumes or view
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's #26872 — looks like it got stalled because it was proposed before we really had a handle on minor changes.
shorter lines, more spaces, and comments (because I had already forgotten how this worked myself)
Triage was in favor and I addressed the style review... and while I was at it I added a few more comments because I didn't understand how this worked anymore. |
Very nice! :) |
I've frequently wanted to use
Iterators.partition
to subdivide an iteration space into chunks amenable to parallelism, but its implementation — which defaulted to copying all elements of each partition into a newVector
— left some efficiencies to be desired. It did have one special case — wherein apartition
of aVector
itself would helpfully use views. This PR makes that the default for allAbstractArray
s. It goes further and adds very helpful performance specializations for some ranges andCartesianIndices
essential for doing loops overpartition(eachindex(...))
.This PR is divided into three commits:
AbstractArray
s return viewsCartesianIndices
partition
we can easily split the algorithm into a@simd
able section, leading to up to 10x speedups for code likea .== 0
: