-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add Cartesian product iteration. Fixes #1917 #6437
Conversation
Any progress on this and the inlining issues? I would also very much like to see this functionality, cause I have also been thinking about iterators for running over higher-dimensional arrays. For example, you could create iterators where you specify a permutation of the array dimensions. You could then write for I,J=zip(iterate(A,1:N),iterate(B,permutation)
B[J]=A[I]
end If you combine this with the implementation of different iterators that access the data according to different patterns, e.g. some cache friendly blocking order or some divide and conquer order, you could write very simple expressions for cache-friendly methods for higher order arrays. I could easily imagine writing complicated tensor contractions with three simple for loops just like matrix multiplication, where the iterators take care of matching all the right elements and at the same time ensure cache-friendlyness. But of course, none of this is possible as long as the inline keyword is not in place. |
@JeffBezanson,
After warmup:
The |
I also put a little time into trying to figure out why
which as far as I understand it seems like a fairly straightforward lookup-a-value-from-pointer-indexed-memory and add it to
To me it looks like the tuple is not being elided as a pair of integer loop variables. |
Finally, I seem to remember that there's an issue (couldn't find it despite searching, unfortunately) where @carlobaldassi showed that if you replace the tuple with an immutable, you get rid of the performance hit. I seem to remember that Carlo himself thought that approach was a little too ugly, because you need an Iterator3 immutable for 3 dimensions, an Iterator4 for 4 dimensions, etc. But it does show that in principle this can be done. |
6c7c7e3
to
1a4c02f
Compare
Efficient cartesian iteration (new version of #6437)
This is a WIP because
next
is not currently being inlined, and consequently this has a massive performance hit.I'm aware that most of the performance hit comes from not eliding tuples in places where we ideally could. But once that problem gets solved, we'll have to face the next performance problem that stems more specifically from the lack of inlining. (Heck, we were worried about #5469, and compared to inlining the increment operation, moving the placement of
state += 1
is peanuts.) So we'll need to fix the inlining anyway, and it neatly sidesteps the (harder?) problem of tuple elision in this particular case.Once the problems are fixed we can expect to be within a small factor of manually-coded (or Cartesian-coded) loops. See discussion in #3796.