Skip to content

Conversation

@mateuszbaran
Copy link
Contributor

This is the first batch of changes related to #160 . Most changes are related to the fact that map is much more friendly to Julia compiler than broadcasting so I've replaced a bunch of unnecessary broadcasts with map. The other thing is that the loop in assignment broadcast is very problematic -- ideally compiler should be able to constant-propagate N and unroll the loop but it seems that doing so breaks some other optimizations. This change helps a bit but it's still not enough.

@mateuszbaran
Copy link
Contributor Author

I've improved that copyto! method and now my example from #160 is about 100x faster than before 🎉 :

julia> @benchmark f!($c, $a, $b)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min  max):  7.885 ns  34.613 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     7.972 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.990 ns ±  0.406 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                  ▂▂█▁▄▄                      
  ▂▂▂▂▁▁▂▁▂▂▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▅▄▅▄▄▇▆██████▇▇▄▅▅▄▇▆▇▄▄▃▃▂▂▃▂▃▃▃ ▃
  7.88 ns        Histogram: frequency by time         830 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Copy link
Member

@ChrisRackauckas ChrisRackauckas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, just mapping over the tuple is a lot better, great idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants