Skip to content

Significantly slower broadcasting #356

@astrozot

Description

@astrozot

I have been testing the speed of broadcast operations with OffsetArrays of StaticArrays, and it looks like there is a significant penalty in time. In fact, on my laptop I see

julia> using StaticArrays, OffsetArrays, BenchmarkTools

julia> xs = [SVector{2}(rand(2)) for _  1:10_000];

julia> ys = similar(xs);

julia> @benchmark (@. $ys = $xs / (first($xs)^2 + last($xs)^2))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   9.419 μs … 43.605 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     10.142 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   11.559 μs ±  2.524 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █    ▁ ▅                                                     
  █▅▆▄▆█▃█▂▃▂▂▃▂▂▃▃▂▂▃▂▁▂▂▂▂▂▂▂▂▃▂▁▂▂▁▁▁▁▁▃▄▅▂▂▂▄▂▃▁▃▂█▆▂▁▁▅▆ ▃
  9.42 μs         Histogram: frequency by time        15.2 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

When using OffsetArrays instead these are the results:

julia> oxs = OffsetArray(xs, -1000);

julia> oys = OffsetArray(ys, -1000);

julia> @benchmark (@. $oys = $oxs / (first($oxs)^2 + last($oxs)^2))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  48.245 μs … 159.888 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     50.522 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   53.394 μs ±  10.011 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇███▄▁            ▃  ▃ ▂    ▁  ▄   ▃                         ▂
  ██████▇▇▁▇▆▆▇▅▇▃█▆█▇██▆█▇▆█▅█▆▅█▆▆▅█▆▆▆▆▆▆█▆▆▆▅▆▆▅▆▆▅▆▆▆▆▆▆▆ █
  48.2 μs       Histogram: log(frequency) by time      97.1 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

The $&gt; 5 \times$ penalty is even larger than the SIMD vector size of my laptop (4 for Float64 arrays).

Any help or clarification is really appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions