- 
                Notifications
    You must be signed in to change notification settings 
- Fork 64
Closed
Description
Hello,
I show a simple example where I find some memory allocations when performing the 5-arguments mul!. In some cases only when the Sparse matrix A is inside a struct, in other cases also without struct.
julia> using LinearAlgebra
julia> using SparseArrays
julia> using BenchmarkTools
julia> T = Float64;
julia> N = 1000;
julia> A = sprand(T, N, N, 5 / N);
julia> x = rand(T, N);
julia> y = similar(x);
julia> α = rand(T);
julia> β = rand(T);
julia> @benchmark mul!($y, $A, $x, $α, false) # This is ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.092 μs …  3.589 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.146 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.184 μs ± 68.468 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%
           ▁▇█▄                                               
  ▂▂▂▂▂▂▂▂▄████▇▄▂▂▂▂▁▂▂▂▂▂▂▂▂▁▂▂▁▂▂▂▃▄▆██▆▄▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  2.09 μs        Histogram: frequency by time        2.35 μs <
 Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark mul!($y, $A, $x, $α, $β) # This is not ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.164 μs …   5.236 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.193 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.261 μs ± 138.081 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%
  ▁▅▇█▇▆▄▂▂▂▁           ▁▁    ▁  ▂▄▆▆▆▄▃                ▁▁    ▂
  ████████████▆▅▄▃▅▄▅▃▆██████▇███████████▆▆▅▅▄▄▃▁▅▄▃▅▅▆█████▆ █
  2.16 μs      Histogram: log(frequency) by time      2.54 μs <
 Memory estimate: 80 bytes, allocs estimate: 2.Involving a simple constructor
julia> struct my_MatrixOperator{MT}
           A::MT
       end
julia> function LinearAlgebra.mul!(v::AbstractVecOrMat, L::my_MatrixOperator, u::AbstractVecOrMat)
           mul!(v, L.A, u)
       end
julia> function LinearAlgebra.mul!(v::AbstractVecOrMat,
               L::my_MatrixOperator,
               u::AbstractVecOrMat,
               α,
               β)
           mul!(v, L.A, u, α, β)
       end
julia> A_op_my = my_MatrixOperator(A);
julia> @benchmark mul!($y, $A_op_my, $x, $α, false) # This is no longer ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.553 μs …   5.645 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.627 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.653 μs ± 116.667 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%
  ▁▆▇▇▆▃▁▂▅██▇▅▂              ▁ ▂▃▄▄▃▂   ▂▃▃▂            ▁▂▂▂ ▃
  ██████████████▇▄▅▄▃▇████▇▇▇█████████▇▆██████▅▅▄▁▄▁▄▅▆▅▆████ █
  2.55 μs      Histogram: log(frequency) by time         3 μs <
 Memory estimate: 80 bytes, allocs estimate: 2.
julia> @benchmark mul!($y, $A_op_my, $x, $α, $β) # This is not ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.160 μs …  12.129 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.190 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.254 μs ± 176.229 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%
  ▁▅██▆▄▂          ▁▅▆▆▅▃▂▁  ▁▃▄▃▂              ▁▂▁           ▂
  ████████▆▅▅▃▃▄▄▅▇████████▇▇██████▆▆▅▅▅▁▅▅▅▅▅▇█████▆▆▅▅▄▅▄▁▃ █
  2.16 μs      Histogram: log(frequency) by time       2.6 μs <
 Memory estimate: 80 bytes, allocs estimate: 2.
julia> @benchmark mul!($y, $A_op_my, $x) # This is ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.030 μs …  3.575 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.053 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.064 μs ± 54.170 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%
   ▄▆███▇▅▄▁                                        ▁▁       ▂
  ██████████▇▅▄▄▃▁▁▄▄▁▁▄▃▁▄▆▇███▇█▇▇▇▅▆▅▄▄▅▅▄▄▅▄▅▄▄▇███▇▅▄▄▆ █
  2.03 μs      Histogram: log(frequency) by time     2.31 μs <
 Memory estimate: 0 bytes, allocs estimate: 0.Possible partial fix
I've noticed that, by defining the functions above with the @inline or Base.@constprop :aggressive macros, the mul!($y, $A_op_my, $x, $α, false) case is fixed, but it is still a mystery to me.
Metadata
Metadata
Assignees
Labels
No labels