Skip to content

Memory allocations of 5-argumets mul! when involving struct #566

@albertomercurio

Description

@albertomercurio

Hello,

I show a simple example where I find some memory allocations when performing the 5-arguments mul!. In some cases only when the Sparse matrix A is inside a struct, in other cases also without struct.

julia> using LinearAlgebra

julia> using SparseArrays

julia> using BenchmarkTools

julia> T = Float64;

julia> N = 1000;

julia> A = sprand(T, N, N, 5 / N);

julia> x = rand(T, N);

julia> y = similar(x);

julia> α = rand(T);

julia> β = rand(T);

julia> @benchmark mul!($y, $A, $x, $α, false) # This is ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.092 μs …  3.589 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.146 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.184 μs ± 68.468 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▁▇█▄                                               
  ▂▂▂▂▂▂▂▂▄████▇▄▂▂▂▂▁▂▂▂▂▂▂▂▂▁▂▂▁▂▂▂▃▄▆██▆▄▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  2.09 μs        Histogram: frequency by time        2.35 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark mul!($y, $A, $x, $α, $β) # This is not ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.164 μs …   5.236 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.193 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.261 μs ± 138.081 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁▅▇█▇▆▄▂▂▂▁           ▁▁    ▁  ▂▄▆▆▆▄▃                ▁▁    ▂
  ████████████▆▅▄▃▅▄▅▃▆██████▇███████████▆▆▅▅▄▄▃▁▅▄▃▅▅▆█████▆ █
  2.16 μs      Histogram: log(frequency) by time      2.54 μs <

 Memory estimate: 80 bytes, allocs estimate: 2.

Involving a simple constructor

julia> struct my_MatrixOperator{MT}
           A::MT
       end

julia> function LinearAlgebra.mul!(v::AbstractVecOrMat, L::my_MatrixOperator, u::AbstractVecOrMat)
           mul!(v, L.A, u)
       end

julia> function LinearAlgebra.mul!(v::AbstractVecOrMat,
               L::my_MatrixOperator,
               u::AbstractVecOrMat,
               α,
               β)
           mul!(v, L.A, u, α, β)
       end

julia> A_op_my = my_MatrixOperator(A);

julia> @benchmark mul!($y, $A_op_my, $x, $α, false) # This is no longer ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.553 μs …   5.645 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.627 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.653 μs ± 116.667 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁▆▇▇▆▃▁▂▅██▇▅▂              ▁ ▂▃▄▄▃▂   ▂▃▃▂            ▁▂▂▂ ▃
  ██████████████▇▄▅▄▃▇████▇▇▇█████████▇▆██████▅▅▄▁▄▁▄▅▆▅▆████ █
  2.55 μs      Histogram: log(frequency) by time         3 μs <

 Memory estimate: 80 bytes, allocs estimate: 2.

julia> @benchmark mul!($y, $A_op_my, $x, $α, $β) # This is not ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.160 μs …  12.129 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.190 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.254 μs ± 176.229 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁▅██▆▄▂          ▁▅▆▆▅▃▂▁  ▁▃▄▃▂              ▁▂▁           ▂
  ████████▆▅▅▃▃▄▄▅▇████████▇▇██████▆▆▅▅▅▁▅▅▅▅▅▇█████▆▆▅▅▄▅▄▁▃ █
  2.16 μs      Histogram: log(frequency) by time       2.6 μs <

 Memory estimate: 80 bytes, allocs estimate: 2.

julia> @benchmark mul!($y, $A_op_my, $x) # This is ok
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.030 μs …  3.575 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.053 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.064 μs ± 54.170 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▄▆███▇▅▄▁                                        ▁▁       ▂
  ██████████▇▅▄▄▃▁▁▄▄▁▁▄▃▁▄▆▇███▇█▇▇▇▅▆▅▄▄▅▅▄▄▅▄▅▄▄▇███▇▅▄▄▆ █
  2.03 μs      Histogram: log(frequency) by time     2.31 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

Possible partial fix

I've noticed that, by defining the functions above with the @inline or Base.@constprop :aggressive macros, the mul!($y, $A_op_my, $x, $α, false) case is fixed, but it is still a mystery to me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions