# Benchmark Performance of argmin()

In [2]:
using BenchmarkTools

In [3]:
x = rand(450);

In [25]:
@btime findmin(x)

  516.057 ns (1 allocation: 32 bytes)


(0.0017632793527515567, 315)

In [26]:
@btime findmin(@view x[begin:end])

  567.935 ns (3 allocations: 112 bytes)


(0.0017632793527515567, 315)

In [27]:
@btime findmin(@view x[1:449])

  537.698 ns (2 allocations: 80 bytes)


(0.0017632793527515567, 315)

In [28]:
@btime findmin(@view x[1:448])

  536.376 ns (2 allocations: 80 bytes)


(0.0017632793527515567, 315)

In [29]:
@benchmark findmin(x)

BenchmarkTools.Trial: 10000 samples with 192 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m516.714 ns[22m[39m … [35m 8.406 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 92.77%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m522.135 ns              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m537.012 ns[22m[39m ± [32m83.419 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.15% ±  0.93%

  [39m█[39m█[39m▆[34m▅[39m[39m▄[39m▂[39m [39m [39m [39m▁[39m▄[32m▄[39m[39m▄[39m▄[39m▃[39m▃[39m▂[39m▁[39m▁[39m [39m▁[39m▃[39m▄[39m▄[39m▃[39m▂[39m▃[39m▂[39m▁[39m▁[39m [39m▁[39m▁[39m [39m [39m▁[39m▁[39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[39m█[34m█

In [30]:
@code_lowered findmin(x)

CodeInfo(
[90m1 ─[39m %1 = Base.:(var"#findmin#860")(Base.:(:), #self#, A)
[90m└──[39m      return %1
)

In [9]:
using LoopVectorization

In [31]:
fast_findmin(dij, n) = begin
    best = 1
    @inbounds dij_min = dij[1]
    @turbo for here in 2:n
        newmin = dij[here] < dij_min
        best = newmin ? here : best
        dij_min = newmin ? dij[here] : dij_min
    end
    dij_min, best
end

fast_findmin (generic function with 1 method)

In [32]:
@btime fast_findmin(x, 450)

  97.736 ns (1 allocation: 32 bytes)


(0.0017632793527515567, 315)

In [33]:
@benchmark fast_findmin(x, 450)

BenchmarkTools.Trial: 10000 samples with 946 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m 97.736 ns[22m[39m … [35m 1.478 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 92.26%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m100.598 ns              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m102.964 ns[22m[39m ± [32m32.326 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.75% ±  2.24%

  [39m█[39m█[39m▆[39m▁[39m [39m▁[39m▇[39m█[34m▆[39m[39m▃[39m▅[39m▆[39m▅[39m▃[32m▃[39m[39m▃[39m▄[39m▄[39m▃[39m▃[39m▃[39m▃[39m▃[39m▂[39m▂[39m▂[39m▂[39m▂[39m▂[39m▂[39m▁[39m▂[39m [39m▂[39m▁[39m▁[39m▁[39m▁[39m [39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▃
  [39m█[39m█[39m█[39m█

In [34]:
@benchmark for j in 450:-1:1 fast_findmin(x, j) end

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m26.250 μs[22m[39m … [35m 1.311 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 96.40%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m26.875 μs              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m28.079 μs[22m[39m ± [32m20.770 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m1.25% ±  1.67%

  [39m▃[39m▇[39m█[34m▆[39m[39m▆[39m▆[39m▅[39m▅[39m▂[39m▁[32m▂[39m[39m [39m▁[39m [39m▁[39m▁[39m [39m [39m [39m▁[39m▂[39m▂[39m▂[39m▂[39m▂[39m▁[39m [39m [39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[39m█[34m█[39m[39m█[

In [35]:
@benchmark for j in 450:-1:1 findmin(@view x[1:j]) end

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m125.542 μs[22m[39m … [35m 1.908 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 92.30%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m127.250 μs              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m131.795 μs[22m[39m ± [32m56.608 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m1.40% ±  3.04%

  [39m▃[39m▆[39m█[39m█[39m▆[34m▃[39m[39m▃[39m▄[39m▄[39m▃[39m▃[39m▄[39m▄[39m▄[39m▃[39m▃[39m▃[32m▃[39m[39m▃[39m▂[39m▂[39m▂[39m▂[39m▃[39m▂[39m▂[39m▂[39m▂[39m▂[39m▁[39m▁[39m▂[39m▁[39m▁[39m▁[39m▁[39m▁[39m▁[39m [39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[39m█[39m█[39