# Benchmarking typed stable code

In [1]:
using BenchmarkTools

[1m[36mINFO: [39m[22m[36mRecompiling stale cache file /Users/david/.julia/lib/v0.6/JLD.ji for module JLD.


In [2]:
function sumsqrtn(n::Int)
    r = 0
    for i = 1:n
        r = r + sqrt(i)
    end
return r end

sumsqrtn (generic function with 1 method)

In [5]:
@time sumsqrtn(10)
@time sumsqrtn(10^7)

  0.000004 seconds (34 allocations: 640 bytes)
  0.292040 seconds (30.00 M allocations: 457.764 MB, 10.44% gc time)


2.1081852648716972e10

In [6]:
typeof(sqrt(4)), typeof(0)

(Float64,Int64)

#### what is wrong with the previous code?

Notice that **```r```** changes its type inside the function.
It is first defined as an integer but then it changes to float.


You can check how the compiler infers types of the variables inside a function using the **```@code_warntype```** method.

In [7]:
@code_warntype sumsqrtn(10)

Variables:
  #self#::#sumsqrtn
  n::Int64
  i::Int64
  #temp#@_4::Int64
  r[1m[91m::Any[39m[22m
  #temp#@_6::Core.MethodInstance
  #temp#@_7::Float64

Body:
  begin 
      r[1m[91m::Any[39m[22m = 0 # line 3:
      SSAValue(2) = (Base.select_value)((Base.sle_int)(1,n::Int64)::Bool,n::Int64,(Base.box)(Int64,(Base.sub_int)(1,1)))::Int64
      #temp#@_4::Int64 = 1
      5: 
      unless (Base.box)(Base.Bool,(Base.not_int)((#temp#@_4::Int64 === (Base.box)(Int64,(Base.add_int)(SSAValue(2),1)))::Bool)) goto 30
      SSAValue(3) = #temp#@_4::Int64
      SSAValue(4) = (Base.box)(Int64,(Base.add_int)(#temp#@_4::Int64,1))
      i::Int64 = SSAValue(3)
      #temp#@_4::Int64 = SSAValue(4) # line 4:
      unless (r[1m[91m::Union{Float64,Int64}[39m[22m isa Float64)[1m[91m::Any[39m[22m goto 15
      #temp#@_6::Core.MethodInstance = MethodInstance for [1m+[22m[22m[1m([22m[22m::Float64, ::Float64[1m)[22m[22m
      goto 24
      15: 
      unless (r[1m[91m::Union{Float64,Int64}

Let us rewrite the function defining ```r```as float. Doing so we will avoid changing its type inside the function and it will run faster.

In [11]:
function sumsqrtn2(n::Int64) 
    r = 0.
    for i = 1:n
        r = r + sqrt(i)
    end
return r end

sumsqrtn2 (generic function with 2 methods)

In [13]:
@time sumsqrtn2(10^3)
@time sumsqrtn2(10^7)

  0.000014 seconds (6 allocations: 192 bytes)
  0.095450 seconds (6 allocations: 192 bytes)


2.1081852648716972e10

In [14]:
@code_warntype sumsqrtn2(10)

Variables:
  #self#::#sumsqrtn2
  n::Int64
  i::Int64
  #temp#::Int64
  r::Float64

Body:
  begin 
      r::Float64 = 0.0 # line 3:
      SSAValue(3) = (Base.select_value)((Base.sle_int)(1,n::Int64)::Bool,n::Int64,(Base.box)(Int64,(Base.sub_int)(1,1)))::Int64
      #temp#::Int64 = 1
      5: 
      unless (Base.box)(Base.Bool,(Base.not_int)((#temp#::Int64 === (Base.box)(Int64,(Base.add_int)(SSAValue(3),1)))::Bool)) goto 16
      SSAValue(4) = #temp#::Int64
      SSAValue(5) = (Base.box)(Int64,(Base.add_int)(#temp#::Int64,1))
      i::Int64 = SSAValue(4)
      #temp#::Int64 = SSAValue(5) # line 4:
      SSAValue(2) = (Base.Math.box)(Base.Math.Float64,(Base.Math.sqrt_llvm)((Base.box)(Float64,(Base.sitofp)(Float64,i::Int64))))::Float64
      r::Float64 = (Base.box)(Base.Float64,(Base.add_float)(r::Float64,SSAValue(2)))
      14: 
      goto 5
      16:  # line 6:
      return r::Float64
  end::Float64


### optimize operations

In [15]:
function sum_opt2(n) 
    r = 0.
    for i = 1:n
        r = r + sqrt(i)
    end
return r end

sum_opt2 (generic function with 1 method)

In [17]:
@time sum_opt2(10^7)

  0.098142 seconds (6 allocations: 192 bytes)


2.1081852648716972e10

In [18]:
function sum_opt2(n,d) 
    r = zeros(d)
    for i = 1:n
        for j in 1:d
            r[j] = r[j] + sqrt(i)
        end
    end
return r 
end

sum_opt2 (generic function with 2 methods)

In [20]:
@benchmark sum_opt2(10^5,5)

BenchmarkTools.Trial: 
  memory estimate:  128 bytes
  allocs estimate:  1
  --------------
  minimum time:     428.228 μs (0.00% GC)
  median time:      440.947 μs (0.00% GC)
  mean time:        482.978 μs (0.00% GC)
  maximum time:     1.379 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

In [21]:
function sum_opt3(n,d) 
    r = zeros(d)
    for i = 1:n
        r = r + sqrt(i)
    end
return r 
end

sum_opt3 (generic function with 1 method)

In [22]:
@benchmark sum_opt3(10^5,5)

BenchmarkTools.Trial: 
  memory estimate:  12.21 MiB
  allocs estimate:  100001
  --------------
  minimum time:     3.490 ms (0.00% GC)
  median time:      4.458 ms (15.20% GC)
  mean time:        4.474 ms (9.15% GC)
  maximum time:     8.388 ms (12.12% GC)
  --------------
  samples:          1107
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

In [24]:
function sum_opt4(n,d) 
    r = zeros(d)
    for i = 1:n
        r .= r .+ sqrt(i)
    end
return r 
end

sum_opt4 (generic function with 1 method)

In [25]:
@benchmark sum_opt4(10^5,5)

BenchmarkTools.Trial: 
  memory estimate:  128 bytes
  allocs estimate:  1
  --------------
  minimum time:     1.114 ms (0.00% GC)
  median time:      1.121 ms (0.00% GC)
  mean time:        1.144 ms (0.00% GC)
  maximum time:     2.245 ms (0.00% GC)
  --------------
  samples:          4356
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

#### verify type inestability using @code_warntype

In [100]:
# In this case r is Any 
# The type of r changes and this heavily penalizes speed
#@code_warntype sumsqrtn(10);

In [101]:
# In this case r is Float64 during the execution
#@code_warntype sumsqrtn2(10)

## Use ```local``` to fix type of a variable

If we do not want to profile the code using **```@code_warntype```** we can u **'''local'''** to ensure the type of the variable.

In [26]:
workspace()

In [27]:
function sumsqrtn(n::Int)
    local r::Int64 = 0
    
    for i = 1:n
        r = r + sqrt(i)
    end
return r end

sumsqrtn (generic function with 1 method)

In [28]:
sumsqrtn(10)

LoadError: [91mInexactError()[39m