### Performance remarks

* global vs local scope
* type stability
* memory allocations
* warning premature optimization
* Tuples and Arrays
* type struct fields but not function arguments
* Profile and ProfileView
* Vectorization -> not relevant for Julia

In [30]:
n,m = 10000,100
a = rand(n,m)
function mysum(a)
    s = zero(eltype(a))
    @simd for el in a
        s = s + el
    end
    s
end

mysum (generic function with 1 method)

In [32]:
@time mysum(a)

  0.001276 seconds (5 allocations: 176 bytes)


499961.1704976735

In [36]:
@which sum(a)

**Always put performance-critical code into functions**

### Type stability

*Use code_warntype to spot type instabilities*

### Cache efficiency

### Memory allocations

In [66]:
n,m = 100,100000
a = rand(n,m)
function colsum(a)
    n,m = size(a)
    s = zeros(eltype(a),m)
    for i_m = 1:m
        s[i_m] = sum(view(a,:,i_m))
    end
    s
end
function colsum2(a)
    n,m = size(a)
    s = zeros(eltype(a),m)
    for i_m = 1:m
        s[i_m] = sum(a[:,i_m])
    end
    s
end
function colsum3(a)
    n,m = size(a)
    s = zeros(eltype(a),m)
    for i_m = 1:m
        @simd for i_n = 1:n
            s[i_m] += a[i_n,i_m]
        end
    end
    s
end

colsum3 (generic function with 1 method)

In [69]:
using BenchmarkTools
@btime colsum(a)
@btime colsum3(a);

  14.285 ms (100002 allocations: 5.34 MiB)
  94.177 ms (2 allocations: 781.33 KiB)


In [40]:
a = rand(10)
function dosomething()
    if rand()<0.5
        a = [1.0,1.0]
    else
        a = [1,1]
    end

    for ia in a
        b =ia 
    end
    b
end

dosomething (generic function with 1 method)

In [41]:
@code_warntype dosomething()

Body[91m[1m::Any[22m[39m
[90m[44G│╻          rand[1G[39m[90m3  [39m1 ── %1   = Random.GLOBAL_RNG[36m::Random.MersenneTwister[39m
[90m[44G││╻╷╷╷╷╷     rand[1G[39m[90m   [39m│    %2   = (Base.getfield)(%1, :idxF)[36m::Int64[39m
[90m[44G│││┃││││      rand[1G[39m[90m   [39m│    %3   = Random.MT_CACHE_F[36m::Int64[39m
[90m[44G││││┃││││      rand[1G[39m[90m   [39m│    %4   = (%2 === %3)[36m::Bool[39m
[90m[44G│││││┃│         rand[1G[39m[90m   [39m└───        goto #3 if not %4
[90m[44G││││││┃│╷        reserve_1[1G[39m[90m   [39m2 ── %6   = $(Expr(:gc_preserve_begin, :(%1)))
[90m[44G│││││││╻          gen_rand[1G[39m[90m   [39m│    %7   = (Base.getfield)(%1, :state)[36m::Random.DSFMT.DSFMT_state[39m
[90m[44G││││││││┃│         macro expansion[1G[39m[90m   [39m│    %8   = (Base.getfield)(%1, :vals)[36m::Array{Float64,1}[39m
[90m[44G│││││││││╻          pointer[1G[39m[90m   [39m│    %9   = $(Expr(:foreigncall, :(:jl_array_ptr), Pt

[90m[44G│          [1G[39m[90m   [39m└───        goto #42 if not %117
[90m[44G│          [1G[39m[90m   [39m41 ─        goto #29
[90m[44G│          [1G[39m[90m12 [39m42 ─        return Main.b


### Additional tips

* Use mutating forms of functions
* Use functional form of reductions

In [72]:
x = rand(10000)
sort!(x)

10000-element Array{Float64,1}:
 1.3883558455107803e-5
 5.6604690438000205e-5
 0.0005575192030908838
 0.000738246141811949 
 0.0008507676806963627
 0.0008914074820833839
 0.0010088300582937748
 0.0010271315316172647
 0.001037274153034673 
 0.0011415070098375057
 0.0011539138914347102
 0.0012038245127756753
 0.0012297237868841293
 ⋮                    
 0.9989860613566759   
 0.9989911506307019   
 0.9990080084515858   
 0.9991901037248367   
 0.9991912531659881   
 0.9993670073790266   
 0.9994944602921907   
 0.9995723673831312   
 0.9997367301949187   
 0.9997961802796427   
 0.9997995674587632   
 0.9999835347399164   

In [74]:
x = [1,2,3,4,2]
replace(x,2=>5)

5-element Array{Int64,1}:
 1
 5
 3
 4
 5

In [75]:
x

5-element Array{Int64,1}:
 1
 2
 3
 4
 2

In [77]:
replace!(x,2=>5)
x

5-element Array{Int64,1}:
 1
 5
 3
 4
 5

In [83]:
x = [i^2 for i=1:10]

sum(i for i in 1:10 if i<5)

10