# Performance of Array vs Loop Style 

This notebook demonstrates that loop-style code still outperforms array-style code as of Julia v0.6.2.  The code for benchmarking is borrowed directly from the Julia performance workshop at JuliaCon 2016 by Arch D. Robison.

See video at https://www.youtube.com/watch?v=szE4txAD8mk

#### Array-style

In [None]:
function foo(c, w, i, j, dx, dy)
    dw = w[:,i] - w[:,j]
    c[:,1] += dw * dx
    c[:,2] += dw * dy
end

#### Loop-style

In [2]:
function bar(c, w, i, j, dx, dy)
    m, n = size(w)
    for k = 1:m
        dw = w[k,i] - w[k,j]
        c[k,1] += dw * dx
        c[k,2] += dw * dy
    end
end

bar (generic function with 1 method)

#### Verify both functions work the same way

In [3]:
# sample data to work with
z = rand(10,10)

10×10 Array{Float64,2}:
 0.865573  0.729865  0.969893  0.262898   …  0.654395   0.609925   0.426222 
 0.475022  0.150922  0.730453  0.607109      0.513592   0.801204   0.271387 
 0.885364  0.65157   0.248171  0.771997      0.0775057  0.855948   0.0049222
 0.372987  0.988638  0.927938  0.163775      0.290307   0.607091   0.780549 
 0.336583  0.426304  0.528664  0.0754067     0.911839   0.0213179  0.19275  
 0.238581  0.219954  0.713346  0.106057   …  0.525553   0.635506   0.340465 
 0.370079  0.579456  0.211741  0.126389      0.646627   0.589706   0.624614 
 0.585515  0.384464  0.261348  0.965125      0.560094   0.14339    0.896444 
 0.678774  0.599189  0.184715  0.16788       0.460699   0.168255   0.543777 
 0.272665  0.466196  0.789117  0.809847      0.938572   0.990326   0.215268 

In [4]:
# test foo function
c = zeros(10, 2)
foo(c, z, 1, 2, 3.0, 5.0)
c

10×2 Array{Float64,2}:
  0.407125    0.678542 
  0.9723      1.6205   
  0.701382    1.16897  
 -1.84695    -3.07826  
 -0.269164   -0.448607 
  0.0558799   0.0931332
 -0.628131   -1.04689  
  0.603151    1.00525  
  0.238756    0.397927 
 -0.580595   -0.967659 

In [5]:
# test bar function... should return same results as the above
c = zeros(10, 2)
bar(c, z, 1, 2, 3.0, 5.0)
c

10×2 Array{Float64,2}:
  0.407125    0.678542 
  0.9723      1.6205   
  0.701382    1.16897  
 -1.84695    -3.07826  
 -0.269164   -0.448607 
  0.0558799   0.0931332
 -0.628131   -1.04689  
  0.603151    1.00525  
  0.238756    0.397927 
 -0.580595   -0.967659 

#### Create a new function for benchmarking

In [6]:
fb(g, z) = begin 
    c = zeros(10, 2)
    g(c, z, 1, 2, 3.0, 5.0)
    c
end

fb (generic function with 1 method)

#### Start benchmarking

In [7]:
using BenchmarkTools

In [8]:
# warm the system
fb(foo, z);
fb(bar, z);

In [9]:
@btime fb(foo, z)

  597.367 ns (16 allocations: 1.73 KiB)


10×2 Array{Float64,2}:
  0.407125    0.678542 
  0.9723      1.6205   
  0.701382    1.16897  
 -1.84695    -3.07826  
 -0.269164   -0.448607 
  0.0558799   0.0931332
 -0.628131   -1.04689  
  0.603151    1.00525  
  0.238756    0.397927 
 -0.580595   -0.967659 

In [10]:
@btime fb(bar, z)

  87.114 ns (1 allocation: 240 bytes)


10×2 Array{Float64,2}:
  0.407125    0.678542 
  0.9723      1.6205   
  0.701382    1.16897  
 -1.84695    -3.07826  
 -0.269164   -0.448607 
  0.0558799   0.0931332
 -0.628131   -1.04689  
  0.603151    1.00525  
  0.238756    0.397927 
 -0.580595   -0.967659 