## Benchmarking Perceptron


#### About profiling julia code

- https://thirld.com/blog/2015/05/30/julia-profiling-cheat-sheet/

In [1]:
versioninfo()

Julia Version 0.6.0-dev.2417
Commit e63fec8 (2017-01-27 19:42 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, ivybridge)


In [None]:
using MNIST
using BenchmarkTools

source_path = join(push!(split(pwd(),"/")[1:end-1],"source/" ),"/")

if !contains(==,LOAD_PATH, source_path) 
    push!(LOAD_PATH, source_path)
end

using MulticlassPerceptron6
using MulticlassPerceptron5
using MulticlassPerceptron4
using MulticlassPerceptron3
using MulticlassPerceptron2
using MulticlassPerceptron

percep1 = MulticlassPerceptron.MPerceptron(Float32, 10, 784)
percep2 = MulticlassPerceptron2.MPerceptron(Float32, 10, 784)
percep3 = MulticlassPerceptron3.MPerceptron(Float32, 10, 784)
percep4 = MulticlassPerceptron4.MPerceptron(Float32, 10, 784)
percep5 = MulticlassPerceptron5.MPerceptron(Float32, 10, 784)
percep6 = MulticlassPerceptron6.MPerceptron(Float32, 10, 784)
#percep7 = MulticlassPerceptron7.MPerceptron(Float32, 10, 784)

n_classes = 10
n_features = 784
T = Float32

#percep1 = MulticlassPerceptron.MPerceptron{T}(rand(T, n_classes,n_features), zeros(T, n_classes), n_classes, n_features)
#percep2 = MulticlassPerceptron2.MPerceptron{T}(rand(T, n_classes,n_features), zeros(T, n_classes), n_classes, n_features)
#percep3 = MulticlassPerceptron3.MPerceptron{T}(rand(T, n_classes,n_features), zeros(T, n_classes), n_classes, n_features)
#percep4 = MulticlassPerceptron4.MPerceptron{T}(rand(T, n_classes,n_features), zeros(T, n_classes), n_classes, n_features)
#percep5 = MulticlassPerceptron5.MPerceptron{T}(rand(T, n_classes,n_features), zeros(T, n_classes), n_classes, n_features)

In [None]:
X_train, y_train = MNIST.traindata();
X_test, y_test = MNIST.testdata();
y_train = y_train + 1
y_test = y_test + 1;

In [None]:
T = Float32
X_train = Array{T}((X_train - minimum(X_train))/(maximum(X_train) - minimum(X_train)))
y_train = Array{Int64}(y_train)
X_test = Array{T}(X_test - minimum(X_test))/(maximum(X_test) - minimum(X_test)) 
y_test = Array{Int64}(y_test);

In [6]:
@benchmark MulticlassPerceptron.fit!(percep1, X_train, y_train, 1, 0.0001)

LoadError: [91mInterruptException:[39m

#### MulticlassPerceptron2

Using views instead of copying examples

In [6]:
@benchmark MulticlassPerceptron2.fit!(percep2, X_train, y_train, 1, 0.0001)

Accuracy epoch 1 is :0.56985
Accuracy epoch 1 is :0.6920833333333334
Accuracy epoch 1 is :0.74235
Accuracy epoch 1 is :0.7710333333333333
Accuracy epoch 1 is :0.7885666666666666
Accuracy epoch 1 is :0.8020166666666667
Accuracy epoch 1 is :0.8117666666666666
Accuracy epoch 1 is :0.8190666666666667
Accuracy epoch 1 is :0.8251833333333334


BenchmarkTools.Trial: 
  memory estimate:  1.14 GiB
  allocs estimate:  2401680
  --------------
  minimum time:     1.588 s (3.78% GC)
  median time:      1.599 s (3.79% GC)
  mean time:        1.611 s (3.75% GC)
  maximum time:     1.658 s (3.70% GC)
  --------------
  samples:          4
  evals/sample:     1

#### MulticlassPerceptron3

- using inbounds
- using views

In [7]:
@benchmark MulticlassPerceptron3.fit!(percep3, X_train, y_train, 1, 0.0001)

Accuracy epoch 1 is :0.5972666666666666
Accuracy epoch 1 is :0.7052
Accuracy epoch 1 is :0.74975
Accuracy epoch 1 is :0.7757666666666667
Accuracy epoch 1 is :0.7936666666666666
Accuracy epoch 1 is :0.8056166666666666
Accuracy epoch 1 is :0.8147
Accuracy epoch 1 is :0.8214333333333333
Accuracy epoch 1 is :0.82745
Accuracy epoch 1 is :0.8326333333333333
Accuracy epoch 1 is :0.8377833333333333
Accuracy epoch 1 is :0.8416166666666667


#### MulticlassPerceptron4

- using inbounds
- using views
- prediction vector prealocated before


In [5]:
@benchmark MulticlassPerceptron4.fit!(percep4, X_train, y_train, 1, 0.0001)

  0.186155 seconds (360.00 k allocations: 9.155 MiB)
Accuracy epoch 1 is :0.6003
  0.171435 seconds (360.00 k allocations: 9.155 MiB)
Accuracy epoch 1 is :0.7099833333333333
  0.171120 seconds (360.00 k allocations: 9.155 MiB, 1.64% gc time)
Accuracy epoch 1 is :0.7548666666666667
  0.193690 seconds (360.00 k allocations: 9.155 MiB)
Accuracy epoch 1 is :0.77975
  0.167620 seconds (360.00 k allocations: 9.155 MiB)
Accuracy epoch 1 is :0.7952833333333333
  0.162997 seconds (360.00 k allocations: 9.155 MiB)
Accuracy epoch 1 is :0.80765
  0.174270 seconds (360.00 k allocations: 9.155 MiB, 1.69% gc time)
Accuracy epoch 1 is :0.81675
  0.166941 seconds (360.00 k allocations: 9.155 MiB)
Accuracy epoch 1 is :0.824
  0.180057 seconds (360.00 k allocations: 9.155 MiB, 1.04% gc time)
Accuracy epoch 1 is :0.8295
  0.162910 seconds (360.00 k allocations: 9.155 MiB)
Accuracy epoch 1 is :0.8340666666666666
  0.165267 seconds (360.00 k allocations: 9.155 MiB)
Accuracy epoch 1 is :0.8385
  0.200064 sec

BenchmarkTools.Trial: 
  memory estimate:  85.60 MiB
  allocs estimate:  1162211
  --------------
  minimum time:     895.936 ms (1.49% GC)
  median time:      942.764 ms (1.29% GC)
  mean time:        947.475 ms (1.26% GC)
  maximum time:     988.338 ms (1.03% GC)
  --------------
  samples:          6
  evals/sample:     1

#### MulticlassPerceptron5

- prediction vector prealocated before
- using inbounds
- copying the current datapoint x = X[:,m] at every update
- No inplace ops:  
    ```h.W[y_tr[m], :] .+= learning_rate * x```

In [9]:
@benchmark MulticlassPerceptron5.fit!(percep5, X_train, y_train, 1, 0.0001)

Accuracy epoch 1 is :0.6022166666666666
Accuracy epoch 1 is :0.7082
Accuracy epoch 1 is :0.7526833333333334
Accuracy epoch 1 is :0.7774166666666666
Accuracy epoch 1 is :0.79345
Accuracy epoch 1 is :0.8058666666666666
Accuracy epoch 1 is :0.81435
Accuracy epoch 1 is :0.8225166666666667
Accuracy epoch 1 is :0.8285333333333333
Accuracy epoch 1 is :0.8335833333333333
Accuracy epoch 1 is :0.8373166666666667
Accuracy epoch 1 is :0.8413333333333334


BenchmarkTools.Trial: 
  memory estimate:  510.30 MiB
  allocs estimate:  735228
  --------------
  minimum time:     1.350 s (4.41% GC)
  median time:      1.378 s (3.83% GC)
  mean time:        1.379 s (4.02% GC)
  maximum time:     1.409 s (3.14% GC)
  --------------
  samples:          4
  evals/sample:     1

#### MulticlassPerceptron6

- prediction vector prealocated before
- using inbounds
- copying the current datapoint x = X[:,m] at every update
- Inplace operations:
    ```h.W[y_tr[m], :] .+= learning_rate .* x```


In [8]:
@benchmark MulticlassPerceptron6.fit!(percep6, X_train, y_train, 1, 0.0001)

Accuracy epoch 1 is :0.8281833333333334
Accuracy epoch 1 is :0.83305
Accuracy epoch 1 is :0.8364
Accuracy epoch 1 is :0.8397666666666667
Accuracy epoch 1 is :0.8430833333333333
Accuracy epoch 1 is :0.84555
Accuracy epoch 1 is :0.8481
Accuracy epoch 1 is :0.8503666666666667


BenchmarkTools.Trial: 
  memory estimate:  268.95 MiB
  allocs estimate:  1231924
  --------------
  minimum time:     1.655 s (2.17% GC)
  median time:      1.788 s (1.68% GC)
  mean time:        1.773 s (1.72% GC)
  maximum time:     1.875 s (1.60% GC)
  --------------
  samples:          3
  evals/sample:     1

#### Multiclass perceptron7: Not working... why?

In [11]:
typeof(X_train)

Array{Float32,2}

In [12]:
#@benchmark MulticlassPerceptron7.fit!(percep7, X_train, y_train, 1, 0.0001)

#### Profiling the code

In [13]:
#Profile.clear()
#@profile MulticlassPerceptron5.fit!(percep5, X_train, y_train, 1, 0.0001)

In [14]:
#using ProfileView
#ProfileView.view()

In [15]:
#a = zeros(Float32,10)

#MulticlassPerceptron5.predict(percep5, X_train[:,3],a)

In [16]:
#@benchmark MulticlassPerceptron5.predict(percep5, X_train[:,3],a)