## Profiling Code

Here we compare the optimized to the unoptimized version of the VAE to the optimized version. We first load the MNIST data and intialize the classes.

In [30]:
import pickle, gzip
import matplotlib.pyplot as plt 
import numpy as np
import math
import random
import sys
%matplotlib inline

In [31]:
sys.path.append('..\\vae')

In [32]:
with gzip.open('..\\resources\\mnist.pkl.gz', 'rb') as f:
    train, test, val = pickle.load(f, encoding='latin1')
    mnist = train[0]

### Unoptimized Version

In [33]:
from vae_unoptimized import vae_unoptimized

In [34]:
def act(x):
    '''activation function'''
    mat = list()
    for r in x:
        row = list()
        for c in r:
            row.append(1 / (1 + math.exp(- c)))
        mat.append(row)
    return mat
    
def grad_act(x):
    '''grad of activation function'''
    mat = list()
    for r in x:
        row = list()
        for c in r:
            row.append(math.exp(- c) / (1 + math.exp(- c))**2)
        mat.append(row)
    return mat

def loss(x):
    '''squared error loss'''
    mat = list()
    for r in x:
        row = list()
        for c in r:
            row.append(math.exp(- c) / (1 + math.exp(- c))**2)
        mat.append(row)
    return mat

def grad_loss(A, B):
    '''grad of squared error loss'''
    mat = list()
    for i, row in enumerate(A):
        r = list()
        for j, column in enumerate(row):
            r.append( -A[i][j] + B[i][j])
        mat.append(r)
    return mat

In [35]:
params = {
    'alpha' : 0.1,
    'max_iter' : 2,
    'activation' : act,
    'grad_act' : grad_act,
    'loss' : loss,
    'grad_loss' : grad_loss,
    'mode' : 'autoencoder'
}

unopt = vae_unoptimized([784, 200], [784, 200], params)

In [36]:
%prun -q -D train_unoptimized.prof unopt.train(mnist[0:10], mnist[0:10])

 
*** Profile stats marshalled to file 'train_unoptimized.prof'. 


In [45]:
import pstats
p = pstats.Stats('train_unoptimized.prof')
p.print_stats()
pass

Sun Apr 30 15:20:47 2017    train_unoptimized.prof

         2086296 function calls in 14.491 seconds

   Random listing order was used

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        8    0.013    0.002    0.017    0.002 <ipython-input-34-a7a6fecb4bff>:1(act)
        8    0.010    0.001    0.011    0.001 ..\vae\vae_unoptimized.py:95(multiply)
        1    0.000    0.000   14.491   14.491 {built-in method builtins.exec}
        1    0.015    0.015   14.487   14.487 ..\vae\vae_unoptimized.py:249(train)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        8    0.146    0.018    0.174    0.022 ..\vae\vae_unoptimized.py:85(scalar_mult)
       18    0.000    0.000    0.013    0.001 ..\vae\vae_unoptimized.py:114(t)
       22   13.905    0.632   13.948    0.634 ..\vae\vae_unoptimized.py:72(matmult)
       18    0.013    0.001    0.013    0.001 ..\vae\vae_unoptimized.py:116(<listcomp>)
        8    0.212    0.02

The majority of the slowdown seems to be in the feeforward and backpropagation steps. These are heavily dependent on matrix mutliplication, so we will vectorize them.

### Optimized Version

In [46]:
from vae import vae

In [47]:
params = {
    'alpha' : 0.1,
    'max_iter' : 2,
    'activation' : (lambda x: 1 / (1 + np.exp(-x))),
    'grad_act' : (lambda x: np.exp(-x) / (1 + np.exp(-x))**2),
    'loss' : (lambda y, yhat: 0.5 * np.sum((y - yhat)**2)),
    'grad_loss' : (lambda y, yhat: y - yhat),
    'mode' : 'autoencoder'
}

opt = vae([784, 200], [784, 200], params)

In [48]:
%prun -q -D train_optimized.prof opt.train(mnist[0:10], mnist[0:10])

 
*** Profile stats marshalled to file 'train_optimized.prof'. 


In [49]:
import pstats
p = pstats.Stats('train_optimized.prof')
p.print_stats()
pass

Sun Apr 30 15:33:31 2017    train_optimized.prof

         36 function calls in 0.016 seconds

   Random listing order was used

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       10    0.001    0.000    0.001    0.000 <ipython-input-47-b0bed5c11f7e>:5(<lambda>)
        1    0.000    0.000    0.016    0.016 {built-in method builtins.exec}
        4    0.000    0.000    0.000    0.000 {built-in method numpy.core.multiarray.arange}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.003    0.002    0.004    0.002 ..\vae\vae.py:103(feedforward)
        2    0.000    0.000    0.000    0.000 {built-in method numpy.core.multiarray.array}
        1    0.004    0.004    0.016    0.016 ..\vae\vae.py:142(train)
        2    0.000    0.000    0.000    0.000 <ipython-input-47-b0bed5c11f7e>:7(<lambda>)
        2    0.007    0.003    0.008    0.004 ..\vae\vae.py:53(backprop)
        8    0.001    0.000    0.001    

By vectorizing matrix multiplication with numpy we have produce a 1000 fold speedup.