# Neural Network Benchmark

In [1]:
import numpy as np
import pandas as pd
import sklearn as sk
import sklearn.metrics
from IPython.display import Image
from addutils import css_notebook
import time
import json
css_notebook()

In [2]:
import bokeh.plotting as bk
bk.output_notebook()

## 1 Keras - Theano Benchmark

In this serie of notebooks we would like to compare three different libraries for Neural Netwok using a regression problem on temporal data. The problem is to predict the value of the next event using a fully connected feed forward network. Each notebook performs the fitting and prediction of the data using a different library, varying batch size and recording both precision (RMSE) and the time needed to fit the data. 

This notebook uses Keras with Theano as backend. Please refer to previous notebooks in the serie for installation instructions and how-tos.

## 2 Load data

For this serie of tests we used a dataset with more examples and an increased model complexity. We downloaded it from the *UCI Machine Learning Repository*. Follow this link and download the [Individual household electric power consumption](http://archive.ics.uci.edu/ml/machine-learning-databases/00235/household_power_consumption.zip) dataset. It has more than two milions examples. Please download it in `tutorials/machine_learning/example_data/` and unzip it. It refers to measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. Different electrical quantities and some sub-metering values are available. In our experiment we use *Voltage* as reference quantity.

Settings for this experiments are similar to the one used for previous notebooks. It is possible to choose the number of features (i.e. the number of inputs) and the number of future steps to predict. The variable `percentage` refers to the number of training examples with respect to the whole dataset. For example if you specify 0.7 it means that 70% of the examples will be used for training. The remaining 30% examples are split equally between test set and validation set.

In [3]:
n_inputs = 100
steps_forward = 1
percentage = 0.7

In [4]:
input_file = 'example_data/household_power_consumption.txt'

In [5]:
x = pd.read_csv(input_file, sep=';', na_values='?') 
# automatically converting '?' to NaN removes warning
a = x['Voltage']
a = a[a.notnull()]
a = a.values

vec_size = n_inputs + steps_forward - 1

X = np.zeros((a.size-vec_size,n_inputs))

y = np.zeros((a.size-vec_size,1))
for r in range(a.size-vec_size):
    X[r,:] = a[r:r+n_inputs]
    y[r,:] = a[r+vec_size]

split = int(a.size*percentage) - n_inputs
vt_split = int((X.shape[0]-split)/2)
X_train = X[:split]
X_test = X[split+vt_split:] 
X_valid = X[split:split+vt_split] 
y_train = y[:split]
y_test = y[split+vt_split:]
y_valid = y[split:split+vt_split]

In [6]:
print('X_train shape: %d, %d' % X_train.shape)
print('X_test shape: %d, %d' % X_test.shape)
print('X_valid shape: %d, %d' % X_valid.shape)
print('y_train shape: %d, %d' % y_train.shape)
print('y_test shape: %d, %d' % y_test.shape)
print('y_valid shape: %d, %d' % y_valid.shape)

X_train shape: 1434396, 100
X_test shape: 307392, 100
X_valid shape: 307392, 100
y_train shape: 1434396, 1
y_test shape: 307392, 1
y_valid shape: 307392, 1


## 3 Test varying batch size

In the following section we perform fitting and prediction of the dataset, using several batch sizes and for each size we record execution time and error. The error measure is Root Mean Squared Error. 

Keras uses theano as backend. In order to choose the device (CPU or GPU) to use please modify .theanorc file. You can find a sample .theanorc in the utilities dir. The documentation for the configuration can be found [here](http://deeplearning.net/software/theano/library/config.html). In particular the `[global]` section allows you to specify the default dtype of theano variables (used by keras), for example with a statement like `floatX = float32` and also which device to use, for example `device = gpu0` selects the firt GPU on the machine.

In [7]:
import keras.models as kmodels
import keras.layers.core as klcore
import keras.optimizers as kopt
from keras.callbacks import EarlyStopping

Using Theano backend.


Using gpu device 0: GeForce GTX TITAN X (CNMeM is disabled)


The architecture of the Neural Network is composed as follows:
    
* Two hidden layers
* First hidden layer with 100 neurons
* Second hidden layer with 50 neurons
* Dropout with p=0.5 in each layer
* Gradient Descent with momentum 
* Learning Rate = 0.01
* Momentum = 0.9
* Eearly Stopping
* One neuron with linear activation in output

The function `fit_predict` creates a neural network with the specific architecture and a given batch size and perform fitting and prediction on the dataset. It return the error measure and the time elapsed during fitting.

In [8]:
def fit_predict(batch_size, X_train, y_train, X_test, y_test, X_valid, y_valid):
    hidden_size1 = 100    # Number of hidden units first layer
    hidden_size2 = 50    # Number of hidden units second layer
    p = 0.5
    epochs = 100
    learning_rate = 0.01
    momenutm = 0.9

    nn = kmodels.Sequential()
    nn.add(klcore.Dense(hidden_size1, input_dim=n_inputs, init='normal'))
    nn.add(klcore.Activation('tanh'))
    nn.add(klcore.Dropout(p))
    nn.add(klcore.Dense(hidden_size2, init='normal'))
    nn.add(klcore.Activation('tanh'))
    nn.add(klcore.Dropout(p))
    nn.add(klcore.Dense(1, init='normal'))
    nn.add(klcore.Activation('linear'))

    sgd = kopt.SGD(lr=learning_rate, momentum=momenutm)
    nn.compile(loss='mean_squared_error', optimizer=sgd)
    early = EarlyStopping()

    t0 = time.time()
    #nn.fit(X_train, y_train, verbose=0, nb_epoch=epochs, batch_size=batch_size)
    nn.fit(X_train, y_train, verbose=0, nb_epoch=epochs, batch_size=batch_size, 
           validation_data=(X_valid, y_valid), callbacks=[early])
    fit_time = time.time() - t0
    preds = nn.predict(X_test, verbose=0, batch_size=batch_size)
    err = np.sqrt(sk.metrics.mean_squared_error(preds, y_test))
    return fit_time, err

In [9]:
keras_result = {'batch_size':[], 'error':[], 'fit_time':[]}
for batch in [128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072]:
    keras_result['batch_size'].append(batch)
    t, err = fit_predict(batch, X_train, y_train, X_test, y_test, X_valid, y_valid)
    keras_result['error'].append(err)
    keras_result['fit_time'].append(t)

In [10]:
pd.DataFrame(keras_result)

Unnamed: 0,batch_size,error,fit_time
0,128,2.689021,22.78714
1,256,3.257902,22.597849
2,512,2.819646,5.967604
3,1024,2.7344,4.596805
4,2048,2.729561,5.412578
5,4096,2.683255,5.942875
6,8192,2.760422,6.067741
7,16384,2.907761,8.491305
8,32768,3.918299,4.112153
9,65536,17.775959,2.692263


Choose the name and location of the destination file where results will be stored.

In [11]:
output_file = 'temp/keras.json'

In [12]:
with open(output_file, 'w') as fp:
    json.dump(keras_result, fp)

---

Visit [www.add-for.com](<http://www.add-for.com/IT>) for more tutorials and updates.

This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.