# Neural Network Benchmark

In [1]:
import numpy as np
import pandas as pd
import sklearn as sk
import sklearn.metrics
from IPython.display import Image
from addutils import css_notebook
import time
import json
css_notebook()

In [2]:
import bokeh.plotting as bk
bk.output_notebook()

## 1 TensorFlow Benchmark

In this serie of notebooks we would like to compare three different libraries for Neural Netwok using a regression problem on temporal data. The problem is to predict the value of the next event using a fully connected feed forward network. Each notebook performs the fitting and prediction of the data using a different library, varying batch size and recording both precision (RMSE) and the time needed to fit the data. 

This notebook uses TensorFlow. Please refer to previous notebooks in the serie for installation instructions and how-tos.

## 2 Load data

For this serie of tests we used a dataset with more examples and an increased model complexity. We downloaded it from the *UCI Machine Learning Repository*. Follow this link and download the [Individual household electric power consumption](http://archive.ics.uci.edu/ml/machine-learning-databases/00235/household_power_consumption.zip) dataset. It has more than two milions examples. Please download it in `tutorials/machine_learning/example_data/` and unzip it. It refers to measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. Different electrical quantities and some sub-metering values are available. In our experiment we use *Voltage* as reference quantity.

Settings for this experiments are similar to the one used for previous notebooks. It is possible to choose the number of features (i.e. the number of inputs) and the number of future steps to predict. The variable `percentage` refers to the number of training examples with respect to the whole dataset. For example if you specify 0.7 it means that 70% of the examples will be used for training. The remaining 30% examples are split equally between test set and validation set.

In [3]:
n_inputs = 100
steps_forward = 1
percentage = 0.7

In [4]:
input_file = 'example_data/household_power_consumption.txt'

In [5]:
x = pd.read_csv(input_file, sep=';', na_values='?') # automatically converting '?' to NaN removes warning
a = x['Voltage']
a = a[a.notnull()]
a = a.values

vec_size = n_inputs + steps_forward - 1

X = np.zeros((a.size-vec_size,n_inputs))

y = np.zeros((a.size-vec_size,1))
for r in range(a.size-vec_size):
    X[r,:] = a[r:r+n_inputs]
    y[r,:] = a[r+vec_size]

split = int(a.size*percentage) - n_inputs
vt_split = int((X.shape[0]-split)/2)
X_train = X[:split]
X_test = X[split+vt_split:] 
X_valid = X[split:split+vt_split] 
y_train = y[:split]
y_test = y[split+vt_split:]
y_valid = y[split:split+vt_split]

In [6]:
print('X_train shape: %d, %d' % X_train.shape)
print('X_test shape: %d, %d' % X_test.shape)
print('X_valid shape: %d, %d' % X_valid.shape)
print('y_train shape: %d, %d' % y_train.shape)
print('y_test shape: %d, %d' % y_test.shape)
print('y_valid shape: %d, %d' % y_valid.shape)

X_train shape: 1434396, 100
X_test shape: 307392, 100
X_valid shape: 307392, 100
y_train shape: 1434396, 1
y_test shape: 307392, 1
y_valid shape: 307392, 1


## Test varying batch size

In the following section we perform fitting and prediction of the dataset, using several batch sizes and for each size we record execution time and error. The error measure is Root Mean Squared Error. 

In [7]:
import tensorflow as tf

The architecture of the Neural Network is composed as follows:

* Two hidden layers
* First hidden layer with 100 neurons
* Second hidden layer with 50 neurons
* Dropout with p=0.5 in each layer
* Gradient Descent with momentum 
* Learning Rate = 0.01
* Momentum = 0.9
* Eearly Stopping
* One neuron with linear activation in output

The function `fit_predict` creates a neural network with the specific architecture and a given batch size and perform fitting and prediction on the dataset. It return the error measure and the time elapsed during fitting.

The TensorFlow implementation translates the graph definition into executable operations distributed across available computing resources, such as the CPU or one of the GPU cards. In general you do not have to specify CPUs or GPUs explicitly, because TensorFlow uses the first GPU, if there is one, for as many operations as possible.

For the purpose of this benchmark, however, we would like to specify to run the computation on a single device, such as CPU only, as much as TensorFlow allows us to do it. In order to choose which device to use, CPU or GPU, please change the variable `device` in the cell below. To select CPU use the keyword `/cpu:0`, while with keyword `/gpu:0` it is possible to select the first GPU. If you have more than one GPU installed on your machine, refer to it using the number after the colon, for example the second GPU is `/gpu:1`.

In [8]:
device = '/gpu:0'

In [9]:
def fit_predict(batch_size, X_train, y_train, X_test, y_test, X_valid, y_valid):
    epochs = 100
    hidden_size1 = 100
    hidden_size2 = 50
    p = 0.5

    n_inputs = X_train.shape[1]

    with tf.device(device):
        x = tf.placeholder(tf.float32, [None, n_inputs])
        y_ = tf.placeholder(tf.float32, [None, 1])
        keep_prob = tf.placeholder("float")

        W1_x = tf.Variable(tf.truncated_normal([n_inputs, hidden_size1], stddev=0.05))
        b1_h = tf.Variable(tf.constant(0.1, shape=[hidden_size1]))

        h1 = tf.nn.tanh(tf.matmul(x, W1_x) + b1_h)
        h1_drop = tf.nn.dropout(h1, keep_prob)

        W2_x = tf.Variable(tf.truncated_normal([hidden_size1, hidden_size2], stddev=0.05))
        b2_h = tf.Variable(tf.constant(0.1, shape=[hidden_size2]))

        h2 = tf.nn.tanh(tf.matmul(h1_drop, W2_x) + b2_h)
        h2_drop = tf.nn.dropout(h2, keep_prob)

        W_h = tf.Variable(tf.zeros([hidden_size2, 1]))

        b_y = tf.Variable(tf.zeros([1]))

        y = tf.matmul(h2_drop, W_h) + b_y

    loss = tf.reduce_mean(tf.square(y - y_))
    optimizer = tf.train.GradientDescentOptimizer(0.01) # learning rate         
    train = optimizer.minimize(loss)

    sess = tf.Session()

    sess.run(tf.initialize_all_variables())

    train_batches = len(X_train) // batch_size 
    t0 = time.time()
    prev = None
    for epoch in range(epochs):
        for i in range(train_batches):
            batch_x = X_train[i*batch_size:(i+1)*batch_size]
            batch_y = y_train[i*batch_size:(i+1)*batch_size]
            sess.run(train, feed_dict={x: batch_x, y_:batch_y, keep_prob:p})
        preds = sess.run(y, feed_dict={x: X_valid, keep_prob: 1.0})
        curr = np.sqrt(sk.metrics.mean_squared_error(preds, y_valid))
        if (prev == None):
            prev = curr
        else:
            if curr > prev:
                break
            prev = curr

    fit_time = time.time() - t0

    preds = sess.run(y, feed_dict={x: X_test, keep_prob: 1.0})
    err = np.sqrt(sk.metrics.mean_squared_error(preds, y_test))
    return fit_time, err

In [10]:
tensorflow_result = {'batch_size':[], 'error':[], 'fit_time':[]}
for batch in [128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072]:
    tensorflow_result['batch_size'].append(batch)
    t, err = fit_predict(batch, X_train, y_train, X_test, y_test, X_valid, y_valid)
    tensorflow_result['error'].append(err)
    tensorflow_result['fit_time'].append(t)

In [11]:
pd.DataFrame(tensorflow_result)

Unnamed: 0,batch_size,error,fit_time
0,128,3.605597,145.099285
1,256,3.547923,94.896823
2,512,2.702137,101.889646
3,1024,2.76709,54.535229
4,2048,2.69552,44.514253
5,4096,3.110668,18.167414
6,8192,4.126063,5.683449
7,16384,4.635352,5.589076
8,32768,4.846756,3.359841
9,65536,4.423262,2.580113


Choose the name and location of the destination file where results will be stored.

In [12]:
output_file = 'temp/tensorflow.json'

In [13]:
with open(output_file, 'w') as fp:
    json.dump(tensorflow_result, fp)

---

Visit [www.add-for.com](<http://www.add-for.com/IT>) for more tutorials and updates.

This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.