# You can use this notebook either to reproduce how training a simple model is done in Keras, or if you want, just to load the existing trained model and use it to make predictions.

But either way you first need the imports

In [1]:
from keras.models import Sequential, model_from_json
from keras.layers.core import Dense, Activation, Dropout
import numpy as np

Using Theano backend.
Using gpu device 0: Quadro K5200 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 5005)


# IMPORTANT: THOSE NEXT LINES ARE FOR TRAINING ONLY, UNTIL THE NEXT MARKDOWN ALL CAPS COMMENT

let's define the model...

In [2]:
model = Sequential()
model.add(Dense(128, input_dim=18, activation='linear'))
model.add(Dense(256, activation='tanh'))
model.add(Dropout(0.2))
model.add(Dense(256, activation='tanh'))
model.add(Dropout(0.2))
model.add(Dense(256, init='uniform', activation='linear'))
model.add(Dense(1, activation='linear'))


...and compile it, using `MSE` = Mean Squared Error as loss function and `RMSprop` as optimizer

In [None]:
model.compile(loss='mse', optimizer='rmsprop')

this gives you a nice table with the network configuration and it counts the total number of parameters

In [3]:
model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
dense_1 (Dense)                  (None, 128)           2432        dense_input_1[0][0]              
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 256)           33024       dense_1[0][0]                    
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 256)           0           dense_2[0][0]                    
____________________________________________________________________________________________________
dense_3 (Dense)                  (None, 256)           65792       dropout_1[0][0]                  
___________________________________________________________________________________________

this if for training only. It's the name of the text file that contains both training and test set

In [4]:
bmanDataName = "bb-abc2.csv"

load the file into memory

In [5]:
bmanData = np.loadtxt(bmanDataName, delimiter=",", skiprows=1)

display the file's contents, to verify the the CSV import worked correctly

In [6]:
bmanData

array([[ 9.        ,  0.78539816, -0.78539816, ..., -0.26179939,
        -0.26179939,  0.52359878],
       [ 6.        ,  0.26179939,  0.34906585, ..., -0.26179939,
         0.26179939, -0.52359878],
       [ 4.        ,  0.26179939,  0.34906585, ..., -0.52359878,
         0.        , -0.26179939],
       ..., 
       [ 9.        ,  0.26179939,  0.78539816, ...,  0.26179939,
         0.26179939, -0.26179939],
       [ 4.        , -0.78539816,  0.17453293, ..., -0.52359878,
         0.        ,  0.52359878],
       [ 8.        ,  0.26179939,  0.17453293, ...,  0.26179939,
        -0.26179939,  0.52359878]])

show the amount of data... should be N x 19, where N >= 1000

In [7]:
bmanData.shape

(100000, 19)

cut the data into input (18 joint data elements) and output (1 rating value)

In [8]:
rawData = bmanData[:,1:]
rawLabels = bmanData[:,0]

shuffle the data in order to randomly split into test/training set

In [None]:
indices = np.arange(len(rawLabels))
np.random.shuffle(indices)

split = 0.8 # how much of the data is training. split = 0.8 means 80% train, 20% test
splitIdx = int(round(len(rawLabels)*0.8))

X_train = rawData[indices[:splitIdx]]
y_train = rawLabels[indices[:splitIdx]]

X_test = rawData[indices[splitIdx:]]
y_test = rawLabels[indices[splitIdx:]]

show the amount of test/training data

In [9]:
print (X_train.shape)
print (y_train.shape)

print (X_test.shape)
print (y_test.shape)

(80000, 18)
(80000,)
(20000, 18)
(20000,)


some hyperparameters necessary for training

In [10]:
epochs = 200 # how many training iterations
batch_size = 64 # in each iteration, batch learning is used. This specifies how big the batch has to be

val_split = 0.2 # validation split, how much of the training data is used for validation during training - 
# this is important, because the training data might overfit (the performance getting better), 
# but the validation split would then show the overall performance going down


This trains the network and shows a little progress bar for each epoch, 
takes about 5min on my Nvidia Quadro K5200

In [11]:
hist = model.fit(X_train, y_train, nb_epoch=epochs, batch_size=batch_size, validation_split=val_split)
# it also stores the history of training episodes that can later be used to plot the convergence graph

Train on 64000 samples, validate on 16000 samples
Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/2

run the model on the test data for evaluation

In [12]:
score = model.evaluate(X_test, y_test, batch_size=batch_size)



In [13]:
score # this gives the mean squared error over the test set

0.20652985680103303

in order to save/load the trained model, let's give it a name and a version number

In [15]:
modelName = "bb-abc"
modelVersion = 2

Saving and loading the model weights is done in H5 binary format.

To get Python to understand H5, you need to install the h5py extension either with

`sudo apt-get install python-h5py`

or

`sudo pip install h5py`

In [16]:
# first store the model definition (the content of the second cell from the top in this notebook)
open("model-"+modelName+'-'+str(modelVersion)+'.json', 'w').write(model.to_json())

# then save the actual weights
model.save_weights("model-"+modelName+'-'+str(modelVersion)+'.h5')

# IMPORTANT: STARTING FROM THE NEXT CELL YOU CAN _USE_ THE MODEL (I.E. USE THE TRAINED MODEL TO DO PREDICTIONS WITHOUT MODIFYING THE MODEL)

If you came here directly without executing any of the lines above (except the imports), 
then you need to execute this:

This is the same as 2 cells above (but here in case you skipped it)

In [24]:
modelName = "bb-abc"
modelVersion = 2

In [20]:
# first load the model definition (and also secretly compile the model)
model = model_from_json(open("model-"+modelName+'-'+str(modelVersion)+'.json').read())
# then load and set the weights of the model to those we trained
model.load_weights("model-"+modelName+'-'+str(modelVersion)+'.h5')

Now just to understand the data format, 
show the first row of the test dataset (that we use here to show how predictions are done)

In [21]:
X_test[0:1] # note: this is different from X_test[0] (the former is a matrix, the latter a vector)

array([[ 0.26179939,  0.78539816,  0.78539816, -0.26179939,  1.3962634 ,
         0.26179939, -0.52359878, -1.04719755,  0.        , -0.78539816,
         0.        , -0.78539816,  0.        ,  0.        , -0.26179939,
         0.52359878,  0.26179939,  0.26179939]])

this uses the model to predict the output for the first element in the test set

In [22]:
print model.predict([X_test[0:1]])

[[ 7.87725258]]


...and this is the actual test set output (what the above output should be as close as possible to)

In [23]:
print y_test[0]

8.0
