# Task 3: Neural Networks

Multi-class Classification: Your goal is to predict a discrete value y (0, 1, 2, 3 or 4) based on a vector x.

Potential approaches / tools to consider: Neural networks / Deep Learning (Theano, TensorFlow, Torch, Lasagne)

In [62]:
import tensorflow 
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.wrappers.scikit_learn import KerasClassifier
from keras.optimizers import SGD

from keras.utils import np_utils
from sklearn import model_selection
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline

#### Data Import

In [63]:
train = pd.read_hdf("data/train.h5", "train")
train_labels = train['y'].as_matrix()
train_data = train.ix[:, 1:].astype(float).as_matrix()
test_data = pd.read_hdf("data/test.h5", "test").as_matrix()

#### Split Data into Train and Validation Set 

In [64]:
X_train, X_test, y_train, y_test = model_selection.train_test_split(train_data, train_labels, 
                                                                    test_size=0.33, random_state=42)

#### Convert Labels

In [73]:
labels_cat = keras.utils.to_categorical(train_labels, num_classes=5)
y_train_cat = keras.utils.to_categorical(y_train, num_classes=5)
y_test_cat = keras.utils.to_categorical(y_test, num_classes=5)

### Neural Network Model: Baseline Model

The function below creates a baseline neural network, a simple, fully connected network with one hidden layer that contains 100 neurons. The hidden layer uses a rectifier activation function which is a good practice. The output value with the largest value will be taken as the class predicted by the model.

The **network topology** can be summarised by: 
*100 inputs -> [100 hidden nodes] -> 5 outputs* 

In [66]:
model = Sequential()
model.add(Dense(64, input_dim=100, kernel_initializer='normal', 
                activation='relu'))
model.add(Dense(5, kernel_initializer='normal', activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='rmsprop', 
              metrics=['accuracy'])

In [67]:
model.fit(train_data, labels_cat, epochs=10, batch_size=64)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f55f9cfe160>

In [68]:
model.predict(test_data, batch_size=64, )

array([[  9.62778926e-01,   8.55894003e-04,   5.47987921e-03,
          5.45128202e-03,   2.54339222e-02],
       [  2.69350447e-02,   9.96673945e-04,   4.78248388e-01,
          4.31844778e-02,   4.50635374e-01],
       [  2.93500954e-04,   4.36020446e-06,   2.73193233e-03,
          3.52676085e-04,   9.96617496e-01],
       ..., 
       [  2.54674233e-03,   3.89729166e-06,   6.36458513e-04,
          9.96709704e-01,   1.03122213e-04],
       [  5.64311406e-08,   6.86394319e-09,   9.99921799e-01,
          7.81826457e-05,   6.84206858e-11],
       [  1.93114975e-03,   5.83559508e-03,   2.51577526e-01,
          1.68578012e-03,   7.38969922e-01]], dtype=float32)

### Multilayer Perceptron (MLP) for multi-class softmax classification

In [74]:
model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.

model.add(Dense(64, activation='relu', input_dim=100))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

model.fit(X_train, y_train_cat, epochs=20, batch_size=128)
score = model.evaluate(X_test, y_test_cat, batch_size=128)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20

In [76]:
score

[0.27697164616084574, 0.91127899960017167]