## Notebook setting up keras (simple example MLP some ML dataset)
## Notebook Intro to keras


In [None]:
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
import matplotlib.pyplot as plt
%matplotlib inline

# Data (MNIST)
We obtain the MNIST data set directly through Keras, given as fully labelled training and test sets.

In [None]:
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Each example is an image of 28x28 pixels, given as integer grayscale values from 0 to 255. Each example has a label, an integer 0 to 9. The training set contains 60,000 examples, the test set 10,000.

In [None]:
print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)

We can use matplotlib to quickly visualise some of the data.

In [None]:
example_id = 0  # pick any integer from 0 to 59999 to visualize a training example
example = x_train[example_id]
label = y_train[example_id]
print("Class label:", label)
plt.matshow(example, cmap="gray")
plt.show()

MNIST digits have 28\*28=784 dimensions/pixels, and belong to one of 10 possible classes.

In [None]:
n_dims = 784  # MNIST digits have 28*28=784 dimensions/pixels
n_classes = 10

* We are going to build a multi-layer perceptron, which uses Dense layers. These layers expect the data to be given as vectors, not matrices.
* The pixel values are given by integer values from 0 to 255, we normalise this to obtain float values from 0 to 1.
* Labels are given as values 0 to 9, but here we need so-called "one-hot" encodings, e.g. 3 becomes [0,0,0,1,0,0,0,0,0,0]

In [None]:
x_train = x_train.reshape(60000, n_dims)
x_test = x_test.reshape(10000, n_dims)

x_train = x_train.astype("float32")
x_test = x_test.astype("float32")
x_train /= 255
x_test /= 255

y_train = to_categorical(y_train, n_classes)
y_test = to_categorical(y_test, n_classes)

Observe how the shapes of the datasets have changed.

In [None]:
print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)
print("example one-hot encoding:", y_train[0])

# Model architecture & settings

We build an MLP with two hidden layers, with the given number of hidden units. We also include Dropout for each of the layers, with the given dropout rate.

In [None]:
intermediate_dim1 = 256
intermediate_dim2 = 128
dropout_rate = 0.2

* Initialise a Keras Sequential model
* Add two hidden (Dense) layers with ReLU activations and dropout, then a (Dense) Softmax layer with 10 classes (to obtain classification predictions summing up to 1). The first layer must explicitly receive the shape of the input, following layers can do automatice shape inference.
* Optional: print a summary of the model
* Compile the model with the following settings:
    * use stochastic gradient descent with the "adadelta" optimizer to train the model
    * MNIST is a multi-class classification problem, use categorical cross entropy loss function
    * output accuracy (% of correctly classified instances) when evaluating the model

# Training the model
Train the model (using stochastic gradient descent) with given batch size, for given number of epochs. We split of 1/12-th of the data (5,000 of the 60,000 samples) as validation data, such that we can use the validation accuracy for hyperparameter tuning.

In [None]:
batch_size = 100
epochs = 20

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          validation_split=1/12)

# Evaluating the model

We evaluate the model using the test set, obtaining the test loss and accuracy (% examples correctly classified)

In [None]:
loss, accuracy = model.evaluate(x_test, y_test, verbose=0)

print('Test loss:', loss)
print('Test accuracy:', accuracy)