### Florian Muthreich   ---   INF368   ---

# Assignment 1

Load necessary packages

In [None]:
import matplotlib.pyplot as plt
import keras
import random
import numpy as np

---
Next I download the MNIST dataset. It is already split in test and training set and saved to variables. The images are stored separately from the labels in arrays.

In [None]:
from keras.datasets import mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

---
The MNIST set has been downloaded and I can show the dimensions of the dataset.
In total there are 70000 images of handwritten numbers. 60000 in the train set and 10000 in the test set. each image is 28 x 28 pixels and has only one channel, which means each cell indicates the intensity of a pixel. The labels are stored separately in their own array, which basically has the shape of a vector.

In [None]:
print("Training set:",x_train.shape, ", labels", y_train.shape)
print("Test set:",x_test.shape, ", labels", y_test.shape)

---
Here I plot 6 random images from the MNIST dataset. 3 from the train and 3 from the test set.

In [None]:
plt.rcParams["figure.figsize"] = (14,10)
for x in range(6):
    if x<3:
        i = random.randint(1,x_train.shape[0])
        plt.subplot(2, 3, x+1)
        plt.imshow(x_tr[i])
        plt.title("Train img: {}; label: {}".format(i,y_train[i]))
    else:
        i = random.randint(1,x_test.shape[0])
        plt.subplot(2, 3, x+1)
        plt.imshow(x_te[i])
        plt.title("Test img: {}; label: {}".format(i,y_test[i]))
        
plt.show()

---
I defined a function to transform the training data into 4D vectors. The same function also converts the labels into one hot coded vectors. 

In [None]:
def res_rec(data, labels, channels):
    tmp = data.reshape(data.shape[0], channels, data.shape[1], data.shape[2])
    tmp = tmp.astype("float32")
    tmp = tmp/np.amax(x_train)
    lbl = keras.utils.to_categorical(labels, len(np.unique(labels)))
    return tmp, lbl

x_train, y_train = res_rec(x_train, y_train, 1)
x_test, y_test = res_rec(x_test, y_test, 1)

print("Training set:",x_train.shape, ", labels", y_train.shape)
print("Test set:",x_test.shape, ", labels", y_test.shape)

---
This model has 1 hidden layer with 50 units and a second layer with 10 softmax units. I added another layer before the first hidden layer to convert the picture data from a 28 x 28 matrix into a vector, this is done by the flatten layer. there are no weights learned in this layer, hust a transformation from 2 dimensions to 1. The number of parameters can be easily derived from this model description.

We have 784 input features that are connected to 50 hidden units in the first layer (1). Each unit also has a weight for the bias (2). In the next layer, the 50 units of the first hidden unit are connected to the 10 Softmax units with weights for each connection in addition to weights for the bias of each node in the Softmax layer (3). This brings the total to 39760 parameters. We can see the number of trainable parameters with the model.summary() command.

1) 784 * 50 = 39200

2) 39200 + 50 = 39250

3) 50 * 10 + 10 = 510


In [None]:
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten

model = Sequential([
    Flatten(input_shape = (1,28, 28), name = "flatten"),
    Dense(50, name = "hidden_1"),
    Activation("sigmoid", name = "act_hidden_1"),
    Dense(10, name = "out"),
    Activation("softmax", name = "act_out")
])

model.compile(optimizer = "sgd",
             loss = "categorical_crossentropy",
             metrics = ["accuracy"])

model.summary()

---
The training data is split into training and validation (development) set. I randomly select 5000 (1/12) images of the training set and their corresponding labels and save them as the validation set, to check performance during training. 

In the last line of the code I call a function that will save model checkpoints and the weights after each epoch. This way I can go back to each point during training and recreate the model and its weight from that point in the training process. 

In [None]:
ind = random.sample(range(x_train.shape[0]), int(1/12 * x_train.shape[0]))
x_valid, y_valid = x_train[ind], y_train[ind]
x_train, y_train = np.delete(x_train, ind, axis = 0), np.delete(y_train, ind, axis = 0)

checkpoints = keras.callbacks.ModelCheckpoint("./checkpoints/model_{epoch:02d}.hdf5", 
                                              monitor='val_loss', 
                                              verbose=0, 
                                              save_best_only=False, save_weights_only=False)


In [None]:
trained = model.fit(x_train, y_train, 
                    epochs = 5, batch_size = 64, 
                    callbacks = [checkpoints], 
                    validation_data = (x_valid, y_valid))

In [None]:
import pandas as pd

tested_eval = model.evaluate(x_test, y_test, batch_size = 128)
tested_pred = model.predict(x_test, batch_size = 128)

actu = np.argmax(y_test, axis = 1, out = None)
pred = np.argmax(tested_pred, axis = 1, out = None)

confusion = pd.crosstab(actu, pred, rownames=['Actual'], colnames=['Predicted'], margins=True)
confusion

In [None]:
plt.rcParams["figure.figsize"] = (16,5)
plt.subplot(1, 2, 1)

plt.plot(range(1, 6), trained.history["acc"], color = "blue")
plt.plot(range(1, 6), trained.history["val_acc"], color = "green")
plt.plot(5, tested_eval[1], marker = "o", color = "orange")
plt.ylim(0.5, 1), plt.xticks(range(1,6))
plt.legend(["accuracy train", "accuracy valid", "accuracy test"])
plt.title("Accuracy MLP model"), plt.xlabel("Epochs"), plt.ylabel("Accuracy")

plt.subplot(1, 2, 2)
plt.plot(range(1, 6), trained.history["loss"], color = "blue")
plt.plot(range(1, 6), trained.history["val_loss"], color = "green")
plt.plot(5, tested_eval[0], marker = "o", color = "orange")
plt.ylim(0), plt.xticks(range(1,6))
plt.legend(["loss train", "loss valid", "loss test"])
plt.title("Loss MLP model"), plt.xlabel("Epochs"), plt.ylabel("Loss")
plt.show()

---
Saving the model to json string and to a file. 

Saving the weights of the latest model to hdf5 file.

In [None]:
json_string = model.to_json()
with open("model_mlp.json", "w") as json_file:
    json_file.write(json_string)

model.save_weights("weights_mlp.hdf5")

Load model from json and reload the weights.

In [None]:
with open("model_mlp.json", "r") as json_file:
    load_json_model = json_file.read()
    
loaded_model = keras.models.model_from_json(load_json_model)
loaded_model.load_weights("weights_mlp.hdf5", by_name=False)

This code snippet will load the model after the first training epoch and restore the weights 

In [None]:
new_model = keras.models.load_model("checkpoints/model_01.hdf5")
new_model.summary()
new_model.evaluate(x_train, y_train, batch_size = 128)

In [None]:
tbcallback = keras.callbacks.TensorBoard(log_dir='./logs', 
                            histogram_freq=0, 
                            batch_size=32, 
                            write_graph=True, write_grads=True, write_images=True, 
                            embeddings_freq=0, embeddings_layer_names=None, 
                            embeddings_metadata=None, embeddings_data=None, update_freq='epoch')
callbacks = [tbcallback]