# Tutorial 8 - Subclassing and Hyperparameters

Today you will be creating a simple "Multi-layer Perceptron" model by subclassing the keras Model class. You wil then test that model with a variety of different settings, to explore how changing hyperparameters affects model training.

In [None]:
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt

# Exercise 1

The outline of a model class is provided below. The model is a multi-layer perceptron with four dense layers, which will predict class probabilities . In the `__init__` method, you need to create the individual layers and assign them to the model's `self`. In the `call()` method, you should pass the input through each of layers, and then return the output of the final layer.

In [None]:
class MLP_Model(models.Model):
    def __init__(self, units, activations, out_units, out_activation):
        """
        A Multi-layer perceptron model, subclassing models.Model.

        args:
            units, list of integers giving number of units in first three layers
            activations, list of strings giving the activations of the first
                         three layers
            out_units, number of units in the output layer
            out_activation, activation function of the output layer
        """
        super(MLP_Model, self).__init__()

        ### Set up the dense layers and assign them to self
        
    def call(self, inputs):
        # call method takes in inputs and passes it through each of the layers
        # in succession

        return 

# Exercise 2

Call the MLP_Model class with the following settings:

```
units = [512, 256, 128]
activations = ['relu', 'relu', 'relu']
out_units = 10
out_activation = 'softmax'
```

Compile the model for training on the MNIST image dataset, then load and prepare the dataset for training.

Save the model weights so you can reload the initial settings later.

In [None]:
### Create an instance of the MLP_Model with the stated settings.
### Save the initial model weights so we can reset the model to the same
### initial weights.


In [None]:
### Compile the model with the Adam optimizer, Sparse Categorical Crossentropy
### and the accuracy metric


In [None]:
# Load the MNIST image dataset, flatten the images and rescale the pixels
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape(-1, 784)/255.
x_test = x_test.reshape(-1, 784)/255.

# Exercise 3

Train the model using the fit method, and include the test data for validation. A batch size of 256 trains quite quickly on this model. When you call the fit method, you can store a dictionary of losses from the training like so:

```
history = model.fit(...)
history.history # a dictionary of the losses and metrics at each epoch
# check the values available with:
history.history.keys()
```

Make a plots of the losses and metrics for the training and test data.

In [None]:
### Train the model for 20 epochs, include the test data for validation
### Store the losses and metrics as detailed above


In [None]:
### Plot the losses and metrics for the training and test data


# Exercise 4

Up until now we have used the default settings for the Adam optimiser, with a learning rate of 0.001. Test a range of new learning rates on the model by loading the initial weights, then recompiling the model with the Adam optimiser. Train for the same number of epochs each time, and then produce a plot that compares the evolution of the loss function for each value of the learning rate.

Learning rates to test:
```
0.01, 1e-4, 1e-5
```

In [None]:
### Recreate the MLP_Model and compile it with a range of learning rate values


In [None]:
### Recreate the MLP_Model and compile it with a range of learning rate values


In [None]:
### Recreate the MLP_Model and compile it with a range of learning rate values


In [None]:
### Create a plot that compares the losses and accuracies of the MLP_Model
### trained with different learning rates.



# Exercise 5

Experiment with changing any of the settings of the model—with the exception of the number of units and the activation of the final Dense layer. Try changing the number of units in the first three Dense layers, or their [activation functions](https://www.tensorflow.org/api_docs/python/tf/keras/activations). Try a different [optimiser algorithm](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers).