# PRAC 7
## MLPS and Convolutional Neural Networks (CNNs) using Tensorflow 2.x

A lot of this will be borrowed from the Tensorflow introduction found [here](https://www.tensorflow.org/tutorials/)

You've already covered multilayer perceptrons in last weeks prac. CNNs are possibly part of the reason you're interested in this course due to their strengths in image classification.

[This link](https://cs231n.github.io/convolutional-networks/) provides a good overview of CNNs and would be useful to read. 

Some useful imports 

In [None]:
import tensorflow as tf
%load_ext tensorboard
from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model
import datetime
!rm -rf ./logs/ 
print(tf.__version__) #Double check the colab has the instance of tensorflow we want

Import the MNIST dataset and normalise

In [None]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Add a channels dimension
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]

Use `tf.data` to batch and shuffle the dataset:

In [None]:
train_ds = tf.data.Dataset.from_tensor_slices(
    (x_train, y_train)).shuffle(10000).batch(32)

test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

# MLPs
We can really easily create basic [sequential](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential) (the most common) models  in Tensorflow using the [keras](https://www.tensorflow.org/guide/keras) api. 

Here we'll create a single layer MLP that uses the relu activation in its hidden layer.

In [None]:
model =  tf.keras.models.Sequential([
            tf.keras.layers.Flatten(input_shape=(28, 28)),
            tf.keras.layers.Dense(512, activation='relu'),
            tf.keras.layers.Dense(10, activation='softmax')
          ])

Then all we need to do is add an [optimiser](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/) and a [loss](https://www.tensorflow.org/api_docs/python/tf/keras/losses)  function.




In [None]:
model.compile(optimizer='sgd',
              loss='sparse_categorical_crossentropy',metrics=['accuracy'])


We can then fit our model to the and evaluate it all in one simple step. Maybe we also want to be able to check out some sweet graphage - luckily for us, TensorFlow comes with a fantastic visualisation tool called TensorBoard

In [None]:
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S") #datetime storage
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1) #TB callbacks

model.fit(train_ds, 
          epochs=5, 
          validation_data=test_ds, callbacks=[tensorboard_callback])



This didn't work in my FireFox withough enabling cookies. Some googling informed me that it would require me to accept a level of cookies that I was unwilling to accept. If you're also feeling this way you can just download this noebook and run it locally :) 

In [None]:
%tensorboard --logdir logs/fit

#CNNs 

CNNs are neural networks that are structured in a certain way to take advantage of the inherent structure in images. They are just as easy to make in Keras as MLPs! :o

In [None]:
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S") #datetime storage
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1) #TB callbacks

model =  tf.keras.models.Sequential([
            tf.keras.layers.Conv2D(32,3),
             tf.keras.layers.Flatten(),
            tf.keras.layers.Dense(512, activation='relu'),
            tf.keras.layers.Dense(10, activation='softmax')
          ])

model.compile(optimizer='sgd',
              loss='sparse_categorical_crossentropy',metrics=['accuracy'])

model.fit(train_ds, 
          epochs=5, 
          validation_data=test_ds, callbacks=[tensorboard_callback])


In [None]:
%tensorboard --logdir logs/fit

# Tasks for you

Now you've seen how easy it is to create an MLP/CNN in TensorFlow using Keras experiment with the parameters.

Change:
*   the activation functions from relu to tanh or sigmoids.
*   the parameters of the optimiser. Hint use the actual optimiser instead of a string. 
*   the optimiser from SGD to Adam or AdaGrad.
*   the model to have more or less layers.

To do this it might be useful to design some kind of experiment where you sweep some/all/more than the parameters mentioned above. 

Try a different Dataset to MNIST.

Hint: CNNs take some time to train so have a think about how many parameters are in a CNN and the best way you can reduce them. 


#Extra

Sometimes we want to create models at a lower level of abstraction. We can create a similar MLP and fit it using the subsequent cells. Keep in mind this is still using Keras which is a high level API. You can get even more control by doing all the operations yourself. The level of abstraction you want will be dependent on your application. In this prac we'll just stick with Keras.

Define a basic MLP using the [model subclassing API](https://www.tensorflow.org/guide/keras#model_subclassing)

In [None]:
class MLP(Model):
  def __init__(self, hidden_layers = [512], activation = 'relu', output_dimensions=10):
    super().__init__()
    self._inp = Flatten() #ensure the input is flattened
    self._densebois = []
    for h in hidden_layers:
      self._densebois.append(Dense(h, activation=activation))
    self._out = Dense(output_dimensions)

  def call(self, x):
    x = self._inp(x)
    for layer in self._densebois:
      x = layer(x)
    return self._out(x)

# Create an instance of the model
model = MLP()

Pick an arbitrary [optimiser](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/) for training and define the model [loss](https://www.tensorflow.org/api_docs/python/tf/keras/losses) functions. 

In [None]:
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

optimizer = tf.keras.optimizers.SGD()

train_loss = tf.keras.metrics.Mean(name='train_loss')
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')

test_loss = tf.keras.metrics.Mean(name='test_loss')
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')

Tensorflow runs much faster if we create tensorflow functions for training and validating our models. We do this in Python using `@tf.function`.



In [None]:
@tf.function
def train_step(images, labels):
  with tf.GradientTape() as tape:
    # training=True is only needed if there are layers with different
    # behavior during training versus inference (e.g. Dropout).
    predictions = model(images, training=True)
    loss = loss_object(labels, predictions)
  gradients = tape.gradient(loss, model.trainable_variables)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))

  train_loss(loss)
  train_accuracy(labels, predictions)

In [None]:
@tf.function
def test_step(images, labels):
  # training=False is only needed if there are layers with different
  # behavior during training versus inference (e.g. Dropout).
  predictions = model(images, training=False)
  t_loss = loss_object(labels, predictions)

  test_loss(t_loss)
  test_accuracy(labels, predictions)

We can then simply run our model for a number of epochs

In [None]:
EPOCHS = 5

for epoch in range(EPOCHS):
  # Reset the metrics at the start of the next epoch
  train_loss.reset_states()
  train_accuracy.reset_states()
  test_loss.reset_states()
  test_accuracy.reset_states()

  for images, labels in train_ds:
    train_step(images, labels)

  for test_images, test_labels in test_ds:
    test_step(test_images, test_labels)

  template = 'Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}'
  print(template.format(epoch + 1,
                        train_loss.result(),
                        train_accuracy.result() * 100,
                        test_loss.result(),
                        test_accuracy.result() * 100))

This training/validation looks pretty useful but we want to be able to do this to many models. So lets make a class for that :)

In [None]:
class ModelTester:
  def __init__(self, model, optimiser = tf.keras.optimizers.SGD(), loss_object=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)):
    self._model = model
    self._optimiser = optimiser 
    self._loss_object = loss_object
    self._train_loss = tf.keras.metrics.Mean(name='train_loss')
    self._train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')

    self._test_loss = tf.keras.metrics.Mean(name='test_loss')
    self._test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')
  
  @tf.function
  def train_step(self, images, labels):
    with tf.GradientTape() as tape:
      # training=True is only needed if there are layers with different
      # behavior during training versus inference (e.g. Dropout).
      predictions = self._model(images, training=True)
      loss = self._loss_object(labels, predictions)
    gradients = tape.gradient(loss, self._model.trainable_variables)
    self._optimiser.apply_gradients(zip(gradients, self._model.trainable_variables))

    self._train_loss(loss)
    self._train_accuracy(labels, predictions) 

  @tf.function
  def test_step(self, images, labels):
    # training=False is only needed if there are layers with different
    # behavior during training versus inference (e.g. Dropout).
    predictions = self._model(images, training=False)
    t_loss = self._loss_object(labels, predictions)

    self._test_loss(t_loss)
    self._test_accuracy(labels, predictions)

  def train(self, train_ds, test_ds, epochs):
    for epoch in range(epochs):
      # Reset the metrics at the start of the next epoch
      self._train_loss.reset_states()
      self._train_accuracy.reset_states()
      self._test_loss.reset_states()
      self._test_accuracy.reset_states()

      for images, labels in train_ds:
        self.train_step(images, labels)

      for test_images, test_labels in test_ds:
        self.test_step(test_images, test_labels)

      template = 'Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}'
      print(template.format(epoch + 1,
                            self._train_loss.result(),
                            self._train_accuracy.result() * 100,
                            self._test_loss.result(),
                            self._test_accuracy.result() * 100))

In [None]:
tester = ModelTester(model)

In [None]:
tester.train(train_ds, test_ds, 5)