# Deep learning with Colab

We can use colab to run slightly different looking Jupyter notebooks, with access to GPUs and TPUs for training deep neural networks. The notebook can be saved to your Google Drive as a standard Jupyter notebook.

First, configure the notebook to run tensorflow on a GPU. Click on the 'Runtime' heading in the menu at the top of the page, and choose 'Change runtime type'. Then select 'GPU' as the hardware accelerator. *Note if you change the hardware accelerator later, your notebook will be reset!*.

In [7]:
%tensorflow_version 2.x
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, Dropout
from tensorflow.keras.regularizers import l1,l2
from tensorflow.keras.utils import to_categorical, normalize

import matplotlib.pyplot as plt

UsageError: Line magic function `%tensorflow_version` not found.


## CIFAR 10 data

We will use the CIFAR10 data set provided with Keras. It contains a set of small images of 10 different types of object, with labels. You can read more about it here: http://www.cs.toronto.edu/~kriz/cifar.html

In [9]:
(x_train,y_train),(x_test,y_test) = keras.datasets.cifar10.load_data()

NameError: name 'keras' is not defined

Use the shape attribute to show the dimensions of the x_train and y_train arrays.

The first axis of the training data array corresponds to the training images, while the remaining dimensions are the x and y axes of the image, and the three colour channels (red, green and blue). We can use matplotlib to examine the images - try using plt.imshow on some of the images in the training data.

Before we start training, we will convert the target labels for the images (y_train and y_test) to use a categorical or 'one-hot' encoding. This replaces a one dimensional integer vector with an array where each column corresponds to a possible value of the integer. All values in a row are zero, except for in the column corresponding to the value we want to represent, which is one.

In [0]:
y_train_cat = to_categorical(y_train)
y_test_cat = to_categorical(y_test)

Try printing out y_train and y_train_cat to see how this has changed the labelling.


We also need to normalise the image data, as it is currently stored as integers in the range 0-255. We can use the Keras normalize function to do this:

```python
x_train = normalize(x_train)
x_test = normalize(x_test)
```

Now we can construct a Keras model for a deep neural network to use as a classifier. We will use the Sequential model class, and add layers one by one in order, from input to output.

First create an empty model using the Sequential() constructor

```
model = Sequential()
```

To start with, we will ignore the fact that the input is an image and treat it as a large vector of inputs. *This is not a sensible approach!*

Keras is able to infer the dimensions and connections between each layer, provided it is given the shape of the inputs. In our data, each data point (image) is a 32x32x3 array of values.

```
model.add(Flatten(input_shape=(32,32,3)))
```

We can then add fully connected (dense) layers of hidden units to the model, by specifying the number of hidden units to use. The activation function to use can also be specified.

```
model.add(Dense(256,activation='relu'))
```

Try adding two more dense layers to your model, of decreasing size. If you use too many hidden units training may be slow, try starting with 256.

Finally we need an output layer. This is constructed just as with any other, but we need to specify a sensible activation to use. For binary classification, we could use a sigmoid activation. For multi-class classification, we can use softmax (see lecture 5).

```
model.add(Dense(10,activation='softmax'))
```

We need to specify the dimension of the output. Since we have used one-hot encoding for the labels in y_train_cat, we need to have one output for each category.

Now we need to compile our model, specifying the optimizer to use, the loss, and any metrics we wish to calcuate. We will use the categorical_crossentropy loss, which is another way of saying the likelihood for a categorical variable.

```
model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
```

Now use

```
model.summary()
```

to see the structure of your neural network.

We can fit the model using the fit method, and specifying the minibatch size and total number of passes over the data (epochs).

```
model.fit(x_train, y_train_cat, batch_size=32, epochs=10)
```

You may find when training the performance is quite different between runs. After the training has completed, if the fit() method is called again, training will continue from the weights reached at the end of the previous call to fit().



Now we can evaluate our model on the test data:

```
model.evaluate(x_test,y_test_cat)
```

Now we will construct a convolutional deep neural network, which is more sensible for working with image data. This uses convolutional layers, that connect each unit to only a subset of the units in the previous layer, but does so with several different sets of weights. More on this next lecture.

For a convolutional layer, we specify the depth of the layer, and the number of units in the previous layer to consider.

```python
conv_model = Sequential()
conv_model.add(Conv2D(64,3,input_shape=(32,32,3),activation='relu'))
conv_model.add(Conv2D(32,3,activation='relu'))
conv_model.add(Conv2D(16,3,activation='relu'))
```

It is common to then add a final dense fully connected layer:

```python
conv_model.add(Flatten())
conv_model.add(Dense(128,activation='relu'))
conv_model.add(Dense(10,activation='softmax'))
```

Construct and compile this model, and look at the model summary.


We can specify the validation data when training, and Keras will calculate the validation accuracy at the end of each epoch:

```python
conv_model.fit(x_train, y_trainC, batch_size=32, epochs=20, 
               validation_data=(x_test,y_test_cat))
```

Try this for the conv_model neural network. You can try changing the sizes of the layers to see how this affects the performance, but you will need to be sure to re-compile the model each time.

You should see the training accuracy increase to a very high value, but the validation accuracy peak at a much lower value, or even begin to decrease.

Now we can try adding regularisation to see how this improves the training. Try using this model first:

```python
conv_model_reg = Sequential()
conv_model_reg.add(Conv2D(64,3,input_shape=(32,32,3),activation='relu',kernel_regularizer=l2(0.01)))
conv_model_reg.add(Conv2D(32,3,activation='relu',kernel_regularizer=l2(0.01)))
conv_model_reg.add(Conv2D(16,3,activation='relu',kernel_regularizer=l2(0.01)))
conv_model_reg.add(Flatten())
conv_model_reg.add(Dense(128,activation='relu',kernel_regularizer=l2(0.01)))
conv_model_reg.add(Dense(10,activation='softmax'))
```

Compile the model, look at the summary, and try fitting it. You might want to explore how changing the regularization parameter changes the performance.

## Tensorboard

We can visualise the training and validation metrics over training epochs using tensorboard. This can be used in Jupyter notebooks with the tensorboard extension:

```python
%load_ext tensorboard
```

We can run a fit of the model and log the metrics at each epoch using:

```python
tensorboard_callback = keras.callbacks.TensorBoard(log_dir="logs/scalarsConv")
conv_model.fit(x_train, y_train_cat, batch_size=32, epochs=20,
               validation_data=(x_test,y_test_cat)),
               callbacks=[tensorboard_callback])
```

Try this for both the un-regularised and regularised models. You should specify different names for the log_dir variable, e.g. "logs/scalarsConv", "logs/scalarsConvReg"

Now display the output using tensorboard with:

```python
%tensorboard --logdir logs
```

# Further things to try

You could also look at:

 - Changing the numbers of layers in the convolutional neural network.
 - Adding L1 regularisation.
 - Using a Dropout layer to add dropout for the hidden layers.