<a href="https://colab.research.google.com/github/berthine/SIAM-Summer-School/blob/main/Pretrained_Networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Practical: Pretraining networks - transfer learning
(20/July/2021)

### 2021 Gene Golub SIAM Summer School 
https://sites.google.com/aims.ac.za/g2s3/home 

Instructor

<font color="green">***Dr. Emmanuel Dufourq*** 

www.emmanueldufourq.com

edufourq (['@']) gmail.com

***African Institute for Mathematical Sciences***

***Stellenbosch University***

***2021***


material adapted from: https://neptune.ai/blog/transfer-learning-guide-examples-for-images-and-text-in-keras


![](https://storage.googleapis.com/kaggle-competitions/kaggle/3362/media/woof_meow.jpg)

***Deep Blue beat Kasparov at chess in 1997.***

***Watson beat the brightest trivia minds at Jeopardy in 2011.***

***Google DeepMind's AlphaGo outperformed European Go champion Fan Hui in 2015.***

***Can your neural network tell Fido from Mittens in 2021?***

## <font color="green"> Learning outcomes:

* How to use ```image_dataset_from_directory``` to read data from a folder

* The ```GlobalAveragePooling2D``` layer

* How to implement a pre-trained network, in this case MobileNetV2

* How to implement ```preprocess_input``` from one of the ```tensorflow.keras.applications```

* Downloading data from somewhere and using it in Google Colab

## <font color="green">Data information:

* Features: (150x150x3) images

* Output: 2 classes

## <font color="green">Tasks for participants (boolean)?

* No, follow along and make your own modifications and make sure you understand


In [None]:
import matplotlib.pyplot as plt
import numpy as np
import os
import tensorflow as tf
from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras.layers import Input, GlobalAveragePooling2D, Dense, Dropout
from tensorflow.keras import Model
from sklearn.metrics import confusion_matrix, accuracy_score
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
from tensorflow.keras import metrics

import os
import zipfile

## First, let's download the cats vs dogs dataset.

Dataset https://www.kaggle.com/c/dogs-vs-cats 

"In this competition, you'll write an algorithm to classify whether images contain either a dog or a cat.  This is easy for humans, dogs, and cats. Your computer will find it a bit more difficult."


In [None]:
!wget --no-check-certificate \
    https://namespace.co.ke/ml/dataset.zip \
    -O /content/catsdogs.zip

--2021-07-14 10:31:06--  https://namespace.co.ke/ml/dataset.zip
Resolving namespace.co.ke (namespace.co.ke)... 109.106.250.14
Connecting to namespace.co.ke (namespace.co.ke)|109.106.250.14|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 228894139 (218M) [application/zip]
Saving to: ‘/content/catsdogs.zip’


2021-07-14 10:32:28 (2.71 MB/s) - ‘/content/catsdogs.zip’ saved [228894139/228894139]



Now we extract the data into two folders. Check your "file" on the left. You should see the .zip file and a new folder cats_dogs.

In [None]:
with zipfile.ZipFile('catsdogs.zip', 'r') as zip_ref:
    zip_ref.extractall('/content/cats_dogs')
base_dir = '/content/cats_dogs/dataset'
train_dir = os.path.join(base_dir, 'training_set')
validation_dir = os.path.join(base_dir, 'test_set')

Tensorflow has a class called image_dataset_from_directory that allows you to read in images from a particular folder and create a dataset. There is a particular argument called ```label_mode='categorical'``` which helps us create one hot encoded targets! API: https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory

Since the training directory has two subfolders 'cats' and 'dogs' those will be the two classes.

Let's create the training data

In [None]:
training_set = image_dataset_from_directory(train_dir,
                                             shuffle=True,
                                             batch_size=32,
                                             label_mode='categorical',
                                             image_size=(150, 150))

Found 8000 files belonging to 2 classes.


Now the testing data.

Since the testing directory has two subfolders 'cats' and 'dogs' those will be the two classes.

In [None]:
test_dataset = image_dataset_from_directory(validation_dir,
                                                  shuffle=True,
                                                  batch_size=32,
                                                  label_mode='categorical',
                                                  image_size=(150, 150))

Found 2000 files belonging to 2 classes.


## Download a pre-trained model.

Here we download MobileNetV2, API here: https://www.tensorflow.org/api_docs/python/tf/keras/applications/mobilenet_v2

The ```include_top``` is an important arguement here, formally, "Boolean, whether to include the fully-connected layer at the top of the network." Since we want to add our own output softmax layer, we set the variable to False. When setting it to False, we essentiall get the 'feature extractor' and no 'classifier' part of the network. MobileNetV2 was pre-trained on a lot of classes (1000) and our problem only has two classes so we don't need the original softmax layer nor any other fully connected layers in the 'classifier' part.

We also tell it to load the 'imagenet' weights, i.e. the 1000 class problem it was originally trained on. This means the model won't be initialised with random weights, but with the weights obtained from pre-training on ImageNet.

There are a number of models that we can download, API here: https://www.tensorflow.org/api_docs/python/tf/keras/applications

The approach we take in building this model is similar to the Functional API we examined earlier.

In [None]:
base_model = MobileNetV2(
    weights='imagenet',  
    input_shape=(150, 150, 3),
    include_top=False) 

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5


Since many pre-trained models have a `tf.keras.layers.BatchNormalization` layer, it’s important to freeze those layers. Otherwise, the layer mean and variance will be updated, which will destroy what the model has already learned.

Let’s freeze all the layers in this case. You'll notice when we eventually print the model.summary() there will be a bunch of non-trainable weights. In other words, when setting .trainable to False, we now ensure that the weights loaded do not update.

In [None]:
base_model.trainable = False

Let's create a new input for our dataset. Our dataset has colour images of shape 150 x 150.

In [None]:
inputs = Input(shape=(150, 150, 3))

Note: each Keras Application expects a specific kind of input preprocessing. For MobileNetV2, call ```tf.keras.applications.mobilenet_v2.preprocess_input``` on your inputs before passing them to the model. For another model, you will have to use a different ```process_input```!

```mobilenet_v2.preprocess_input``` will scale input pixels between -1 and 1. API for mobilenetv2 preprocess: 

https://www.tensorflow.org/api_docs/python/tf/keras/applications/mobilenet_v2/preprocess_input

Normally, we would scale the values between 0 and 1 by dividing by 255, but in this case we don't.

In [None]:
x = preprocess_input(inputs)

ensure that the base model is running in inference mode so that batch normalization layers are not updated during the fine-tuning stage (set `training=False`). Note that we are freezing the feature extractor part of the pre-trained network. We will add on new layers and those layers will be updated via normal backpropagation.

Between we take the output from the last pooling layer in the pre-trained network and connect it to a new ```GlobalAveragePooling2D``` layer. Followed by dropout and a new fully connected layer. All of these will have new randomly initialised weights which will be updated.

Finally, we create the model which has 1 input branch and 1 output branch.

In [None]:
x = base_model(x, training=False)
x = GlobalAveragePooling2D()(x)
x = Dropout(0.2)(x)  
outputs = Dense(2, activation='softmax')(x)
model = Model(inputs, outputs)

In [None]:
model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 150, 150, 3)]     0         
_________________________________________________________________
tf.math.truediv (TFOpLambda) (None, 150, 150, 3)       0         
_________________________________________________________________
tf.math.subtract (TFOpLambda (None, 150, 150, 3)       0         
_________________________________________________________________
mobilenetv2_1.00_224 (Functi (None, 5, 5, 1280)        2257984   
_________________________________________________________________
global_average_pooling2d (Gl (None, 1280)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1280)              0         
_________________________________________________________________
dense (Dense)                (None, 2)                 2562  

Compile as normal

In [None]:
model.compile(optimizer='adam', loss=tf.keras.losses.CategoricalCrossentropy(), metrics=['accuracy'])

Train as normal, except now our training features and labels is a tf.data.Dataset object because of the ```image_dataset_from_directory``` that we used earlier.

In [None]:
model.fit(training_set, epochs=3)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x7fe0419fab10>

Let's create an array to store the predictions and true labels for the test data.

In [None]:
predictions = np.array([])
labels =  np.array([])

Now let's iterate on the test data (a tf.data.Dataset object) and predict along with obtaining the true values.

In [None]:
for x, y in test_dataset:
  predictions = np.concatenate([predictions, np.argmax(model.predict(x),axis=-1)])
  labels = np.concatenate([labels, np.argmax(y.numpy(), axis=-1)])

In [None]:
labels

array([1., 0., 0., ..., 1., 0., 1.])

In [None]:
predictions

array([1., 0., 0., ..., 1., 0., 1.])

Finally, we end with the confusion matrix and the accuracy.

In [None]:
confusion_matrix(labels, predictions)

array([[971,  29],
       [ 25, 975]])

In [None]:
accuracy_score(labels,predictions)

0.973

## Predict on images downloaded from Google Images.

Download some images of cats and some of dogs from the Internet and upload them for prediction.

https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory note that the encoded are obtained by sorting the sub folders in order, so cat would be first and dog second, so cat would be encoded as [1 0] and dog as [0 1], if you had to use np.argmax() then cat would be the integer 0 and dog the integer 1.

In [None]:
import numpy as np
from google.colab import files
from keras.preprocessing import image

uploaded = files.upload()

for fn in uploaded.keys():
 
  # predicting images
  path = fn
  img = image.load_img(path, target_size=(150, 150))
  x = image.img_to_array(img)
  x = np.expand_dims(x, axis=0)

  images = np.vstack([x])
  classes = model.predict(images, batch_size=10)
  print('file:',fn)
  print('softmax output',classes)
  print ('integer output',np.argmax(classes,axis=-1))

## Task

* Re-run the code but this time **do not train the model**. Instead, predict directly on the test data and compare the performance. The model should do badly because the new fully connected layer was initialised with random values and not trained.

* Try to modify the classifier part of the network. In this example we used  ```GlobalAveragePooling2D``` following by ```Dropout``` but perhaps you could add another fully connected layer?