# CS4243 - Lab 11
Computer Vision & Pattern Recognition

Week 11

Author: Dr. Amirhassan MONAJEMI. Modified by: Soo Han

In [2]:
# Function estimation using neural network, libraries

from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
import tensorflow as tf
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from numpy import asarray
from matplotlib import pyplot as plt
import math as m
import random as r
import pandas as pd

# CS4243, Deep Learning Image Classification Example

In this lab, you will go through a general example of deep learning models for image classification which is similar to your mini-project. This example uses a dataset of cat and dog images to solve a binary classification task.

This example is provided by Prof Amir, with additional notes and pointers written on this notebook by Soo Han.

## Part 1: Setting up

We first check to see if we can use GPUs for faster training. If you have GPU but do not have a version that uses GPU, please use the following code block below:

```
python3 -m pip install tensorflow[and-cuda]
# Verify the installation:
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
```

In [3]:
# We will do a simple check to see if we have GPU for training. Please use GPU to accelerate your training.

if tf.test.gpu_device_name(): 
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
    print("Please install GPU version of TF")

Please install GPU version of TF


### Downloading our dataset
Download the zip file `ann_images1.zip` from our google drive. Alternatively, you can download the file from Canvas/Files/Python notebooks/set6. Unzip the file to obtain 2 directories containing images for train and test. `pets_very_small` folder will be used to train our model, and `pets_tiny_test` will be used for testing of the model. The list of test images can be found in `flst.txt` file. Please 

If you are using Google Colab, modify and run the additional codeblock provided for you to mount the dataset. 

In [None]:
# setting the train and evaluation dataset directories. Change the below to your own path
train_path = '/home/soohan/4243/sample_images/pets_very_small'
test_path = '/home/soohan/4243/sample_images/pets_tiny_test'
imglist_path = '/home/soohan/4243/sample_images/flst.txt'

In [None]:
''' 
To run this code on co lab: 

add: import os

add: 
from google.colab import drive
drive.mount('/content/gdrive')
!ls

set the directory, e.g.:
"/content/gdrive/MyDrive/ANN/pets_very_small"
"/content/gdrive/MyDrive/ANN/flst.txt"

flst.txt file should be modified too

'''

## Part 2: Train our model

We first load data using the `tf.keras.utils.image_dataset_from_directory` utility. We split the images from the train directory into two, to be used for training of the model and validating its performance (note that it is not for testing the final performance of the model).

You can view the output from these datasets which we loaded.

In [None]:
image_size = (256,256)
batch_size = 16

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    train_path,
    validation_split=0.2,
    subset="training",
    seed=110,
    image_size=image_size,
    batch_size=batch_size,
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    train_path,
    validation_split=0.2,
    subset="validation",
    seed=110,
    image_size=image_size,
    batch_size=batch_size,
)

In [None]:
# showing the images 
#
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
    for i in range(4):
        ax = plt.subplot(2, 2, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(int(labels[i]))
        plt.axis("off") 

We add in some data augmentation which provides several benefits for training Convolutional Neural Networks (CNNs) such as model invariance (flipping an image horizontally can help the model recognize objects irrespective of their orientation (left or right). Similarly, rotations can help a model recognize objects that might be tilted in various ways), better generalizations (slight rotations, zooming, or changes in lighting conditions can make the model more robust to such natural variations when making predictions on new, unseen data) and more. You can search up on the details of these augmentations from the keras documentation.

In [None]:
# data augmentation, using horizontal flip, and random rotation 
# rotation factor is between 0 to 0.1*2pi 
# 
data_augmentation = keras.Sequential(
    [
        layers.experimental.preprocessing.RandomFlip("horizontal"),
        layers.experimental.preprocessing.RandomRotation(0.1),
    ]
)

In [None]:
# showing the rotated and flipped images that were added to the original dataset
#
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
    for i in range(4):
        augmented_images = data_augmentation(images)
        ax = plt.subplot(2,2, i + 1)
        plt.imshow(augmented_images[0].numpy().astype("uint8"))
        plt.axis("off")

We define our model here which is essentially a Convolutional Neural Network. If you are not familiar with CNNs, I would recommend reading this <a href="https://aigents.co/data-science-blog/publication/introduction-to-convolutional-neural-networks-cnns">article</a> and this fun <a href="https://setosa.io/ev/image-kernels/">playground</a> (full credits to their corresponding authors.

Some key points to help you understand some components below:
- Input() is used to instantiate a Keras tensor. It is more of a symbolic use rather than it meaning a mathematical operation - it's a way to define how the input data to the model should look.
- Batch normalization normalizes the activations of a layer to have zero mean and unit variance, helping to stabilize and accelerate training by reducing internal covariate shift.
- Residual refers to residual connections which allow gradients to "skip" layers by adding the original input to the output, helping to mitigate the vanishing gradient problem and enabling deeper networks to be trained more effectively. 

FYI: Prof Amir uses Separable Convolutions layers here instead of the normal Conv2D layers you might be used to. Just note that this is just a variant of Convolution Layers, and at your own time may read up more about it. One resource is this <a href="https://towardsdatascience.com/a-basic-introduction-to-separable-convolutions-b99ec3102728">link</a>. Understanding this specific layer is not the objective of today's lab.

In [None]:
def make_model(input_shape):
    inputs = keras.Input(shape=input_shape)
    # Image augmentation block
    x = data_augmentation(inputs)

    # Entry block
    x = layers.experimental.preprocessing.Rescaling(1.0 / 255)(x)
    x = layers.Conv2D(32, 3, strides=2, padding="same")(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation("relu")(x)

    x = layers.Conv2D(64, 3, padding="same")(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation("relu")(x)

    previous_block_activation = x  # Set aside residual
    for size in [128, 256, 512, 728]:
        
        x = layers.SeparableConv2D(size, 3, padding="same")(x)
        x = layers.BatchNormalization()(x)
        x = layers.Activation("relu")(x)
        
        x = layers.SeparableConv2D(size, 3, padding="same")(x)
        x = layers.BatchNormalization()(x)
        x = layers.Activation("relu")(x)
        
        x = layers.MaxPooling2D(3, strides=2, padding="same")(x)

        # Project residual
        residual = layers.Conv2D(size, 1, strides=2, padding="same")(
            previous_block_activation
        )
        x = layers.add([x, residual])  # Add back residual
        previous_block_activation = x  # Set aside next residual

    x = layers.SeparableConv2D(1024, 3, padding="same")(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation("relu")(x)

    x = layers.GlobalAveragePooling2D()(x)
    activation = "sigmoid"
    units = 1
   
    x = layers.Dropout(0.5)(x)
    x = layers.Dense(25, activation='relu')(x)
    outputs = layers.Dense(units, activation=activation)(x)
    return keras.Model(inputs, outputs)

As we have learnt from week 10 lab session, we define our model, `compile` to configure our ANN's learning process, and use the `fit` method to start the training process of our model.

In [None]:
model = make_model(input_shape=image_size + (3,) )

In [None]:
# compiling and training our model

epochs = 80
model.compile(
    optimizer=keras.optimizers.Adam(1e-3),
    loss="binary_crossentropy",
    metrics=["accuracy"],
)

In [None]:
# This is a handy function in Keras that lets you look at your model which you have just compiled. 
# Personally, looking at the output shape is particularly useful when you do CV

model.summary()

In [None]:
history = model.fit(
    train_ds, epochs=epochs, validation_data=val_ds,
)

In [None]:
# Accuracy curve
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

In [None]:
# Loss curve
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

In [None]:
# testing the classifier with some images  
flst = np.loadtxt(imglist_path, dtype=np.character) 
ddmm = len(flst)
tags = np.zeros( (1,ddmm) )
tags[:,27:ddmm]= 1
tags = np.int8( tags.T )

In [None]:
#
# the list of test files is in flst.txt, you may need to change the path 
# predict() function is employed , each test image is preprocessed the way train images had been
# 
predct = []
for i in flst:
    i = i.decode('utf-8')
    ### If your image directories are note in the same place as your notebook, you can use this c:
    # i = i.replace('../pets_tiny_test', test_path)
    ###
    img = keras.preprocessing.image.load_img( i , target_size=image_size )
    img_array = keras.preprocessing.image.img_to_array(img)
    img_array = tf.expand_dims(img_array, 0)  # Create batch axis

    predictions = model.predict(img_array, verbose=0)
    score = predictions[0]
    print( i, 
        " is %.2f percent cat and %.2f percent dog."
        % (100 * (1 - score), 100 * score)
    )
    predct.append( np.round(score) )

In [None]:
predct = np.int8( np.array(predct) )
sscc = np.sum(abs(tags-predct))
print('Number of correct classification =' , ddmm-sscc , ' out of ', ddmm , ' means ', round((ddmm-sscc)/ddmm,3) )


In [None]:
# if you wanna save your model: later you can load it using 'load_model' instruction
#model.save('path_to_model')

## Part 3: Try these suggested changes to your model and compare the validation performance!

 1. set the base model parameters to image size = 256x256, epochs=100 , dropout=0.5
 2. train and test the model. see the performance, training, validation, and testing accuracy
 3. go to "make_model" function cell
 4. change all the 'relu' activation function to 'sigmoid'
 5. run the program, train and test it and see the performances. better? same? worse?
 6. bring the activation functions back to the 'relu'
 7. at the end of "make_model" function, find this lines and make that line a comment: 
    ```
    x = layers.Dropout(0.5)(x)
    x = layers.Dense(25, activation='relu')(x)
    outputs = layers.Dense(units, activation=activation)(x)
    return keras.Model(inputs, outputs)
    ```
        change to:
    ```
    x = layers.Dropout(0.5)(x)
    # x = layers.Dense(25, activation='relu')(x)
    outputs = layers.Dense(units, activation=activation)(x)
    return keras.Model(inputs, outputs)
    ```
     This means that we are going to have just 1 output neuron and no fully-connected classification layer 
 8. run the model and see the performance. Compared to the base model, is it better? same? worse?
 9. bring back all the modifications 
 10. at "make_model" function, find this lines and make that line a comment: 
    ```
    previous_block_activation = x  # Set aside residual
    for size in [128, 256, 512, 728]:
        x = layers.Activation("relu")(x)
        x = layers.SeparableConv2D(size, 3, padding="same")(x)
        x = layers.BatchNormalization()(x)
    ```
       change to:
    ```
    previous_block_activation = x  # Set aside residual
    for size in [128, 256, 512, 728]:
        #x = layers.Activation("relu")(x)
        #x = layers.SeparableConv2D(size, 3, padding="same")(x)
        #x = layers.BatchNormalization()(x)
    ```
    
 it means that we are going to remove one of the convolution modules
 11. run the model and see the performance. Compared to the base model, is it better? same? worse?


You may run the training multiple times to be sure of your results!