
# TP CNN
### Diane LINGRAND 

diane.lingrand@univ-cotedazur.fr   
Polytech - SI4 - 2021

## Introduction

In [None]:
from IPython.display import Image
import tensorflow as tf
print(tf.__version__)
import tensorflow.keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Activation
import matplotlib.pyplot as plt


**The GPU**



To enable GPU backend in Google colab for your notebook:

1.   Runtime (top left corner) -> Change runtime type
2.   Put GPU as "Hardware accelerator"
3.   Save.

Or run the next cell:

In [None]:
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
    raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))


## Convolutional Neural Networks (CNN)

Derived from the MLP, a convolutional neural network (CNN) is a type of artificial neural network that is specifically designed to process **pixel data**.  The layers of a CNN consist of an **input layer**, an **output layer** and **hidden layers** that can include **convolutional layers**, **pooling layers**, **fully connected layers** and **normalization layers**. It exists a lot of techniques to optimize CNN, like for example the dropout.

### Loading the dataset
In this part, we will use photographies of animals from the kaggle dataset [animals-10](https://www.kaggle.com/alessiocorrado99/animals10). Please connect to their site before loading the dataset from this [zip file](http://www.i3s.unice.fr/~lingrand/raw-img.zip). Decompress the zip file on your disk.

If you are using google colab, there is no need to download the dataset because I have a copy on my drive. You just need add to your drive this shared folder: https://drive.google.com/drive/folders/15cB1Ky-7OTUqfcQDZZyzc5HArt0GA6Sm?usp=sharing
You need to click on the link and click on "Add shortcut to Drive" and then select "My Drive".

In [None]:
from google.colab import drive
drive.mount('/content/drive/')

To feed the data to a CNN, we need to shape it as required by Keras. As input, a 2D convolutional layer needs a **4D tensor** with shape: **(batch, rows, cols, channels)**. Therefore, we need to precise the "channels" axis, which can be seen as the number of level of color of each input: 3 channels in our case. We will fix the dimension of images according to the VGG-16 network: (224, 224).


In [None]:
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, MaxPooling2D, Flatten
from sklearn.metrics import confusion_matrix, plot_confusion_matrix, f1_score
import tensorflow.keras
from tensorflow.keras.callbacks import EarlyStopping
import numpy as np
import glob
# when processing time is long, it's nice to see the progress bar
#!pip install tqdm
from tqdm import tqdm

### loading train data

Please read the code before running any of the cells!

In [None]:
#datasetRoot='/home/lingrand/Ens/MachineLearning/animals/raw-img/'
#datasetRoot='/whereYouPutTheImages/'
datasetRoot='/content/drive/My Drive/raw-img/'
# I suggest to reduce the number of classes for a first trial. 
# If you finish this notebook before the end of the course, you can add more classes (and images per class).
classes = ['mucca', 'elefante', 'gatto', 'cavallo', 'scoiattolo', 'ragno', 'pecora', 'farfalla', 'gallina', 'cane']
nbClasses = len(classes)

#training data

rootTrain = datasetRoot+'train/'
classLabel = 0
reducedSizePerClass = 200 #in order to reduce the number of images per class
totalImg = nbClasses * reducedSizePerClass
xTrain = np.empty(shape=(totalImg,224,224,3))
yTrain = []
first = True
i= 0
for cl in classes:
    listImages = glob.glob(rootTrain+cl+'/*')
    yTrain += [classLabel]*reducedSizePerClass #len(listImages) # note that here ...
    for pathImg in tqdm(listImages[:reducedSizePerClass]): # and here, we have reduced the data to be loaded (only 1000 per class)
        img = image.load_img(pathImg, target_size=(224,224))
        im = image.img_to_array(img)
        im = np.expand_dims(im, axis=0)
        im = preprocess_input(im)
        xTrain[i,:,:,:] = im
        i += 1
    classLabel += 1
print(len(yTrain))
print(xTrain.shape)
yTrain = tensorflow.keras.utils.to_categorical(yTrain, nbClasses)


**[TO DO - Students] What is the dimension of xTrain ? What do those dimensions represent ?**


**[TO DO - Students] Complete the following code to plot a few training images**


In [None]:
import matplotlib.pyplot as plt

square = 8
ix = 1
fig, axs = plt.subplots(square, square, figsize=(20, 20))
for i in range(square):
    for j in range(square):
        # specify subplot and turn of axis
        ax = axs[i,j]
        ax.set_xticks([])
        ax.set_yticks([])
        im = xTrain[ix][:,:,...]
        ax.imshow(...)
        ix += 1

In order to speed-up the time spent on this part of the lab, you may have noticed that we reduced the number of classes and the number of images per class. You can change these few lines of code if you want to work on the whole dataset.

### loading test data

In [None]:
#you need to use the same classes for the test dataset than for the train dataset
rootTest = datasetRoot+'test/'
classLabel = 0

totalTestImg = 0
for cl in classes:
    totalTestImg += len(glob.glob(rootTest+cl+'/*'))

print("There are ",totalTestImg, " images in test dataset.")
xTest = np.empty(shape=(totalTestImg,224,224,3))
yTest = []
i = 0

for cl in classes:
    listImages = glob.glob(rootTest+cl+'/*')
    yTest += [classLabel]*len(listImages)
    for pathImg in listImages:
        img = image.load_img(pathImg, target_size=(224, 224))
        im = image.img_to_array(img)
        im = np.expand_dims(im, axis=0)
        im = preprocess_input(im)
        xTest[i,:,:,:] = im 
    classLabel += 1
print(len(yTest))
print(xTest.shape)
yTest = tensorflow.keras.utils.to_categorical(yTest, nbClasses)

## Build your own CNN network

**[TO DO - Students] Start with the simplest CNN: 1 conv2D layer + 1 pooling + 1 dense layer. Fill the gaps and explain the parameters of the MaxPooling2D layer**

In [None]:
model = Sequential()
model.add(Conv2D(32,(3,3),padding='same',activation='relu', input_shape=...))
model.add(MaxPooling2D(pool_size=(4, 4), strides=4, padding='same'))
model.add(Flatten())
model.add(Dense(..., activation=...))
model.compile(optimizer='rmsprop',loss=..., metrics=['accuracy'])

Let's look at the dimension of all inputs and outputs:

In [None]:
model.summary()

**[TO DO - Students] Train and test this network.**

In [None]:
## Your code here

**[TO DO - Students] Plot the training metrics (loss and accuracy). Test the model on the test data and compare the confusion matrix on the test data and train data**

In [None]:
# Plot history
f, (ax1, ax2) = plt.subplots(1,2)
ax1.plot(history.history['loss'], label='train')
ax1.plot(history.history['val_loss'], label='val')
ax1.legend()
ax2.plot(history.history['accuracy'], label='train')
ax2.plot(history.history['val_accuracy'], label='val')
ax2.legend()
plt.show()

In [None]:
# for you !
score = model.evaluate(xTest,yTest)
print("%s: %.2f%%" % (model.metrics_names[1], score[1]*100))

ypred = np.argmax(model.predict(xTest), axis=1)
print("F1 score: ", f1_score(ypred,np.argmax(yTest,axis=1),average='micro'))

Visualize the confusion matrix on the test dataset for this model

In [None]:
y_pred = model.predict(xTrain)


In [None]:
from sklearn.metrics import ConfusionMatrixDisplay, confusion_matrix
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
le.fit(classes)

cm = confusion_matrix(le.inverse_transform(np.argmax(yTrain, axis=1)), 
                      le.inverse_transform(np.argmax(y_pred, axis=1)), 
                      labels=classes)
disp = ConfusionMatrixDisplay(confusion_matrix=cm,
                              display_labels=classes)
disp.plot()

How is the accuracy or F1-measure on the test dataset?

Are you satisfied by the performances?

Try to modify the architecture (add layers) and some of the parameters.

### About Dropout 

*Study this part only if you have time for it. It concerns the previous network but prefer to study first part II and come back here after.*

Simply put, dropout refers to ignoring units (i.e. neurons) during the training phase of certain set of neurons which is chosen at random. By “ignoring”, I mean these units are not considered during a particular forward or backward pass.

Why use dropout ? A fully connected layer occupies most of the parameters, and hence, neurons develop co-dependency amongst each other during training which curbs the individual power of each neuron leading to overfitting of training data.

**Let's add dropout and activation functions to the network!**

In [None]:
from tensorflow.keras.layers import Dropout

model = Sequential(name='MLP model with dropout') 

model = Sequential()
model.add(Conv2D(256,(3,3),activation='relu',input_shape=(224,224,3)))
model.add(GlobalAveragePooling2D())
model.add(Dense(200,activation='relu'))
# adding dropout to the previous layer
model.add(Dropout(0.2))

model.add(Dense(nbClasses, activation='softmax'))

model.compile(optimizer='rmsprop',loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

**[TO DO - Students] Plot the training metrics (loss and accuracy). Test the model on the test data and compare the confusion matrix on the test data and train data**

## Using a pre-learned network

### loading VGG-16 description part and adding layers to build our own classification network

In [None]:
VGGmodel = VGG16(weights='imagenet', include_top=False)
#features = VGGmodel.predict(xTrain)
#print(features.shape)
VGGmodel.summary()

**[TO DO - Students] What is the goal of the include_top=false parameter and adapt the model to our classification model by filling the gaps of the following cell**

In [None]:
# we will add layers to this feature extraction part of VGG network
m = VGGmodel.output
# we start with a global average pooling
m = GlobalAveragePooling2D()(m)
# and add a fully-connected layer
m = Dense(1024, activation='relu')(m)
# finally, the softmax layer for predictions (we have nbClasses classes)
predictions = Dense(..., activation=...)(m)

# global network
model = Model(inputs=VGGmodel.input, outputs=predictions)

Can you display the architecture of this entire network?

In [None]:
model.summary()

**[TO DO - Students] What would happen if we ran model.fit now ? Make it so that the training will only train the new layers and train the model.**

In [None]:
## Your code here

Some classes are not predicted because we did not shuffle the data and every samples of some datasets are part of the validation set.

### fine-tune the network

Fine-tune the entire network if you have enough computing ressouces, otherwise, carefully choose the layers you want to fine-tune.

In [None]:
for i, layer in enumerate(VGGmodel.layers):
   print(i, layer.name)
model.summary()

In this example, we will fine-tune the last convolution block starting at layer number 15 (block5_conv).

In [None]:
from tensorflow.keras.optimizers import RMSprop
for layer in model.layers[:11]:
   layer.trainable = False
for layer in model.layers[11:]:
   layer.trainable = True
#need to recompile the network
model.compile(optimizer=RMSprop(learning_rate=0.0001), loss='categorical_crossentropy',metrics=['accuracy'])
#and train again ...
model.fit(xTrain, yTrain, epochs=20, batch_size=128, validation_split=0.2, callbacks=[ourCallback],verbose=1)

You already know how to evaluate the performances on the test dataset and display the confusion matrix. You can also modify the code that loads the test dataset in order to reduce it's size. Let's do it!

In [None]:
#enter here your code for evaluation of performances

You are now free to experiments changes in the network:
* add a dense layer
* modify the number of neurons in dense layer(s)
* change the global average polling
* add classes and data
* experiment other optimizers (SGD, Adam, ...)


...

## Visualizing the convolution filters

In this part, we'll visualize the convolution filters and their effect on the input for our previously trained model

**[TO DO - Students] What is the following code plotting ?**

In [None]:
layer_id = 1
# retrieve weights from the second hidden layer
filters, biases = model.layers[layer_id].get_weights()
# normalize filter values to 0-1 so we can visualize them
f_min, f_max = filters.min(), filters.max()
filters = (filters - f_min) / (f_max - f_min)
# plot first few filters
n_filters, ix = 6, 1
for i in range(n_filters):
	# get the filter
	f = filters[:, :, :, i]
	# plot each channel separately
	for j in range(3):
		# specify subplot and turn of axis
		ax = pyplot.subplot(n_filters, 3, ix)
		ax.set_xticks([])
		ax.set_yticks([])
		# plot filter channel in grayscale
		pyplot.imshow(f[:, :, j], cmap='gray')
		ix += 1
# show the figure
pyplot.show()

**[TO DO - Students] Now, let's visualize the feature maps of various depths. Fill the gaps to do so.**

In [None]:
# redefine model to output right after the first hidden layer
img = xTrain[...].reshape(...)

for layer in ['block1_conv2', 'block2_conv2', 'block3_conv1', 'block4_conv1', 'block5_conv1']:
    model_fm = Model(inputs=model.inputs, outputs=...)
    feature_maps = model_fm.predict(img)
    # plot all 64 maps in an 8x8 squares
    square = 6
    ix = 1
    fig, axs = plt.subplots(square, square, figsize=(10, 10))
    for i in range(square):
      for j in range(square):
        # specify subplot and turn of axis
        ax = axs[i,j]
        ax.set_xticks([])
        ax.set_yticks([])
        # plot filter channel in grayscale
        ax.imshow(feature_maps[0, :, :, ...], cmap='gray')
        ix += 1
    # show the figure
    fig.suptitle(layer, fontsize=20)
    plt.show()


## Activation maximization

Another solution to interpret the inner mecanisms of the network is to use Activation Maximization. This method computes the optimal output which gives the maximum value of a particular activation. Used on the classification layers, this can give us an idea of the patterns recognized to classify a particular class.

To do that we'll use the tf_keras_vis module.

In [None]:
[i for i in range(10)]

In [None]:
! pip install tf_keras_vis
from tf_keras_vis.activation_maximization import ActivationMaximization

In [None]:
def loss(output):
  return (output[0][0], output[1][1], output[2][2])

def model_modifier(m):
    m.layers[-1].activation = tensorflow.keras.activations.linear

visualize_activation = ActivationMaximization(model, model_modifier)

In [None]:
seed_input = tensorflow.random.uniform((3, 224, 224, 3), 0, 255)
activations = visualize_activation(loss, seed_input=seed_input, steps=512)
images = [activation.astype(np.float32) for activation in activations]

In [None]:
fig, axs = plt.subplots(1, 3, figsize=(20, 20))
for i in range(0, len(images)):
  ax = axs[i]
  visualization = images[i].reshape(224,224,3)
  visualization = (visualization - visualization.min())/(visualization.max()-visualization.min())
  visualization = visualization[:,:,[2,1,0]]
  ax.imshow(visualization)
