# Session 5

We will use in this session Convolutional Neural Networks to train image dataset. You will see that this model provides much higher level of accuracy compared with the classical Neural Networks.

## Learning Objectives:
* Understand Convolution 
* Apply Convolutional Neural Networks to extract more features
* Use Google Colab service to train networks on cloud
* Bonus: Face detection example


---
## Understanding Convolution

Convolution is basically a function that aims to extract more features from the data without the use of feature engineering by human. Convolution consists of kernels that are applied on window of pixels or data points, and extracts the relationship between these data points. 

To visualize this process in Python, [Convolution Notebook](./Python/Convolution.ipynb) contains a sample program that loads a random image from dataset folder, applies several hand written 3x3 kernels, and displays them.

Internally, the learning model automatically calculates these kernels, and typically, there are hundreds of them within the model. 


---

## Applying CNN 

The basic architecture to create model doesn't differ much from how you previously used to create a model. There are mainly three new layers to be used:
* Conv2D: Applies Convolution on 2D data (such as images). 
* MaxPooling2D: Reduces the output size of previous layer. Mainly used after Conv2D layers
* Flatten: Used just before we add Dense layers to the model. It converts the 3D output from Convolutional and MaxPooling layers to 1D vector. For example, (28,28,3) image will be flatten to 28x28x3=2352 1D vector size.

The input for CNN differs from Neural Networks that it should take 3D input instead of 1D input. Since we are using images, the input for the model should be (WxHxD):
* W: Width of the input images
* H: Height of the input images
* D: Depth of input images (or number of channels). For example, D=1 for grayscale images, or 3 for RGB images.


In [7]:
from tensorflow.keras import models
from tensorflow.keras import layers


IMAGE_WIDTH=64
IMAGE_HEIGHT=64
IMAGE_DEPTH=3 #RGB

model=models.Sequential()
model.add(layers.Conv2D(filters=32,  #Number of kernels to be created for this layer
                        kernel_size=(3,3), #kernel size, usually 2x2 or 3x3
                        input_shape=(IMAGE_WIDTH,IMAGE_HEIGHT,IMAGE_DEPTH), #Input size
                        name="Input_Conv",
                        activation='relu')) #as usual, activation function
model.add(layers.MaxPool2D())

#next layers don't need to specify the input_shape, it will be automatically calculated
model.add(layers.Conv2D(filters=64,  #Number of kernels to be created for this layer
                        kernel_size=(3,3),
                        name="Hidden_Conv_1",
                        activation='relu'))
model.add(layers.Conv2D(filters=64,  #Number of kernels to be created for this layer
                        kernel_size=(3,3),
                        name="Hidden_Conv_2",
                        activation='relu'))
model.add(layers.MaxPool2D())


#next layers don't need to specify the input_shape, it will be automatically calculated
model.add(layers.Conv2D(filters=128,  #Number of kernels to be created for this layer
                        kernel_size=(3,3), 
                        name="Hidden_Conv_3",
                        activation='relu'))
model.add(layers.MaxPool2D())


#next layers don't need to specify the input_shape, it will be automatically calculated
model.add(layers.Conv2D(filters=128,  #Number of kernels to be created for this layer
                        kernel_size=(3,3), 
                        name="Hidden_Conv_4",
                        activation='relu'))
model.add(layers.MaxPool2D())

model.add(layers.Flatten()) #the magical layer to flatten the output of previous layers
model.add(layers.Dense(128,activation='relu',name="Dense_1")) #our NN layer
model.add(layers.Dense(64,activation='relu',name="Dense_2")) #our NN layer
model.add(layers.Dense(2,activation='sigmoid',name="Output"))# output layer of the model

model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['acc'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
Input_Conv (Conv2D)          (None, 62, 62, 32)        896       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 31, 31, 32)        0         
_________________________________________________________________
Hidden_Conv_1 (Conv2D)       (None, 29, 29, 64)        18496     
_________________________________________________________________
Hidden_Conv_2 (Conv2D)       (None, 27, 27, 64)        36928     
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 13, 13, 64)        0         
_________________________________________________________________
Hidden_Conv_3 (Conv2D)       (None, 11, 11, 128)       73856     
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 5, 5, 128)         0         
__________

In [17]:
#this will explort a model plot to an image file
#to run this command, you will need to install both pydot and graphviz
#conda install -c anaconda graphviz
#conda install -c anaconda pydot

from tensorflow.keras.utils import plot_model
plot_model(model, to_file='images/model.png',show_shapes=True)



The result of the model architecture looks like the following:
<img src="./Images/model.png" width="30%">


And example to use Convolution is on [MNIST handwritten digits](http://localhost:8888/notebooks/5.%20CNN/Python/CNN_MNIST.ipynb). It achives training and validation accuracy of 99~100% using CNN!

One important point you will notice when training CNN is how slow the process is. To address that, you will either need a good GPU installed (and tensorflow is configured to use it), such as GTX 1060 or above. Or to use cloud based training such as Google Colab.


---
## Google Colab

Google provides a free service (https://colab.research.google.com/notebook) that allows you to run python on the cloud. It contains also tensorflow and keras frameworks already preinstalled for you to use.

One important thing to know, that each time you open google colab space, it will create a new virtual session. So all calculations will be removed if you refreshed the page or closed it. Its important to save your trained model to a file and download it to your PC.


This shared notebook contains an example to train a model from dataset using Convolutional Neural Networks:
https://drive.google.com/open?id=1a1opqViWFcu6r9WOQaznDOgMBZgfoVfF

After the model is trained and exported to a file __\[\*\]__ , you can load it as in the [CatsDogs_Load example](./Python/CatsDogs_Load.ipynb).

__\[\*\] Important Note:__ Currently, the tensorflow version used by Google Colab and the one we are using is slightly different, thus the exported file (yaml) needs to be manually edited as follow:
* Open the yaml file using text editor (such as Atom).
* Remove the keyword "__layers:__" in the fourth line
* Remove the one before last line, usually like: __name: sequential_3__
* Replace in all file keyword: __GlorotUniform__ to __VarianceScaling__

## Bonus: Face Detection

Here is an exmaple for using face detection module provided by OpenCV. It can be used for building datasets of faces

https://drive.google.com/open?id=1yBeXoyu22Ne8gipNG_pmkODBq5iPfdIN