## Convolutional Neural Network

* For images.

* How it works:
    * Input Image -> CNN -> Output Label (Image Class)

* How computer reads images:
    * Digital representation
    * B/W (2D Array) - 0 will be completely black pixels & 255 will be completely white pixel; Gray scale in between.
    * Coloured (3D Array) - Pixel has 3 values assigned to it, each between 0 & 255
        * Find out the colour by combining  the values
        * Red Channel, Green Channel, Blue Channel
        
<img src='../../resources/deep_learning/cnn/cnn1.png' />

* For example, very simply (Assume 0 is white & 1 is black):
<img src='../../resources/deep_learning/cnn/cnn2.png' />

* Steps:
1. Convolution
2. Max Pooling
3. Flattening
4. Full Connection

### 1. Convolution

* Function:
    * Basically combine integration of two functions
<img src='../../resources/deep_learning/cnn/cnn3.png' />

* In simplified terms with pixels only in 0 & 1, and feature detector 3x3 (doesn't have to be 3x3):
    * Feature Detector can be called: Kernel/ Filter
    * Convolution operation is signified by a X in a circle.
    * Take the filter & put on the image, and multiple each value (element-wise) & add up to give a Feature Map
    * Feature Map can be known as Convolved Feature/ Activation Map.
    
<img src='../../resources/deep_learning/cnn/cnn4.png' />
<img src='../../resources/deep_learning/cnn/cnn5.png' />
<img src='../../resources/deep_learning/cnn/cnn6.png' />

* Reduced the size of the image - This step is to make the image smaller to process faster
* Convolutional Layer: Create multiple feature maps to get first convolution layer
    * We use different filters, use certain features & not just one.
    * To get feature A, we use feature detector A, to get B, we use another detetor.

<img src='../../resources/deep_learning/cnn/cnn7.png' />

* For photo filters, we use different feature detector (eg. sharpen) to get different feature map of the image (eg. sharpened image).

### 1(B) ReLU Layer

* For the Convolutional Layer, we apply Rectifier to increase non-linearity in the image.
    * This is because images themselves are highly non-linear.
    * But when we apply mathematical operation like convolution, we might risk increasing linearity, so we need to break them up.
<img src='../../resources/deep_learning/cnn/cnn8.png' />

* For example: From a image to black & white, then takes out the black part.
    * From white to gray, the next step will be black - linear concept.
    * With this, introduce non-linearity.
<img src='../../resources/deep_learning/cnn/cnn11.png' />
<img src='../../resources/deep_learning/cnn/cnn9.png' />
<img src='../../resources/deep_learning/cnn/cnn10.png' />

### Max Pooling

* We want Neural Networks to recognize:
    * We want to make sure Neural Network to recognize the image from ALL angles - eg. cheetahs at all angles & lights
    * Pooling - to make sure our Neural Network has flexibility to understand it.
    
<img src='../../resources/deep_learning/cnn/cnn12.png' />

* We take a box of 2x2 (doesn't have to be), find the maximum value in that box, and note that value.
    * We still able to preserve the feature & accounts for possible distortion & reducing the size, thus reducing number of parameters -> preventing overfitting.
    * Because we are taking the max pooling, for the eyes feature at a certain position, we still are getting the same pool feature max.
    
<img src='../../resources/deep_learning/cnn/cnn13.png' />
<img src='../../resources/deep_learning/cnn/cnn15.png' />
<img src='../../resources/deep_learning/cnn/cnn14.png' />

<img src='../../resources/deep_learning/cnn/cnn16.png' />

### Flattening

* After flattening, will then be our input layer of Artificial Neural Network.
<img src='../../resources/deep_learning/cnn/cnn17.png' />
<img src='../../resources/deep_learning/cnn/cnn18.png' />
<img src='../../resources/deep_learning/cnn/cnn19.png' />

### Full Connection

* Adding a whole Artificial Neural Network to Convolutional Neural Network.

<img src='../../resources/deep_learning/cnn/cnn20.png' />

* Hidden Layer called Fully Connected Layer as it is a more specific type of hidden layer.
    * In Artificial Neural Network, hidden layers don't have to be fully connected.
    * In Convolutional Neural Network, we use fully connected hidden layer.
    
* After flattening, we have features that can already perform classifications. But we use ANN to make it even better.

<img src='../../resources/deep_learning/cnn/cnn21.png' />

* Weights (Synapses, the blue line) is adjusted during back propagation.

* How do two neurons at output layers work?
    * Find out which of the important neurons are for the Dog.
    * Same for the Cat.
    * Seeing which Neuron is fired up depending on the input image.
    
<img src='../../resources/deep_learning/cnn/cnn22.png' />
<img src='../../resources/deep_learning/cnn/cnn23.png' />

**Note:** One output is for predicting numerical value. In this case, we have 2 outputs.

### Summary

1. Input Image
2. Apply multiple feature detectors to get multiple feature maps.
3. Then, at convolutional layer, we apply ReLU (Rectified Linear Unit) to remove any linearity.
4. Apply pooling layer to convolutional layer (make sure the flexibility to detect the image & reduce the size & avoid overfitting & preserved the main features).
5. Flattened all the pooled images
6. Input into Artificial Neural Networks.

<img src='../../resources/deep_learning/cnn/cnn24.png' />

### Softmax & Cross-Entropy

* Normally, the output don't add up to one, but with Softmax function applied, they add up to 1.
<img src='../../resources/deep_learning/cnn/cnn25.png' />

* Softmax comes hand in hand with Cross-Entropy function:
    * Different version of cross-entropy function, but gives the same result.
<img src='../../resources/deep_learning/cnn/cnn26.png' />

### Building Convolutional Neural Network

In [1]:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator

In [2]:
tf.__version__

'2.11.0'

### Data Preparation

**Preprocessing on training sets**

* Transformations on the images too avoid overfitting
* Technical terms = Image Augmentation

In [5]:
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

In [6]:
train_set = train_datagen.flow_from_directory( # Connects Image Augmentation tool to the training set
        '../../codes_datasets/cnn_dataset/training_set',
        target_size=(64, 64),
        batch_size=32,
        class_mode='binary')

Found 8000 images belonging to 2 classes.


**Preprocessing on testing sets**

* Rescale the testing match the scaling 

In [7]:
test_datagen = ImageDataGenerator(rescale=1./255)

In [8]:
test_set = test_datagen.flow_from_directory(
        '../../codes_datasets/cnn_dataset/test_set',
        target_size=(64, 64),
        batch_size=32,
        class_mode='binary')

Found 2000 images belonging to 2 classes.


### Building the CNN

**Initialzing the CNN**

In [9]:
cnn = tf.keras.models.Sequential() # Initialize cnn as a sequence instead of computational graph

**Step 1: Convolution**

In [10]:
# Dense class for fully connected layer
# Filters specify the number of filters we want, classic architecture = 32
# Kernel size specifies the number of rows & columns of filters/ feature detector
# As we reshaped our images, we need to adjust input shape
#  coloured images = (x, x, 3), grayscale iamges = (x, x, 1)

cnn.add(tf.keras.layers.Conv2D(filters=32,
                               kernel_size=3,
                               activation='relu',
                               input_shape=[64, 64, 3])) 

**Step 2: Pooling**

In [11]:
# We apply max pooling in this case
# 2 by 2 pooling filter, strides = shift every two pixels of the filter
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

**Adding second convolutional layer**

In [12]:
cnn.add(tf.keras.layers.Conv2D(filters=32,
                               kernel_size=3,
                               activation='relu')) # Input shape is only added when first add the images to the layer

cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

**Step 3: Flattening**

In [13]:
cnn.add(tf.keras.layers.Flatten())

**Step 4: Full Connection**

In [14]:
# units = number of hidden neurons
# Recommend for classification to use relu before reaching the output layer
cnn.add(tf.keras.layers.Dense(units=128, activation='relu'))

**Step 5: Output Layer**

In [15]:
# Output neurons = 1
# Sigmoid to give probability of predicted class
cnn.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

### Training the CNN

**Compiling the CNN**

In [16]:
# 'adam' optimizer to perform Stochastic Gradient Descent
# Entropy loss for loss function as we are doing binary classification

cnn.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

**Training & Evaluating**

In [18]:
# Evaluates at the same time
cnn.fit(x=train_set, validation_data=test_set, epochs=25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<keras.callbacks.History at 0x1dc4f239900>

### Making a single prediction

In [22]:
import numpy as np
from tensorflow.keras.utils import load_img, img_to_array

# Needs to be in the same size
test_image = load_img('../../codes_datasets/cnn_dataset/single_prediction/cat_or_dog_1.jpg', target_size=(64, 64))

# Convert PIL format into 2D array
test_image = img_to_array(test_image)

# We trained our cnn with batches, so also needs to be in the same format
# Adding dimensions, axis=0 to make sure we are adding to the first dimension
test_image = np.expand_dims(test_image, axis=0)

result = cnn.predict(test_image)
train_set.class_indices # Get the right class indices, dog=1, cat=0

# Inside the first batch, then first image
if result[0][0] == 1:
    prediction = 'dog'
else:
    prediction = 'cat'



In [23]:
print(prediction)

dog
