# Convolutional Neural Network

## Dataset

### Layout

* Images:
	* Dog
    * Cat
* 1000s of images
	* Training set
	    * 4000 images each for dogs and cats
    * Test set
        * 1000 images each for dogs and cat

### Goals

* Build a CNN model to classify images for a dog or cat

---

## Import Libraries

In [18]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [19]:
tf.__version__

'2.20.0'

---

## Data Preprocessing

### Preprocessing Training Set

#### Image Augmentation

* Transformations will be applied to images only in the training set
    * This is to avoid over fitting
    * Otherwise, there will be a huge difference in accuracy of the training and test sets:
        * Close to $98\%$ accuracy on the training set
        * Much lower accuracy on the test set
* Geometric transformations are applied to the training set images:
    * For example, zoom, rotations, etc. on the images
    * First, transvections to shift pixels
    * Next, rotations with horizontal flips and zoom-in-and-out
* These transformations are called **image augmentation**
* The goal is to augment the diversity of the training set images

#### Image Generation

* The `ImageDataGenerator` class from the `preprocessing.image` module of the Keras library is used to generate images and perform image augmentation
    * Parameters
        * `rescale` applies feature scaling to each pixel
            * Each pixel has a value between $0$ and $255$
            * Each pixel value is divided by $255$
            * This will normalize the pixel values in images
        * `shear_range` randomly applies a range of shear transformations that shift pixels by a specified position to images
        * `zoom_range` randomly applies a range of zooming to images
        * `horizontal_flip` indicates a value to flip images horizontally
* The `train_datagen` variable defines an instance of the `ImageDataGenerator` class
* The training dataset will be imported from the dataset directory containing the directory of training set images
    * The `flow_from_directory` method on the `ImageDataGenerator` class recursively loads images from a specified directory
        * Parameters
            * `directory` is the root directory of images to process
            * `target_size` performs image resizing by specifying the target pixel size of the images
                * This makes the image processing less computationally intensive
            * `batch_size` indicates the number of images to process per batch
            * `class_mode` specifies the classification mode: `binary` or `categorical`
* The `training_set` variable is the training set of the images

In [20]:
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
)

In [21]:
training_set = train_datagen.flow_from_directory(
    directory='dataset/training_set',
    target_size=(64, 64),
    batch_size=32,
    class_mode='binary',
)

Found 8000 images belonging to 2 classes.


### Preprocessing Test Set

* Image augmentation is not applied to test set images so those parameters are omitted

In [22]:
test_datagen = ImageDataGenerator(rescale=1. / 255)
test_set = test_datagen.flow_from_directory(
    directory='dataset/test_set',
    target_size=(64, 64),
    batch_size=32,
    class_mode='binary',
)

Found 2000 images belonging to 2 classes.


---

## Build CNN

### Initialize CNN

* The `Sequential` class is from the `models` module of the Keras library and allows one to construct a neural network made of a sequence of layers
* The `cnn` variable is an object instance of the `Sequential` class

In [23]:
cnn = tf.keras.models.Sequential()

### Step 1 - Convolution

* The `Conv2D` class is from the `layers` module of the Keras library and creates a convolutional layer
    * Parameters
        * `filters` is the number of feature detectors used to perform convolution
        * `kernel_size` is the size of the array (matrices as row and columns) for a feature detector
        * `strides` is the number of pixels to move a feature detector across and up/down an image. Default value of $1$ will be used.
        * `activation` is the activation function to apply after convolution
        * `input_shape` is the dimensions of an input image in the RGB 3 dimensions of color:
            * `width` = $64$ pixels
            * `height` = $64$ pixels
            * `colors` = $3$ indicates colored images. $0$ = black and $1$ = white.

In [24]:
cnn.add(
    tf.keras.layers.Conv2D(
        filters=32,
        kernel_size=(3, 3),
        activation='relu',
        input_shape=(64, 64, 3)
    )
)

### Step 2 - Pooling

* The `MaxPoolConv2D` class is from the `layers` module of the Keras library and creates a pooling layer using max pooling
    * Parameters
        * `pool_size` is the size of the array (matrices as row and columns) of the pool window (frame)
        * `strides` is the number of pixels to move the pool window across and up/down an image

In [25]:
cnn.add(
    tf.keras.layers.MaxPooling2D(
        pool_size=(2, 2),
        strides=(2, 2)
    )
)

### Add 2nd Convolutional Layer

* The `input_shape` parameter is omitted because is the shape is only defined on the first convolutional layer in the CNN

In [26]:
cnn.add(
    tf.keras.layers.Conv2D(
        filters=32,
        kernel_size=(3, 3),
        activation='relu',
    )
)

In [27]:
cnn.add(
    tf.keras.layers.MaxPooling2D(
        pool_size=(2, 2),
        strides=(2, 2)
    )
)

### Step 3 - Flattening

* The `Flatten` class is from the `layers` module of the Keras library and creates a flattening layer
    * No parameters are required for an instance of this class

In [28]:
cnn.add(
    tf.keras.layers.Flatten()
)

### Step 4 - Full Connection

* The `Dense` class is from the `layers` module of the Keras library and allows one to add a fully connected layer to an ANN
    * An object instance of this class is used to construct a connected layer
    * Parameters
        * `units` defines the number of hidden neurons in a hidden layer
            * $128$ neurons are used to achieve a higher level of accuracy
        * `activation` defines the activation function
            * Rectifier activation function (`relu`) will be used for hidden layers

In [29]:
cnn.add(
    tf.keras.layers.Dense(
        units=128,
        activation='relu'
    )
)

### Step 5 - Output Layer

* The same `Dense` class is used to construct the output layer except it:
    * Has $1$ neuron
        * Since there is only $1$ dependent variable
    * Has Sigmoid activation function
        * Used when making predictions that are binary for classification

In [30]:
cnn.add(
    tf.keras.layers.Dense(
        units=1,
        activation='sigmoid'
    )
)

---

## Training CNN

### Compile CNN

* The `compile` method of `Sequential` class compiles the neural network
    * Parameters
        * `optimizer` defines the algorithm used to minimize loss
            * Calculates the gradient of the loss function
            * Updates the weights by moving in the direction of the negative gradient
            * Optimizer will eventually converge on the global minimum, which is the acceptable level of error
            * `adam` is a very performant optimizer that performs stochastic gradient descent (SGD)
        * `loss` defines the algorithm to compute the loss function value, which is the difference between predictions and actual values
            * When making binary predictions, for classification, use the `binary_crossentropy` loss function
            * When making one or two categorical predictions, for classification, use the `categorical_crossentropy` loss function
            * When making continuous number predictions, for regression, use the `mean_squared_error` loss function
        * `metrics` defines the list of metrics to be evaluated by the model during training and testing

In [31]:
cnn.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

### Train CNN on Training Set and Evaluating on Test Set

* The `fit` method of `Sequential` class trains the neural network on the training set and validates it using the test set
    * Parameters
        * `x` defines the matrix of features for the training set
        * `validation_data` defines the dataset used for validation of the model
            * The validation dataset is used to monitor the training and make adjustments to the neural network parameters if needed
            * This is a way to prevent overfitting
            * The validation dataset is not used to make predictions
        * `epochs` defines the number of full iterations of a dataset
            * A neural network must be trained over $n$ number of epochs to improve its accuracy over time

In [32]:
cnn.fit(
    x=training_set,
    validation_data=test_set,
    epochs=25
)

Epoch 1/25
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 36ms/step - accuracy: 0.5444 - loss: 0.6932 - val_accuracy: 0.6275 - val_loss: 0.6466
Epoch 2/25
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 37ms/step - accuracy: 0.6548 - loss: 0.6297 - val_accuracy: 0.6960 - val_loss: 0.5921
Epoch 3/25
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 37ms/step - accuracy: 0.6801 - loss: 0.5909 - val_accuracy: 0.7200 - val_loss: 0.5658
Epoch 4/25
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 37ms/step - accuracy: 0.7165 - loss: 0.5567 - val_accuracy: 0.7180 - val_loss: 0.5536
Epoch 5/25
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 37ms/step - accuracy: 0.7339 - loss: 0.5263 - val_accuracy: 0.7495 - val_loss: 0.5226
Epoch 6/25
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 36ms/step - accuracy: 0.7641 - loss: 0.4934 - val_accuracy: 0.7620 - val_loss: 0.5006
Epoch 7/25
[1m250/25

<keras.src.callbacks.history.History at 0x11f755f70>

---

## Make Predictions

### Import Image Preprocessing Module

In [33]:
from keras.preprocessing import image

### Make Single Prediction

* The `load_img` function of the `image` module of the Keras library loads an image from a specified file path
    * Parameters
        * `path` is the file path to the image to load
        * `target_size` performs image resizing by specifying the target pixel size of the image

#### Load Image

In [40]:
test_image = image.load_img(
    path='dataset/single_prediction/cat_or_dog_1.png',
    target_size=(64, 64)
)

#### Convert PIL Image Format to Image Array

* A "PIL image" refers to an image object created and manipulated using the Python Imaging Library (PIL), or more commonly, its actively maintained fork, Pillow
* The `img_to_array` function of the `image` module of the Keras library converts an image to a Numpy array
    * Parameters
        * `img` is the image object to convert
    * The function returns a 2D Numpy array

In [41]:
test_image = image.img_to_array(img=test_image)

#### Add Batch Dimension to Image Array

* The CNN model was trained using batches of images
* To make a single prediction, the image must be added to a batch of images
    * This is done by adding a batch dimension to the image array
* The `expand_dims` function of the Numpy library adds a batch dimension to a Numpy array
    * Parameters
        * `a` is the image array in which to add a batch dimension
        * `axis` is the index in which add the batch dimension
            * It is the first dimension since batches are first in the ordering of objects used to train the CNN model

In [42]:
test_image = np.expand_dims(a=test_image, axis=0)

#### Make Prediction

In [43]:
result = cnn.predict(test_image)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step


#### Display Classes

In [45]:
training_set.class_indices

{'cats': 0, 'dogs': 1}

#### Display Result

* The `result` object contains
    * The batch dimension at index $0$
    * Then the predicted value at index $0$

In [48]:
if result[0][0] == 1:
    prediction='Dog'
else:
    prediction='Cat'

In [49]:
print(prediction)

Dog
