## Introduction

***Objective:*** Binary Clasification of Image data. To classify an image as belonging to one of the 2 classes: Dog or Cat

***Concepts:*** Convolutional Neural Network (CNN/ConvNet) implemented using Keras API of Tensorflow

***Level***: Beginner Friendly. It consolidates all my learning from various sources. Please feel free to Upvote if this notebook helps you. Such small little things do motivate us :) I will be adding more functionality to this notebook in the future.

## Table of Contents

- [Libraries and Utilities](#lib) 
- [Define Constants](#cons)
- [Load Data](#load)
- [Data Exploration](#exp)
- [Model Training](#train)
- [Callbacks](#call)
- [Data Augmentation](#aug)
- [Fit Model](#fit)
- [Model Evaluation](#eval)
- [Misclassified Images Analysis](#miss)

## Import Libraries
<a id= "lib"> </a>

In [None]:
import numpy as np
import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.utils import shuffle
import matplotlib.pyplot as plt
import seaborn as sns
import random
from tqdm import tqdm #to create progress bar
import cv2 
import os
print(os.listdir("../input"))

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense, Activation, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau

## Define Constants
<a id= "cons"> </a>

In [None]:
FAST_RUN = False
IMAGE_WIDTH=224
IMAGE_HEIGHT=224
IMAGE_SIZE=(IMAGE_WIDTH, IMAGE_HEIGHT)
IMAGE_CHANNELS=3 # 3 for Colored Images and 1 for Grayscale Images
BATCH_SIZE=32
EPOCHS= 5

if FAST_RUN:
    EPOCHS= 5

In [None]:
class_names = ['Fake',"Real"]
class_names_label = {class_name:i for i, class_name in enumerate(class_names)}

nb_classes = len(class_names)

In [None]:
class_names_label

## Load Data
<a id= "load"> </a>

In [None]:
train_data_dir = '/kaggle/input/deepfake-and-real-images/Dataset/Train'
test_data_dir = '/kaggle/input/deepfake-and-real-images/Dataset/Test'

In [None]:
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=15,
    shear_range=0.1,
    zoom_range=0.2,
    horizontal_flip=True,
    width_shift_range=0.1,
    height_shift_range=0.1
)

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(IMAGE_WIDTH, IMAGE_HEIGHT),
    batch_size=BATCH_SIZE,
    class_mode='categorical'
)
test_generator = test_datagen.flow_from_directory(
    test_data_dir,
    target_size=(IMAGE_WIDTH, IMAGE_HEIGHT),
    batch_size=BATCH_SIZE,
    class_mode='categorical'
)

The data contains consecutive images of dogs followed by images of cats together. Let us shuffle it.

## Explore Data
<a id= "exp"> </a>

Let us get the length of each dataset, counts across all categories etc.

### Plot Sample Image

### Some data-format related changes to make datasets comptabile with Keras

Using the method to_categorical(), `train_labels` which has categories represented by integers (1 for Dog, 0 for Cat) is converted into a matrix (eg for Dog: [0,1] and for Cat: [1,0]).<br>
Output: 
This function returns a matrix of binary values (either ‘1’ or ‘0’). It has number of rows equal to the length of the input vector and number of columns equal to the number of classes.

train_images is already in Keras accepted format (number_of_images, image width, image height, channels). So we do not need to do reshape it again. <br>
p.s. if number of images is unknown during the reshape operation for number_of_images parameter, put `-1` instead.

## Model Training
<a id= "train"> </a>

**Layers**<br>
In Keras Sequential API, we add one layer at a time, starting from the input layer.
- **Conv2D:** 2D convolution layer. When using this layer as the first layer in a model, provide the keyword argument input_shape, e.g. input_shape=(128, 128, 3) for 128x128 RGB pictures in data_format="channels_last". You can use None when a dimension has variable size.<br>
Other key arguments are `filters` and `kernel_size`. filters is the number of desired feature maps. kernel_size is the size of the convolution window. It can be an integer or tuple/list of 2 integers (eg 5 or (5,5) mean the same thing for kernel_size). This layer extracts features from an image (eg ear shape, eyes shape etc in case of a Cat image). 
- **MaxPooling2D:** Downsamples the input along its spatial dimensions (height and width) by taking the maximum value over an input window (of size defined by pool_size) for each channel of the input. The window is shifted by `strides` along each dimension.
- **Flatten:** Flattens the input, that is converts the final feature maps into a one single 1D vector.. Does not affect the batch size.
- **Dense:** Just your regular densely-connected NN layer.
- **Dropout:** This layer is used for Regularization. It randomly sets input units to 0 with a frequency of `rate` at each step during training time, which helps prevent overfitting. <br>
rate: Float between 0 and 1. Fraction of the input units to drop.
- **BatchNormalization:** Layer that normalizes its inputs. Batch normalization applies a transformation that maintains the mean output close to 0 and the output standard deviation close to 1.

**Activation Functions:**

- **Relu:** 'relu' is the rectifier (activation function max(0,x). The rectifier activation function is used to add non linearity to the network. Given a value x, returns max(x, 0).

- **Softmax:** 2 neurons, probability that the image belongs to one of the two classes. Softmax converts a vector of values to a probability distribution. The elements of the output vector are in range (0, 1) and sum to 1. Softmax is often used as the activation for the last layer of a classification network because the result could be interpreted as a probability distribution.

**Loss and Optimizer**<br>
- Binary Cross Entropy computes the cross-entropy loss between true labels and predicted labels. Use this cross-entropy loss for binary (0 or 1) classification applications. Since we have only 2 classes, we used `binary_crossentropy`
- CategoricalCrossentropy: Use this crossentropy loss function when there are two or more label classes. The labels are expected to be provided in a one_hot representation. If you want to provide labels as integers, please use SparseCategoricalCrossentropy loss.
- RMSprop: The gist of RMSprop is to: Maintain a moving (discounted) average of the square of gradients, Divide the gradient by the root of this average. For our project, we have used `rmsprop` optimizer. Feel free to apply adam or other optimizers and see if it improves the performance of the model.
- adam: Another popular optimizer is adam. Adam optimization is a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments.

In [None]:
model = Sequential()

model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS)))
#model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))


model.add(Conv2D(32, (3, 3), activation='relu'))
#model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3), activation='relu'))
#model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))


model.add(Conv2D(128, (3, 3), activation='relu'))
#model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))


model.add(Flatten())
model.add(Dense(256, activation='relu'))
#model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Dense(2, activation='softmax')) # 2 because we have cat and dog classes

model.compile(loss='binary_crossentropy', optimizer="adam",metrics=['accuracy'])

model.summary()

## Callbacks
<a id= "call"> </a>

**Early Stopping**

To prevent over fitting we will stop the learning after val_loss value has not decreased for 10 epochs

In [None]:
earlystop = EarlyStopping(patience=10)

**Learning Rate Reduction** <br>
tl;dr: We will reduce the Learning Rate when Accuracy does not increase for 2 steps. <br>
The Learning Rate (LR) is the step by which the optimizer walks through the 'loss landscape'. The higher LR, the bigger are the steps and the quicker is the convergence. However the sampling is very poor with an high LR and the optimizer could probably fall into a local minima.

It is better to have a decreasing learning rate during the training to reach efficiently the global minimum of the loss function. To keep the advantage of the fast computation time with a high LR, we will decrease the LR dynamically every X steps (epochs) depending on if it is necessary (when accuracy does not improve).

With the ReduceLROnPlateau function from Keras.callbacks, we will reduce the LR by `factor` (0.1 in this case) if the accuracy does not improve after 2 epochs.

In [None]:
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc', 
                                            patience=2, 
                                            verbose=1, 
                                            factor=0.1, 
                                            min_lr=0.00001)

In [None]:
callbacks = [earlystop, learning_rate_reduction]

## Data Augmentation to Prevent Overfitting
<a id= "aug"> </a>

Data Augmentation is done to prevent overfitting by exposing the images to random changes like Resizing, Flips, Rotation etc. This is what happens during Augmentation:
- A batch of images is taken for training. 
- The Generator applies random transformations to each image in the batch during Training.
- Replacing the original batch of images with a new randomly transformed batch.
- During each Epoch, a random variation of the augmented image is used for training.<br><br>
Key Points:
- The overall number of sample size does not increase or decrease due to Data Augmentation
- Augmentation prevents model from generalising. Instead of learning too much from an image, it learns from an augmented transformation in each epoch
- `flow()` , `flow_from_directory()`, `flow_from_dataframe()` : One of these three functions can be used to transform Images. When there are separate subfolders for each category (eg cat images in Cat folder, and Dog images in Dog folder), then we use `flow_from_directory()`. If there is a single folder which contains all the images, then we can use `flow_from_dataframe()`. 
- `fit()` :Fits the data generator to some sample data. This computes the internal data stats related to the data-dependent transformations, based on an array of sample data. Only required if featurewise_center or featurewise_std_normalization or zca_whitening are set to True. When rescale is set to a value, rescaling is applied to sample data before computing the internal data stats.
- Data Augmentation techniques are applied only to the data we train the model on (train_images), we do not apply it on test dataset (test_images) or validation dataset (in case there are 3 sets).

**Training Generator**

For Training Images, we will apply following augmentations:

- Randomly rotate some training images by 15 degrees
- Randomly Zoom by 20% some training images
- Horizontally flip images 
- Randomly shift images horizontally by 10% of the width
- Randomly shift images vertically by 10% of the height

**Validation Generator**

For Validation images, we will only apply Normalization. <br>
Normalization ensures that each input parameter (pixel, in this case) has a similar data distribution. This makes convergence faster while training the network. <br>
A good read to understand the importance of Image Normalization:<br>
https://stats.stackexchange.com/questions/211436/why-normalize-images-by-subtracting-datasets-image-mean-instead-of-the-current

## Check Sample Image after augmentation

## Fit Model
<a id= "fit"> </a>

- When using Generator for inputs, we do not need to specify (train_X, train_Y) separately. The model will automatically take train_Y from Generator objects. It aplies for both training and validation datasets. If we are not using Generator or Dataframe, then we need to supply input in the form (train_X, train_Y)
- steps_per_epoch: Integer or None. Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined.
- If input to fit() is generator, then we cannot use validation_split. Therefore, we need to have separate train and validation sets for the CNN fit() method which we can achieve either by reading train-test datasets from separate directories (like in our case) or using sklearn train_test_split method.

In [None]:
hist = model.fit(
    train_generator,
    epochs=5,
    validation_data=test_generator,
    callbacks=callbacks
)

### Save the Model

In [None]:
model.save('deepdetectv4.h5')

## Model Evaluation
<a id= "eval"> </a>

In [None]:
from tensorflow import keras 
model3= keras.models.load_model("/kaggle/working/deepdetectv3.h5")

Plotting Loss and Accuracy graphs for Training and Validation datasets

   **Some of the places from where I learnt CNN which made this notebook possible:**<br>
https://www.kaggle.com/code/yassineghouzam/introduction-to-cnn-keras-0-997-top-6 <br>
https://www.kaggle.com/code/uysimty/keras-cnn-dog-or-cat-classification <br>
https://www.kaggle.com/code/rajmehra03/flower-recognition-cnn-keras <br>
https://www.kaggle.com/code/vincee/intel-image-classification-cnn-keras/notebook <br>
https://www.geeksforgeeks.org/ <br>

In [None]:
model3.predict("/kaggle/input/deepfake-and-real-images/Dataset/Validation/Real/real_10.jpg")

In [None]:
 [code]