<a href="https://www.kaggle.com/code/rautaishwarya/cat-vs-dog-classification-using-cnn?scriptVersionId=138466876" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Cat vs Dog Classification

This notebook demonstrates the steps involved in building a cat vs dog image classification model using a Convolutional Neural Network (CNN).

### Step 1: Dataset Preparation

The first step is to prepare the dataset for training and testing the model.

1. Download the dataset containing images of cats and dogs.
2. Extract the contents of the dataset to a directory on your local machine.
3. Split the dataset into training and testing sets.

### Step 2: Model Architecture

Next, we define the architecture of the CNN model that will be used for cat vs dog classification.

1. Create a sequential model.
2. Add convolutional layers with batch normalization, pooling, and dropout.
3. Flatten the feature maps and add fully connected layers.
4. Compile the model with appropriate loss function, optimizer, and evaluation metric.

### Step 3: Data Augmentation

To improve the model's performance and generalization, data augmentation techniques are applied to the training set.

1. Define an image data generator for data augmentation.
2. Specify the augmentation options, such as rotation, rescaling, shearing, and flipping.
3. Generate augmented training data using the data generator.

### Step 4: Training the Model

In this step, we train the CNN model using the augmented training data.

1. Set the number of training epochs and batch size.
2. Fit the model on the training data and validate it on the validation data.
3. Monitor the training progress and evaluate the model's performance.

### Step 5: Evaluating the Model

After training the model, we evaluate its performance on the testing set.

1. Generate augmented testing data.
2. Use the trained model to make predictions on the testing data.
3. Evaluate the model's accuracy and other performance metrics.

### Step 6: Predictions

Finally, we can use the trained model to make predictions on new, unseen images.

1. Load and preprocess the new image(s).
2. Feed the image(s) to the model for prediction.
3. Interpret the model's output and determine if the image contains a cat or a dog.



In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

# Import Libraries

In [None]:
import os
import shutil
import zipfile
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, BatchNormalization, MaxPooling2D, Dropout, Flatten, Dense
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from PIL import Image

# Dataset Preparation

## Setting Image Dimensions and Channels

In this step, we define the dimensions and channels of the input images that will be used for cat vs dog classification.

### Image Width and Height

The image width and height determine the size of the input images that the model will process. In this case, we set the dimensions to 128x128 pixels. Adjusting the image dimensions can impact the model's performance, as larger images may require more computational resources and training time.

### Image Channels

Images can have different color channels, such as grayscale (1 channel) or RGB (3 channels). Here, we specify that the input images have 3 channels, which correspond to the red, green, and blue color channels. This is commonly used in color image classification tasks.

### Image Size

The image size is a tuple containing the width and height values, representing the final dimensions of the images. In this case, the image size is set to (128, 128) pixels.

Defining the image dimensions and channels is an important step to ensure consistency in the input data for the model. It enables proper processing and analysis of the images during training and inference.

In [None]:
# Set image dimensions and channels
image_width = 128
image_height = 128
image_channels = 3
image_size = (image_width, image_height)

Extract Data from Zip file.

In [None]:
# Function to create a directory
def make_dir(dir_path):
    if os.path.exists(dir_path):
        shutil.rmtree(dir_path)
    os.makedirs(dir_path)

In [None]:
# Set dataset path
dataset_path = '../output/dogs-vs-cats'

# Create dataset directory
make_dir(dataset_path)

# Extract train.zip to dataset_path
with zipfile.ZipFile("../input/dogs-vs-cats/train.zip", "r") as zip_ref:
    zip_ref.extractall(dataset_path)

# Extract test1.zip to dataset_path
with zipfile.ZipFile("../input/dogs-vs-cats/test1.zip", "r") as zip_ref:
    zip_ref.extractall(dataset_path)

# Get paths to train and test directories
Train_data = os.path.sep.join([dataset_path,"train"])
Test_data=os.path.sep.join([dataset_path,"test1"])

train_files=os.listdir(Train_data)
test_files= os.listdir(Test_data)

In [None]:
# Create a list to store categories
categories = []

# Iterate over train files and assign categories
for filename in os.listdir(train_data):
    category = filename.split(".")[0]
    if category == "dog":
        categories.append(1)
    else:
        categories.append(0)

# Create a DataFrame to store filenames and categories
df = pd.DataFrame({"filename": os.listdir(train_data), "category": categories})

Displace some images

In [None]:
Image.open(os.path.sep.join([Train_data,train_files[10]]))

In [None]:
Image.open(os.path.sep.join([Test_data,test_files[30]]))

# Visualization

In [None]:
label=["Dog","Cat"]
df["category"].value_counts().plot.pie(autopct='%1.2f%%',labels=label)
plt.show()

## Creating the CNN Model

In this step, we create a Convolutional Neural Network (CNN) model for cat vs dog classification.

A CNN is a deep learning architecture designed for image processing tasks. It consists of multiple layers that extract features from images and make predictions based on those features.

### Model Architecture

The CNN model is created using the Sequential API from the Keras library. Here is the architecture of the model:

1. **Convolutional Layers:** The model starts with a `Conv2D` layer with 32 filters, a kernel size of (3, 3), and the ReLU activation function. This layer processes the input images by convolving the filters over the image pixels, capturing local patterns and features.

2. **Batch Normalization:** After each convolutional layer, a `BatchNormalization` layer is added. Batch normalization normalizes the activations of the previous layer, making the training process more stable and accelerating convergence.

3. **Max Pooling Layers:** Following each batch normalization layer, a `MaxPooling2D` layer is included with a pool size of (2, 2). Max pooling reduces the spatial dimensions of the feature maps, focusing on the most important information while reducing the computational complexity.

4. **Dropout Layers:** To prevent overfitting, `Dropout` layers are added after each max pooling layer. Dropout randomly sets a fraction of the input units to 0 during training, which helps to prevent the model from relying too much on specific features.

5. **Flatten Layer:** After the last dropout layer, a `Flatten` layer is used to flatten the 2D feature maps into a 1D vector. This prepares the data for the fully connected layers.

6. **Dense Layers:** Two fully connected `Dense` layers are added. The first dense layer has 512 units and uses the ReLU activation function. The second dense layer has 2 units (corresponding to the cat and dog classes) and uses the softmax activation function to output the probabilities for each class.

### Model Compilation

Before training the model, we need to compile it by specifying the loss function, optimizer, and evaluation metrics. Here are the compilation settings:

- **Loss Function:** The `categorical_crossentropy` loss function is used since we have a multi-class classification problem with two classes (cat and dog).

- **Optimizer:** The Adam optimizer is used, which is an efficient optimization algorithm that adapts the learning rate during training.

- **Metrics:** We specify `accuracy` as the evaluation metric to monitor the model's performance during training and evaluation.

Compiling the model prepares it for training by configuring the necessary settings for the optimization process.

In [None]:
# Create the CNN model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation="relu", input_shape=(image_width, image_height, image_channels)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))

model.add(Conv2D(64, (3, 3), activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(128, (3, 3), activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))

model.add(Flatten())
model.add(Dense(512, activation="relu"))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(2, activation="softmax"))

# Compile the model
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])

In [None]:
# Print model summary
model.summary()

## Early Stopping and Learning Rate Reduction Callbacks

In this step, we set up two callbacks to monitor the training progress and make adjustments during the training process: Early Stopping and Learning Rate Reduction.

### Early Stopping

Early stopping is a technique used to prevent overfitting and find the optimal number of training epochs. It monitors a specified metric (in this case, validation accuracy) and stops the training process if the metric doesn't improve for a certain number of epochs.

Here's how we set up the Early Stopping callback:

In [None]:
early_stop = EarlyStopping(patience=10)

- `patience=10` indicates that if the validation accuracy doesn't improve for 5 consecutive epochs, the training process will be stopped. This helps prevent overfitting by avoiding unnecessary training epochs when the model's performance plateaus.

### Learning Rate Reduction

Learning rate reduction is a technique used to adjust the learning rate during training, which can help the model converge faster and improve generalization. It reduces the learning rate if the validation metric (in this case, validation accuracy) doesn't improve for a certain number of epochs.

Here's how we set up the Learning Rate Reduction callback:

In [None]:
learning_rate_reduction = ReduceLROnPlateau(monitor="val_accuracy", patience=10)

- `monitor="val_accuracy"` specifies that the callback should monitor the validation accuracy to decide when to reduce the learning rate.

- `patience=10` indicates that if the validation accuracy doesn't improve for 5 consecutive epochs, the learning rate will be reduced.

By using these callbacks, we can automatically stop the training process when it's no longer improving and adjust the learning rate to facilitate better convergence. These techniques help us achieve better model performance and avoid overfitting.

## Preparing Data for Training and Validation

In this section, we perform the following steps to prepare our data for training and validation:

### Replace Category Values

The original category values in the DataFrame `df` are represented as integers: 0 for "cat" and 1 for "dog". To make it more readable and intuitive, we replace the integer values with their corresponding labels:

In [None]:
df["category"] = df["category"].replace({0: 'cat', 1: 'dog'})

This step ensures that the category values are represented as strings, which will be helpful during data exploration and analysis.

### Split DataFrame into Train and Validation Sets

Next, we split the DataFrame `df` into training and validation sets using the `train_test_split` function from the `sklearn.model_selection` module. The splitting is performed based on a specified test size (in this case, 20% of the data) and a random seed (random_state) for reproducibility:

In [None]:
train_df, validate_df = train_test_split(df, test_size=0.2, random_state=7)

The resulting `train_df` and `validate_df` DataFrames contain the training and validation samples, respectively.

### Reset Index for Train and Validation DataFrames

After splitting the DataFrame, the index values may not be sequential. To ensure that the index is reset and starts from 0 for both the `train_df` and `validate_df` DataFrames, we use the `reset_index` method with the `drop=True` parameter:

In [None]:
train_df = train_df.reset_index(drop=True)
validate_df = validate_df.reset_index(drop=True)

This step ensures that the index values reflect the new order of the samples after the split.

### Set Total Number of Training and Validation Samples

Finally, we calculate and store the total number of training and validation samples for later use. The `shape[0]` attribute of a DataFrame returns the number of rows (samples), so we assign these values to the variables `total_train` and `total_validate`, respectively:

In [None]:
total_train = train_df.shape[0]
total_validate = validate_df.shape[0]

These variables will be used to specify the steps per epoch and validation steps during the model training process.

## Data Augmentation for Training Images

We set the batch size to 10, which determines the number of samples processed in each training batch. Then, we create an `ImageDataGenerator` instance called `train_datagen` to perform data augmentation on the training images. This includes various transformations such as rotation, rescaling, shearing, zooming, flipping, and shifting. Finally, we use the `flow_from_dataframe` method to generate augmented training data by providing the training DataFrame, directory path of the training images, column names for filenames and categories, target image size, class mode, and batch size. Data augmentation helps increase training data diversity and reduces the risk of overfitting.

In [None]:
# Set batch size
batch_size = 25

# Set up data augmentation for training images
train_datagen = ImageDataGenerator(
    rotation_range=15,
    rescale=1./255,
    shear_range=0.1,
    zoom_range=0.2,
    horizontal_flip=True,
    width_shift_range=0.1,
    height_shift_range=0.1
)

# Generate augmented training data
train_generator = train_datagen.flow_from_dataframe(
    train_df,
    train_data,
    x_col='filename',
    y_col='category',
    target_size=image_size,
    class_mode="categorical",
    batch_size=batch_size
)

We create another `ImageDataGenerator` instance called `validation_datagen` to apply data augmentation on the validation images. In this case, we only perform rescaling by dividing the pixel values by 255 to normalize them. Then, we use the `flow_from_dataframe` method with the validation DataFrame, directory path of the training images, column names for filenames and categories, target image size, class mode, and batch size to generate the validation data. The purpose of data augmentation on validation images is to maintain consistency with the preprocessing applied to the training images, allowing for fair evaluation of the model's performance.

In [None]:
# Set up data augmentation for validation images
validation_datagen = ImageDataGenerator(rescale=1./255)

# Generate validation data
validation_generator = validation_datagen.flow_from_dataframe(
    validate_df,
    train_data,
    x_col='filename',
    y_col='category',
    target_size=image_size,
    class_mode='categorical',
    batch_size=batch_size
)

We create an `ImageDataGenerator` instance called `test_datagen` to rescale the pixel values of the testing images by dividing them by 255. This ensures that the testing images are normalized similar to the training and validation images.

Then, we use the `flow_from_directory` method with the directory path of the testing images, target image size, batch size, class mode set to None, and shuffle set to False. This configuration allows us to generate the augmented testing data without shuffling the order of the images.

The purpose of this data augmentation is to ensure consistency in preprocessing between the training, validation, and testing images, making the predictions on the testing data more reliable.

In [None]:
# Set up data augmentation for testing images
test_datagen = ImageDataGenerator(rescale=1./255)

# Generate augmented testing data
test_generator = test_datagen.flow_from_directory(
    test_data,
    target_size=image_size,
    batch_size=batch_size,
    class_mode=None,
    shuffle=False
)

In [None]:
# Set up early stopping and learning rate reduction callbacks
early_stop = EarlyStopping(patience=5)
learning_rate_reduction = ReduceLROnPlateau(monitor="val_accuracy", patience=5)
callbacks = [early_stop, learning_rate_reduction] 

We set the number of epochs to 5. Then we use the `fit` method of the model to train the model using the training data generated by `train_generator`. 

We pass the following parameters to the `fit` method:
- `epochs`: The number of epochs to train the model.
- `validation_data`: The validation data generated by `validation_generator`.
- `validation_steps`: The number of steps to validate the model per epoch. In this case, we divide the total number of validation samples by the batch size.
- `steps_per_epoch`: The number of steps to take per epoch during training. In this case, we divide the total number of training samples by the batch size.
- `callbacks`: The list of callbacks to be applied during training. This includes the early stopping and learning rate reduction callbacks defined earlier.

The training process will update the model's weights and monitor the validation accuracy to determine if early stopping should be applied or if the learning rate should be reduced. The training progress and evaluation metrics will be stored in the `history` object.

In [None]:
epochs = 10
history = model.fit(
    train_generator, 
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=total_validate // batch_size,
    steps_per_epoch=total_train // batch_size,
    callbacks=callbacks
)

In [None]:
model.save("model1_catsVSdogs.h5")

# Model Evaluation Plot

The code retrieves the filenames of the test images, creates a DataFrame to store the filenames, and calculates the number of test samples or images in the test dataset.

In [None]:
test_filenames = os.listdir(test_data)
test_df = pd.DataFrame({
    'filename': test_filenames
})
samples = test_df.shape[0]

In [None]:
test_gen = ImageDataGenerator(rescale=1./255)
test_generator = test_gen.flow_from_dataframe(
    test_df,
    test_data, 
    x_col='filename',
    y_col=None,
    class_mode=None,
    target_size=image_size,
    batch_size=batch_size,
    shuffle=False
)

Predict the classes of the test images by passing the test generator to the model.predict() function.

In [None]:
predict = model.predict(test_generator, steps = np.ceil(samples/batch_size))

In [None]:
# Add prdicted classes to test_df
test_df["predict"]=np.argmax(predict,axis=-1)
test_df

In [None]:
# Replace classes
test_df["predict"]=test_df["predict"].replace({1:"Dog",0:"Cat"})
test_df

In [None]:
test_df["predict"].value_counts().plot(kind="bar")

In [None]:
submission_df = test_df.copy()
submission_df['id'] = submission_df['filename'].str.split('.').str[0]
submission_df['label'] = submission_df['predict']
submission_df.drop(['filename', 'predict'], axis=1, inplace=True)
submission_df.to_csv('submission.csv', index=False)

In summary, the code builds a CNN model for cat vs dog classification. It includes steps such as setting image dimensions, creating the model architecture, compiling the model, setting up callbacks for early stopping and learning rate reduction, preprocessing the data, setting data augmentation for training and validation images, generating augmented data, training the model, plotting the training and validation loss, and predicting the test data.

## Thank You!!!