# Cats vs Dogs

### 1. Introduction & Objectives

In this project, we will build a Convolutional Neural Network (CNN) to classify images of cats and dogs. The goal is to develop a model that can accurately differentiate between these two types of animals. The dataset we will be using contains 25,000 images in total, with 12,500 images of cats and 12,500 images of dogs. We will preprocess and divide the images into training, validation, and test sets to ensure robust model performance.

We will use Keras, with Tensorflow as the backend, to implement and train the CNN. After training, we will evaluate the model’s performance using separate test data to measure its ability to generalize to unseen images.

### 2. Data Understanding

The dataset consists of 25,000 images, with an equal split between cats and dogs (12,500 images of each). To facilitate model training and evaluation, we will use the following subsets:

    Training set: 1,000 images of cats and 1,000 images of dogs.
    Validation set: 500 images of cats and 500 images of dogs.
    Test set: 1,000 images of cats and 1,000 images of dogs.



We will use a Python script to create these subsets from the original dataset, as shown below:

In [5]:
import os
import pathlib
import shutil

original_dir = pathlib.Path("../Inputs/dogs-vs-cats-orig/train")
new_base_dir = pathlib.Path("../Inputs/cats_vs_dogs_data")


def make_subset(subset_name, start_index, end_index):
    for category in ("cat", "dog"):
        dir = new_base_dir / subset_name / category
        os.makedirs(dir)
        fnames = [f"{category}.{i}.jpg"
                  for i in range(start_index, end_index)]
        for fname in fnames:
            shutil.copyfile(src=original_dir / fname,
                            dst=dir / fname)


make_subset("train", start_index=0, end_index=1000)
make_subset("validation", start_index=1000, end_index=1500)
make_subset("test", start_index=1500, end_index=2500)

This ensures the dataset is structured for training, validation, and testing phases, with a balanced distribution of cat and dog images in each subset.

#### 2.1 Importing Required Libraries and Loading the Data

We will begin by importing the necessary libraries, including Keras and TensorFlow, and using the `image_dataset_from_directory` function to load our datasets. The training, validation, and test datasets will be loaded from their respective directories, with each image being resized to 180x180 pixels and packed into batches of 32 for efficient processing.

In [6]:
import keras
from keras import layers
from keras.src.utils import image_dataset_from_directory
import tensorflow as tf

In [7]:
train_dataset = image_dataset_from_directory(new_base_dir / 'train', image_size=(180, 180), batch_size=32)
validation_dataset = image_dataset_from_directory(new_base_dir / 'validation', image_size=(180, 180), batch_size=32)
test_dataset = image_dataset_from_directory(new_base_dir / 'test', image_size=(180, 180), batch_size=32)

Found 2000 files belonging to 2 classes.
Found 1000 files belonging to 2 classes.
Found 2000 files belonging to 2 classes.


The data has been successfully loaded from the specified directories, with each image resized to 180x180 pixels and organized into batches of 32. Both the training and validation datasets are ready, and we also have a test dataset prepared for evaluation. We can now proceed with building and training the model for our image classification task.

### 3. Creating the model

We will build a CNN model using Keras to classify images of cats and dogs. The model will start with an input layer for images of shape (180, 180, 3). The input images will be rescaled by dividing by 255. Then, the model will include five convolutional layers, the first with 32 filters, followed by layers with 64, 128, and two consecutive layers with 256 filters, all using a 3x3 kernel and ReLU activation. Each convolutional layer will be followed by a max-pooling layer to downsample the feature maps. After the last convolutional block, the feature maps will be flattened, and the output will be a dense layer with a single neuron and a sigmoid activation function, providing the probability of the image being a dog.

In [None]:
inputs = keras.Input(shape=(180, 180, 3))

x = layers.Rescaling(1. / 255)(inputs)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)

outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

The CNN model has been successfully built with multiple convolutional and max-pooling layers to extract features from the input images. The architecture also includes a final dense layer for binary classification. We are now ready to proceed with compiling the model and fitting it to our training data for the cat vs. dog image classification task.