<a href="https://colab.research.google.com/github/Benj-admin/Dogs_vs_cats/blob/main/Dogs_vs_cats.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#  CNN Project: Cats vs. Dogs Classification with PyTorch 🐈🐕

This project aims to build a Convolutional Neural Network (CNN) using the **PyTorch** framework to accurately distinguish between images of cats and dogs.

---



## Step 1: Kaggle Connection and Data Acquisition

This initial step configures your Google Colab environment to interact with the Kaggle API, allowing you to download the large **"Dogs vs. Cats"** dataset directly. We'll install the necessary library, securely upload your API token, and download/unzip the raw image data.

In [None]:
# Install the Kaggle API tool quietly
!pip install -q kaggle

# Create the hidden directory for Kaggle configuration
!mkdir -p ~/.kaggle

# ----------------------------------------------------------------------
# MANUAL STEP: Upload your 'kaggle.json' file to the Colab file system (left panel).
# ----------------------------------------------------------------------

# Securely copy the API token and set restrictive permissions
# 'kaggle.json' must be in the Colab root directory
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
print("Kaggle API setup complete. Starting download...")

# Download the competition dataset (contains train.zip and test1.zip)
!kaggle competitions download -c dogs-vs-cats

# Unzip the main download file
!unzip -q dogs-vs-cats.zip

# Unzip the training data file (25,000 labellées images)
!unzip -q train.zip
print("Training data downloaded and extracted into the 'train/' folder.")

# Verify the file structure
!ls train | head -n 5

Kaggle API setup complete. Starting download...
dogs-vs-cats.zip: Skipping, found more recently modified local copy (use --force to force download)
replace sampleSubmission.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: replace train/cat.0.jpg? [y]es, [n]o, [A]ll, [N]one, [r]ename: 

## Step 2: Data Organization and Preprocessing

The raw images are currently mixed in the `train/` folder. PyTorch's built-in `ImageFolder` class requires data to be organized into class-specific subfolders (here, `train_set/cats` and `train_set/dogs`).

In this step, we will:
1.  Create the necessary directory structure (`train_set` and `val_set`).
2.  Split the data (e.g., 80% for training, 20% for validation) and move the images accordingly.
3.  Define **data augmentation** techniques for the training set and standard transformations for the validation set using `torchvision.transforms`.
4.  Create the PyTorch **`DataLoader`** instances for batch-wise data access.

In [None]:
import os
import shutil
import random
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import torch

In [None]:
# --- 1. Define Paths and Parameters ---

# Source directory containing all images
SOURCE_DIR = 'train'
# Base directory for the organized data
BASE_DIR = 'data'
TRAIN_DIR = os.path.join(BASE_DIR, 'train_set')
VAL_DIR = os.path.join(BASE_DIR, 'val_set')

# Hyperparameters for preprocessing
IMAGE_SIZE = 224 # Standard input size
BATCH_SIZE = 32
SPLIT_RATIO = 0.8 # 80% for training, 20% for validation

# Standard Normalization values derived from ImageNet
MEAN = [0.485, 0.456, 0.406]
STD = [0.229, 0.224, 0.225]

In [None]:

# --- 2. Create Directory Structure and Split Data ---

# Create the main directories
os.makedirs(os.path.join(TRAIN_DIR, 'cats'), exist_ok=True)
os.makedirs(os.path.join(TRAIN_DIR, 'dogs'), exist_ok=True)
os.makedirs(os.path.join(VAL_DIR, 'cats'), exist_ok=True)
os.makedirs(os.path.join(VAL_DIR, 'dogs'), exist_ok=True)

# List and shuffle all files by class
all_files = os.listdir(SOURCE_DIR)
cat_files = [f for f in all_files if f.startswith('cat')]
dog_files = [f for f in all_files if f.startswith('dog')]
random.shuffle(cat_files)
random.shuffle(dog_files)

# Split and copy utility function
def split_and_copy(files, class_name, split_ratio):
    train_split = int(len(files) * split_ratio)
    train_files = files[:train_split]
    val_files = files[train_split:]

    for f in train_files:
        shutil.copy(os.path.join(SOURCE_DIR, f), os.path.join(TRAIN_DIR, class_name, f))
    for f in val_files:
        shutil.copy(os.path.join(SOURCE_DIR, f), os.path.join(VAL_DIR, class_name, f))

# Apply the split and copy function
split_and_copy(cat_files, 'cats', SPLIT_RATIO)
split_and_copy(dog_files, 'dogs', SPLIT_RATIO)



In [None]:

print("Data split into Train and Validation sets.")
print(f"Total training images: {len(os.listdir(os.path.join(TRAIN_DIR, 'cats'))) + len(os.listdir(os.path.join(TRAIN_DIR, 'dogs')))}")
print(f"Total validation images: {len(os.listdir(os.path.join(VAL_DIR, 'cats'))) + len(os.listdir(os.path.join(VAL_DIR, 'dogs')))}")

In [None]:
# --- 3. Define Transformations and DataLoaders ---

# Transformations for training (including Data Augmentation)
train_transforms = transforms.Compose([
    transforms.Resize((IMAGE_SIZE, IMAGE_SIZE)),
    transforms.RandomRotation(15),             # Simple data augmentation, helping the model become invariant to the rotation
    transforms.RandomHorizontalFlip(),         # Simple data augmentation, helping the model become invariant to the symmetry
    transforms.ToTensor(),                     # Convert image to a PyTorch Tensor
    transforms.Normalize(mean=MEAN, std=STD)   # Normalize pixel values
])

# Transformations for validation (only resizing, ToTensor, and Normalization)
val_transforms = transforms.Compose([
    transforms.Resize((IMAGE_SIZE, IMAGE_SIZE)),
    transforms.ToTensor(),
    transforms.Normalize(mean=MEAN, std=STD)
])

# Use ImageFolder to load the dataset structure
train_data = datasets.ImageFolder(root=TRAIN_DIR, transform=train_transforms)
val_data = datasets.ImageFolder(root=VAL_DIR, transform=val_transforms)

# Create DataLoaders to iterate over batches
# num_workers > 0 speeds up loading by using multiple subprocesses
train_loader = DataLoader(train_data, batch_size=BATCH_SIZE, shuffle=True, num_workers=2)
val_loader = DataLoader(val_data, batch_size=BATCH_SIZE, shuffle=False, num_workers=2)

print("\nPyTorch DataLoaders created successfully.")
print(f"Classes: {train_data.classes}")
print(f"Number of training batches (steps per epoch): {len(train_loader)}")

##  Step 3: Building and Training the Baseline Model

In this phase, we design a simple Convolutional Neural Network (CNN) from scratch using PyTorch's `torch.nn.Module`. This initial model, often called a **baseline**, will serve as a starting point to assess the difficulty of the task and establish a benchmark performance before implementing more advanced techniques.

We will:
1.  Define the CNN architecture using convolutional layers, ReLU activation, and max-pooling.
2.  Choose the **Loss Function** (`CrossEntropyLoss`) and the **Optimizer** (`Adam`).
3.  Implement the training loop over a few epochs.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
import time

In [None]:
# --- 1. Model Definition: Simple CNN Architecture ---

class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()

        # First Convolutional Block
        # Input: 3x224x224 (RGB image)
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=1)

        # Second Convolutional Block
        # Input: 32x112x112 (after first pooling)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)

        # Third Convolutional Block
        # Input: 64x56x56 (after second pooling)
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1)

        #  Pooling Layer
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        # Flatten Layer
        self.flatten = nn.Flatten()

        # Linear layers for classification
        # Input: 128 * 28 * 28 (after third pooling and flatten layer)
        self.fc1 = nn.Linear(128 * 28 * 28, 512)
        # Output layer: 2 classes (cat or dog)
        self.fc2 = nn.Linear(512, 2)

    def forward(self, x):
        # 3 block: Conv -> ReLU -> Pool
        x = self.pool(nn.functional.relu(self.conv1(x))) # Size: 32x112x112
        x = self.pool(nn.functional.relu(self.conv2(x))) # Size: 64x56x56
        x = self.pool(nn.functional.relu(self.conv3(x))) # Size: 128x28x28

        # Flatten the feature maps for the dense layers
        x = self.flatten(x)

        # Dense layers
        x = nn.functional.relu(self.fc1(x))
        x = self.fc2(x) # Output logits for 2 classes
        return x

In [27]:
# --- 2. Setup Device, Model, Loss, and Optimizer ---

# Check for GPU availability and select device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Instantiate the model and move it to the selected device
model = SimpleCNN().to(device)

# Define Loss Function
criterion = nn.CrossEntropyLoss()

# Define Optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Number of epochs to train for the baseline
NUM_EPOCHS = 5


Using device: cpu


In [None]:
# --- 3. Training Loop ---

def train_model(model, criterion, optimizer, num_epochs=NUM_EPOCHS):
    start_time = time.time()

    for epoch in range(num_epochs):
        print(f'Epoch {epoch+1}/{num_epochs}')
        print('-' * 10)

        model.train()
        running_loss = 0.0
        running_acc = 0

        # Iterate over data
        for inputs, labels in train_loader:
            inputs = inputs.to(device)
            labels = labels.to(device)

            # Zero the parameter gradients
            optimizer.zero_grad()

            # Forward pass
            outputs = model(inputs)
            _, preds = torch.max(outputs, 1) # Get predicted class
            loss = criterion(outputs, labels)

            # Backward pass and optimize
            loss.backward()
            optimizer.step()

            # Statistics
            running_loss += loss.item() * inputs.size(0)
            running_acc+= torch.sum(preds == labels.data)

        epoch_loss = running_loss / len(train_loader.dataset)
        epoch_acc = running_acc.double() / len(train_loader.dataset)

        print(f'Train Loss: {epoch_loss:.4f} Acc: {epoch_acc:.4f}')

        # VALIDATION Phase (Optional, but highly recommended)
        model.eval() # Set model to evaluation mode
        val_running_loss = 0.0
        val_running_acc = 0

        with torch.no_grad(): # Disable gradient calculations
            for inputs, labels in val_loader:
                inputs = inputs.to(device)
                labels = labels.to(device)

                outputs = model(inputs)
                _, preds = torch.max(outputs, 1)
                loss = criterion(outputs, labels)

                val_running_loss += loss.item() * inputs.size(0)
                val_running_acc += torch.sum(preds == labels.data)

        val_loss = val_running_loss / len(val_loader.dataset)
        val_acc = val_running_acc.double() / len(val_loader.dataset)

        print(f'Val Loss: {val_loss:.4f} Val Acc: {val_acc:.4f}')

    time_elapsed = time.time() - start_time
    print(f'Training complete in {time_elapsed // 60:.0f}m {time_elapsed % 60:.0f}s')

# Execute the training function
train_model(model, criterion, optimizer)

Epoch 1/5
----------
