## Dog and Cat Image Classification with Transfer Learning

This notebook demonstrates how to build an image classification model to distinguish between dogs and cats using a pre-trained ResNet50 model and transfer learning. We will cover data acquisition, preparation, model building, training, and evaluation.

Data Acquisition

We begin by downloading the 'Dog and Cat Classification Dataset' from KaggleHub. This dataset contains images of dogs and cats which we will use to train our model. The necessary libraries for image processing, model building, and plotting are also imported.

In [None]:
%pip install kagglehub opencv-python Pillow tensorflow matplotlib scikit-learn

In [None]:
import numpy as np
import kagglehub
import cv2
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.applications.resnet50 import preprocess_input
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
import matplotlib.pyplot as plt
import os

Data Configuration

We define key parameters for our image processing and model training:

- `data_dir`: Specifies the path to our dataset.
- `IMG_SIZE`: Sets the target dimensions for all images to ensure consistency.
- `BATCH_SIZE`: Determines the number of samples per gradient update during training.
- `SEED`: Ensures reproducibility of our dataset splits.

In [None]:
base_dataset_path = kagglehub.dataset_download("bhavikjikadara/dog-and-cat-classification-dataset")
if 'base_dataset_path' not in locals():
    base_dataset_path = "/kaggle/input/dog-and-cat-classification-dataset"
data_dir = os.path.join(base_dataset_path, "PetImages")

IMG_SIZE = (224, 224)
BATCH_SIZE = 32
SEED = 42

Explore the Dataset

We load a sample of the dataset using `tf.keras.utils.image_dataset_from_directory` to automatically infer class labels from folder names. Then, we visualize a few images to understand the dataset content and confirm correct loading.

In [None]:
dataset = tf.keras.utils.image_dataset_from_directory(
    data_dir,
    shuffle=True,
    seed=SEED,
    image_size=IMG_SIZE,
    batch_size=BATCH_SIZE
)

class_names = dataset.class_names

plt.figure(figsize=(10, 5))

for images, labels in dataset.take(1):
    for i in range(5):
        ax = plt.subplot(1, 5, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        plt.axis("off")

plt.tight_layout()
plt.show()

Split Dataset into Training and Validation Sets

The dataset is loaded and split into training and validation sets. `tf.keras.utils.image_dataset_from_directory` is used to efficiently load images from the specified directory. We then apply `ignore_errors` to handle any potentially corrupted images gracefully, preventing training interruptions.

In [None]:
train_ds = tf.keras.utils.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=SEED,
    image_size=IMG_SIZE,
    batch_size=BATCH_SIZE
)

val_ds = tf.keras.utils.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    seed=SEED,
    image_size=IMG_SIZE,
    batch_size=BATCH_SIZE
)

train_ds = train_ds.apply(tf.data.experimental.ignore_errors())
val_ds = val_ds.apply(tf.data.experimental.ignore_errors())

Optimize Data Pipelining

`AUTOTUNE` allows the TensorFlow data pipeline to dynamically tune the number of elements prefetched and processed in parallel. Prefetching overlaps data preprocessing and model execution, significantly improving training performance by keeping the GPU busy.

In [None]:
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.prefetch(buffer_size=AUTOTUNE)

Apply Data Augmentation

To prevent overfitting and improve model generalization, we apply data augmentation techniques:

- `layers.RandomFlip("horizontal")`: Horizontally flips images randomly.
- `layers.RandomRotation(0.1)`: Introduces slight rotations within a 10% range.
- `layers.RandomZoom(0.1)`: Randomly scales images up or down by up to 10%.

These transformations increase the diversity of our training data without collecting new images.

In [None]:
data_augmentation = tf.keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
])

Transfer Learning with ResNet50

We leverage a pre-trained `ResNet50` model, which has been trained on the large ImageNet dataset.

- By setting `include_top=False`, we remove its original classification head, allowing us to add our own custom layers.
- `weights='imagenet'` initializes the model with weights learned from ImageNet, providing a strong starting point.
- `base_model.trainable = False` freezes these pre-trained layers to retain their powerful feature extraction capabilities, while only training our new classification head.

In [None]:
base_model = ResNet50(
    weights='imagenet',
    include_top=False,
    input_shape=(224, 224, 3)
)

base_model.trainable = False

Define Model Architecture

Here, we construct our complete model by stacking several layers:

- **Input Layer:** Defines the expected shape of our input images (224x224 pixels with 3 color channels).
- **Data Augmentation:** Applies the previously defined augmentation transformations to increase data diversity.
- **ResNet Preprocessing:** `preprocess_input` standardizes image pixel values (e.g., mean subtraction, scaling) according to the ResNet model's requirements.
- **Base Model (ResNet50):** The frozen ResNet50 acts as a feature extractor, providing rich representations of the input images.
- **GlobalAveragePooling2D:** Reduces the spatial dimensions of the feature maps, effectively summarizing the features for each image into a single vector.
- **Dropout:** A regularization technique that randomly sets a fraction of input units to 0 at each update during training (here, 30%), which helps prevent overfitting.
- **Dense Layer:** The final classification layer with a single neuron and 'sigmoid' activation, suitable for binary classification (outputting a probability between 0 and 1 for 'Dog').

In [None]:
model = models.Sequential([

    layers.Input(shape=(224, 224, 3)),

    data_augmentation,

    layers.Lambda(preprocess_input),

    base_model,

    layers.GlobalAveragePooling2D(),

    layers.Dropout(0.3),

    layers.Dense(1, activation='sigmoid')
])

model.summary()

Compile the Model

The model is compiled with the following settings:

- **Optimizer:** `tf.keras.optimizers.Adam(learning_rate=1e-4)` for efficient gradient descent with a small learning rate suitable for transfer learning.
- **Loss Function:** `'binary_crossentropy'` is used, which is appropriate for binary classification tasks where the output is a probability.
- **Metrics:** `'accuracy'` is chosen to monitor the proportion of correctly classified images during training and evaluation.

In [None]:
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
    loss='binary_crossentropy',
    metrics=['accuracy']
)

model.summary()

Define Callbacks

Callbacks are functions that are applied at certain stages of the training procedure. We use:

- **`EarlyStopping`**: Monitors a chosen metric (e.g., validation loss) and stops training if it doesn't improve for a specified number of epochs (`patience=3`), restoring the best weights found.
- **`ModelCheckpoint`**: Saves the model's weights during training, only keeping the version with the best performance on the validation set, specified by `'best_model.keras'`.

In [None]:
model_filename = "best_model.keras"

callbacks = [
    EarlyStopping(patience=3, restore_best_weights=True),
    ModelCheckpoint(model_filename, save_best_only=True)
]

Train the Model

The model is trained using the `fit` method on our `train_ds` and `val_ds` for 3 `epochs`. We include our defined `callbacks` to manage the training process, enabling early stopping and saving the best model.

In [None]:
history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=3,
    callbacks=callbacks
)

Load and Recompile the Best Model

After initial training, we load the best performing model saved by `ModelCheckpoint`. We then recompile it, potentially with a slightly different (often lower) learning rate, to prepare for final evaluation or further fine-tuning. Here, the learning rate is set to `1e-5`.

In [None]:
import tensorflow as tf
from tensorflow.keras.applications.resnet50 import preprocess_input
import os

if 'model_filename' not in locals():
    model_filename = "best_model.keras"

model_path = os.path.join(os.getcwd(), model_filename)

model = tf.keras.models.load_model(
    model_path,
    compile=False,
    custom_objects={
        "preprocess_input": preprocess_input
    }
)

model.compile(
    optimizer=tf.keras.optimizers.Adam(1e-5),
    loss='binary_crossentropy',
    metrics=['accuracy']
)

model.summary()

Evaluate Model Performance

We evaluate the model's performance on the validation dataset (`val_ds`) to assess its generalization capabilities on unseen data. The `evaluate` method returns the `loss` and `accuracy` on the validation set, providing a final measure of our model's effectiveness.

In [None]:
loss, accuracy = model.evaluate(val_ds)

print(f"\nValidation Loss: {loss:.4f}")
print(f"Validation Accuracy: {accuracy*100:.2f}%")

Make Predictions on New Images

This section demonstrates how to use the trained model to classify a new, unseen image:

1.  **Select Random Image:** A random image path is chosen from either the 'Cat' or 'Dog' directory.
2.  **Load and Preprocess:** The image is loaded, converted to RGB, resized to `IMG_SIZE`, and then converted to a NumPy array. An additional batch dimension is added (`np.expand_dims`), and `preprocess_input` is applied as required by ResNet50.
3.  **Make Prediction:** The preprocessed image is fed into the `model.predict()` method.
4.  **Interpret Prediction:** The model's output (a probability) is interpreted to determine the predicted class ('Cat' or 'Dog') and a confidence score.
5.  **Display Results:** The original image is displayed along with its predicted label, confidence, and actual class for visual verification.

In [None]:
import random
import os

cat_dir = os.path.join(data_dir, 'Cat')
dog_dir = os.path.join(data_dir, 'Dog')

def get_random_image(directory):
    all_images = os.listdir(directory)
    image_files = [f for f in all_images if f.lower().endswith(('.png', '.jpg', '.jpeg'))]
    if not image_files:
        raise ValueError(f"No image files found in {directory}")
    return os.path.join(directory, random.choice(image_files))

chosen_dir = random.choice([cat_dir, dog_dir])
random_image_path = get_random_image(chosen_dir)

img = Image.open(random_image_path).convert('RGB')
img = img.resize(IMG_SIZE)
img_array = np.array(img)
img_array = np.expand_dims(img_array, axis=0)
preprocessed_img = preprocess_input(img_array)

prediction = model.predict(preprocessed_img)

predicted_class_index = (prediction > 0.5).astype(int)[0][0]
predicted_label = class_names[predicted_class_index]
confidence = prediction[0][0] if predicted_class_index == 1 else (1 - prediction[0][0])

plt.figure(figsize=(6, 6))
plt.imshow(img)
plt.title(f"Predicted: {predicted_label} ({confidence*100:.2f}%)\nActual: {os.path.basename(chosen_dir)}")
plt.axis('off')
plt.show()

## Conclusion

This notebook has walked through the process of building and training an image classification model using transfer learning with ResNet50. We've covered dataset loading, augmentation, model definition, training with callbacks, and finally, making predictions on new images. This approach is highly effective for image-related tasks, especially when dealing with limited datasets, by leveraging the powerful features learned by models on much larger datasets.