## Emilie Dubief

# Introduction

Image Classification is a technique used to sort images between labels depending on its pixels. It can have multiple applications, such as the one used in this notebook: Binary Classification. This consists of sorting images into two categories.
To do this kind of classification, several methods can be used, like CNN (Convolutional Neural Network). This method is an algorithm of deep learning that learns patterns from data and applies them to other data to make predictions.
Another algorithm used for image classification in Transfer Learning that consists of using an already existing CNN to improve another.

Before entering more into the details for this notebook, let's recap what the objectives are.

For the assignment, the objective was to create a CNN model that could do binary classification between two categories: cats and dogs.
In order to build the model, we had an already existing dataset filled with pictures taken by former students of their pets.
By looking into the data set, I noticed that there was a wide array of diversity in the pictures. In fact, many ones were uncanny pictures of the cats and dogs among more "classic" ones. Which already gives a good representation of what cats and dogs pictures can be. 

In this notebook, I will use this dataset to build a CNN model. First, you will see the methodology I used to build the model, then the results I had all along the process of building the model. Finally, you will find the conclusions of this experiment along with the references I used in this notebook.

# Methodology

In this section, you can find the final model build to do a binary classification between cat and dog pictures. If you want to run this notebook, be aware that it takes more than 2 hours in Kaggle.

## Basics imports

In [None]:
import os
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

from tensorflow import data as tf_data
import keras

from tensorflow.keras import layers

seed = 42
keras.utils.set_random_seed(seed)

## Read in the training data

In this section, I read the dataset consisting of all the pictures of cats and dogs of former students.

I enlarged this dataset by flipping and rotating the images.

In [None]:
image_size = (256, 256)

# when working with 20_000 files for training this
# will lead to exactly 160 mini-batches per epoch
batch_size = 125

# https://keras.io/api/data_loading/image/#imagedatasetfromdirectory-function
train_ds, val_ds = keras.utils.image_dataset_from_directory(
    #"PetImages",
    "/kaggle/input/u-tad-dogs-vs-cats-2025/train/train",
    validation_split=0.2,
    subset="both",
    seed=seed,
    image_size=image_size,
    batch_size=batch_size,
    labels="inferred",
    label_mode="categorical",
)

# Adding more data by tranforming randomly the data
augmentation_layers = [
    layers.RandomFlip("horizontal"), # horizontal flip
    layers.RandomRotation(0.1), # rotation 
]

def data_augmentation(x):
    for layer in augmentation_layers:
        x = layer(x)
    return x

# Transform the training data
train_ds = train_ds.map(lambda x, y: (data_augmentation(x), y))

## Transfer learning

In this section, you can see the implementation of the transfer learning. The idea here is to take an already trained CNN model to improve the results of mine.

This implementation was made thanks to the Keras documentation present in the references section.

First, I needed to create a base model from ImageNet.

Then, I froze this model so that it won't be trained along with my model.

Finally, I created a new model on top of the base model.

In [None]:
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

base_model = keras.applications.Xception(
    weights="imagenet",  # Load weights pre-trained on ImageNet.
    input_shape=(256, 256, 3), # fit the images to the same size
    include_top=False,
)

# Freeze the base_model
base_model.trainable = False

# Create new model on top
model = Sequential([
    base_model,
    keras.layers.GlobalAveragePooling2D(),
    keras.layers.Dense(512, activation='relu'),
    keras.layers.Dense(2, activation='softmax')
])

model.summary()

## Compile and train (fit)

In this section, I compiled the model with the optimizer Adam and I trained the model using the data from the dataset.

In [None]:
%%time

model.compile(optimizer=keras.optimizers.Adam(1e-5),  # Very low learning rate
              loss='categorical_crossentropy',
              metrics=['accuracy'])

epochs = 40

print("Training the top layer")
history = model.fit(train_ds,
                    validation_data = val_ds,
                    epochs = epochs,)

## Fine-tuning

In this section, I did some fine tuning to improve the model I trained previously.

The first step is to defroze the base model to actually train it along with my model.

Then, I recompiled the model with the new data that I defrozed.

Finally, I can fit the model with the base model taken in account.

In [None]:
# Unfreeze the base_model. Note that it keeps running in inference mode
# since we passed `training=False` when calling it. This means that
# the batchnorm layers will not update their batch statistics.
# This prevents the batchnorm layers from undoing all the training
# we've done so far.
base_model.trainable = True
model.summary(show_trainable=True)

model.compile(
    optimizer=keras.optimizers.Adam(1e-5),  # Low learning rate
    loss=keras.losses.BinaryCrossentropy(from_logits=True),
    metrics=[keras.metrics.BinaryAccuracy()],
)

epochs = 1
print("Fitting the end-to-end model")
model.fit(train_ds, validation_data=val_ds, epochs=epochs,)


## Plot the learning curves

In this section, you can observe the learning curves related to my model; you'll have to launch the notebook to actually see them.

In [None]:
logs = pd.DataFrame(history.history)

plt.figure(figsize=(14, 4))
plt.subplot(1, 2, 1)
plt.plot(logs.loc[1:,"loss"], lw=2, label='training loss')
plt.plot(logs.loc[1:,"val_loss"], lw=2, label='validation loss')
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(logs.loc[1:,"accuracy"], lw=2, label='training accuracy')
plt.plot(logs.loc[1:,"val_accuracy"], lw=2, label='validation accuracy')
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.legend(loc='lower right')
plt.show()

## Save the trained model

In [None]:
model.save("model.keras")

## Evaluate model performance using the `supplementary_data`

In [None]:
supplementary_ds = keras.utils.image_dataset_from_directory(
    #"PetImages",
    "/kaggle/input/u-tad-dogs-vs-cats-2025/supplementary_data/supplementary_data",
    image_size=image_size,
    batch_size=batch_size,
    labels="inferred",
    label_mode="categorical",
)

model.evaluate(supplementary_ds,
               return_dict=True,
               verbose=1)

# Results

## Details of trials

To finally arrive to this model, I tried a lot of different things detailed below along with the accuracy score associated with each trial.

The first step was to launch the template model for the competition, consisting of a basic CNN model, resulting in an accuracy score of 0.65.

My first change in the building of the model was to use another optimizer: Adam instead of RMSprops. I found out in the example notebook shown in class that it was one of the best optimizers when I did my tests. Yet this change only reduced my accuracy score by giving me 0.61. I decided to keep it to see what it would do with my next updates.

Then, I looked into the Keras documentation to do some transfer learning and fine tuning to my model. The idea was to use an already trained model to better mine. I added some pre-trained models from ImageNet on top of my model. I froze that model to not use it during the training. After that I defroze it and do some fine tuning of my model with it. This increased my score a lot with an accuracy of 0.78.

In order to train my model more, I added some epochs, not much at first because it took a long time for my model to run. I added 5 epochs and that added 0.01 to my accuracy score.

Yet, my priority was first to add more parameters to train my model, I would then add more epochs to train it as much as possible.

The next step was to add more data by transforming the one I already had. In fact, the dataset doesn't have a lot of data; it was necessary to add more in order to hope for a better accuracy. To do so I started by flipping horizontally my dataset to have more data to train on. This added 0.02 to my accuracy score, which was then 0.81.

Thanks to the Keras documentation and the courses, I saw that I also could rotate the images of my dataset to add even more data to my model, I added a random rotation of 0.1. This update increased my accuracy score to 0.89.

Finally, I started adding epochs to train even more my model. I didn't do it a lot before all my changes because it would already take 25 minutes to run. I added epochs incrementally (5 by 5) to finally end up with 40 epochs and an accuracy of 0.91. 

## Create predictions for all of the test images
(Do not modify this section)

In [None]:
%%time

folder_path = "/kaggle/input/u-tad-dogs-vs-cats-2025/test/test"

predictions_dict = {}

for img in os.listdir(folder_path):
    img = os.path.join(folder_path, img)
    
    # save the image name
    file_name = img.split('/')[-1]
    file_no_extension = file_name.split('.')[0]
    
    img = keras.utils.load_img(img, target_size=image_size)
    img_array = keras.utils.img_to_array(img)
    img_array = keras.ops.expand_dims(img_array, 0)
    prediction = model.predict(img_array, verbose=None)
    label = np.argmax(prediction)

    # save the predictions to a dictionary
    predictions_dict[int(file_no_extension)] = label

## Save your predictions to a competition submission file

In [None]:
submission = pd.DataFrame(predictions_dict.items(), columns=["id", "label"]).sort_values(by='id', ascending=True)
submission.to_csv('submission.csv',index=False)

# print numbers of each class label
submission["label"].value_counts()

# Conclusions

The experiment allowed me to better my understanding of image classification and deep learning. It is really interesting to see how technical it is for a machine to do something so simple for most of the humans.

My final model resulted in using deep learning along with transfer learning. I think I could have improved it more, maybe by adding more epochs or even by researching more about the parameters of all the functions I used. Yet, I am really satisfied with the accuracy score I end with (0.91).

The dataset we had was also really small, so it was hard in the end to really improve significantly the score, I think transfer learning was really the key, maybe other datasets online were better than the one I used. Thus, with a larger initial dataset and another one for transfer learning, I think the accuracy score could be even more improved.

This assignment was really interesting to understand all the mechanisms of image classification and how complex it is for only binary classification. Therefore, I can only imagine how complex it is for multiple classifications with more complex images.

# References

Keras documentation about transfer learning and fine-tuning: 
https://keras.io/guides/transfer_learning/ 

Geeks for geeks article about Image Classification : 
https://www.geeksforgeeks.org/computer-vision/what-is-image-classification/