# Kaggle Dogs vs Cats

Applying deep learning machine learning models on the Kaggle Dogs vs Cats dataset by comparing the performace between a model modified from [Keras tutorial](https://keras.io/examples/vision/image_classification_from_scratch/) and a transfer learning model built on VGG16.

<a id='toc'></a>
<h2> Table of Contents</h2>
<div class='alert alert-box alert-info'>
    <ol>
        <li><a href='#lib'> Import libraries </a></li>
        <li><a href='#data'> Extracting data </a></li>
        <li><a href='#ds'> Generating dataset </a></li>
        <li><a href='#viz'> Visualise the data </a></li>
        <li><a href='#aug'> Using image data augmentation </a></li>
        <li><a href='#model'> Building the models </a></li>
        <li><a href='#train'> Training the models </a></li>
        <li><a href='#pred'> Predictions </a></li>        
        <li><a href='#compare'> Comparison </a></li>
        <li><a href='#ref'> References </a></li>
    </ol>
</div>

<a id='lib'></a>
<h2> Import libraries </h2>


In [32]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import os
from tqdm import tqdm
from zipfile import ZipFile

In [2]:
#from kaggle.api.kaggle_api_extended import KaggleApi

In [3]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import Sequential
from tensorflow.keras import Input
from tensorflow.keras import layers
from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras.preprocessing import image
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.applications import VGG16

[Back to top](#toc)

<a id='data'></a>
<h2> Extracting data </h2>

We have 2 ways to access the data on Kaggle. 

If we are running the notebook on Kaggle itself, we can use `os` library to extract the files. Otherwise if we are runnging the notebook on a local machine, we will have the access the data using Kaggle API to programatically download the dataset from Kaggle. 

The instructions for getting Kaggle's API token can be found [here](https://www.kaggle.com/docs/api). Additionally, this excellent [article](https://python.plainenglish.io/how-to-use-the-kaggle-api-in-python-4d4c812c39c7) by <em>Python in Plain English</em> explains clearly how you can use the Kaggle API in Python.

In [4]:
run_on_kaggle = True

In [5]:
if run_on_kaggle:
    data_path = '../input/dogs-vs-cats'
    work_path = '../work'
    
    with ZipFile(os.path.join(data_path, 'train.zip'), 'r') as z:
        z.extractall(work_path)
    with ZipFile(os.path.join(data_path, 'test1.zip'), 'r') as z:
        z.extractall(work_path)
    
else:
    # initialise the API
    kag = KaggleApi()
    kag.authenticate()
    
    # downloading the files
    comp_name = 'dogs-vs-cats'
    dl_path = './'
    kag.competition_download_files(competition=comp_name, path=dl_path)
    
    # unzip the files
    with ZipFile('dogs-vs-cats.zip', 'r') as z:
        z.extractall()
    with ZipFile('train.zip', 'r') as z:
        z.extractall()
    with ZipFile('test1.zip', 'r') as z:
        z.extractall()

In [6]:
if run_on_kaggle:
    train_path = os.path.join(work_path, 'train')
    test_path = os.path.join(work_path, 'test1')
else:
    train_path = './train'
    test_path = './test1'

<h3> Tidying up the data </h3>

In [7]:
train_df = pd.DataFrame({'image_name':os.listdir(train_path)})
train_df['label'] =train_df['image_name'].apply(lambda x: x.split('.')[0])
train_df

We will move the training images into their respective folders, ie cat images to cat folders and dog images to dog folders. This will be done using `os.mkdir` (creating new folders) and `os.rename` (moving the files).

In [8]:
cat_path = os.path.join(train_path, 'cat')
os.mkdir(cat_path)
cat_df = train_df[train_df.label=='cat']
for n in tqdm(cat_df.image_name):
    os.rename((os.path.join(train_path, n)), (os.path.join(cat_path, n)))

In [9]:
dog_path = os.path.join(train_path, 'dog')
os.mkdir(dog_path)
dog_df = train_df[train_df.label=='dog']
for n in tqdm(dog_df.image_name):
    os.rename((os.path.join(train_path, n)), (os.path.join(dog_path, n)))

[Back to top](#toc)

<a id='ds'></a>
<h2> Generating dataset </h2>

We will prepare tensorflow training and validation datasets using the 
Keras image data processing function [`image_dataset_from_directory`](https://keras.io/api/preprocessing/image/).

In [10]:
image_size = (128, 128)
batch_size = 32
rand_seed = 42
val_split = 0.2

In [11]:
train_ds = image_dataset_from_directory(
    directory=train_path,
    class_names=['cat', 'dog'], 
    batch_size=batch_size,
    image_size=image_size,
    seed=rand_seed, 
    validation_split=val_split,
    subset='training'
)

In [12]:
val_ds = image_dataset_from_directory(
    directory=train_path,
    class_names=['cat', 'dog'],
    batch_size=batch_size,
    image_size=image_size,
    seed=rand_seed,
    validation_split=val_split,
    subset='validation'
)

[Back to top](#toc)

<a id='viz'></a>
<h2> Visualise the data </h2>

To visualise the images, we make use of the tensorflow datasets and the `dataset.take(n)` method. The method returns n batches of images and labels of batch size as defined. 

The images are tensor objects which needs to be converted to numpy and cast to unsigned integers before using `axes.imshow`.

The labels are 0 for cat and 1 for dog. These can be controlled in the earlier `image_dataset_from_directory` function.

In [13]:
plt.figure(figsize=(10,10))
for images, labels in train_ds.take(1):
    for i in range(16):
        ax = plt.subplot(4, 4, i+1)
        ax.imshow(images[i].numpy().astype('uint8'))
        ax.set_title(int(labels[i]))
        ax.axis('off')
plt.show()

[Back to top](#toc)

<a id='aug'></a>
<h2> Visualising image augmentation </h2>

When we don't have a large image dataset, it is a good practice to artifically introduce image diversity by adding some random yet realistic transformations to the training images. This helps expose our model to different aspects of the training data while minimising overfitting. 

We have chosen to use random horizontal flipping, small random rotations, and small random zooms. Below shows the effects of these transformations when compared against the original image.

In [14]:
data_augmentation = Sequential(
    [layers.RandomFlip('horizontal'),
     layers.RandomRotation((-0.1, 0.1)),
     layers.RandomZoom((-0.2, 0.2))
    ]
)

I have set the `training` parameter of `data_augmentation` to True. Typically this will not be a problem if I'm running the notebook in sequence. But when I circle back to this code after building the model, the augmented images will not be shown as augmented anymore unless I have set `training` to True. 

This has to do with the characteristic of the augmentation layer being active only during training and inactive during inference. Thanks to the [solution](https://stackoverflow.com/questions/71164259/tensorflow-augmentation-layers-not-working-after-importing-from-tf-keras-applica/71469695#71469695?newreg=fc8166485de44a8b8e70cac0f6f965c5) from stackoverflow that helped me with the debug.

In [15]:
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
    ax = plt.subplot(441)
    ax.imshow(images[0].numpy().astype('uint8'))
    ax.set_title('original')
    ax.axis('off')

    fig = plt.figure(figsize=(10, 10))
    for i in range(16):
        augmented_image = data_augmentation(images[0], training=True) # training must be set to True
        ax = plt.subplot(4, 4, i+1)
        ax.imshow(augmented_image.numpy().astype('uint8'))
        ax.axis('off')
    fig.suptitle('Augmented images')
    plt.show()

[Back to top](#toc)

<a id='model'></a>
<h2> Building the models </h2>

We will build 2 models for comparison. 
1. A base model modified from the Keras tutorial
2. A transfer learning model based on VGG16

<h3> Base model </h3>

In [16]:
# Define augmentation layer
augmentation_layer = Sequential(
    [layers.RandomFlip('horizontal'),
     layers.RandomRotation((-0.1, 0.1)),
     layers.RandomZoom((-0.2, 0.2))
    ]
)

A little information regarding batch normalisation. 

It is a tecnique deisnged to automatically standardise the inputs to a layer in a deep learning neural network. The benefit being that it has the effect of dramatically accelerating the training process of a NN, and in some cases improving the performance of the model via a modest regularisation effect. 

<em>Reference: [BatchNormalization layer](https://keras.io/api/layers/normalization_layers/batch_normalization/) and [How to Accelerate Learning of DNN with Batch Normalisation](https://machinelearningmastery.com/how-to-accelerate-learning-of-deep-neural-networks-with-batch-normalization/) </em>

In [17]:
basemodel = Sequential()
basemodel.add(Input(shape=image_size+(3,)))
basemodel.add(augmentation_layer)
basemodel.add(layers.Rescaling(1.0/255))

for size in [32, 64]:
    basemodel.add(layers.Conv2D(size, 3, padding='same', activation='relu'))
    basemodel.add(layers.BatchNormalization())
    basemodel.add(layers.MaxPooling2D(pool_size=2))
    basemodel.add(layers.Dropout(0.2))

for size in [128, 256, 512, 728]:
    basemodel.add(layers.SeparableConv2D(size, 3, padding='same', activation='relu'))
    basemodel.add(layers.BatchNormalization())
    basemodel.add(layers.SeparableConv2D(size, 3, padding='same', activation='relu'))
    basemodel.add(layers.BatchNormalization())
    basemodel.add(layers.MaxPooling2D(pool_size=2))
    basemodel.add(layers.Dropout(0.2))
    
# output layer
basemodel.add(layers.Flatten())
basemodel.add(layers.Dense(512, activation='relu'))
#basemodel.add(Dense(2, activation='softmax'))
basemodel.add(layers.Dense(1, activation='sigmoid'))

In [18]:
basemodel.summary()

<h3> Transfer learning model </h3>

To implement transfer learning model with VGG16

In [19]:
vgg16layer = VGG16(
    weights='imagenet', 
    include_top=False, 
)
vgg16layer.trainable = False

In [20]:
vgg16model = keras.Sequential()
vgg16model.add(Input(shape=image_size+(3,)))
vgg16model.add(augmentation_layer)
vgg16model.add(layers.Rescaling(1.0/255))
vgg16model.add(vgg16layer)

vgg16model.add(layers.Flatten())
vgg16model.add(layers.Dense(512, activation='relu'))
#basemodel.add(Dense(2, activation='softmax'))
vgg16model.add(layers.Dense(1, activation='sigmoid'))

In [21]:
vgg16model.summary()

[Back to top](#toc)

<a id='train'></a>
<h2> Train the model </h2>

In [22]:
if run_on_kaggle:
    epochs = 50
else:
    epochs = 1

In [23]:
early_stop = EarlyStopping(patience=10)

lr_reduction = ReduceLROnPlateau(
    monitor='val_accuracy',
    patience=2,
    verbose=1,
    factor=0.5,
    min_lr=0.00001
)

model_chkpt = ModelCheckpoint('save_at_{epoch}.h5')

callbacks = [
    early_stop,
    lr_reduction,
    model_chkpt
]

<b> Training base model </b>

In [24]:
basemodel.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

In [25]:
base_hist = basemodel.fit(
    train_ds, epochs=epochs, callbacks=callbacks, validation_data=val_ds
)

<b> Training transfer learning model </b>

In [26]:
vgg16model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

In [27]:
vgg16_hist = vgg16model.fit(
    train_ds, epochs=epochs, callbacks=callbacks, validation_data=val_ds
)

In [40]:
basemodel_df = pd.DataFrame.from_dict(base_hist.history)
print(basemodel_df.head())
print(basemodel_df.tail())

In [41]:
vgg16model_df = pd.DataFrame.from_dict(vgg16_hist.history)
print(vgg16model_df.head())
print(vgg16model_df.tail())

In [47]:
plt.figure(figsize=(10, 10))
plt.subplot(221)
sns.lineplot(x=basemodel_df.index, y='loss', data=basemodel_df, label='loss')
sns.lineplot(x=basemodel_df.index, y='val_loss', data=basemodel_df, label='val loss')
plt.title('loss and val loss of base model')

plt.subplot(222)
sns.lineplot(x=vgg16model_df.index, y='loss', data=vgg16model_df, label='loss')
sns.lineplot(x=vgg16model_df.index, y='val_loss', data=vgg16model_df, label ='val loss')
plt.title('loss and val loss of trf learning model')


plt.subplot(223)
sns.lineplot(x=basemodel_df.index, y='accuracy', data=basemodel_df, label='acc')
sns.lineplot(x=basemodel_df.index, y='val_accuracy', data=basemodel_df, label='val acc')
plt.title('acc and val acc of base model')

plt.subplot(224)
sns.lineplot(x=vgg16model_df.index, y='accuracy', data=vgg16model_df, label='acc')
sns.lineplot(x=vgg16model_df.index, y='val_accuracy', data=vgg16model_df, label='val acc')
plt.title('acc and val acc of trf learning model')

plt.tight_layout()
plt.show()

[Back to top](#toc)

<a id='pred'></a>
<h2> Predictions </h2>

In [43]:
img_num = str(np.random.randint(1, 12501))
sample_img = os.path.join(test_path, img_num+'.jpg')

img = keras.preprocessing.image.load_img(
    sample_img, target_size=image_size
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0)  # Create batch axis

predictions = basemodel.predict(img_array)
score = predictions[0]
plt.imshow(img_array[0].numpy().astype('uint8'))
plt.axis('off')
plt.show()
print(
    "This image is %.2f percent cat and %.2f percent dog."
    % (100 * (1 - score), 100 * score)
)

In [44]:
test_ds = image_dataset_from_directory(
    test_path,
    label_mode=None,
    image_size=image_size,
    shuffle=False
)

In [52]:
test_filenames = [f.split('/')[-1] for f in test_ds.file_paths]
#test_filenames = [int(f.split('\\')[-1].split('.')[0]) for f in test_ds.file_paths]
#test_filenames = [int(f.split('.')[0]) for f in test_filenames]
#test_filenames

In [53]:
base_pred = basemodel.predict(test_ds)

In [55]:
vgg16_pred = vgg16model.predict(test_ds)

In [56]:
base_pred

In [57]:
pred_df = pd.DataFrame(
    {'filename':test_filenames,
     'base_score':base_pred.reshape(1, -1)[0],
     'vgg16_score':vgg16_pred.reshape(1, -1)[0]
    }
)

pred_df['base_prediction'] = (pred_df['base_score'] >= 0.5).astype('int')
pred_df['vgg16_prediction'] = (pred_df['vgg16_score'] >= 0.5).astype('int')

pred_df.head()

[Back to top](#toc)

<a id='compare'></a>
<h2> Comparison </h2>

In [64]:
n = 16

sample_img_df = pred_df.sample(n)
plt.figure(figsize=(10, 10))
for i in range(n):
    plt.subplot(4, 4, i+1)
    sample_img = os.path.join(test_path, sample_img_df.iloc[i].filename)
    img = image.load_img(
        sample_img, target_size=image_size
    )
    plt.imshow(img)
    plt.title(f'base pred:{sample_img_df.iloc[i].base_prediction}\n vgg16 pred:{sample_img_df.iloc[i].vgg16_prediction}'
    )
    plt.axis('off')
plt.tight_layout()
plt.show()

In [69]:
pred_df['diff'] = pred_df['base_prediction'] != pred_df['vgg16_prediction']
diff_df = pred_df[pred_df['diff'] == True]

In [70]:
n = 16

sample_img_df = diff_df.sample(n)
plt.figure(figsize=(10, 10))
for i in range(n):
    plt.subplot(4, 4, i+1)
    sample_img = os.path.join(test_path, sample_img_df.iloc[i].filename)
    img = image.load_img(
        sample_img, target_size=image_size
    )
    plt.imshow(img)
    plt.title(f'base pred:{sample_img_df.iloc[i].base_prediction}\n vgg16 pred:{sample_img_df.iloc[i].vgg16_prediction}'
    )
    plt.axis('off')
plt.tight_layout()
plt.show()

<a id='ref'></a>
<h2> References </h2>
<ol>
    <li><a href='https://keras.io/examples/vision/image_classification_from_scratch/'> Image classification from scratch </a></li>
    <li><a href='https://www.kaggle.com/docs/api'> Kaggle Public API </a></li>
    <li><a href='https://python.plainenglish.io/how-to-use-the-kaggle-api-in-python-4d4c812c39c7'> How to Use Kaggle API in Python </a></li>
    <li><a href='https://www.kaggle.com/code/uysimty/keras-cnn-dog-or-cat-classification/notebook'> Keras CNN Dog or Cat Classification by <em>UYSIM</em> </a></li>
    <li><a href='https://towardsdatascience.com/transfer-learning-with-vgg16-and-keras-50ea161580b4'> Transfer Learning with VGG16 and Keras </a></li>    
    <li><a href='https://towardsdatascience.com/a-basic-introduction-to-separable-convolutions-b99ec3102728'> A Basic Introduction to Separable Convolutions </a></li>

</ol>


[Back to top](#toc)