<a href="https://www.kaggle.com/code/drondasgupta/car-or-truck-custom-cnn-model?scriptVersionId=107747435" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

# import os
# for dirname, _, filenames in os.walk('/kaggle/input'):
#     for filename in filenames:
#         print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

# Objective
The objective of this code is to create a Convolutional Neural Net (CNN) that takes a 2D image and classifies it as a 'Car' or a 'Truck', depending on its features.

First, we will import and set up a few important libraries:

In [None]:
import matplotlib.pyplot as plt
from matplotlib import gridspec

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing import image_dataset_from_directory

In [None]:
# Set Matplotlib defaults
plt.rc('figure', autolayout=True)
plt.rc('axes', labelweight='bold', labelsize='large',
       titleweight='bold', titlesize=18, titlepad=10)
plt.rc('image', cmap='magma')
# warnings.filterwarnings("ignore") # to clean up output cells

Now, we are going to load our data, which consists of one folder of training images (training set) and one folder of validation images (validation set).

We will also create a pipeline that modifies our image to a float value and autotunes it.

In [None]:
# Load training and validation sets
ds_train_ = image_dataset_from_directory(
    '../input/car-or-truck/train',
    labels='inferred',
    label_mode='binary',
    image_size=[128, 128],
    interpolation='nearest',
    batch_size=64,
    shuffle=True,
)
ds_valid_ = image_dataset_from_directory(
    '../input/car-or-truck/valid',
    labels='inferred',
    label_mode='binary',
    image_size=[128, 128],
    interpolation='nearest',
    batch_size=64,
    shuffle=False,
)

# Data Pipeline
def convert_to_float(image, label):
    image = tf.image.convert_image_dtype(image, dtype=tf.float32)
    return image, label

AUTOTUNE = tf.data.experimental.AUTOTUNE
ds_train = (
    ds_train_
    .map(convert_to_float)
    .cache()
    .prefetch(buffer_size=AUTOTUNE)
)
ds_valid = (
    ds_valid_
    .map(convert_to_float)
    .cache()
    .prefetch(buffer_size=AUTOTUNE)
)

Time to create our model.

For a single convolutional block, we are using a 3x3 kernel, 'relu' activation, 

In [None]:
model = keras.Sequential([
    layers.InputLayer(input_shape=[128, 128, 3]),

    # Block One
    layers.BatchNormalization(renorm=True),
    layers.Conv2D(filters=64, kernel_size=3, activation='relu', padding='same'),
    layers.MaxPool2D(),

    # Block Two
    layers.BatchNormalization(renorm=True),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding='same'),
    layers.MaxPool2D(),

    # Block Three
    layers.BatchNormalization(renorm=True),
    layers.Conv2D(filters=256, kernel_size=3, activation='relu', padding='same'),
    layers.Conv2D(filters=256, kernel_size=3, activation='relu', padding='same'),
    layers.MaxPool2D(),

    # Head
    layers.BatchNormalization(renorm=True),
    layers.Flatten(),
    layers.Dense(8, activation='relu'),
    layers.Dense(1, activation='sigmoid'),
])

In [None]:
model.compile(
    optimizer = tf.keras.optimizers.Adam(epsilon=0.01),
    # YOUR CODE HERE: Add loss and metric
    loss = 'binary_crossentropy',
    metrics = ['binary_accuracy']
)

In [None]:
history = model.fit(
    ds_train,
    validation_data = ds_valid,
    epochs = 50
)

history_frame = pd.DataFrame(history.history)
history_frame.loc[:, ['loss', 'val_loss']].plot()
history_frame.loc[:, ['binary_accuracy', 'val_binary_accuracy']].plot();

As we can see, the training loss reduces sharply after 10 epochs compared to the validation loss. Similarly, the validation accuracy plateaus at roughly 90% from around 30 epochs, while the training accuracy flatlines at almost 100%.

This suggests the model is overfitting, and we need to employ some regulartization controls.

## Dropout
Dropout can be added to dense, convolutional and pooling layers by putting the following line of code after the layer (rate is a hyperparameter, selected as 0.3 here):

In [None]:
layers.Dropout(rate = 0.3)

In a CNN, it is usually only used after the pooling layers.

## Early Stopping
Early stopping can be added to prevent overfitting by stopping the learning when validation loss stops increasing. It can be defined as a callbacks.EarlyStopping object and added as an array for the 'callbacks' hyperparameter in model.fit():

In [None]:
from tensorflow.keras import callbacks

early_stopping = callbacks.EarlyStopping(
    min_delta = 0.001,
    patience = 5,
    restore_best_weights = True
)

# history = model.fit(
#     X_train, y_train,
#     validation_data = (X_valid, y_valid),
#     callbacks = [early_stopping]
# )

## Data Augmentation
Data augmentation gives the model a more generalized input by performing linear transformation on the images (for this dataset, rotation, contrast and horizontal flips make the most sense).

In [None]:
from tensorflow.keras.layers.experimental import preprocessing
# augment = keras.Sequential([
#     # preprocessing.RandomContrast(factor=0.5),
#     preprocessing.RandomFlip(mode='horizontal'), # meaning, left-to-right
#     # preprocessing.RandomFlip(mode='vertical'), # meaning, top-to-bottom
#     preprocessing.RandomWidth(factor=0.15), # horizontal stretch
#     preprocessing.RandomRotation(factor=0.03),
#     preprocessing.RandomTranslation(height_factor=0.1, width_factor=0.1),
# ])

## Sample regularized model

In [None]:
model = keras.Sequential([
    layers.InputLayer(input_shape=[128, 128, 3]),
    
    # Data Augmentation
    preprocessing.RandomContrast(factor=0.10),
    preprocessing.RandomFlip(mode='horizontal'),
    preprocessing.RandomRotation(factor=0.10),

    # Block One
    layers.BatchNormalization(renorm=True),
    layers.Conv2D(filters=64, kernel_size=3, activation='relu', padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(rate = 0.3),

    # Block Two
    layers.BatchNormalization(renorm=True),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(rate = 0.3),

    # Block Three
    layers.BatchNormalization(renorm=True),
    layers.Conv2D(filters=256, kernel_size=3, activation='relu', padding='same'),
    layers.Conv2D(filters=256, kernel_size=3, activation='relu', padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(rate = 0.3),

    # Head
    layers.BatchNormalization(renorm=True),
    layers.Flatten(),
    layers.Dense(8, activation='relu'),
    layers.Dropout(rate = 0.3),
    layers.Dense(1, activation='sigmoid'),
])

In [None]:
optimizer = tf.keras.optimizers.Adam(epsilon=0.01)
model.compile(
    optimizer=optimizer,
    loss='binary_crossentropy',
    metrics=['binary_accuracy'],
)

history = model.fit(
    ds_train,
    validation_data=ds_valid,
    epochs=50,
)

history_frame = pd.DataFrame(history.history)
history_frame.loc[:, ['loss', 'val_loss']].plot()
history_frame.loc[:, ['binary_accuracy', 'val_binary_accuracy']].plot();

Here, both the training and validation loss keep decreasing, and both training and validation accuracy keep increasing. This means that our changes were successful in preventing the model from overfitting.

## Result
Thus, we have successfully created a CNN that can classify Cars vs. Trucks given an image, with around 90% accuracy.