# Homework 4

**Name:**

In [1]:
Name = "Matt Burns"
assert Name != "", 'Please enter your name in the above quotation marks, thanks!'

**A-Number:**

In [2]:
A_number = ""
assert A_number != "", 'Please enter your A-number in the above quotation marks, thanks!'

**Kaggle-UserName:**

In [3]:
Kaggle_UserName = ""
assert Kaggle_UserName != "", 'Please enter your Kaggle Username in the above quotation marks, thanks!'

**Please describe your improvements here**:

* Code is adapted from the following sources: https://github.com/fastai/imagenette
* GPT-4 utilized: https://chat.openai.com/share/c964473d-ab9c-4f07-be8d-bec5bef536ac

* Implemented a MaxBlurPool2D layer
* Utilize a mish activation function
* Custom self attention layer
* Utilize a 50% dropout layer
* Image transformations
* Utilize built-in early stopping and checkpoint

In this homework, we will train a CNN model to classify big cats. This dataset consists of images of ten types of big cats, a.k.a, multiclass classification.

 **Please download the dataset from the [inclass Kaggle competition](https://www.kaggle.com/t/e5a7bab3f6c543a9943b3d9970768eaa) as we split the original dataset into the train-valid-test sets.**

This notebook contains a baseline model. Please use it as a starting point. **The purpose of this homework is to design an advanced CNN model to achieve better performance by yourself. You are not allowed to import pre-trained models. In case you are interested, we provide a sample code by using a pre-trained model, Resnet50.**

Your jobs

-   Read, complete, and run the code.

-   **Make substantial improvements** to maximize the accurcy.

-   Submit the .IPYNB file to Canvas.

    - Run all cells in your notebook to make sure there are no errors by doing `Kernel -> Restart Kernel and Clear All Outputs` and then `Run -> Run All Cells`.
    
    - Notebooks with cell execution numbers out of order will have marks deducted. Notebooks without the output displayed may not be graded at all (because we need to see the output in order to grade your work).
    
    - Please keep your notebook clean and delete any throwaway code.

-   Submit the generated "pred.csv" to the [inclass Kaggle competition](https://www.kaggle.com/t/e5a7bab3f6c543a9943b3d9970768eaa).


# **Rules**

- You should finish your homework on your own.
- **You should not modify your prediction files manually.**
- Do not share code or prediction files with any living creatures.
- **Do not search or use additional data.**
- **Do not use any pre-trained models.**
    - You can ask Github copilot for help.


## Hints to Improve Your Results

* You'd better use a GPU machine to run it, otherwise it'll be quite slow.
* Revise the simple CNN model
* Revise the *transforms* function by using some image augumentation techniques
* Tune hyper-parameters, such as batch_size

In [4]:
%pip install tensorflow matplotlib tensorflow-addons



In [5]:
# Import necessary libraries
import numpy as np
import os
import matplotlib.pyplot as plt
import tensorflow.keras as keras
import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.applications.resnet50 import preprocess_input
from tensorflow.keras.optimizers.schedules import ExponentialDecay
from tensorflow.keras.layers import Dropout

In [6]:
# Check if GPU is available and set TensorFlow device to GPU
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(f'{len(gpus)} Physical GPUs, {len(logical_gpus)} Logical GPUs')
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(e)
else:
    print("No GPU is available.")

1 Physical GPUs, 1 Logical GPUs


In [7]:
from google.colab import files
files.upload()   ## Upload your Kaggle token file.

Saving kaggle.json to kaggle (2).json


{'kaggle (2).json': b'{"username":"","key":""}'}

In [8]:
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
!kaggle competitions download -c fall2023-cs5665-hw4   ## You need to join the competition first.

fall2023-cs5665-hw4.zip: Skipping, found more recently modified local copy (use --force to force download)


In [9]:
import os
import zipfile
from pathlib import Path
local_zip = 'fall2023-cs5665-hw4.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('./')
zip_ref.close()

In [10]:
# Define MaxBlurPool2D layer (mimicking the FastAI MaxBlurPool technique)
class MaxBlurPool2D(layers.Layer):
    def __init__(self, pool_size=(2, 2), strides=(2, 2), padding='valid', **kwargs):
        super(MaxBlurPool2D, self).__init__(**kwargs)
        self.pool_size = pool_size
        self.strides = strides
        self.padding = padding
        self.max_pool = layers.MaxPooling2D(pool_size=self.pool_size, strides=self.strides, padding=self.padding)

    def build(self, input_shape):
        # Create a 3x3 gaussian blur kernel
        kernel_vals = np.array([1, 2, 1], dtype=np.float32)
        kernel_vals = kernel_vals[:, np.newaxis] * kernel_vals[np.newaxis, :]
        kernel_vals /= np.sum(kernel_vals)
        blur_kernel = np.tile(kernel_vals[:, :, np.newaxis, np.newaxis], (1, 1, input_shape[-1], 1))
        self.blur_kernel = tf.constant(blur_kernel, dtype=tf.float32)
        super(MaxBlurPool2D, self).build(input_shape)

    def call(self, inputs):
        x = self.max_pool(inputs)
        # Add blur effect using depthwise_conv2d
        return tf.nn.depthwise_conv2d(input=x, filter=self.blur_kernel, strides=[1, 1, 1, 1], padding='SAME')

In [11]:
# Define a custom Mish activation function
def mish(x):
    return x * tf.math.tanh(tf.math.softplus(x))

In [12]:
# Define a custom Self-Attention Layer
class SelfAttention(layers.Layer):
    def __init__(self, channels, **kwargs):
        super(SelfAttention, self).__init__(**kwargs)
        self.channels = channels
        self.query = layers.Dense(channels)
        self.key = layers.Dense(channels)
        self.value = layers.Dense(channels)
        self.gamma = self.add_weight(name='gamma', shape=[1], initializer='zeros', trainable=True)

    def call(self, inputs):
        shape = tf.shape(inputs)
        f = self.query(inputs)  # [bs, h*w, c']
        g = self.key(inputs)    # [bs, h*w, c']
        h = self.value(inputs)  # [bs, h*w, c']

        s = tf.matmul(g, f, transpose_b=True)  # [bs, h*w, h*w]
        beta = tf.nn.softmax(s, axis=-1)  # attention map

        o = tf.matmul(beta, h)  # [bs, h*w, c']
        o = tf.reshape(o, shape=shape)  # [bs, h, w, c]
        x = self.gamma * o + inputs
        return x


In [13]:
# Set up the image data generator with preprocessing
train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    vertical_flip=True,  # Add vertical flip
    channel_shift_range=20,  # Shift the channels by up to 20 values
    fill_mode='reflect'  # Use 'reflect' mode for filling in new pixels
    # fill_mode='nearest'
)

In [14]:
valid_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

In [15]:
# Set up train and validation generators
train_generator = train_datagen.flow_from_directory(
    './Dataset/train',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical',
    shuffle=True
)

valid_generator = valid_datagen.flow_from_directory(
    './Dataset/val',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical'
)

test_generator = valid_datagen.flow_from_directory(
    './Dataset/test',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical'
)

Found 2111 images belonging to 10 classes.
Found 50 images belonging to 10 classes.
Found 278 images belonging to 1 classes.


In [16]:
base_model = ResNet50(weights=None, include_top=False, input_shape=(224, 224, 3))

In [17]:
from keras.api._v2.keras import models

# Create a new model on top with MaxBlurPool
with tf.device('/GPU:0'): model = models.Sequential([
    base_model,
    MaxBlurPool2D(),
    SelfAttention(base_model.output_shape[-1]),
    layers.GlobalAveragePooling2D(),
    layers.Dense(1024),
    layers.Lambda(mish),
    Dropout(0.5),
    layers.Dense(train_generator.num_classes, activation='softmax')
])

In [18]:
# Compile the model
model.compile(optimizer=tf.keras.optimizers.Adam(),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

In [19]:
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.callbacks import ModelCheckpoint
from google.colab import drive

# Define the early stopping callback
early_stopping = EarlyStopping(
    monitor='val_loss',  # Monitor the validation loss
    patience=25,          # Number of epochs with no improvement after which training will be stopped
    verbose=1,           # Verbosity mode
    restore_best_weights=True  # Whether to restore model weights from the epoch with the best value of the monitored quantity
)

drive.mount('/content/drive')

# Define the checkpoint callback
checkpoint = ModelCheckpoint(
    '/content/drive/My Drive/model_ranger.h5',  # Path where to save the model
    monitor='val_accuracy',  # Metric to monitor
    verbose=1,  # Logging level
    save_best_only=True,  # Only save a model if `val_accuracy` has improved
    mode='max'  # `max` means that `val_accuracy` should be maximized
)

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [20]:
from tensorflow.python.ops import custom_gradient
# Add the checkpoint to your list of callbacks
callbacks_list = [early_stopping, checkpoint]
# callsbacks_list = [checkpoint]

#custom_objects = {'SelfAttention': SelfAttention, 'MaxBlurPool2D': MaxBlurPool2D}

#model = tf.keras.models.load_model('best_model_2', custom_objects=custom_objects)

# Start training
model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // train_generator.batch_size,
    validation_data=valid_generator,
    validation_steps=valid_generator.samples // valid_generator.batch_size,
    epochs=20,
    callbacks=callbacks_list
)

# Save the model
model.save('model.h5')

# Save the model to Google Drive
model.save('/content/drive/My Drive/model.h5')

Epoch 1/20
Epoch 1: val_accuracy improved from -inf to 0.09375, saving model to /content/drive/My Drive/model_ranger.h5


  saving_api.save_model(


Epoch 2/20
Epoch 2: val_accuracy improved from 0.09375 to 0.12500, saving model to /content/drive/My Drive/model_ranger.h5
Epoch 3/20
Epoch 3: val_accuracy improved from 0.12500 to 0.18750, saving model to /content/drive/My Drive/model_ranger.h5
Epoch 4/20
Epoch 4: val_accuracy improved from 0.18750 to 0.40625, saving model to /content/drive/My Drive/model_ranger.h5
Epoch 5/20
Epoch 5: val_accuracy did not improve from 0.40625
Epoch 6/20
Epoch 6: val_accuracy did not improve from 0.40625
Epoch 7/20
Epoch 7: val_accuracy improved from 0.40625 to 0.53125, saving model to /content/drive/My Drive/model_ranger.h5
Epoch 8/20
Epoch 8: val_accuracy did not improve from 0.53125
Epoch 9/20
Epoch 9: val_accuracy did not improve from 0.53125
Epoch 10/20
Epoch 10: val_accuracy did not improve from 0.53125
Epoch 11/20
Epoch 11: val_accuracy did not improve from 0.53125
Epoch 12/20
Epoch 12: val_accuracy did not improve from 0.53125
Epoch 13/20
Epoch 13: val_accuracy did not improve from 0.53125
Epoc

In [21]:
from keras.utils import custom_object_scope
import pandas as pd

def make_predictions_and_export(model_path, test_data_path, output_csv_path):
    """
    Loads a trained model, makes predictions on the test dataset, and exports the predictions to a CSV file.

    :param model_path: Path to the trained model file.
    :param test_data_path: Path to the test dataset directory or file.
    :param output_csv_path: Path where the output CSV file will be saved.
    """
    # Load the model
    with custom_object_scope({'SelfAttention': SelfAttention, 'MaxBlurPool2D': MaxBlurPool2D}):
        model = tf.keras.models.load_model(model_path)


    # Prepare the test dataset
    # Assuming the test data is in a directory and organized in a way that can be used with ImageDataGenerator
    test_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
    test_generator = test_datagen.flow_from_directory(
        test_data_path,
        target_size=(224, 224),  # Assuming the model expects images of this size
        batch_size=32,
        class_mode=None,  # Since we're predicting, we don't need labels
        shuffle=False  # Keep data in same order as labels
    )

    # Make predictions
    predictions = model.predict(test_generator, verbose=1)

    # Assuming the predictions are categorical, get the class with the highest probability
    predicted_classes = np.argmax(predictions, axis=1)

    # Create a DataFrame with the required structure
    ids = range(len(predicted_classes))  # Assuming IDs should be a range starting from 0
    results_df = pd.DataFrame({'id': ids, 'label': predicted_classes})

    # Export to CSV
    results_df.to_csv(output_csv_path, index=False)
    print(f'Predictions are exported to {output_csv_path}')

In [22]:
make_predictions_and_export(
    model_path='model.h5',
    test_data_path='./Dataset/test',
    output_csv_path='pred.csv')

Found 278 images belonging to 1 classes.
Predictions are exported to pred.csv
