##### ARTI 560 - Computer Vision  
## Image Classification using Transfer Learning - Exercise 

### Objective

In this exercise, you will:

1. Select another pretrained model (e.g., VGG16, MobileNetV2, or EfficientNet) and fine-tune it for CIFAR-10 classification.  
You'll find the pretrained models in [Tensorflow Keras Applications Module](https://www.tensorflow.org/api_docs/python/tf/keras/applications).

2. Before training, inspect the architecture using model.summary() and observe:
- Network depth
- Number of parameters
- Trainable vs Frozen layers

3. Then compare its performance with ResNet and the custom CNN.

### Questions:

- Which model achieved the highest accuracy?
- Which model trained faster?
- How might the architecture explain the differences?

### Answers: ###
1- The ResNet50V2 (fine-tuned) model achieved the highest test accuracy at 91.62%, slightly outperforming the custom CNN (91.45%).
2- The custom CNN trained the fastest, followed by MobileNetV2, while ResNet50V2 required the most training time due to its depth and complexity.
3- ResNet50V2 is a deep architecture with residual connections that allow it to learn complex representations, leading to higher accuracy but slower training. MobileNetV2 uses depthwise separable convolutions to reduce computation, making it faster but slightly less accurate. The custom CNN is shallow and computationally simple, which explains its fast training time but slightly lower performance compared to fine-tuned ResNet.

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input

# -----------------------------
# 1) Load CIFAR-10
# -----------------------------
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

class_names = [
    "airplane","automobile","bird","cat","deer",
    "dog","frog","horse","ship","truck"
]

# Keep labels as integers (SparseCategoricalCrossentropy)
y_train = y_train.squeeze().astype("int64")
y_test  = y_test.squeeze().astype("int64")

# Convert images to float32
x_train = x_train.astype("float32")
x_test  = x_test.astype("float32")

# -----------------------------
# 2) Data augmentation
# -----------------------------
data_augmentation = keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.05),
    layers.RandomZoom(0.1),
], name="augmentation")

# -----------------------------
# 3) Build MobileNetV2 backbone (pretrained)
# -----------------------------
resnet_base = MobileNetV2(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
resnet_base.trainable = False  # freeze first (feature extractor)

# -----------------------------
# 4) Full model (preprocess inside model)
# -----------------------------
resnet_model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    layers.Resizing(224, 224, interpolation="bilinear"),
    layers.Lambda(preprocess_input),          # IMPORTANT
    resnet_base,
    layers.GlobalAveragePooling2D(),
    layers.Dense(10)                          
], name="cifar10_resnet50v2")

resnet_model.summary()

# -----------------------------
# 5) Compile + Train (frozen backbone)
# -----------------------------
resnet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

callbacks = [
    keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=3, restore_best_weights=True),
    keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1),
]

history = resnet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    callbacks=callbacks,
    verbose=1
)

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5
[1m9406464/9406464[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m83s[0m 105ms/step - accuracy: 0.6055 - loss: 1.1498 - val_accuracy: 0.8062 - val_loss: 0.5751 - learning_rate: 0.0010
Epoch 2/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m76s[0m 108ms/step - accuracy: 0.7407 - loss: 0.7423 - val_accuracy: 0.8306 - val_loss: 0.5099 - learning_rate: 0.0010
Epoch 3/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m76s[0m 108ms/step - accuracy: 0.7591 - loss: 0.6938 - val_accuracy: 0.8226 - val_loss: 0.5065 - learning_rate: 0.0010


In [3]:

# -----------------------------
# 6) Test / Evaluate
# -----------------------------
test_loss, test_acc_r = resnet_model.evaluate(x_test, y_test, verbose=0)
print("MobileNetV2 (frozen) test accuracy:", test_acc_r)
print("MobileNetV2 (frozen) test loss    :", test_loss)


MobileNetV2 (frozen) test accuracy: 0.8170999884605408
MobileNetV2 (frozen) test loss    : 0.5334654450416565


In [4]:
# Print the total number of layers inside the MobileNetV2 backbone
print("Total layers in MobileNetV2 backbone:", len(resnet_base.layers))

# Filter only layers that actually have learnable parameters (weights/biases)
trainable_layers = [layer for layer in resnet_base.layers if layer.count_params() > 0]
    
# Print the number of layers that contain learnable parameters "Depth of the Model"
# It will be 102 (not 103) because MobileNetV2's classification head is NOT included as we are using only the backbone (feature extractor)
print("Layers with learnable parameters (depth):", len(trainable_layers))


Total layers in MobileNetV2 backbone: 154
Layers with learnable parameters (depth): 104


In [5]:
# Listing all layers that have learnable parameters (trainable_layers)
# Each layer will be printed with:
# (index in the filtered list, layer name, number of parameters)
for i, layer in enumerate(trainable_layers):
    print(i, layer.name, layer.count_params())

0 Conv1 864
1 bn_Conv1 128
2 expanded_conv_depthwise 288
3 expanded_conv_depthwise_BN 128
4 expanded_conv_project 512
5 expanded_conv_project_BN 64
6 block_1_expand 1536
7 block_1_expand_BN 384
8 block_1_depthwise 864
9 block_1_depthwise_BN 384
10 block_1_project 2304
11 block_1_project_BN 96
12 block_2_expand 3456
13 block_2_expand_BN 576
14 block_2_depthwise 1296
15 block_2_depthwise_BN 576
16 block_2_project 3456
17 block_2_project_BN 96
18 block_3_expand 3456
19 block_3_expand_BN 576
20 block_3_depthwise 1296
21 block_3_depthwise_BN 576
22 block_3_project 4608
23 block_3_project_BN 128
24 block_4_expand 6144
25 block_4_expand_BN 768
26 block_4_depthwise 1728
27 block_4_depthwise_BN 768
28 block_4_project 6144
29 block_4_project_BN 128
30 block_5_expand 6144
31 block_5_expand_BN 768
32 block_5_depthwise 1728
33 block_5_depthwise_BN 768
34 block_5_project 6144
35 block_5_project_BN 128
36 block_6_expand 6144
37 block_6_expand_BN 768
38 block_6_depthwise 1728
39 block_6_depthwise_BN 7

In [6]:
# -----------------------------
#Fine-tune last layers
# -----------------------------
resnet_base.trainable = True
for layer in resnet_base.layers[:-30]:
    layer.trainable = False

print("Trainable layers in backbone:", sum(l.trainable for l in resnet_base.layers), "/", len(resnet_base.layers))

resnet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history_ft = resnet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    verbose=1
)

test_loss_ft, test_acc_ft = resnet_model.evaluate(x_test, y_test, verbose=0)
print("MobileNetV2 (fine-tuned) test accuracy:", test_acc_ft)
print("MobileNetV2 (fine-tuned) test loss    :", test_loss_ft)

Trainable layers in backbone: 30 / 154
Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m110s[0m 142ms/step - accuracy: 0.6725 - loss: 0.9588 - val_accuracy: 0.8260 - val_loss: 0.5176
Epoch 2/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m98s[0m 139ms/step - accuracy: 0.7699 - loss: 0.6599 - val_accuracy: 0.8354 - val_loss: 0.4656
Epoch 3/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m98s[0m 140ms/step - accuracy: 0.7982 - loss: 0.5786 - val_accuracy: 0.8496 - val_loss: 0.4163
MobileNetV2 (fine-tuned) test accuracy: 0.8514999747276306
MobileNetV2 (fine-tuned) test loss    : 0.4299635887145996
