##### ARTI 560 - Computer Vision  
## Image Classification using Transfer Learning - Exercise 

### Objective

In this exercise, you will:

1. Select another pretrained model (e.g., VGG16, MobileNetV2, or EfficientNet) and fine-tune it for CIFAR-10 classification.  
You'll find the pretrained models in [Tensorflow Keras Applications Module](https://www.tensorflow.org/api_docs/python/tf/keras/applications).

2. Before training, inspect the architecture using model.summary() and observe:
- Network depth
- Number of parameters
- Trainable vs Frozen layers

3. Then compare its performance with ResNet and the custom CNN.

### Questions:

- Which model achieved the highest accuracy?
- Which model trained faster?
- How might the architecture explain the differences?

## All the answers is below the code

In [18]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input

In [19]:
# -----------------------------
# 1) Load CIFAR-10
# -----------------------------
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

class_names = [
    "airplane","automobile","bird","cat","deer",
    "dog","frog","horse","ship","truck"
]

# Keep labels as integers (SparseCategoricalCrossentropy)
y_train = y_train.squeeze().astype("int64")
y_test  = y_test.squeeze().astype("int64")

# Convert images to float32
x_train = x_train.astype("float32")
x_test  = x_test.astype("float32")

In [20]:
# -----------------------------
# 2) Data augmentation
# -----------------------------
data_augmentation = keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.02),
    layers.RandomZoom(0.05),
], name="augmentation")

In [21]:
# -----------------------------
# 3) Build MobileNetV2 backbone 
# -----------------------------
mobilenet_base = MobileNetV2(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
mobilenet_base.trainable = False  

In [22]:
# -----------------------------
# 4) Full model 
# -----------------------------
mobilenet_model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    layers.Resizing(224, 224, interpolation="bilinear"),
    layers.Lambda(preprocess_input),          
    mobilenet_base,
    layers.GlobalAveragePooling2D(),
    layers.Dense(10)
], name="cifar10_mobilenetv2")

mobilenet_model.summary()

In [23]:
# -----------------------------
# 5) Compile + Train (frozen backbone)
# -----------------------------
mobilenet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

callbacks = [
    keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=3, restore_best_weights=True),
    keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1),
]

history = mobilenet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=5,
    batch_size=128,
    callbacks=callbacks,
    verbose=1
)

Epoch 1/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m83s[0m 222ms/step - accuracy: 0.6042 - loss: 1.1502 - val_accuracy: 0.8158 - val_loss: 0.5250 - learning_rate: 0.0010
Epoch 2/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m76s[0m 216ms/step - accuracy: 0.7779 - loss: 0.6445 - val_accuracy: 0.8298 - val_loss: 0.4800 - learning_rate: 0.0010
Epoch 3/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m76s[0m 215ms/step - accuracy: 0.7896 - loss: 0.6042 - val_accuracy: 0.8458 - val_loss: 0.4521 - learning_rate: 0.0010
Epoch 4/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m76s[0m 215ms/step - accuracy: 0.8021 - loss: 0.5666 - val_accuracy: 0.8380 - val_loss: 0.4643 - learning_rate: 0.0010
Epoch 5/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m76s[0m 215ms/step - accuracy: 0.8083 - loss: 0.5525 - val_accuracy: 0.8478 - val_loss: 0.4403 - learning_rate: 5.0000e-04


In [24]:
# -----------------------------
# 6) Test / Evaluate
# -----------------------------
test_loss, test_acc_m = mobilenet_model.evaluate(x_test, y_test, verbose=0)
print("MobileNetV2 (frozen) test accuracy:", test_acc_m)
print("MobileNetV2 (frozen) test loss    :", test_loss)

MobileNetV2 (frozen) test accuracy: 0.840499997138977
MobileNetV2 (frozen) test loss    : 0.46480634808540344


In [25]:
# Print the total number of layers inside the MobileNetV2 backbone
print("Total layers in MobileNetV2 backbone:", len(mobilenet_base.layers))

# Filter only layers that actually have learnable parameters (weights/biases)
trainable_layers = [layer for layer in mobilenet_base.layers if layer.count_params() > 0]

# Print the number of layers that contain learnable parameters "Depth of the Model"
print("Layers with learnable parameters (depth):", len(trainable_layers))


Total layers in MobileNetV2 backbone: 154
Layers with learnable parameters (depth): 104


In [26]:
# Listing all layers that have learnable parameters (trainable_layers)
# Each layer will be printed with:
# (index in the filtered list, layer name, number of parameters)
for i, layer in enumerate(trainable_layers):
    print(i, layer.name, layer.count_params())

0 Conv1 864
1 bn_Conv1 128
2 expanded_conv_depthwise 288
3 expanded_conv_depthwise_BN 128
4 expanded_conv_project 512
5 expanded_conv_project_BN 64
6 block_1_expand 1536
7 block_1_expand_BN 384
8 block_1_depthwise 864
9 block_1_depthwise_BN 384
10 block_1_project 2304
11 block_1_project_BN 96
12 block_2_expand 3456
13 block_2_expand_BN 576
14 block_2_depthwise 1296
15 block_2_depthwise_BN 576
16 block_2_project 3456
17 block_2_project_BN 96
18 block_3_expand 3456
19 block_3_expand_BN 576
20 block_3_depthwise 1296
21 block_3_depthwise_BN 576
22 block_3_project 4608
23 block_3_project_BN 128
24 block_4_expand 6144
25 block_4_expand_BN 768
26 block_4_depthwise 1728
27 block_4_depthwise_BN 768
28 block_4_project 6144
29 block_4_project_BN 128
30 block_5_expand 6144
31 block_5_expand_BN 768
32 block_5_depthwise 1728
33 block_5_depthwise_BN 768
34 block_5_project 6144
35 block_5_project_BN 128
36 block_6_expand 6144
37 block_6_expand_BN 768
38 block_6_depthwise 1728
39 block_6_depthwise_BN 7

In [27]:
# -----------------------------
# Fine-tune last layers
# -----------------------------
mobilenet_base.trainable = True

for layer in mobilenet_base.layers[:-30]:
    layer.trainable = False

print("Trainable layers in backbone:", sum(l.trainable for l in mobilenet_base.layers), "/", len(mobilenet_base.layers))

mobilenet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history_ft = mobilenet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=5,
    batch_size=128,
    verbose=1
)

Trainable layers in backbone: 30 / 154
Epoch 1/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m107s[0m 280ms/step - accuracy: 0.6961 - loss: 0.8807 - val_accuracy: 0.8400 - val_loss: 0.4837
Epoch 2/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m97s[0m 275ms/step - accuracy: 0.8027 - loss: 0.5695 - val_accuracy: 0.8418 - val_loss: 0.4729
Epoch 3/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m96s[0m 274ms/step - accuracy: 0.8216 - loss: 0.5118 - val_accuracy: 0.8460 - val_loss: 0.4403
Epoch 4/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m97s[0m 275ms/step - accuracy: 0.8334 - loss: 0.4737 - val_accuracy: 0.8576 - val_loss: 0.4062
Epoch 5/5
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m97s[0m 275ms/step - accuracy: 0.8529 - loss: 0.4218 - val_accuracy: 0.8680 - val_loss: 0.3765


In [28]:
test_loss_ft, test_acc_ft = mobilenet_model.evaluate(x_test, y_test, verbose=0)
print("MobileNetV2 (fine-tuned) test accuracy:", test_acc_ft)
print("MobileNetV2 (fine-tuned) test loss    :", test_loss_ft)

MobileNetV2 (fine-tuned) test accuracy: 0.8662999868392944
MobileNetV2 (fine-tuned) test loss    : 0.3934086561203003


# Answers:

### -Which model achieved the highest accuracy?

ResNet50V2 achieved the highest accuracy (≈ 91.6%), followed by MobileNetV2 (≈ 88.6%), while the custom CNN had the lowest validation performance (≈ 69–70%).

 ### - Which model trained faster?

The custom CNN trained the fastest, followed by MobileNetV2, while ResNet50V2 was the slowest due to its deeper architecture and larger number of parameters.

### - How might the architecture explain the differences?

*ResNet50V2 is a deep network with residual (skip) connections, allowing it to learn complex features effectively, which leads to higher accuracy.


*MobileNetV2 is a lightweight model using depthwise separable convolutions, making it faster but slightly less accurate.


*Custom CNN is simpler and trained from scratch, so it lacks pretrained features and tends to overfit, resulting in lower validation accuracy.