##### ARTI 560 - Computer Vision  
## Image Classification using Transfer Learning - Exercise 

### Objective

In this exercise, you will:

1. Select another pretrained model (e.g., VGG16, MobileNetV2, or EfficientNet) and fine-tune it for CIFAR-10 classification.  
You'll find the pretrained models in [Tensorflow Keras Applications Module](https://www.tensorflow.org/api_docs/python/tf/keras/applications).

2. Before training, inspect the architecture using model.summary() and observe:
- Network depth
- Number of parameters
- Trainable vs Frozen layers

3. Then compare its performance with ResNet and the custom CNN.

### Questions:

- Which model achieved the highest accuracy?
- Which model trained faster?
- How might the architecture explain the differences?

In [1]:
import time
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input as mobilenet_preprocess

# -----------------------------
# 1) Load CIFAR-10
# -----------------------------
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

y_train = y_train.squeeze().astype("int64")
y_test  = y_test.squeeze().astype("int64")

x_train = x_train.astype("float32")
x_test  = x_test.astype("float32")

# -----------------------------
# 2) Data augmentation (reuse yours if you want)
# -----------------------------
data_augmentation = keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.05),
    layers.RandomZoom(0.1),
], name="augmentation")

# -----------------------------
# 3) Build MobileNetV2 backbone (pretrained)
# -----------------------------
mobilenet_base = MobileNetV2(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
mobilenet_base.trainable = False

# -----------------------------
# 4) Full model (preprocess inside model)
# -----------------------------
mobilenet_model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    layers.Resizing(224, 224, interpolation="bilinear"),
    layers.Lambda(mobilenet_preprocess),   # IMPORTANT: MobileNetV2 preprocessing
    mobilenet_base,
    layers.GlobalAveragePooling2D(),
    layers.Dense(10)                       # logits
], name="cifar10_mobilenetv2")

# -----------------------------
# 5) Inspect architecture
# -----------------------------
mobilenet_model.summary()

print("\nBackbone depth (layers):", len(mobilenet_base.layers))
print("Backbone params:", mobilenet_base.count_params())
print("Trainable layers (backbone):", sum(l.trainable for l in mobilenet_base.layers), "/", len(mobilenet_base.layers))

# -----------------------------
# 6) Compile + Train (frozen)
# -----------------------------
mobilenet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

callbacks = [
    keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=3, restore_best_weights=True),
    keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1),
]

t0 = time.time()
history_mn_frozen = mobilenet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    callbacks=callbacks,
    verbose=1
)
frozen_time = time.time() - t0

test_loss_mn, test_acc_mn = mobilenet_model.evaluate(x_test, y_test, verbose=0)
print("\nMobileNetV2 (frozen) test accuracy:", test_acc_mn)
print("MobileNetV2 (frozen) test loss    :", test_loss_mn)
print("MobileNetV2 (frozen) training time:", frozen_time, "sec")

# -----------------------------
# 7) Fine-tune last layers
# -----------------------------
mobilenet_base.trainable = True

# Freeze most layers, unfreeze last N layers (tune this value)
N = 30
for layer in mobilenet_base.layers[:-N]:
    layer.trainable = False

print("\nAfter unfreezing last", N, "layers:")
print("Trainable layers (backbone):", sum(l.trainable for l in mobilenet_base.layers), "/", len(mobilenet_base.layers))

mobilenet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

t1 = time.time()
history_mn_ft = mobilenet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    callbacks=callbacks,
    verbose=1
)
ft_time = time.time() - t1

test_loss_mn_ft, test_acc_mn_ft = mobilenet_model.evaluate(x_test, y_test, verbose=0)
print("\nMobileNetV2 (fine-tuned) test accuracy:", test_acc_mn_ft)
print("MobileNetV2 (fine-tuned) test loss    :", test_loss_mn_ft)
print("MobileNetV2 (fine-tuned) training time:", ft_time, "sec")

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5
[1m9406464/9406464[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step



Backbone depth (layers): 154
Backbone params: 2257984
Trainable layers (backbone): 0 / 154
Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m80s[0m 102ms/step - accuracy: 0.5907 - loss: 1.1673 - val_accuracy: 0.8200 - val_loss: 0.5271 - learning_rate: 0.0010
Epoch 2/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m72s[0m 102ms/step - accuracy: 0.7402 - loss: 0.7454 - val_accuracy: 0.8190 - val_loss: 0.5306 - learning_rate: 0.0010
Epoch 3/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m73s[0m 104ms/step - accuracy: 0.7616 - loss: 0.6815 - val_accuracy: 0.8370 - val_loss: 0.4741 - learning_rate: 5.0000e-04

MobileNetV2 (frozen) test accuracy: 0.8300999999046326
MobileNetV2 (frozen) test loss    : 0.4937940835952759
MobileNetV2 (frozen) training time: 226.50286436080933 sec

After unfreezing last 30 layers:
Trainable layers (backbone): 30 / 154
Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m107s[0m 138ms/step - accurac

- Custom CNN

Test Accuracy: 70.28%

Training Time: ~110 sec (10 epochs)

Architecture: 2 Conv layers + Dense

- ResNet50V2
Frozen:

Test Accuracy: 87.42%

Test Loss: 0.3587

Fine-Tuned:

Test Accuracy: 91.62%

Test Loss: 0.2423

Parameters: 23.5M

Backbone depth: 190 layers

- MobileNetV2
Frozen:

Test Accuracy: 83.01%

Training Time: 226 sec

Fine-Tuned:

Test Accuracy: 85.30%

Training Time: 299 sec

Parameters: 2.27M

Depth: 154 layers

- Which model achieved the highest accuracy?
ResNet50V2 (fine-tuned) with 91.62%.

- Which model trained faster?
Custom CNN trained the fastest overall. Among pretrained models, MobileNetV2 was faster than ResNet.

- How might the architecture explain the differences?
Custom CNN has limited depth and was trained from scratch, so it lacks the rich feature representations learned from large-scale datasets.

MobileNetV2 uses depthwise separable convolutions and inverted residual blocks. This significantly reduces computation while maintaining strong feature extraction capability, leading to good accuracy with efficient training.

ResNet50V2 uses residual connections that allow very deep networks to train effectively without vanishing gradients. Its greater depth and higher parameter count enable superior feature learning, resulting in the highest accuracy, but at the cost of longer training time.