##### ARTI 560 - Computer Vision  
## Image Classification using Transfer Learning - Exercise 

### Objective

In this exercise, you will:

1. Select another pretrained model (e.g., VGG16, MobileNetV2, or EfficientNet) and fine-tune it for CIFAR-10 classification.  
You'll find the pretrained models in [Tensorflow Keras Applications Module](https://www.tensorflow.org/api_docs/python/tf/keras/applications).

2. Before training, inspect the architecture using model.summary() and observe:
- Network depth
- Number of parameters
- Trainable vs Frozen layers

3. Then compare its performance with ResNet and the custom CNN.

### Questions:

- Which model achieved the highest accuracy?
- Which model trained faster?
- How might the architecture explain the differences?

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input

# --------------------------
# 1) Load CIFAR-10 Dataset
# --------------------------
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

class_names = [
    "airplane", "automobile", "bird", "cat", "deer",
    "dog", "frog", "horse", "ship", "truck"
]

# Preprocess labels and convert images to float32
y_train = y_train.squeeze().astype("int64")
y_test  = y_test.squeeze().astype("int64")
x_train = x_train.astype("float32")
x_test  = x_test.astype("float32")

# --------------------------
# 2) Data Augmentation
# --------------------------
data_augmentation = keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.05),
    layers.RandomZoom(0.1),
], name="augmentation")

# ----------------------------
# 3) Build MobileNetV2 Backbone (Pretrained)
# ----------------------------

mobilenet_base = MobileNetV2(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)


mobilenet_base.trainable = False 


# ----------------------------
# 4) Construct the Full Model
# ----------------------------

model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    layers.Resizing(224, 224, interpolation="bilinear"),
    layers.Lambda(preprocess_input, name="preprocessing"),          
    mobilenet_base,
    layers.GlobalAveragePooling2D(),
    layers.Dense(512, activation='relu'),
    layers.BatchNormalization(),
    layers.Dropout(0.3), 
    layers.Dense(10)    
], name="cifar10_mobilenetv2")


model.summary()

# ------------------------
# 5) Compile and Train
# ------------------------
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

# Callbacks for efficient training
callbacks = [
    keras.callbacks.EarlyStopping(
        monitor="val_accuracy", 
        patience=3, 
        restore_best_weights=True
    ),
    keras.callbacks.ReduceLROnPlateau(
        monitor="val_loss", 
        factor=0.5, 
        patience=1
    ),
]

print("\nStarting training...")
history = model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=6,
    batch_size=64,
    callbacks=callbacks,
    verbose=1
)

# -----------------
# 6) Evaluate on Test Set
# -----------------
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f"\nTest Accuracy: {test_acc:.4f}")


Starting training...
Epoch 1/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m91s[0m 116ms/step - accuracy: 0.6598 - loss: 1.0260 - val_accuracy: 0.8230 - val_loss: 0.5390 - learning_rate: 0.0010
Epoch 2/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m79s[0m 113ms/step - accuracy: 0.7406 - loss: 0.7415 - val_accuracy: 0.8240 - val_loss: 0.5191 - learning_rate: 0.0010
Epoch 3/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m79s[0m 113ms/step - accuracy: 0.7518 - loss: 0.7096 - val_accuracy: 0.8126 - val_loss: 0.5569 - learning_rate: 0.0010
Epoch 4/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m79s[0m 112ms/step - accuracy: 0.7641 - loss: 0.6730 - val_accuracy: 0.8444 - val_loss: 0.4565 - learning_rate: 5.0000e-04
Epoch 5/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m79s[0m 112ms/step - accuracy: 0.7748 - loss: 0.6468 - val_accuracy: 0.8270 - val_loss: 0.5119 - learning_rate: 5.0000e-04
Epoch 6/6
[1m704/704[0m [32

In [4]:
# -----------------------------
#Fine-tune last layers
# -----------------------------
mobilenet_base.trainable = True
for layer in mobilenet_base.layers[:-30]:
    layer.trainable = False


print("Trainable layers in backbone:", sum(l.trainable for l in mobilenet_base.layers), "/", len(mobilenet_base.layers))

# Re-compile because we add tuning here 
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)


history_ft = model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=6,
    batch_size=64,
    callbacks=callbacks,
    verbose=1
)

# 4) Final Evaluation
test_loss_ft, test_acc_ft = model.evaluate(x_test, y_test, verbose=0)
print("MobileNetV2 (fine-tuned) test accuracy:", test_acc_ft)
print("MobileNetV2 (fine-tuned) test loss    :", test_loss_ft)

Trainable layers in backbone: 30 / 154
Epoch 1/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m113s[0m 147ms/step - accuracy: 0.8152 - loss: 0.5294 - val_accuracy: 0.8684 - val_loss: 0.3694 - learning_rate: 1.0000e-05
Epoch 2/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 144ms/step - accuracy: 0.8297 - loss: 0.4873 - val_accuracy: 0.8752 - val_loss: 0.3478 - learning_rate: 1.0000e-05
Epoch 3/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 144ms/step - accuracy: 0.8426 - loss: 0.4556 - val_accuracy: 0.8794 - val_loss: 0.3344 - learning_rate: 1.0000e-05
Epoch 4/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 144ms/step - accuracy: 0.8446 - loss: 0.4480 - val_accuracy: 0.8818 - val_loss: 0.3308 - learning_rate: 1.0000e-05
Epoch 5/6
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 144ms/step - accuracy: 0.8557 - loss: 0.4204 - val_accuracy: 0.8824 - val_loss: 0.3276 - learning_rate: 1.0000e

**Here The solution for the Question:**

1. Before training, inspect the architecture using model.summary() and observe:
- Network depth  
- Number of parameters
- Trainable vs Frozen layers
***
                   depth        Trainable       non-Trainalbe       Accuracy
            CNN:    ~10          315,722              0               0.7   
         ResNet:    103           20,490          23,564,800          0.916 
    MobileNetV2:    105          12,810          2,257,984            0.886 
***


Then compare its performance with ResNet and the custom CNN.

### Questions:

***1- Which model achieved the highest accuracy?***
- ResNet achieve the highest (91.6%) accuracy but was the slowest one

***2- Which model trained faster?***
- The Custom CNN trained faster because it has the fewest total parameters (roughly 315k compared to millions in the others)

***3- How might the architecture explain the differences?***
- ResNet: Its high accuracy comes from its extreme depth and Residual/Skip connections that prevent data loss.

- MobileNetV2: It uses a bottleneck to squeeze the data thin, processes it efficiently, and then expands it again. This allows it to be much more memory-efficient (fewer parameters) than ResNet, while still using skip connections to maintain high accuracy.

- CNN: It is simple and lacks "pre-trained" knowledge, so it needs much more data and time to reach the same accuracy as the others.