##### ARTI 560 - Computer Vision  
## Image Classification using Transfer Learning - Exercise 

### Objective

In this exercise, you will:

1. Select another pretrained model (e.g., VGG16, MobileNetV2, or EfficientNet) and fine-tune it for CIFAR-10 classification.  
You'll find the pretrained models in [Tensorflow Keras Applications Module](https://www.tensorflow.org/api_docs/python/tf/keras/applications).

2. Before training, inspect the architecture using model.summary() and observe:
- Network depth
- Number of parameters
- Trainable vs Frozen layers

3. Then compare its performance with ResNet and the custom CNN.

### Questions:

- Which model achieved the highest accuracy?
- Which model trained faster?
- How might the architecture explain the differences?

In [None]:
from tensorflow import keras

(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

y_train = y_train.squeeze()
y_test = y_test.squeeze()

print("x_train:", x_train.shape)
print("y_train:", y_train.shape)
print("x_test :", x_test.shape)
print("y_test :", y_test.shape)

x_train: (50000, 32, 32, 3)
y_train: (50000,)
x_test : (10000, 32, 32, 3)
y_test : (10000,)


In [None]:
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input

# augmentation
data_augmentation = keras.Sequential(
    [
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(0.1),
    ],
    name="data_augmentation"
)

# Backbone
base_model = MobileNetV2(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
base_model.trainable = False

# Model
model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    layers.Resizing(224, 224),
    layers.Lambda(preprocess_input),
    base_model,
    layers.GlobalAveragePooling2D(),
    layers.Dense(10, activation="softmax")
], name="mobilenetv2_basic")

model.summary()

# Exercize 2 
print("=== Network Depth ===")
print("Full model depth (layers):", len(model.layers))
print("MobileNetV2 backbone depth (layers):", len(base_model.layers))


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5
[1m9406464/9406464[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


=== Network Depth ===
Full model depth (layers): 6
MobileNetV2 backbone depth (layers): 154


In [None]:
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=["accuracy"]
)

history_frozen = model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=10,
    batch_size=64,
    verbose=1
)

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)

print("\n=== Frozen MobileNetV2 Results ===")
print("Final Train Accuracy:", history_frozen.history["accuracy"][-1])
print("Final Val Accuracy  :", history_frozen.history["val_accuracy"][-1])
print("Test Accuracy       :", test_acc)
trainable_layers = sum(layer.trainable for layer in model.layers)
frozen_layers = len(model.layers) - trainable_layers

print("\n=== Trainable vs Frozen Layers ===")
print("Trainable layers:", trainable_layers)
print("Frozen layers   :", frozen_layers)

Epoch 1/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m84s[0m 113ms/step - accuracy: 0.5828 - loss: 1.1949 - val_accuracy: 0.8110 - val_loss: 0.5422
Epoch 2/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m78s[0m 111ms/step - accuracy: 0.7365 - loss: 0.7608 - val_accuracy: 0.8312 - val_loss: 0.4867
Epoch 3/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m78s[0m 110ms/step - accuracy: 0.7513 - loss: 0.7146 - val_accuracy: 0.8366 - val_loss: 0.4688
Epoch 4/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m78s[0m 110ms/step - accuracy: 0.7595 - loss: 0.6929 - val_accuracy: 0.8418 - val_loss: 0.4597
Epoch 5/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m78s[0m 111ms/step - accuracy: 0.7631 - loss: 0.6829 - val_accuracy: 0.8374 - val_loss: 0.4614
Epoch 6/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m78s[0m 111ms/step - accuracy: 0.7673 - loss: 0.6766 - val_accuracy: 0.8424 - val_loss: 0.4558
Epoch 7/10

In [None]:
# Unfreeze last layers
base_model.trainable = True

for layer in base_model.layers[:-30]:
    layer.trainable = False

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=["accuracy"]
)

history_ft = model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=10,
    batch_size=64,
    verbose=1
)

final_test_loss, final_test_acc = model.evaluate(x_test, y_test, verbose=0)

print("\n=== FINAL MobileNetV2 Results ===")
print("Final Train Accuracy:", history_ft.history["accuracy"][-1])
print("Final Val Accuracy  :", history_ft.history["val_accuracy"][-1])
print("FINAL Test Accuracy :", final_test_acc)


Epoch 1/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m111s[0m 146ms/step - accuracy: 0.6652 - loss: 1.0187 - val_accuracy: 0.8344 - val_loss: 0.4842
Epoch 2/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 143ms/step - accuracy: 0.7658 - loss: 0.6904 - val_accuracy: 0.8462 - val_loss: 0.4491
Epoch 3/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 143ms/step - accuracy: 0.7913 - loss: 0.6148 - val_accuracy: 0.8528 - val_loss: 0.4219
Epoch 4/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 143ms/step - accuracy: 0.8047 - loss: 0.5665 - val_accuracy: 0.8622 - val_loss: 0.3903
Epoch 5/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 143ms/step - accuracy: 0.8132 - loss: 0.5329 - val_accuracy: 0.8672 - val_loss: 0.3756
Epoch 6/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 143ms/step - accuracy: 0.8274 - loss: 0.4987 - val_accuracy: 0.8756 - val_loss: 0.3589
Epoc

### Exercize 3 

Custom CNN test acc: 0.8741999864578247

ResNet fine-tuned test acc: 0.9161999821662903

MobileNetV2 test acc : 0.8860999941825867

### Questions:

- Which model achieved the highest accuracy?
- Which model trained faster?
- How might the architecture explain the differences?

### Answer:
- ResNet achieved the highest test accuracy with 91.62% because it has the benefit of residual connection that can avoid vanishing gradient.

- Custom CNN was the faster then MobileNetV2, while the slowest one was the ResNet.

- In Custom CNN the architecture is simpler and few parameters which allowed to train faster but lower accuracy performance. 

  MobileNetV2 used something call depth wise separable convolutions which an efficient convolution technique that breaks standard convolutions into two smaller steps: depth wise convolution (spatial filtering per channel) and pointwise convolution (1x1 channel mixing). This approach drastically reduces parameter counts and computational cost which give me balanced trade-off between accuracy and speed.

  ResNet is deeper network with large number of parameter and strong pretrained features, using skip connection to enable stable training which gives us higher accuracy but slower training.