##### ARTI 560 - Computer Vision  
## Image Classification using Transfer Learning - Exercise 

### Objective

In this exercise, you will:

1. Select another pretrained model (e.g., VGG16, MobileNetV2, or EfficientNet) and fine-tune it for CIFAR-10 classification.  
You'll find the pretrained models in [Tensorflow Keras Applications Module](https://www.tensorflow.org/api_docs/python/tf/keras/applications).

2. Before training, inspect the architecture using model.summary() and observe:
- Network depth
- Number of parameters
- Trainable vs Frozen layers

3. Then compare its performance with ResNet and the custom CNN.

### Questions:

- Which model achieved the highest accuracy?
- Which model trained faster?
- How might the architecture explain the differences?

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import vgg19
from tensorflow.keras.applications.vgg19 import preprocess_input

# -----------------------------
# 1) Load CIFAR-10
# -----------------------------
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

class_names = [
    "airplane","automobile","bird","cat","deer",
    "dog","frog","horse","ship","truck"
]

# Keep labels as integers (SparseCategoricalCrossentropy)
y_train = y_train.squeeze().astype("int64")
y_test  = y_test.squeeze().astype("int64")

# Convert images to float32
x_train = x_train.astype("float32")
x_test  = x_test.astype("float32")

# -----------------------------
# 2) Data augmentation
# -----------------------------
data_augmentation = keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.05),
    layers.RandomZoom(0.1),
], name="augmentation")

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 0us/step


In [13]:

# -----------------------------
# 3) Build / backbone (pretrained)
# -----------------------------
VGG19_base = keras.applications.VGG19(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
VGG19_base.trainable = False  # freeze first (feature extractor)

# -----------------------------
# 4) Full model (preprocess inside model)
# -----------------------------
VGG19_model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    layers.Resizing(224, 224, interpolation="bilinear"),
    layers.Lambda(preprocess_input),          # IMPORTANT: correct for ResNet50V2
    VGG19_base,
    layers.GlobalAveragePooling2D(),
    layers.Dense(10)                          # logits
], name="cifar10_VGG19")

VGG19_model.summary()



Observation:

- Network depth

The VGG19 backbone consists of 16 convolutional layers and 3 fully connected layers in its original form. integrated into a 6-block pipeline with a total of 22 trainable internal layers.

- Number of parameters

The model has a total of 20,029,514 parameters (~76.41 MB).

Trainable params: 5,130

Non-trainable params: 20,024,384

- Trainable vs Frozen layers

20,024,384 non-trainable parameters (the parameter count of the vgg19 layer exactly); the VGG19 backbone is completely frozen.
Only the final Dense layer (5,130 parameters) is being trained to adapt the features to the 10 classes of CIFAR-10.

In [14]:
# -----------------------------
# 5) Compile + Train (frozen backbone)
# -----------------------------
VGG19_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

callbacks = [
    keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=3, restore_best_weights=True),
    keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1),
]

history = VGG19_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    callbacks=callbacks,
    verbose=1
)

Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m307s[0m 434ms/step - accuracy: 0.5301 - loss: 1.5130 - val_accuracy: 0.8148 - val_loss: 0.5651 - learning_rate: 0.0010
Epoch 2/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m326s[0m 463ms/step - accuracy: 0.7650 - loss: 0.6793 - val_accuracy: 0.8440 - val_loss: 0.4752 - learning_rate: 0.0010
Epoch 3/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m326s[0m 463ms/step - accuracy: 0.7856 - loss: 0.6242 - val_accuracy: 0.8580 - val_loss: 0.4383 - learning_rate: 0.0010


In [15]:

# -----------------------------
# 6) Test / Evaluate
# -----------------------------
test_loss, test_acc_r = VGG19_model.evaluate(x_test, y_test, verbose=0)
print("VGG19 (frozen) test accuracy:", test_acc_r)
print("VGG19_base (frozen) test loss    :", test_loss)


VGG19 (frozen) test accuracy: 0.8475000262260437
VGG19_base (frozen) test loss    : 0.4665752947330475


In [18]:
# -----------------------------
#Fine-tune last layers
# -----------------------------
VGG19_base.trainable = True #unfreeze
for layer in VGG19_base.layers[:-150]: #freeze all except the last 150 layer 
    layer.trainable = False

print("Trainable layers in backbone:", sum(l.trainable for l in VGG19_base.layers), "/", len(VGG19_base.layers))

VGG19_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history_ft = VGG19_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    verbose=1,
    callbacks=callbacks,
)



Trainable layers in backbone: 22 / 22
Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m948s[0m 1s/step - accuracy: 0.8338 - loss: 0.4826 - val_accuracy: 0.9146 - val_loss: 0.2706 - learning_rate: 1.0000e-05
Epoch 2/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m957s[0m 1s/step - accuracy: 0.8979 - loss: 0.3039 - val_accuracy: 0.9234 - val_loss: 0.2344 - learning_rate: 1.0000e-05
Epoch 3/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m960s[0m 1s/step - accuracy: 0.9151 - loss: 0.2442 - val_accuracy: 0.9262 - val_loss: 0.2202 - learning_rate: 1.0000e-05


In [None]:
test_loss_ft, test_acc_ft = VGG19_model.evaluate(x_test, y_test, verbose=0)
print("VGG19_ (fine-tuned) test accuracy:", test_acc_ft)
print("VGG19_ (fine-tuned) test loss    :", test_loss_ft)

VGG19_ (fine-tuned) test accuracy: 0.9243999719619751
VGG19_ (fine-tuned) test loss    : 0.22588831186294556


In [21]:
# Collect and compare accuracies (update if you rename variables)
results = {
    "VGG19_ frozen test acc": float(test_acc_r) if 'test_acc_r' in globals() else None,
    "VGG19_ fine-tuned test acc": float(test_acc_ft) if 'test_acc_ft' in globals() else None,
}
for k,v in results.items():
    print(f"{k}: {v}")

VGG19_ frozen test acc: 0.8475000262260437
VGG19_ fine-tuned test acc: 0.9243999719619751


In [None]:
test_acc_ft = 0.9243999719619751
test_acc_r = 0.8475000262260437

# Custom CNN

In [5]:
# -----------------------------
# 3) Build Custom CNN Model
# ----------------------------- 


custom_cnn_model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    
    # Rescaling pixels to [0, 1] (Custom CNNs need manual scaling)
    layers.Rescaling(1./255),
    
    # Feature Extraction Layers
    layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),
    
    layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),
    
    # Classification Head
    layers.Dense(128, activation='relu'),
    layers.Flatten(),
    layers.Dropout(0.2),
    layers.Dense(10) # Logits
], name="cifar10_custom_cnn")

custom_cnn_model.summary()

# -----------------------------
# 4) Compile + Train
# -----------------------------

callbacks = [
    keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=3, restore_best_weights=True),
    keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1),
]

custom_cnn_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history = custom_cnn_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=10, # Custom CNNs usually need more epochs than transfer learning
    batch_size=64,
    callbacks=callbacks,
    verbose=1
)

Epoch 1/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 8ms/step - accuracy: 0.3685 - loss: 1.7541 - val_accuracy: 0.5604 - val_loss: 1.2684 - learning_rate: 0.0010
Epoch 2/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 8ms/step - accuracy: 0.5405 - loss: 1.2900 - val_accuracy: 0.5768 - val_loss: 1.1848 - learning_rate: 0.0010
Epoch 3/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 7ms/step - accuracy: 0.5938 - loss: 1.1444 - val_accuracy: 0.6318 - val_loss: 1.0409 - learning_rate: 0.0010
Epoch 4/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 7ms/step - accuracy: 0.6228 - loss: 1.0768 - val_accuracy: 0.6602 - val_loss: 0.9873 - learning_rate: 0.0010
Epoch 5/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 7ms/step - accuracy: 0.6440 - loss: 1.0212 - val_accuracy: 0.6712 - val_loss: 0.9561 - learning_rate: 0.0010
Epoch 6/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [

In [6]:
# -----------------------------
# 6) Test / Evaluate
# -----------------------------

test_loss_cnn, test_acc_cnn = custom_cnn_model.evaluate(x_test, y_test, verbose=0)

print("Custom CNN Test Accuracy:", test_acc_cnn)
print("Custom CNN Test Loss    :",test_loss_cnn)

Custom CNN Test Accuracy: 0.7128999829292297
Custom CNN Test Loss    : 0.8473742008209229


Resneet

In [7]:
from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras.applications.resnet_v2 import preprocess_input

# -----------------------------
# 3) Build ResNet50V2 backbone (pretrained)
# -----------------------------
resnet_base = ResNet50V2(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
resnet_base.trainable = False  # freeze first (feature extractor)

# -----------------------------
# 4) Full model (preprocess inside model)
# -----------------------------
resnet_model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    layers.Resizing(224, 224, interpolation="bilinear"),
    layers.Lambda(preprocess_input),     # IMPORTANT: correct for ResNet50V2 without it result in  bad accuarcy
    resnet_base,
    layers.GlobalAveragePooling2D(),
    layers.Dense(10)                          # logits
], name="cifar10_resnet50v2")

resnet_model.summary()

# -----------------------------
# 5) Compile + Train (frozen backbone)
# -----------------------------
resnet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

callbacks = [
    keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=3, restore_best_weights=True),
    keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1),
]

history = resnet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    callbacks=callbacks,
    verbose=1
)



Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50v2_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m94668760/94668760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step


Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m176s[0m 239ms/step - accuracy: 0.6768 - loss: 0.9333 - val_accuracy: 0.8708 - val_loss: 0.3686 - learning_rate: 0.0010
Epoch 2/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m170s[0m 242ms/step - accuracy: 0.8001 - loss: 0.5682 - val_accuracy: 0.8738 - val_loss: 0.3543 - learning_rate: 0.0010
Epoch 3/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m169s[0m 240ms/step - accuracy: 0.8216 - loss: 0.5172 - val_accuracy: 0.8858 - val_loss: 0.3366 - learning_rate: 0.0010


In [8]:
# -----------------------------
# 6) Test / Evaluate
# -----------------------------
test_loss_res, test_acc_res = resnet_model.evaluate(x_test, y_test, verbose=0)
print("ResNet50V2 (frozen) test accuracy:", test_acc_res)
print("ResNet50V2 (frozen) test loss    :", test_loss_res)


ResNet50V2 (frozen) test accuracy: 0.8798999786376953
ResNet50V2 (frozen) test loss    : 0.34890511631965637


In [9]:
# -----------------------------
#Fine-tune last layers
# -----------------------------
resnet_base.trainable = True #unfreeze
for layer in resnet_base.layers[:-30]: #freeze all except the last 30 layer (i guess)
    layer.trainable = False

print("Trainable layers in backbone:", sum(l.trainable for l in resnet_base.layers), "/", len(resnet_base.layers))

resnet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history_ft = resnet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    verbose=1
)


Trainable layers in backbone: 30 / 190
Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m238s[0m 322ms/step - accuracy: 0.8008 - loss: 0.5775 - val_accuracy: 0.9008 - val_loss: 0.2852
Epoch 2/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m224s[0m 319ms/step - accuracy: 0.8665 - loss: 0.3855 - val_accuracy: 0.9136 - val_loss: 0.2471
Epoch 3/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m224s[0m 319ms/step - accuracy: 0.8855 - loss: 0.3312 - val_accuracy: 0.9206 - val_loss: 0.2276


In [10]:

test_loss_r_ft, test_acc_r_ft = resnet_model.evaluate(x_test, y_test, verbose=0)
print("ResNet50V2 (fine-tuned) test accuracy:", test_acc_r_ft)
print("ResNet50V2 (fine-tuned) test loss    :", test_loss_r_ft)

ResNet50V2 (fine-tuned) test accuracy: 0.9193000197410583
ResNet50V2 (fine-tuned) test loss    : 0.23613256216049194


# 3. Then compare its performance with ResNet and the custom CNN.

In [20]:
print("Custom CNN Test Accuracy:", test_acc_cnn)
print(" ")
print("ResNet50V2 (frozen) test accuracy:", test_acc_res)
print("ResNet50V2 (fine-tuned) test accuracy:", test_acc_r_ft)
print(" ")
print("VGG19 (frozen) test accuracy:", test_acc_r)
print("VGG19_ (fine-tuned) test accuracy:", test_acc_ft)


Custom CNN Test Accuracy: 0.7128999829292297
 
ResNet50V2 (frozen) test accuracy: 0.8798999786376953
ResNet50V2 (fine-tuned) test accuracy: 0.9193000197410583
 
VGG19 (frozen) test accuracy: 0.8475000262260437
VGG19_ (fine-tuned) test accuracy: 0.9243999719619751



### Questions:

#### - Which model achieved the highest accuracy?
VGG19 (Fine-tuned) achieved the highest test accuracy at 92.44%.
#### - Which model trained faster?
The custom CNN was the fastest trained model even with more epoc.
#### - How might the architecture explain the differences?

Custom CNN: lightweight and fast but lacks the deep "feature extraction" capabilities of  larger models, lower accuracy of 71.29%.

ResNet50V2: Uses Residual (Skip) Connections to train effectivly. good performance even when frozen (87.99%) because its pre-trained ImageNet filters are highly robust.

VGG19: stack of 3×3 convolutions. its high parameter count (20M+), and its simplicity allows it to be very effective when the entire backbone is unfrozen for fine-tuning.