##### ARTI 560 - Computer Vision  
## Image Classification using Transfer Learning - Exercise 

### Objective

In this exercise, you will:

1. Select another pretrained model (e.g., VGG16, MobileNetV2, or EfficientNet) and fine-tune it for CIFAR-10 classification.  
You'll find the pretrained models in [Tensorflow Keras Applications Module](https://www.tensorflow.org/api_docs/python/tf/keras/applications).

2. Before training, inspect the architecture using model.summary() and observe:
- Network depth
- Number of parameters
- Trainable vs Frozen layers

3. Then compare its performance with ResNet and the custom CNN.

### Questions:

- Which model achieved the highest accuracy?
- Which model trained faster?
- How might the architecture explain the differences?

1. the selected pretrained model: EfficientNet

In [2]:
# imports
import time
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.applications.efficientnet import preprocess_input

In [3]:
# loading cifar 10
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

class_names = [
    "airplane","automobile","bird","cat","deer",
    "dog","frog","horse","ship","truck"
]

y_train = y_train.squeeze().astype("int64")
y_test  = y_test.squeeze().astype("int64")

x_train = x_train.astype("float32")
x_test  = x_test.astype("float32")

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 0us/step


In [4]:
# data augmentation
data_augmentation = keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.05),
    layers.RandomZoom(0.1),
], name="augmentation")

In [5]:
# load efficientnetb0 and freeze
eff_base = EfficientNetB0(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
eff_base.trainable = False

Downloading data from https://storage.googleapis.com/keras-applications/efficientnetb0_notop.h5
[1m16705208/16705208[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


In [6]:
eff_model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    layers.Resizing(224, 224, interpolation="bilinear"),
    layers.Lambda(preprocess_input),
    eff_base,
    layers.GlobalAveragePooling2D(),
    layers.Dropout(0.2),
    layers.Dense(10)   # logits
], name="cifar10_efficientnetb0")

eff_model.summary()

In [7]:
# compile and train
eff_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

callbacks = [
    keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=3, restore_best_weights=True),
    keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1),
]

history = eff_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    callbacks=callbacks,
    verbose=1
)

Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m118s[0m 148ms/step - accuracy: 0.6370 - loss: 1.0943 - val_accuracy: 0.8736 - val_loss: 0.3734 - learning_rate: 0.0010
Epoch 2/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m105s[0m 149ms/step - accuracy: 0.7798 - loss: 0.6440 - val_accuracy: 0.8884 - val_loss: 0.3274 - learning_rate: 0.0010
Epoch 3/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m105s[0m 150ms/step - accuracy: 0.7924 - loss: 0.5995 - val_accuracy: 0.8908 - val_loss: 0.3227 - learning_rate: 0.0010


In [8]:
# test/eval
test_loss_eff, test_acc_eff = eff_model.evaluate(x_test, y_test, verbose=0)
print("EfficientNetB0 (frozen) test accuracy:", test_acc_eff)
print("EfficientNetB0 (frozen) test loss    :", test_loss_eff)

EfficientNetB0 (frozen) test accuracy: 0.8840000033378601
EfficientNetB0 (frozen) test loss    : 0.3379216492176056


In [9]:
eff_base.trainable = True

for layer in eff_base.layers[:-30]:
    layer.trainable = False

print("trainable layers in backbone:",
      sum(l.trainable for l in eff_base.layers), "/", len(eff_base.layers))

eff_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history_ft = eff_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=3,
    batch_size=64,
    verbose=1
)

test_loss_ft, test_acc_ft = eff_model.evaluate(x_test, y_test, verbose=0)
print("EfficientNetB0 (fine-tuned) test accuracy:", test_acc_ft)
print("EfficientNetB0 (fine-tuned) test loss    :", test_loss_ft)

trainable layers in backbone: 30 / 238
Epoch 1/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m139s[0m 179ms/step - accuracy: 0.7577 - loss: 0.7292 - val_accuracy: 0.8800 - val_loss: 0.3625
Epoch 2/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m122s[0m 174ms/step - accuracy: 0.7992 - loss: 0.5869 - val_accuracy: 0.8896 - val_loss: 0.3225
Epoch 3/3
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m123s[0m 174ms/step - accuracy: 0.8126 - loss: 0.5408 - val_accuracy: 0.8980 - val_loss: 0.2974
EfficientNetB0 (fine-tuned) test accuracy: 0.8985000252723694
EfficientNetB0 (fine-tuned) test loss    : 0.30153918266296387


In [10]:
results = {
    "EfficientNet frozen test acc": float(test_acc_eff),
    "EfficientNet fine-tuned test acc": float(test_acc_ft),
}

for k, v in results.items():
    print(f"{k}: {v}")

EfficientNet frozen test acc: 0.8840000033378601
EfficientNet fine-tuned test acc: 0.8985000252723694


In [11]:
# Print the total number of layers inside the EfficientNetB0 backbone
print("Total layers in EfficientNetB0 backbone:", len(eff_base.layers))

# Filter only layers that actually have learnable parameters (weights/biases)
trainable_layers = [layer for layer in eff_base.layers if layer.count_params() > 0]

# Print the number of layers that contain learnable parameters ("Depth of the Model")
print("Layers with learnable parameters (depth):", len(trainable_layers))

# Listing all layers that have learnable parameters
# Each layer will be printed with:
# (index in the filtered list, layer name, number of parameters)
for i, layer in enumerate(trainable_layers):
    print(i, layer.name, layer.count_params())

Total layers in EfficientNetB0 backbone: 238
Layers with learnable parameters (depth): 131
0 normalization 7
1 stem_conv 864
2 stem_bn 128
3 block1a_dwconv 288
4 block1a_bn 128
5 block1a_se_reduce 264
6 block1a_se_expand 288
7 block1a_project_conv 512
8 block1a_project_bn 64
9 block2a_expand_conv 1536
10 block2a_expand_bn 384
11 block2a_dwconv 864
12 block2a_bn 384
13 block2a_se_reduce 388
14 block2a_se_expand 480
15 block2a_project_conv 2304
16 block2a_project_bn 96
17 block2b_expand_conv 3456
18 block2b_expand_bn 576
19 block2b_dwconv 1296
20 block2b_bn 576
21 block2b_se_reduce 870
22 block2b_se_expand 1008
23 block2b_project_conv 3456
24 block2b_project_bn 96
25 block3a_expand_conv 3456
26 block3a_expand_bn 576
27 block3a_dwconv 3600
28 block3a_bn 576
29 block3a_se_reduce 870
30 block3a_se_expand 1008
31 block3a_project_conv 5760
32 block3a_project_bn 160
33 block3b_expand_conv 9600
34 block3b_expand_bn 960
35 block3b_dwconv 6000
36 block3b_bn 960
37 block3b_se_reduce 2410
38 block3

### Questions:

- Which model achieved the highest accuracy? Custom CNN (efficientnetb0 was close :))
- Which model trained faster? Custom CNN
- How might the architecture explain the differences? When using the custom CNN, it is much smaller and less complex than Resnet and Efficientnet. The size of the images remain as 32x32. the other two archtectures resize to 224x224 which requires more computation. Resnet and Efficientnet have millions of parameters whereas the custom CNN has fewer parameters and is lighter weight.