##### ARTI 560 - Computer Vision  
## Image Classification using Transfer Learning

### Overview

**Transfer learning** is a machine learning technique where a model developed for one task is reused as the starting point for a different but related task. Instead of training a model from scratch, which can be time-consuming and require large datasets, transfer learning leverages the knowledge a pretrained model has already learned.

#### Key Concepts

1. **Pretrained Models**  
   - Models trained on large benchmark datasets (e.g., ImageNet) can capture general features such as edges, textures, and shapes in images.
   - Common pretrained models for image classification include **VGG16**, **ResNet**, **MobileNetV2**, and **EfficientNet**.

2. **Feature Extraction**  
   - The pretrained model is used as a fixed feature extractor.
   - Only the final layers (classifier) are replaced and trained on the new dataset.

3. **Fine-Tuning**  
   - A more advanced approach where some of the pretrained layers are "unfrozen" and trained on the new dataset.
   - Helps the model adapt more closely to the specific features of the new task.

![image.png](attachment:image.png)

### Objective

In this lab, we will apply transfer learning with a pretrained ResNetV2 on CIFAR-10, fine-tune the model, and examine its architecture and trainable layers.

### Tools & Libraries
- Python  
- NumPy  
- TensorFlow
- Matplotlib  


### Transfer Learning using a Pre-trained ResNet

#### Steps:
1. Load a pre-trained ResNet50V2 model from `keras.applications` with ImageNet weights, excluding the top classification layer.
2. Resize the CIFAR-10 images to match the input size expected by ResNet (e.g., 224x224).
3. Build a new Keras Sequential model by adding the existing `data_augmentation` layer, followed by a resizing layer for input to ResNet, the loaded ResNet base model (freezing its layers), a `GlobalAveragePooling2D` layer, and a new `Dense` classification head (with 10 output units) for CIFAR-10.
4. Compile this model with an Adam optimizer, sparse categorical cross-entropy loss, and accuracy metrics, and train it on the preprocessed CIFAR-10 training data for a few epochs using early stopping callbacks.


#### Important: Pretrained model preprocessing

Pretrained models (ImageNet weights) expect inputs to be preprocessed *exactly* like during ImageNet training.
For ResNet50 in Keras, use:

`tf.keras.applications.resnet50.preprocess_input(...)`

If you skip this step, accuracy may drop close to random guessing.


In [1]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras.applications.resnet_v2 import preprocess_input

# -----------------------------
# 1) Load CIFAR-10
# -----------------------------
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

class_names = [
    "airplane","automobile","bird","cat","deer",
    "dog","frog","horse","ship","truck"
]

# Keep labels as integers (SparseCategoricalCrossentropy)
y_train = y_train.squeeze().astype("int64")
y_test  = y_test.squeeze().astype("int64")

# Convert images to float32
x_train = x_train.astype("float32")
x_test  = x_test.astype("float32")

# -----------------------------
# 2) Data augmentation
# -----------------------------
data_augmentation = keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.05),
    layers.RandomZoom(0.1),
], name="augmentation")

# -----------------------------
# 3) Build ResNet50V2 backbone (pretrained)
# -----------------------------
resnet_base = ResNet50V2(
    include_top=False,
    weights="imagenet",
    input_shape=(224, 224, 3)
)
resnet_base.trainable = False  # freeze first (feature extractor)

# -----------------------------
# 4) Full model (preprocess inside model)
# -----------------------------
resnet_model = keras.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,
    layers.Resizing(224, 224, interpolation="bilinear"),
    layers.Lambda(preprocess_input),          # IMPORTANT: correct for ResNet50V2
    resnet_base,
    layers.GlobalAveragePooling2D(),
    layers.Dense(10)                          # logits
], name="cifar10_resnet50v2")

resnet_model.summary()

# -----------------------------
# 5) Compile + Train (frozen backbone)
# -----------------------------
resnet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

callbacks = [
    keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=3, restore_best_weights=True),
    keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1),
]

history = resnet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=5,
    batch_size=34,
    callbacks=callbacks,
    verbose=1
)



Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50v2_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m94668760/94668760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 0us/step


Epoch 1/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m189s[0m 134ms/step - accuracy: 0.6969 - loss: 0.8628 - val_accuracy: 0.8700 - val_loss: 0.3664 - learning_rate: 0.0010
Epoch 2/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m176s[0m 133ms/step - accuracy: 0.8033 - loss: 0.5673 - val_accuracy: 0.8752 - val_loss: 0.3570 - learning_rate: 0.0010
Epoch 3/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m176s[0m 133ms/step - accuracy: 0.8215 - loss: 0.5200 - val_accuracy: 0.8768 - val_loss: 0.3506 - learning_rate: 0.0010
Epoch 4/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m176s[0m 133ms/step - accuracy: 0.8264 - loss: 0.4980 - val_accuracy: 0.8770 - val_loss: 0.3645 - learning_rate: 0.0010
Epoch 5/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m202s[0m 133ms/step - accuracy: 0.8367 - loss: 0.4666 - val_accuracy: 0.8858 - val_loss: 0.3346 - learning_rate: 5.0000e-04


Let's test our model

In [2]:

# -----------------------------
# 6) Test / Evaluate
# -----------------------------
test_loss, test_acc_r = resnet_model.evaluate(x_test, y_test, verbose=0)
print("ResNet50V2 (frozen) test accuracy:", test_acc_r)
print("ResNet50V2 (frozen) test loss    :", test_loss)


ResNet50V2 (frozen) test accuracy: 0.8841999769210815
ResNet50V2 (frozen) test loss    : 0.33952605724334717


### Fine-tune ResNet

In this step, we fine-tune the pretrained network by unfreezing the last layers and training with a small learning rate. This allows the model to better adapt to CIFAR-10 while preserving useful pretrained features.

In [3]:
# -----------------------------
#Fine-tune last layers
# -----------------------------
resnet_base.trainable = True
for layer in resnet_base.layers[:-30]:
    layer.trainable = False

print("Trainable layers in backbone:", sum(l.trainable for l in resnet_base.layers), "/", len(resnet_base.layers))

resnet_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history_ft = resnet_model.fit(
    x_train, y_train,
    validation_split=0.1,
    epochs=5,
    batch_size=34,
    verbose=1
)

test_loss_ft, test_acc_ft = resnet_model.evaluate(x_test, y_test, verbose=0)
print("ResNet50V2 (fine-tuned) test accuracy:", test_acc_ft)
print("ResNet50V2 (fine-tuned) test loss    :", test_loss_ft)

Trainable layers in backbone: 30 / 190
Epoch 1/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m260s[0m 188ms/step - accuracy: 0.8027 - loss: 0.5721 - val_accuracy: 0.9054 - val_loss: 0.2696
Epoch 2/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m246s[0m 186ms/step - accuracy: 0.8689 - loss: 0.3797 - val_accuracy: 0.9122 - val_loss: 0.2430
Epoch 3/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m246s[0m 186ms/step - accuracy: 0.8954 - loss: 0.3022 - val_accuracy: 0.9180 - val_loss: 0.2304
Epoch 4/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m246s[0m 186ms/step - accuracy: 0.9097 - loss: 0.2574 - val_accuracy: 0.9290 - val_loss: 0.2137
Epoch 5/5
[1m1324/1324[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m246s[0m 186ms/step - accuracy: 0.9269 - loss: 0.2177 - val_accuracy: 0.9312 - val_loss: 0.1979
ResNet50V2 (fine-tuned) test accuracy: 0.9283999800682068
ResNet50V2 (fine-tuned) test loss    : 0.21251121163368225


### Compare the two models

In [4]:
# Collect and compare accuracies (update if you rename variables)
results = {
    "ResNet frozen test acc": float(test_acc_r) if 'test_acc_r' in globals() else None,
    "ResNet fine-tuned test acc": float(test_acc_ft) if 'test_acc_ft' in globals() else None,
}
for k,v in results.items():
    print(f"{k}: {v}")


ResNet frozen test acc: 0.8841999769210815
ResNet fine-tuned test acc: 0.9283999800682068
