# Lecture 66: Transfer Learning in Deep Learning

This notebook demonstrates **transfer learning** using a pre-trained VGG16 model for image classification on the CIFAR-10 dataset. Transfer learning involves using a model pre-trained on a large dataset (e.g., ImageNet) and adapting it for a new task. We'll cover:

- Loading and preprocessing the CIFAR-10 dataset
- Using a pre-trained VGG16 model (excluding top layers)
- Adding custom layers for the new task
- Freezing and fine-tuning the model
- Training and evaluating the model
- Visualizing predictions

VGG16, pre-trained on ImageNet, will be used as the base model, with fine-tuning to adapt it to CIFAR-10's 10 classes.

## Setup and Imports

Let's import the necessary libraries and set up the environment for reproducibility.

In [1]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Dropout
from tensorflow.keras.datasets import cifar10
import matplotlib.pyplot as plt

# Set random seed for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

## Loading and Preprocessing the CIFAR-10 Dataset

CIFAR-10 contains 60,000 32x32 RGB images across 10 classes. We'll preprocess the images to match VGG16's expected input format (resizing to 224x224 and applying VGG16-specific preprocessing) and convert labels to one-hot encoding.

In [None]:
# Load CIFAR-10 dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Class names for CIFAR-10
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

# Resize images to 224x224 for VGG16
def resize_images(images, target_size=(224, 224)):
    resized_images = np.zeros((images.shape[0], *target_size, 3))
    for i in range(images.shape[0]):
        resized_images[i] = tf.image.resize(images[i], target_size).numpy()
    return resized_images

X_train = resize_images(X_train)
X_test = resize_images(X_test)

# Normalize pixel values using VGG16 preprocessing
X_train = keras.applications.vgg16.preprocess_input(X_train)
X_test = keras.applications.vgg16.preprocess_input(X_test)

# Convert labels to one-hot encoding
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

print(f"Training data shape: {X_train.shape}")
print(f"Test data shape: {X_test.shape}")

# Visualize some sample images
plt.figure(figsize=(10, 2))
for i in range(5):
    plt.subplot(1, 5, i+1)
    # Undo VGG16 preprocessing for visualization
    img = X_train[i].copy()
    img[:, :, 0] += 103.939
    img[:, :, 1] += 116.779
    img[:, :, 2] += 123.68
    img = img[:, :, ::-1]  # BGR to RGB
    img = np.clip(img, 0, 255).astype('uint8')
    plt.imshow(img)
    plt.title(class_names[np.argmax(y_train[i])])
    plt.axis('off')
plt.show()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 0us/step


## Loading the Pre-trained VGG16 Model

We'll load VGG16 pre-trained on ImageNet, excluding its top (fully connected) layers, and add custom layers for CIFAR-10 classification. We'll initially freeze the pre-trained layers to use their learned features.

In [None]:
# Load VGG16 without top layers
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Freeze the base model layers
base_model.trainable = False

# Add custom layers
inputs = keras.Input(shape=(224, 224, 3))
x = base_model(inputs, training=False)
x = GlobalAveragePooling2D()(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
outputs = Dense(10, activation='softmax')(x)

# Create the model
model = Model(inputs, outputs)

# Compile the model
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m58889256/58889256[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 0us/step


## Training the Model (Feature Extraction)

First, we'll train the model with the pre-trained layers frozen, only updating the custom layers. This is known as **feature extraction**.

In [None]:
# Define early stopping
early_stopping = keras.callbacks.EarlyStopping(
    monitor='val_loss',
    patience=5,
    restore_best_weights=True
)

# Train the model (feature extraction)
history = model.fit(X_train, y_train,
                    epochs=20,
                    batch_size=32,
                    validation_split=0.2,
                    callbacks=[early_stopping],
                    verbose=1)

NameError: name 'X_train' is not defined

## Fine-Tuning the Model

To improve performance, we'll unfreeze some of the later layers of VGG16 and fine-tune them with a lower learning rate. This allows the model to adapt the pre-trained features to the CIFAR-10 dataset.

In [None]:
# Unfreeze the base model
base_model.trainable = True

# Freeze all layers except the last convolutional block (block5)
for layer in base_model.layers:
    if 'block5' in layer.name:
        layer.trainable = True
    else:
        layer.trainable = False

# Recompile the model with a lower learning rate
model.compile(optimizer=keras.optimizers.Adam(learning_rate=1e-5),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Fine-tune the model
fine_tune_history = model.fit(X_train, y_train,
                              epochs=10,
                              batch_size=32,
                              validation_split=0.2,
                              callbacks=[early_stopping],
                              verbose=1)

## Evaluating the Model

We'll evaluate the fine-tuned model on the test set and visualize the training and validation accuracy/loss curves.

In [None]:
# Evaluate on test set
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=0)
print(f"Test accuracy: {test_accuracy:.4f}")
print(f"Test loss: {test_loss:.4f}")

# Plot training history (combine feature extraction and fine-tuning)
plt.figure(figsize=(12, 4))

# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'] + fine_tune_history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'] + fine_tune_history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'] + fine_tune_history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'] + fine_tune_history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.tight_layout()
plt.savefig('transfer_learning_history.png')

## Visualizing Predictions

Let's make predictions on the test set and visualize some examples to assess the model's performance qualitatively.

In [None]:
# Make predictions
predictions = model.predict(X_test)
predicted_classes = np.argmax(predictions, axis=1)
true_classes = np.argmax(y_test, axis=1)

# Visualize some predictions
plt.figure(figsize=(12, 4))
for i in range(5):
    plt.subplot(1, 5, i+1)
    # Undo VGG16 preprocessing for visualization
    img = X_test[i].copy()
    img[:, :, 0] += 103.939
    img[:, :, 1] += 116.779
    img[:, :, 2] += 123.68
    img = img[:, :, ::-1]  # BGR to RGB
    img = np.clip(img, 0, 255).astype('uint8')
    plt.imshow(img)
    plt.title(f"Pred: {class_names[predicted_classes[i]]}\nTrue: {class_names[true_classes[i]]}")
    plt.axis('off')
plt.tight_layout()
plt.show()

## Explanation

- **Transfer Learning**: We used VGG16 pre-trained on ImageNet, leveraging its learned features for CIFAR-10 classification.
- **Feature Extraction**: Initially, we froze the VGG16 layers and trained only the custom dense layers to adapt to the new task.
- **Fine-Tuning**: We unfroze the last convolutional block (block5) and fine-tuned it with a low learning rate to improve performance.
- **Preprocessing**: Images were resized to 224x224 and preprocessed to match VGG16's requirements (BGR format, specific mean subtraction).
- **Model Architecture**: Added global average pooling, a dense layer with ReLU, dropout for regularization, and a softmax output for 10 classes.
- **Evaluation**: Assessed performance with test accuracy/loss and visualized training history to monitor convergence.
- **Predictions**: Visualized sample predictions to qualitatively evaluate the model.

To extend this work, consider:
- Using other pre-trained models like ResNet50 or EfficientNet
- Applying data augmentation (e.g., random flips, rotations) to improve robustness
- Fine-tuning more layers or adjusting the learning rate schedule
- Experimenting with different custom top layers or regularization techniques