# Transfer Learning on Cifar-10 Dataset

Using a pre-trained model as a starting point for a new, related task. It leverages the learned features from the pre-trained model, improving performance and reducing the need for extensive data and computational resources.

Lets modify the Cifar-10 architecture we have previously built.

## 1. Import Libraries

Importing libraries in programming is essential because it allows us to leverage pre-written code, enabling them to perform complex tasks without reinventing the wheel.

In [1]:
# Main imports needed
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications.resnet50 import ResNet50
import tensorflow_datasets as tfds

print("Tensorflow version:", tf.__version__)

2024-06-19 13:00:21.676422: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Tensorflow version: 2.16.1


# 2. Load Data

Loading data is crucial in transfer learning because it forms the foundation upon which pre-trained models are fine-tuned to solve new, specific tasks.

In [2]:
# Using keras

(x_train_full, y_train_full), (x_test, y_test) = keras.datasets.cifar10.load_data()

print("Training data shape", x_train_full.shape)
print("Test data shape", x_test.shape)

Training data shape (50000, 32, 32, 3)
Test data shape (10000, 32, 32, 3)


# 3. Visualize Data

Lets get some insight into the dataset, enabling better understanding and decision-making throughout the model adaptation process. Visualizing data helps identify patterns, anomalies, and distributions, ensuring the pre-trained model's assumptions align with the new dataset.

In [None]:
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

# Create a new figure
plt.figure(figsize=(12, 8))

# Loop over the first 24 images
for i in range(24):
    # Create a subplot for each image
    plt.subplot(4, 6, i + 1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)

    # Display the image
    plt.imshow(x_train_full[i])

    # Set the label as the title
    plt.title(class_names[y_train_full[i][0]], fontsize=12)

# Display the figure
plt.show()

# 4. Build Transfer Learning Model

## 4.1 Import Necessary Libraries

In [4]:
from keras.utils import to_categorical
from keras.applications.resnet50 import preprocess_input
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D, Input, Flatten, UpSampling2D
from keras.optimizers import SGD, Adam
from keras.callbacks import EarlyStopping

## 4.2 Preprocess Input


Ensures that the new dataset is compatible with the pre-trained model's expectations, leading to more accurate and efficient learning.

In [5]:
x_train_full = x_train_full.astype('float32')
x_test = x_test.astype('float32')

# Assuming x_train_full and x_test are already loaded as numpy arrays
x_train_full = preprocess_input(x_train_full)
x_test = preprocess_input(x_test)

print("Training data shape:", x_train_full.shape)
print("Test data shape:", x_test.shape)

Training data shape: (50000, 32, 32, 3)
Test data shape: (10000, 32, 32, 3)


## 4.3 Train, Test Split

Lets divide the training set into separate training and validation sets using slicing operations. The training set, x_train_full, is split into two parts, 1 containing the majority of the data and the other containing a smaller portion (5000 samples in this case) which will be used for validation during model training. 
Corresponding labels are also split into `y_train` and `y_valid`. 

Additionally, we **convert the class labels from integer format to categorical format** using the `to_categorical` function. This is necessary for categorical classification tasks like CIFAR-10, where each image is assigned one of ten possible categories. 

Converting the labels to categorical format ensures that they are represented as **one-hot vectors**, which is required by the model during training and evaluation.

In [6]:
x_train, x_valid = x_train_full[:-5000], x_train_full[-5000:]
y_train, y_valid = y_train_full[:-5000], y_train_full[-5000:]

y_train = to_categorical(y_train, 10)
y_valid = to_categorical(y_valid, 10)
y_test = to_categorical(y_test, 10)

print("Training data shape", x_train.shape)
print("Test data shape", x_test.shape)
print("Valid data shape", x_valid.shape)

Training data shape (45000, 32, 32, 3)
Test data shape (10000, 32, 32, 3)
Valid data shape (5000, 32, 32, 3)


## 4.4 Define Feature Extractor and Classifier

The purpose of a feature extractor is to leverage the learned representations from a pre-trained model to extract relevant features from input images.

The `feature_extractor` function **takes input tensors representing images and returns the output feature maps** generated by the `ResNet50` model. By setting `include_top=False`, we exclude the fully connected layers at the top of the `ResNet50` architecture, retaining only the convolutional layers. This allows us to **use `ResNet50` as a feature extractor while excluding its classification layers**, which are specific to the `ImageNet` task.

Additionally, we freeze the layers of the base `ResNet50` model by setting `layer.trainable = False` for each layer. **Freezing the layers prevents their weights from being updated during training**, ensuring that only the weights of the additional layers we add on top of the base model will be trained. 


In [7]:
# Define the feature extractor using ResNet50
def feature_extractor(inputs):
    base_model = tf.keras.applications.ResNet50(
        input_shape=(224, 224, 3), include_top=False, weights='imagenet')

    # Freeze the layers of the base model
    for layer in base_model.layers:
        layer.trainable = False
        
        return base_model(inputs)

Now lets define a classifier function that builds the classification layers on top of the features extracted by the `ResNet50` model. The classifier is responsible for mapping the extracted features to the corresponding class probabilities for the given task.

The classifier function **takes the output feature maps from the feature extractor as input and adds several dense layers to perform classification**. First, we *apply a global average pooling layer to reduce the spatial dimensions of the feature maps while retaining important spatial information*. Then, we flatten the pooled feature maps into a `1D` vector to feed into the fully connected layers.

Next, **we add two densely connected layers with `ReLU` activation functions**, which introduce non-linearity to the model and allow it to learn complex patterns in the data. These layers progressively reduce the dimensionality of the feature space, capturing increasingly abstract representations of the input data.

Finally, **we add a dense output layer with `softmax` activation**, consisting of `10` units corresponding to the `10` classes in the `CIFAR-10` dataset. The softmax function normalizes the output probabilities, ensuring that they sum up to 1 and represent the predicted probabilities for each class. The name "classification" is assigned to this layer for easy identification

In [8]:
def classifier(inputs):
    x = GlobalAveragePooling2D()(inputs)
    x = Flatten()(x)
    x = Dense(1024, activation='relu')(x)
    x = Dense(512, activation='relu')(x)
    x = Dense(10, activation='softmax', name="classification")(x)
    return x

## Defining the Final Model
In this step, we define the final model by integrating the feature extraction and classification components. The final model takes input tensors representing images and produces output predictions for the given task.

The `final_model` function begins by **upsampling the input images using the `UpSampling2D` layer**. This step increases the spatial dimensions of the images to match the input size expected by the `ResNet50` model. By resizing the images to a size of `(224, 224)`, we ensure compatibility with the input shape required by the pre-trained `ResNet50` architecture.

Next, **the resized images are passed through the feature extractor**, which extracts relevant features from the input images using the pre-trained `ResNet50` model. The feature extractor leverages the learned representations from the `ResNet50` architecture to capture meaningful patterns and characteristics present in the images.

**The extracted features are then fed into the classifier**, which consists of several densely connected layers followed by a softmax output layer. The classifier processes the extracted features and generates class probabilities for each input image, indicating the likelihood of belonging to each of the predefined classes.


In [9]:
def final_model(inputs):
    resize = UpSampling2D(size=(7,7))(inputs)
    resnet_fe = feature_extractor(resize)
    classification_output = classifier(resnet_fe)
    
    return classification_output

In [10]:
def compile_model():
    inputs = Input(shape=(32, 32,3))
    classification_output = final_model(inputs)
    model = Model(inputs=inputs, outputs=classification_output)
    model.compile(optimizer=Adam(learning_rate=0.001),
                 loss='categorical_crossentropy',
                 metrics=['accuracy'])
    return model

## Creating and Summarizing the Model

In this step, we create the neural network model, which defines the model architecture and compiles it with specified optimization parameters, loss function, and evaluation metrics. Once the model is created, we use the summary method to print a concise summary of its architecture. This summary provides key information about the model's structure, including the type and shape of each layer, the number of parameters, and the output shape of each layer. 

In [11]:
model = compile_model()
model.summary()

## Training the Model with Early Stopping

Here, we employ the early stopping technique by defining an early stopping callback, which monitors the validation loss during training and halts the training process if the validation loss does not improve for a specified number of epochs (patience). The `restore_best_weights=True` argument ensures that the model's weights are reverted to the configuration yielding the lowest validation loss when training concludes.

In [None]:
# Early stopping callback
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Train the model
history = model.fit(x_train, y_train, epochs=20, batch_size=32, validation_data=(x_valid, y_valid), callbacks=[early_stopping])

Epoch 1/20
[1m   6/1407[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m3:37:11[0m 9s/step - accuracy: 0.1647 - loss: 3.2323