<a href="https://colab.research.google.com/github/luiscunhacsc/udemy-ai-en/blob/main/part1b_vision/part1b_example02_vgg16_cifar10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

In this example, we will learn how to use pre-trained models for image classification. We will use the "CIFAR-10" dataset, which consists of 60,000 32x32 color images divided into 10 different classes.

We will start by using a pre-trained model directly to classify images. Then, we will perform freezing and fine-tuning of the model to better adapt it to our dataset. Finally, we will explore some additional techniques such as data augmentation and regularization.

In [None]:
# Import necessary libraries
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
import numpy as np
import matplotlib.pyplot as plt

# Load the CIFAR-10 dataset

The CIFAR-10 dataset contains 60,000 32x32 color images divided into 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images.

In [None]:
# Load the dataset
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

# Normalize the data
train_images = train_images / 255.0
test_images = test_images / 255.0

# Convert labels to one-hot encoding
train_labels = to_categorical(train_labels, 10)
test_labels = to_categorical(test_labels, 10)

# Using a pre-trained model

Let's start by using the VGG16 model, which is pre-trained on the ImageNet dataset, to classify the CIFAR-10 images. We will use the model directly without any modifications.

In [None]:
# Use the pre-trained VGG16 model
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(32, 32, 3))

# Add custom classification layers
x = base_model.output
x = layers.Flatten()(x)
x = layers.Dense(512, activation='relu')(x)
predictions = layers.Dense(10, activation='softmax')(x)

# Build the final model
model = Model(inputs=base_model.input, outputs=predictions)

# Compile the model
model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])

# Evaluate the performance of the pre-trained model on the test set
evaluation = model.evaluate(test_images, test_labels)
print(f"Accuracy of the pre-trained model on the test set: {evaluation[1]}")

# Fine-tuning the pre-trained model

Now, let's perform the fine-tuning of the pre-trained model. Initially, we will freeze all the layers of the base model (VGG16) and train only the top layers that we added. This is called "feature extraction". Then, we will unfreeze some of the layers of the base model and perform fine-tuning together with the added layers.

In [None]:
# Feature extraction - freeze the layers of the base model
for layer in base_model.layers:
    layer.trainable = False

# Compile the model
model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history_feature_extraction = model.fit(train_images, train_labels, batch_size=128, epochs=5, validation_data=(test_images, test_labels))

# Unfreeze some layers and perform fine-tuning

Now let's unfreeze some of the layers of the base model and perform fine-tuning together with the added layers.

In [None]:
# Unfreeze layers and perform fine-tuning
for layer in base_model.layers[-4:]:
    layer.trainable = True

# Compile the model
model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history_fine_tuning = model.fit(train_images, train_labels, batch_size=128, epochs=5, validation_data=(test_images, test_labels))


# Data Augmentation

Data Augmentation increases the size of the training set by creating modified versions of the images, such as rotations, shifts, and zooms. This can help improve the performance and generalization of the model.

In [None]:
# Use Data Augmentation
data_augmentation = keras.Sequential([
    layers.experimental.preprocessing.RandomFlip("horizontal"),
    layers.experimental.preprocessing.RandomRotation(0.1),
    layers.experimental.preprocessing.RandomZoom(0.2)
])

inputs = keras.Input(shape=(32, 32, 3))
x = data_augmentation(inputs)
x = base_model(x, training=False)
x = layers.Flatten()(x)
x = layers.Dense(512, activation='relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)

# Build the final model with Data Augmentation
model_with_augmentation = Model(inputs=inputs, outputs=outputs)

# Compile the model
model_with_augmentation.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history_with_augmentation = model_with_augmentation.fit(train_images, train_labels, batch_size=128, epochs=5, validation_data=(test_images, test_labels))

# Visualizing the results

Now let's visualize the performance of the models at different stages: using the pre-trained model directly, after feature extraction, after fine-tuning, and after data augmentation.

In [None]:
# Visualization of results
plt.figure(figsize=(12, 4))

plt.subplot(1, 2, 1)
plt.plot(history_feature_extraction.history['val_accuracy'], label='Feature Extraction')
plt.plot(history_fine_tuning.history['val_accuracy'], label='Fine-tuning')
plt.plot(history_with_augmentation.history['val_accuracy'], label='Data Augmentation')
plt.legend()
plt.xlabel('Epochs')
plt.ylabel('Validation Accuracy')
plt.title('Model Performance')

plt.subplot(1, 2, 2)
plt.plot(history_feature_extraction.history['val_loss'], label='Feature Extraction')
plt.plot(history_fine_tuning.history['val_loss'], label='Fine-tuning')
plt.plot(history_with_augmentation.history['val_loss'], label='Data Augmentation')
plt.legend()
plt.xlabel('Epochs')
plt.ylabel('Validation Loss')
plt.title('Model Performance')

plt.show()


# Conclusion

In this example, we explored different techniques to leverage pre-trained models for image classification. We used the CIFAR-10 dataset and the pre-trained VGG16 model.

We started by using the model directly, then performed feature extraction, followed by fine-tuning, and finally used data augmentation.

It is important to note that choosing the best techniques and hyperparameters may depend on the dataset and the specific problem. Experimenting and evaluating different approaches is an important part of the deep learning model development process.