<a href="https://colab.research.google.com/github/waleedGeorgy/deep-learning/blob/main/TensorFlow_Transfer_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Introduction & Data Preparation

**Transfer Learning** is the process of using a pre-trained, well-performing model, on our custom, related problem. For example, using a model that was trained on millions of various images and adjust it to classify various kinds of foods.

Transfer learning is a very important, efficient, and popular concept in machine learning because it can train deep neural networks faster with comparatively little data, and usually results in better models.

For example, one of the best CV models, **EfficientNet**, can be repurposed to work with the problem of classifying types of food we're familiar with. We've already seen that a manually built TinyVGG does not perform very well when working with 10 types of food, so, our next course of action is to use a pretrained **EfficientNet** model to resolve our problem.

There are **two** main types of transfer learning:


1.   **Feature Extraction** - in feature extraction we use the representations (weights) learned by a previous network to extract meaningful features from new data. In feature extraction, we simply modify the output layer to match the problem we're working with, while freezing all the other layers, this means that there is no need to retrain the entire model, but rather only the modified output layer.
2.   **Fine-Tuning** - in fine-tuning we unfreeze a few of the layers of a pretrained model and only train both the newly-added unfrozen layers and the modified output layers of said model. This allows us to "fine-tune" the higher-order feature representations in the base model in order to make them more relevant for the specific task.



For that data, we'll be using a subset of the Food101 dataset, that contains only 10 types of food and 10% of the data.

We're starting with a tiny subset, because in deep learning it is always a good idea to start small, check if the model works well for this tiny subset, and then add more as needed. Also working with a small subset will highlight the power of transfer learning of being able to work with a little amount of data.

In [1]:
# Downloading and unzipping the data
import zipfile

!wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip

with zipfile.ZipFile('10_food_classes_10_percent.zip', 'r') as zipref:
  zipref.extractall()

--2024-06-28 18:08:31--  https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 172.253.118.207, 74.125.200.207, 74.125.130.207, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|172.253.118.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 168546183 (161M) [application/zip]
Saving to: ‘10_food_classes_10_percent.zip’


2024-06-28 18:08:41 (18.0 MB/s) - ‘10_food_classes_10_percent.zip’ saved [168546183/168546183]



In [2]:
# Walking through the data directories
import os

for dirpath, dirname, filename in os.walk('10_food_classes_10_percent'):
  print(f"There are {len(dirname)} directories and {len(filename)} files in {dirpath}.")

There are 2 directories and 0 files in 10_food_classes_10_percent.
There are 10 directories and 0 files in 10_food_classes_10_percent/train.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/chicken_curry.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/hamburger.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/pizza.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/sushi.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/ice_cream.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/ramen.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/fried_rice.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/chicken_wings.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/grilled_salmon.
There are 0 directories and 75 files in 10_food_classes_10_percent/train/steak.
There are 10 director

In [3]:
# Turning our data into dataset
import tensorflow as tf
from tensorflow.keras.preprocessing import image_dataset_from_directory

BATCH_SIZE = 32
IMAGE_SIZE = (224,224)

train_dir = '10_food_classes_10_percent/train/'
test_dir = '10_food_classes_10_percent/test/'

train_data = image_dataset_from_directory(directory = train_dir,
                                          batch_size = BATCH_SIZE,
                                          image_size = IMAGE_SIZE,
                                          labels = 'inferred',
                                          label_mode = 'categorical',
                                          shuffle = True,
                                          seed = 42)

test_data = image_dataset_from_directory(directory = test_dir,
                                          batch_size = BATCH_SIZE,
                                          image_size = IMAGE_SIZE,
                                          labels = 'inferred',
                                          label_mode = 'categorical',
                                          shuffle = False,
                                          seed = 42)

Found 750 files belonging to 10 classes.
Found 2500 files belonging to 10 classes.


In [4]:
# Getting the class names
class_names = train_data.class_names
class_names

['chicken_curry',
 'chicken_wings',
 'fried_rice',
 'grilled_salmon',
 'hamburger',
 'ice_cream',
 'pizza',
 'ramen',
 'steak',
 'sushi']

#Setting up Callbacks

**Callbacks** are helpful extra functionalities that can be added to models during or after the fitting process.

Previously, we got familiar with the LearningRateScheduler callback, that helped us find the ideal learning rate for our model.

In the future, we will be using more of them in order to make our work easier and more efficient. Some of the callbacks we'll be using are:


1.   **TensorBoard** - logs the performance of multiple models, and helps compare said models visually in a single, easy to use interface. Can be accessed with `tf.keras.callbacks.TensorBoard()`.
2.   **Model Checkpointing** - saves our model as it trains, even before it finishes training fully, so we can resume training at a later time. Helpful when training a model takes a long time. Can be accessed with `tf.keras.callbacks.ModelCheckpoint()`.
3.   **Early Stopping** - Stops training the model when it fails to improve. Helpful when the number of epochs can't be decided, or when we have a large dataset and don't know how long the training will take. Can be accessed with `tf.keras.callbacks.EarlyStopping()`.



Mainly, we will be using the TensorBoard callback, to easily compare the results of different models.

For this, we will write a function that will create a TensorBoard callback for each model we train, and saves all these callbacks to one folder.

In [5]:
import datetime

def create_tb_callback(dir_name, experiment_name):
  '''
  Creates a TensorBoard callback for a model and saves it to the dir_name/experiment_name/CURRENT_DATATIME
  directory.
  '''
  log_dir = dir_name + '/' + experiment_name + '/' + datetime.datetime.now().strftime('%Y%m%d-%H%M%S')
  tb_callback = tf.keras.callbacks.TensorBoard(log_dir = log_dir)
  print(f'Creating a TensorBoard log in {log_dir}')
  return tb_callback

#Using Transfer Learning & Creating a Feature Extractor

Now comes the exciting part of using a pretrained model for our image classification problem.

For TensorFlow, there exists a huge number of different models trained for various applications, that are ready to be applied and fine-tuned for our custom problems, and all of them could be found on [TensorFlow Hub](https://www.kaggle.com/models?tfhub-redirect=true).

For our problem, we will be using two of the most popular and well-performing architectures:

1.   [ResNetV2](https://arxiv.org/abs/1603.05027) -  a state of the art computer vision model architecture from 2016. Can be found [here](https://www.kaggle.com/models/google/resnet-v2) on TensorFlow Hub.
2.   [EfficientNetV2](https://arxiv.org/abs/1905.11946) - a state of the art computer vision architecture from 2019. Can be found [here](https://www.kaggle.com/models/google/efficientnet-v2) on TensorFlow Hub.

How do we know which model performs the best for a certain type of application? Luckily, there is a great website called [paperwithcode.com/sota](https://paperswithcode.com/sota), that contain many benchmarks for various models and applications.

Important to remember is that the best performing model is not always the best choice for every problem, things like model's size and complexity and the size of the dataset, among others, need to be taken into consideration when choosing a model.

For example, choosing an overly complex and huge model for a relatively small dataset will most likely yield bad results. Another example is that we may want a model that trains and predicts fast, in this case, a small, not the best performing model would be ideal.

First we need to download the models (feature extractors) from the hub.

Luckily, the hub gives us the needed code to instantly download any model we need, and gives us the path to access the model.

We will write a function that will take in the URL of the model, and create a keras feature extractor our of it.

In [12]:
# Saving models' urls in variables
effnetb0_url = "https://www.kaggle.com/models/google/efficientnet-v2/TensorFlow2/imagenet1k-b0-feature-vector/2"
resnetv2_url = "https://www.kaggle.com/models/google/resnet-v2/TensorFlow2/50-feature-vector/2"

In [13]:
import tensorflow_hub as hub

def create_model(model_url, num_classes = 10):
  '''
  Creates a Keras sequential feature extractor by providing a model_url, and the
  number of classes of the output layer.

  Args:
    model_url: the URL of the feature extractor.
    num_classes: number of classes in the output classifier layer. Default = 10.

  Returns:
    An uncompiled Keras sequential feature extractor.
  '''
  # Downloading the pre-trained model and saving it as a keras layer
  feature_extraction_layer = hub.KerasLayer(model_url,
                                           trainable = False, # We don't need to retrain the weights in the model, hence we freeze them by setting trainable to False
                                           name = 'feature_extraction_layer',
                                           input_shape = IMAGE_SIZE + (3,)) # Adding the color channel dimension in the input shape

  # Creating the model with the feature extractor as the backbone
  model = tf.keras.Sequential([
      tf.keras.layers.Rescaling(1./255),
      feature_extraction_layer,
      tf.keras.layers.Dense(num_classes, activation = 'softmax', name = 'output_layer')
  ])

  return model

Generally, one model has a lot of variations, which are reflected by the number that comes after a model's name.

For example, in our case, we are working with ResNetV2_50, but there are also ResNetV2_101, ResNetV2_152, etc. The number here represents the number of layers in the ResNet architecture, the higher the number, the more layers there are and the more complex and big the model is.

For EfficientNetV2, there are also different variations. We are working with EfficientNetV2 that was trained on [ImageNet](https://www.image-net.org/)1k (1k means it was trained of 1000 classes of images, there is also 21k) with a size of B0, but there are also B1, B2, B3, etc. that contain more layers.

All of these variations can be accessed through the TensorFlow Hub.

In [16]:
# Creating a ResNetV2 model using the function created aboce
resnet_model = create_model(resnetv2_url,
                            num_classes = len(class_names))

In [25]:
# Getting a summary of the model
resnet_model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 rescaling_1 (Rescaling)     (None, 224, 224, 3)       0         
                                                                 
 feature_extraction_layer (  (None, 2048)              23564800  
 KerasLayer)                                                     
                                                                 
 output_layer (Dense)        (None, 10)                20490     
                                                                 
Total params: 23585290 (89.97 MB)
Trainable params: 20490 (80.04 KB)
Non-trainable params: 23564800 (89.89 MB)
_________________________________________________________________


In [18]:
# Compiling the model
resnet_model.compile(loss = 'categorical_crossentropy',
                     optimizer = tf.keras.optimizers.Adam(),
                     metrics = ['accuracy'])

In [19]:
# Fitting the model and attaching a TensorBoard callback to it
resnet_hitsory = resnet_model.fit(train_data,
                                  epochs = 5,
                                  steps_per_epoch = len(train_data),
                                  validation_data = test_data,
                                  validation_steps = len(test_data),
                                  callbacks = [create_tb_callback(dir_name = 'tf_hub',
                                                                  experiment_name = 'resnet50V2')])

Creating a TensorBoard log in tf_hub/resnet50V2/20240628-181223
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Already we can see a huge improvement over any model we've built previously! We managed to get a 78.16% validation accuracy as opposed to the ~30% accuracy we got from TinyVGG. All that while training on only 10% of the data, and for way less time (since we're only training the output layer).