# Transfer Learning with TensorFlow Part 1: Feature Extraction

Transfer learning is leveraging a working model's existing architecture and learned patterns for our problem.

There are two main benefits:
1. Can leverage an existing neural network architecture proven to work on problems similar to our own.
2. Can leverage a working neural network architecture which has already learned patterns on similar data to our own, then we can adapt those patterns to our own data.

Video N°143: Downloading and preparing data for our first transfer learning model

## Download and becoming one with the data

In [1]:
from MachineLearningUtils.data_acquisition.data_downloader import download_data

In [2]:
# Get data (10% of 10 food classes from Food101) - https://www.kaggle.com/datasets/dansbecker/food-101
url = "https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip"
file_path = "10_food_classes_10_percent.zip"
download_data(url=url, file_path=file_path, extract=True)

The file 10_food_classes_10_percent.zip already exists.
Extracting 10_food_classes_10_percent.zip as ZIP...
10_food_classes_10_percent.zip has been extracted to current directory.


In [3]:
# How many images in each folder?
import os

# Walk through 10 percent data directory and list number of files
for dirpath, dirnames, filenames in os.walk("10_food_classes_10_percent"):
    print(f"There are {len(dirnames)} directories and {len(filenames)} images in '{dirpath}'.")

There are 2 directories and 0 images in '10_food_classes_10_percent'.
There are 10 directories and 0 images in '10_food_classes_10_percent/test'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/steak'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/pizza'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/sushi'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/fried_rice'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/chicken_curry'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/grilled_salmon'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/ice_cream'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/hamburger'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/ramen'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/ch

## Creating data loaders (preparing the data)

We'll use the `ImageDataGenerator` class to load in our images in batches. 

In [4]:
# Setup data inputs
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_dir = "10_food_classes_10_percent/train/"
test_dir = "10_food_classes_10_percent/test/"

flow_from_dir_args = {
    "target_size": (224, 224),
    "color_mode": "rgb",
    "batch_size": 32,
    "class_mode": "categorical",
    "shuffle": True
}

train_datagen = ImageDataGenerator(rescale=1/255.)
test_datagen = ImageDataGenerator(rescale=1/255.)

print("Training images:")
train_data_10_percent = train_datagen.flow_from_directory(directory=train_dir, **flow_from_dir_args)

print("Testing images:")
test_data = test_datagen.flow_from_directory(directory=test_dir, **flow_from_dir_args)

2024-04-02 10:23:18.715890: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


ImportError: cannot import name 'cast' from partially initialized module 'keras.src.backend' (most likely due to a circular import) (/home/wm18vw/miniconda3/envs/ML-env/lib/python3.9/site-packages/keras/src/backend/__init__.py)

Video N°144: Introducing Callbacks in TensorFlow and making a callback to track our models

## Setting up callbacks (things to run whilst our model trains)

Callbacks are extra functionality you can add to your models to be performed during or after training. Some of the modt popular callbacks:
* Tracking experiments with the TensorBoard callback
* Model checkpoint with the ModelCheckpoint callback
* Stopping a model from training (before it trains too long and overfits) with the EarlyStopping callback

In [None]:
# Create TensorBoard callback (functionized because we need to create a new one for each model)
from MachineLearningUtils.training_utilities.model_callbacks import create_tensorboard_callback

🔑 **Note:** You can customize the directory where your TensorBoard logs (model training metrics) get saved to whatever you like. The `log_dir` parameter we've created above is only one option

Video N°145: Exploring the TensorFlow Hub website for pretrained models

## Creating models using TensorFlow Hub

In the past we've used TensorFlow to create our own models layer by layer from scratch.
Now we're going to do a similar process, except the majority of our model's layers are going to come frome TensorFlow Hub.
We can access pretrained models on: https://tfhub.dev/
Browsing the TensorFlow Hub page and sorting for image classification, we found the following feature vector model link:
https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1

Video N°146: Building and compiling a TensorFlow feature extraction model

In [None]:
# Let's compare the following two models
resnet_ulr = "https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/4"

efficientnet_url = "https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1"

# New: EfficientNetB0 feature vector (version 2)
# efficientnet_url = "https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet1k_b0/feature_vector/2"

In [None]:
# Import dependencies
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras import layers

In [None]:
# Let's make a create_model() function to create a model from a URL
from MachineLearningUtils.training_utilities.transfer_learning import build_complete_model_from_url

### Creating and testing ResNet TensorFlow Hub Feature Extraction model

In [None]:
# Create Resnet model
resnet_model = build_complete_model_from_url(model_url=resnet_ulr,
                                             num_classes=train_data_10_percent.num_classes,
                                             input_shape=(224, 224, 3))

In [None]:
resnet_model.summary()

In [None]:
train_data_10_percent.num_classes

In [None]:
# Compile our resnet model
resnet_model.compile(loss="categorical_crossentropy",
                     optimizer=tf.keras.optimizers.Adam(),
                     metrics=["accuracy"])

Video N°147: Blowing our previous models out of the water with transfer learning

In [None]:
# Let's fit our ResNet model to the data (10 percent of 10 classes)
resnet_history = resnet_model.fit(train_data_10_percent,
                                  epochs=5,
                                  steps_per_epoch=len(train_data_10_percent),
                                  validation_data=test_data,
                                  validation_steps=len(test_data),
                                  callbacks=[create_tensorboard_callback(dir_name="tensorboard_hub",
                                                                         experiment_name="resnet50V2"
                                                                         )])

Wow!

That. Is. Incredible. Our transfer learning feature extractor model out performed ALL of the previous models we built by hand... (substantially) and in a quicker training time AND with only 10% of the training examples.

Video N°148: Plotting the loss curves of our ResNet feature extraction model

In [None]:
# Let's create a function to plot our loss curves...
from MachineLearningUtils.data_visualization.model_learning_curves import plot_loss_curves
plot_loss_curves(history=resnet_history)

Video N°149: Building and training a pre-trained EfficientNet model on our data

### Creating and testing EfficientNetB0 TensorFlow Hub Feature Extraction model
About EfficientNet [https://blog.research.google/2019/05/efficientnet-improving-accuracy-and.html](https://blog.research.google/2019/05/efficientnet-improving-accuracy-and.html)

In [None]:
# Create EfficientNetB0 feature extractor model
efficientnet_model = build_complete_model_from_url(model_url=efficientnet_url,
                                                   num_classes=train_data_10_percent.num_classes,
                                                   input_shape=(224, 224, 3))

# Compile EfficientNet model
efficientnet_model.compile(loss="categorical_crossentropy",
                           optimizer=tf.keras.optimizers.Adam(),
                           metrics=['accuracy'])

# Fit EfficientNet model to 10% of training data
efficientnet_history = efficientnet_model.fit(train_data_10_percent,
                                              epochs=5,
                                              steps_per_epoch=len(train_data_10_percent),
                                              validation_data=test_data,
                                              validation_steps=len(test_data),
                                              callbacks=[create_tensorboard_callback(dir_name="tensorflow_hub",
                                                                                     experiment_name="efficientnetb0")])


In [5]:
plot_loss_curves(history=efficientnet_history)

NameError: name 'plot_loss_curves' is not defined

Video N° 150: Different Types of Transfer Learning

In [None]:
efficientnet_model.summary()

In [None]:
resnet_model.summary()

In [None]:
# How many layers does our efficientnetb0 feature extractor have?
len(efficientnet_model.layers[0].weights)

## Different types of transfer learning

* **"As is" transfer learning** - using an existing model with no changes what so ever (e.g. using ImageNet model on 1000 ImageNet classes, none of your own)
* **"Feature extraction" transfer learning** - use the prelearned patterns of an existing model (e.g. EfficientNetB0 trained on ImageNet) and adjust the output layer for your own problem (e.g. 1000classes &rarr; 10 classes of food)
* **"Fine-tuning" transfer learning** - use the prelearned patterns of an existing model and "fine-tune" many or all of the underlying layers (including new output layers)

Video N° 151: Comparing Our Model's Results

## Comparing our models results using TensorBoard

> 🔑 **Note:** When you upload things to TensorBoard.dev, you experiments are public. So if you're running private experiments (things you don't want others to see) do not upload them to TensorBoard.dev.

> **Update!!!:** TensorBoard.dev has been shut down as of January 1, 2024!

```python
# Upload TensorBoard dev records
!tensorboard dev upload --logdir "./tensorflow_hub/" --name "EfficientNetB0 vs. ResNet50V2" --description "Comparing two different TF Hub feature extraction model architectures using 10% of the training data" --one_shot
```

Our TensorBoard experiments are uploaded publically: https://tensorboard.dev/experiment/dQBrpdwlRgS2ql0Andv8Yg/

In [None]:
# Check out what TensorBoard experiments you have
# !tensorboard dev list

In [None]:
# Delete an experiment
# !tensorboard dev delete --experiment_id dQBrpdwlRgS2ql0Andv8Yg