# Transfer Learning

**Transfer Learning:**

A research problem in machine learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. 

Can leverage an existing neural network architecture proven to work on problems similar to our own

Can leverage a working network architecture which has already learned patterns on similar data to our own (often results in great results with less data)

**Transfer learning use cases:**
* Computer Vision
  - [Imagenet](https://www.image-net.org/): image database organized according to the WordNet hierarchy in which each node of the hierarchy is depicted by hundreds and thousands of images.
  - Currently the best architecture is called EfficientNet
* Natural Language Processing:
  - A subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. 

## Feature Extraction

In [None]:
import  tensorflow.compat.v1 as tf
physical_device = tf.config.experimental.list_physical_devices('GPU')
print(f'Device found : {physical_device}') 

In [None]:
# Get Data (10% of 10 food classes from Food101)
import zipfile

# Download data
!wget -nc -P ../Downloads/ https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip

# Unzip
zip_ref = zipfile.ZipFile('../Downloads/10_food_classes_10_percent.zip')
zip_ref.extractall(path='../Downloads/')
zip_ref.close()


In [None]:
import os

# Walk through 10 percent data directory and list number of files
for dirpath, dirnames, filenames in os.walk('../Downloads/10_food_classes_10_percent'):
  print(f"There are {len(dirnames)} directories and {len(filenames)} images in '{dirpath}'.")

### Create data loaders (preparing the data using `ImageDataGenerator`)

In [None]:
# Setup data inputs
from tensorflow.keras.preprocessing.image import ImageDataGenerator

IMAGE_SHAPE = (224, 224)
BATCH_SIZE = 32

train_dir = '../Downloads/10_food_classes_10_percent/train/'
test_dir = '../Downloads/10_food_classes_10_percent/test/'

train_datagen = ImageDataGenerator(rescale=1/255.)
test_datagen = ImageDataGenerator(rescale=1/255.)

print('Training images:')
train_data_10_percent = train_datagen.flow_from_directory(train_dir,
                                                          target_size=IMAGE_SHAPE,
                                                          batch_size=BATCH_SIZE,
                                                          class_mode='categorical')

print('Testing images:')
test_data = test_datagen.flow_from_directory(test_dir,
                                             target_size=IMAGE_SHAPE,
                                             batch_size=BATCH_SIZE,
                                             class_mode='categorical')

### Setting up callbacks

callbacks are extra functionality you can add to your models to be performed during or after training.  Some of the most popular callbacks:
* Tracking experiments with the `TensorBoard()` callback
  - Log the performance of multiple models and then view and compare these models in a visual way on TensorBoard (a dashboard for inspecting neural network parameters).  Helpful to compare the results of different models on your data.
* Model checkpoint with the `ModelCheckpoint()` callback
  - Save your model as it trains so you can stop training if needed and come back to continue where you left off.  Helpful if training takes a long time and can't be done in one sitting.
* Stopping a model from training (before it trains too long and overfits) with the `EarlyStopping()` callback
  - Leave your model training for an arbitrary amount of time and have it stop training automaticaly when it ceases to improve.  Helpful when you've got a large dataset and don't know how long training will take.

Can be accessed via `tf.keras.callbacks`

In [None]:
# Create tensorboard callback (functionized because need to create a new one for each model)
import datetime
def create_tensorboard_callback(dir_name, experiment_name):
  log_dir = dir_name + "/" + experiment_name + "/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
  tensorboard_callback = tf.keras.callbacks.TensorBoard(
      log_dir=log_dir
  )
  print(f"Saving TensorBoard log files to: {log_dir}")
  return tensorboard_callback

### Creating models using TensorFlow Hub

TensorFlow Hub is a repo of trained ml models

we can access pretrained models on: https://tfhub.dev/

**ResNet**
- Deep Residual Learning for Image Recognition.


In [None]:
# Compare the following two models
resnet_url = 'https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/5'
efficientnet_url = 'https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1'



In [None]:
# Import dependencies
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras import layers

In [None]:
# make a create_model() function to create a model from a URL
def create_model(model_url, num_classes=10):
    """
    Takes a TensorFlow Hub URL and creates a Keras Sequential model with it.

    Args:
        model_url (str): A TensorFlow Hub feature extraction URL.
        num_classes (int): Number of output neurons in the output layer, 
            should be equal to number of target classes, default 10.as_integer_ratio
    Returns:
        An uncompiled Keras Sequential model with model_url as feature extractor
        layer and Dense output layer with num_classes output neurons.
    """
    # Create the model
    model = tf.keras.Sequential()
    # Create Sequential layer from existing url
    feature_extractor_layer = hub.KerasLayer(model_url, 
            trainable=False,
            name="feature_extraction_layer",
            input_shape=IMAGE_SHAPE + (3,)
        ) # freeze the already learned patterns
    # Create layers
    output_layer = layers.Dense(num_classes, activation='softmax', name='output_layer')
    
    # Add layers to the model
    model.add(feature_extractor_layer)
    model.add(output_layer)

    return model


### Creating and testing ResNet TensorFlow Hub Feature Extraction model

In [None]:
# Create ResNet model
resnet_model = create_model(resnet_url, num_classes=train_data_10_percent.num_classes)
resnet_model.summary()

In [None]:
# Compile our resnet model
resnet_model.compile(loss="categorical_crossentropy", optimizer=tf.keras.optimizers.Adam(), metrics=["accuracy"])

In [None]:
# Fit the model
resnet_history = resnet_model.fit(train_data_10_percent,
                                  epochs=5,
                                  steps_per_epoch=len(train_data_10_percent),
                                  validation_data=test_data,
                                  validation_steps=len(test_data),
                                  # Add TensorBoard callback to model (callbacks parameter takes a list)
                                  callbacks=[create_tensorboard_callback(dir_name="../tensorflow_hub", # save experiment logs here
                                                                         experiment_name="resnet50v2")]) # name of log files

In [None]:
# Create a function to plot our loss curves...
# Note: you could put this in a script and import it when needed.

import matplotlib.pyplot as plt

# Plot the validation and training curves
def plot_accuracy_loss_curves(history):
    """
    Returns separate loss curves for training and validation metrics.

    Args:
        history: TensorFlow History object

    Returns:
        Plots of training/validation loss and accuracy metrics.
    """

    loss = history.history["loss"]
    val_loss = history.history["val_loss"]

    accuracy = history.history["accuracy"]
    val_accuracy = history.history["val_accuracy"]

    epochs = range(len(history.history["loss"]))

    # Plot loss
    plt.plot(epochs, loss, label="training_loss")
    plt.plot(epochs, val_loss, label="val_loss")
    plt.title("Loss")
    plt.xlabel("Epochs")
    plt.legend()

    # Plot accuracy
    plt.figure()
    plt.plot(epochs, accuracy, label="training_accuracy")
    plt.plot(epochs, val_accuracy, label="val_accuracy")
    plt.title("Accuracy")
    plt.xlabel("Epochs")
    plt.legend()


In [None]:
plot_accuracy_loss_curves(history=resnet_history)


### Creating and testing EfficientNetB0 TensorFlow Hub Feature Extraction model

In [None]:
train_data_10_percent.num_classes

In [None]:
# Create EfficientNetB0 feature extractor model
efficientnet_model = create_model(model_url=efficientnet_url,
                                  num_classes=train_data_10_percent.num_classes)

# Compile
efficientnet_model.compile(loss="categorical_crossentropy",
                            optimizer=tf.keras.optimizers.Adam(),
                            metrics=["accuracy"])

# Fit to 10% of training data
efficientnet_history = efficientnet_model.fit(train_data_10_percent,
                                                epochs=5,
                                                steps_per_epoch=len(train_data_10_percent),
                                                validation_data=test_data,
                                                validation_steps=len(test_data),
                                                callbacks=[
                                                    create_tensorboard_callback(dir_name="../tensorflow_hub",
                                                        experiment_name="efficientnetB0")
                                                ]
                                            )


In [None]:
plot_accuracy_loss_curves(history=efficientnet_history)

In [None]:
resnet_model.summary()

In [None]:
efficientnet_model.summary()

In [None]:
# How many layers does our efficientnet feature extractor have?
print(efficientnet_model.layers)

# check length input weights
print(len(efficientnet_model.layers[0].weights))

# check length output weights
print(len(efficientnet_model.layers[1].weights))

## Types of Transfer Learning

* **"As is"** transfer learning - Using an existing model with no changes (e.g. using ImageNet model on 1000 ImageNet classes)
* **"Feature extraction"** transfer learning - Use the prelearned patterns of an existing model (e.g. EfficientNetB0 trained on ImageNet) and adjust the output layer for your own problem (e.g. 1000 classes -> 10 classes of food)
* **"Fine-tuning"** transfer learning - Use the patterns patterns of an existing model and "fine-tune" many or all of the underlying layers (including new output layers)

## Comparing our models results using TensorBoard

**Note:** When you upload things to TensorBoard.dev, your experiments are public.  So if you're running private experiments, do not upload them to TensorBoard.dev.

## Upload TensorBoard dev records via terminal

`
tensorboard dev upload --logdir ./tensorflow_hub/ \
    --name "efficientnetB0 vs. resnet50v2" \
    --description "Comparing two different TF Hub feature extraction model architectures using 10% of the training data" \
    --one_shot
`

Our TensorBoard experiments are uploaded publicly: https://tensorboard.dev/experiment/DpVXdXhbS1u6lpJv5DBBIQ/

In [None]:
# Check out what TensorBoard experiments we have.
!tensorboard dev list

### Delete an experiment from TensorBoard

`tensorboard dev delete --experiment_id DpVXdXhbS1u6lpJv5DBBIQ`

Confirm deletion by re-checking what experiments are left.

In [None]:
!tensorboard dev delete --experiment_id DpVXdXhbS1u6lpJv5DBBIQ

In [None]:
!tensorboard dev list