# Transfer Learning - Part 1
## Feature Extraction

A very powerful technique in Deep Learning is the use of Transfer Learning which leverages architectures of existing models trained on a similar set to the problem at hand, and build a model from that model.

1. Can leverage an existing neural network architecture proven to work on problems similar to the one in hand.
2. Can leverage a working neural network architecture which has already learned patterns on similar data to our own, then we can adapt those patterns to our own data.

For this part, we are going to look at only 10% of the same food image dataset done on the convolutional neural networks notebook. 

In [None]:
import os
import pathlib
import random
import sys
from typing import Tuple

module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow_hub as hub

from src import utils

## Helpers

In [None]:
def summarize_image_directory(data_directory: pathlib.Path):
    # Lets look at the number of files in the test and train sets
    # TODO: Move this to a nice function
    for dirpath, dirnames, filenames in os.walk(data_directory):
        images = [file for file in filenames if file.endswith('jpg') or file.endswith('jpeg') or file.endswith('png')]
        if images:
            print(f'Directory: {dirpath} Total Images: {len(images)}')

In [None]:
def get_classnames_from_directory(data_directory: pathlib.Path):
    all_class_names = [
        item.name for item in data_directory.iterdir() if item.is_dir() and not item.name.startswith('.')
    ]
    class_names = np.array(sorted(all_class_names))
    return class_names

## Step-0 Looking at the Data

In [None]:
# Image dataset location
data_directory = pathlib.Path('./data/food-101/10_food_classes_10_percent')
test_directory = data_directory / 'test'
train_directory = data_directory / 'train'

In [None]:
summarize_image_directory(data_directory)

In [None]:
# Getting the class names
class_names = get_classnames_from_directory(train_directory)
class_names

### Dataset Findings

There are 10 total image classes, but instead of 750 images for each training dataset in the CNN notebook, there are only 75 for each training dataset. The test data is the same size as the test set in the CNN notebook, which will allow us for a 1-to-1 comparison against the CNN notebook model.

## Initial Pass - Loading the Dataset

In [None]:
# Scaling values
scale = 1. / 255
img_size = 224
batch_size = 32

# Loading in the data
train_data_generator = ImageDataGenerator(rescale=scale)
test_data_generator = ImageDataGenerator(rescale=scale)

# Create the data
train_data = train_data_generator.flow_from_directory(str(train_directory),
                                                      target_size=(img_size, img_size),
                                                      batch_size=batch_size,
                                                      class_mode='categorical')
test_data = test_data_generator.flow_from_directory(str(test_directory),
                                                    target_size=(img_size, img_size),
                                                    batch_size=batch_size,
                                                    class_mode='categorical')

## Setting up Callbacks

Callbacks are extra functionality that you can run to be performed before, during, or after training. Some of the most popular callbacks:

* Tracking experiments with the TensorBoard callback.
* Checkpoints with the ModelCheckpoint callback.
* Stopping a model from training before the model begins to overfit with the EarlyStopping callback.

In [None]:
# TensorBoard Callback, allowing for tracking the performance of the model
# Going to functionize it to allow to create a new folder for each experiment ran.
def create_tensorboard_callback(dir_name: str, experiment_name: str):
    log_dir = f"{dir_name}/{experiment_name}/{dt.datetime.now().strftime('%Y%m%d-%H%M%S')}"
    
    tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir)
    print('Saving TensorBoard log files to: ', log_dir)
    
    return tensorboard_callback


## Transfer Learning Model-1

For this model, majority of the model's layers are going to come from an existing model from TensorFlow Hub. TensorFlow Hub is an open source hub that has pretrained models for use in transfer learning.

* https://www.tensorflow.org/hub 

**NOTE** When going through to narrow down what models to find, a useful resource is Papers with Code which go over researchers have published papers with info on architectures used similar to problems.

* https://paperswithcode.com/

For this project, from going through the TensorFlow Hub page, the following two feature models are going to be used for this example to compare against:

* https://tfhub.dev/google/efficientnet/b0/feature-vector/1
* https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/5

In [None]:
# Comparing the two models
resnet_url = 'https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/5'
efficientnet_url = 'https://tfhub.dev/google/efficientnet/b0/feature-vector/1'

In [None]:
# Going to functionize model creation from a url

def create_model(transfer_learning_url: str, img_shape: Tuple[int, int], num_classes: int = 10):
    """ Takes a TensorFlow Hub model url, and creates a Sequential model """
    feature_extractor_layer = hub.KerasLayer(transfer_learning_url,
                                             trainable=False,
                                             input_shape=img_shape + (3,),
                                             name='FeatureExtractionLayer')

    model = tf.keras.Sequential([
        feature_extractor_layer,
        tf.keras.layers.Dense(num_classes, activation='softmax', name='OutputLayer')
    ])

    return model

### Resnet Model

In [None]:
resnet_model = create_model(resnet_url, img_shape=(img_size, img_size))

In [None]:
resnet_model.summary()

In [None]:
# Compiling model
resnet_model.compile(loss='categorical_crossentropy',
                     optimizer=tf.keras.optimizers.legacy.Adam(),
                     metrics=['accuracy'])

# Training model
resnet_history = resnet_model.fit(train_data,
                                  epochs=5,
                                  steps_per_epoch=len(train_data),
                                  validation_data=test_data,
                                  validation_steps=len(test_data),
                                  callbacks=[create_tensorboard_callback('logs',
                                                                         experiment_name='resnet_50_v2')])

In [None]:
utils.plot.plot_history(resnet_history, metric='loss')
utils.plot.plot_history(resnet_history, metric='accuracy')

#### Findings

The resnet model used from transfer learning outperformed all of the CNN networks built in the CNN notebook, by far, in faster time, and with only 10% of the training data.

The validation accuracy ended up at around 77%.

### Efficientnet Model

In [None]:
efficientnet_model = create_model(efficientnet_url, img_shape=(img_size, img_size))
efficientnet_model.summary()

In [None]:
# Compiling model
efficientnet_model.compile(loss='categorical_crossentropy',
                     optimizer=tf.keras.optimizers.legacy.Adam(),
                     metrics=['accuracy'])

# Training model
efficientnet_model_history = efficientnet_model.fit(train_data,
                                  epochs=5,
                                  steps_per_epoch=len(train_data),
                                  validation_data=test_data,
                                  validation_steps=len(test_data),
                                  callbacks=[create_tensorboard_callback('logs',
                                                                         experiment_name='efficientnet_b0')])

In [None]:
utils.plot.plot_history(efficientnet_model_history, metric='loss')
utils.plot.plot_history(efficientnet_model_history, metric='accuracy')

#### Findings

The efficientnet model used from transfer learning outperformed all of the CNN networks built in the CNN notebook, by far, in faster time, and with only 10% of the training data.

The validation accuracy finished at around 86%.

Compairing this model to the resnet model, it looks like the efficientnet model outperformed the resnet model by about 10%, even though there are less trainable parameters in the efficientnet model.

## Types of Transfer Learning

* "As Is" uses an existing model with no changes whatsoever.
* "Feature Extraction" uses prelearned patterns of an existing model, and adjust output layer to your own problem.
* "Fine Tuning" uses prelearned patterns of an existing model and fine tunes all or many of the underlying layers.

## Comparing Efficientnet Vs. Resnet

Looking at our logs generated with TensorBoard for each run. This uses the tensorboard command located below to upload the logs to TensorBoard, then can be viewed there using the generated link.

**!!NOTE!!** All logs uploaded to TensorBoard are made public, so do NOT upload logs that should remain private!

In [None]:
# !tensorboard dev upload --logdir ./logs \
#    --name "Efficientnet B0 vs. Resnet 50 V2" \
#    --description "Comparing two different TensorFlow Hub NN architectures." \
#    --one_shot