# 04-Transfer learning with Tensorflow Part 1: Feature Extension
We've built a bunch of convolutional neural networks from scratch and they all seem to be learning, however, there is still plenty of room for improvement.

To improve our model(s), we could spend a while trying different configurations, adding more layers, changing the learning rate, adjusting the number of neurons per layer and more.

However, doing this is very time consuming.

Luckily, there's a technique we can use to save time.

It's called transfer learning, in other words, taking the patterns (also called weights) another model has learned from another problem and using them for our own problem.

There are two main benefits to using transfer learning:

1. Can leverage an existing neural network architecture proven to work on problems similar to our own.
2. Can leverage a working neural network architecture which has already learned patterns on similar data to our own. This often results in achieving great results with less custom data.
What this means is, instead of hand-crafting our own neural network architectures or building them from scratch, we can utilise models which have worked for others.

And instead of training our own models from scratch on our own datasets, we can take the patterns a model has learned from datasets such as ImageNet (millions of images of different objects) and use them as the foundation of our own. Doing this often leads to getting great results with less data.

Over the next few notebooks, we'll see the power of transfer learning in action.



What we're going to cover
We're going to go through the following with TensorFlow:

* Introduce transfer learning (a way to beat all of our old self-built models)
* Using a smaller dataset to experiment faster (10% of training samples of 10 * classes of food)
* Build a transfer learning feature extraction model using TensorFlow Hub
* Introduce the TensorBoard callback to track model training results
* Compare model results using TensorBoard


In [1]:
# are you using gpu
!nvidia-smi


Mon May  2 14:20:33 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
| N/A   27C    P8    10W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+---------------------------------------------------------------------------

### Downloading and becoming one with data

In [2]:
# get data 10% of (10 food classes all data)

import zipfile

# Download data
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip

# Unzip the downloaded file
zip_ref = zipfile.ZipFile("10_food_classes_10_percent.zip", "r")
zip_ref.extractall()
zip_ref.close()


--2022-05-02 14:20:34--  https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 172.217.4.208, 142.250.190.112, 142.251.32.16, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|172.217.4.208|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 168546183 (161M) [application/zip]
Saving to: ‘10_food_classes_10_percent.zip.6’


2022-05-02 14:20:36 (102 MB/s) - ‘10_food_classes_10_percent.zip.6’ saved [168546183/168546183]



In [3]:
# how manu directory and file have in each of label
import os
for dirpath ,dirname,filename in os.walk("10_food_classes_10_percent"):
    print(f"There are {len(dirname)} of directory and {len(filename)} image in '{dirpath}' ")

There are 2 of directory and 0 image in '10_food_classes_10_percent' 
There are 10 of directory and 0 image in '10_food_classes_10_percent/test' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test/sushi' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test/ramen' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test/hamburger' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test/pizza' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test/chicken_curry' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test/ice_cream' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test/steak' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test/chicken_wings' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test/grilled_salmon' 
There are 0 of directory and 250 image in '10_food_classes_10_percent/test

### Creating data loader (preprare data)
Now we've downloaded the data, let's use the ImageDataGenerator class along with the flow_from_directory method to load in our images.



In [4]:
# set up inputs
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_dir = "10_food_classes_10_percent/train/"
test_dir = "10_food_classes_10_percent/test/"

IMAGE_SHAPE = (224,224)
BATCH_SIZE = 32
EPOCH = 5

# inialize ImageDataGenerator 
train_datagen_10_percent = ImageDataGenerator(rescale=1/255.)
test_datagen_10_percent = ImageDataGenerator(rescale=1/255.)

print("Training Image: ")
train_data_10_percent = train_datagen_10_percent.flow_from_directory(directory=train_dir,
                                                                    target_size=IMAGE_SHAPE,
                                                                    batch_size=BATCH_SIZE,
                                                                    class_mode='categorical')

print("Testing Image: ")
test_data_10_percent = test_datagen_10_percent.flow_from_directory(directory=test_dir,
                                                                   target_size=IMAGE_SHAPE,
                                                                   batch_size=BATCH_SIZE,
                                                                   class_mode='categorical')

Training Image: 
Found 750 images belonging to 10 classes.
Testing Image: 
Found 2500 images belonging to 10 classes.


### Setting up callbacks (things to run whilst our model trains)
Before we build a model, there's an important concept we're going to get familiar with because it's going to play a key role in our future model building experiments.

And that concept is callbacks.

Callbacks are extra functionality you can add to your models to be performed during or after training. Some of the most popular callbacks include:

* **Experiment tracking with TensorBoard -** log the performance of multiple models and then view and compare these models in a visual way on TensorBoard (a dashboard for inspecting neural network parameters). Helpful to compare the results of different models on your data.
* **Model checkpointing -** save your model as it trains so you can stop training if needed and come back to continue off where you left. Helpful if training takes a long time and can't be done in one sitting.
* **Early stopping -** leave your model training for an arbitrary amount of time and have it stop training automatically when it ceases to improve. Helpful when you've got a large dataset and don't know how long training will take.


In [5]:
# Create tensorbard callback (functionized because need to create a new one for each model)
import datetime
import tensorflow as tf
def create_tensorboard_callback(dir_name, experiment_name):
    log_dir = dir_name + "/"+ experiment_name + "/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
    tensorboard_callback = tf.keras.callbacks.TensorBoard(
        log_dir = log_dir
    )
    print(f"Saving TensorBoard log files to : {log_dir}")
    return tensorboard_callback

### Creating models using TensorFlow Hub
In the past we've used TensorFlow to create our own models layer by layer from scratch.

Now we're going to do a similar process, except the majroity of our model's layers are going to come from TensorFlow Hub.

We can access pretrained models on: https://tfhub.dev/

Browsing the TensorFlow Hub page and sorting for image classification, we found the following feature vector model link: https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1



In fact, we're going to use two models from TensorFlow Hub:

* ResNetV2 - a state of the art computer vision model architecture from 2016.
* EfficientNet - a state of the art computer vision architecture from 2019.


In [6]:
# let's compare two model each other
resnet_url = "https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/5"
efficientnet_url = "https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1"

In [7]:
import tensorflow as tf


In [8]:
print(tf.__version__)

2.8.0


In [9]:
# import importent depandecies
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras import layers 


In [10]:
# Let's make a create_model() function to create a model from a URL
def create_model(model_url, num_classes=10):
  """
  Takes a TensorFlow Hub URL and creates a Keras Sequential model with it.

  Args:
    model_url (str): A TensorFlow Hub feature extraction URL.
    num_classes (int): Number of output neurons in the output layer,
      should be equal to number of target classes, default 10.
  
  Returns:
    An uncompiled Keras Sequential model with model_url as feature extractor
    layer and Dense output layer with num_classes output neurons.
  """
  # Download the pretrained model and save it as a Keras layer
  feature_extractor_layer = hub.KerasLayer(model_url,
                                           trainable=False, # freeze the already learned patterns
                                           name="feature_extraction_lyaer",
                                           input_shape=IMAGE_SHAPE+(3,)) 

  # Create our own model
  model = tf.keras.Sequential([
      feature_extractor_layer,
      layers.Dense(num_classes, activation="softmax", name="output_layer")
  ])
    
  return model  


### Creating and testing ResNet TensorFlow Hub Feature Extraction model

In [11]:
resnet_model = create_model(model_url=resnet_url,
                            num_classes=train_data_10_percent.num_classes)

2022-05-02 14:20:39.723444: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-02 14:20:39.732773: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2022-05-02 14:20:39.732800: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2022-05-02 14:20:39.733305: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow bin

In [12]:
# compile the model
resnet_model.compile(loss=tf.keras.losses.CategoricalCrossentropy(),
                    optimizer=tf.keras.optimizers.Adam(),
                    metrics=["accuracy"])

In [13]:
# fit the model 
res_net_history = resnet_model.fit(train_data_10_percent,
                                  epochs=5,
                                  steps_per_epoch=len(train_data_10_percent),
                                  validation_data=test_data_10_percent,
                                  validation_steps=len(test_data_10_percent),
                                  callbacks=[create_tensorboard_callback(dir_name="tensorflow_hub",
                                                                        experiment_name="resnet50V2")])

Saving TensorBoard log files to : tensorflow_hub/resnet50V2/20220502-142041
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5

KeyboardInterrupt: 

Wow!

That. Is. Incredible. Our transfer learning feature extractor model out performed ALL of the previous models we built by hand... (substantially) and in a quicker training time AND with only 10% of the training examples.


In [None]:
# Let's create a function to plot our loss curves...
# Tidbit: you could put a function like this into a script called "halper.py" and import it when you need it...
import matplotlib.pyplot as plt
def plot_loss_curve(history):
    """
    Return sparate loss curves for training and validation metrics
    
    Args:
      history: Tensorflow history object
    Return: 
      Plots of training/validation loass and accuracy metrics
      
    """
    loss = history.history["loss"]
    val_loss = history.history["val_loss"]
    
    accuracy = history.history["accuracy"]
    val_accuracy = history.history["val_accuracy"]
    
    epoch = range(len(history.history["loss"]))
    
    # plot loss
    plt.plot(epoch,loss,label="training loss")
    plt.plot(epoch,val_loss,label="val loss")
    plt.title("Loss")
    plt.xlabel("Epochs")
    plt.lagend();
    
    # plot accuracy
    plt.figure()
    plt.plot(epoch,accuracy,label="training accuracy")
    plt.plot(epoch,val_accuracy,label="val accuracy")
    plt.title("Accuracy")
    plt.xlabel("Epochs")
    plt.lagend();
    

In [None]:
plot_loss_curves(res_net_history)

### Creating and testing EfficientNetB0 TensorFlow Hub Feature Extraction model

In [None]:
efficientnet_model = create_model(model_url=efficientnet_url,
                                 num_classes=train_data_10_percent.num_classes)

In [None]:
# compilet the model
efficientnet_model.compile(loss=tf.keras.losses.CategoricalCrossetropy(),
                          optimizer=tf.keras.optimizers.Adam(),
                          metrics=["accuracy"])

In [None]:
# fit the model
efficientnet_model.fit(train_data_10_percent,
                      epochs=EPOCHS,
                      steps_per_epoch = len(train_data_10_percent),
                      validation_data = test_data_10_percent,
                      validation_steps = len(test_data_10_percent),
                      callbacks=[create])