# Transfer Learning - Fine Tuning

<font color='steelblue'>

<font size = 4>
    
Use *`pre-defined`* model with the help of *`Keras.Application`* module. Introduce an example of using *`Keras Functional API`* 
</font>

</font>

<font size = 3>
    
Use the same `EfficientNetB0` model, the allows to classify images, however here replace the input layer and add our own input layer for the images we have<br>
    
**Following is included here:<br>**
    
- `Load & Prepare` training and test images<br>
- Use `Functional API` to `build layers` for the model
- `Train & Evaluate` model
- `Explore` model performance
</font>

In [None]:
import os
import shutil

In [None]:
# define location of data
dpath = "../datasets/FoodClasses/"

In [None]:
# list the directories in dataset
for dirpath, dirnames, filenames in os.walk(dpath):
    print(f"{len(dirnames)} directores and {len(filenames)} images in '{dirpath}'")

<font color = 'slategrey'>
<font size = 4>
    <b>Note:</b><br><br>
- There are more images in test directories then in train directories<br>
- Key to showing that transfer learning can perform with less training images<br>
- Train on less data<br>
</font>
</font>


## Loading and Preparing data<br>

<font size = "3">
    
- Load images from the appropriate directories using `image_dataset_from_directory`<br>
- It works in the same way as the `flow_from_directory` method<br>
- Benefit of this method is that we have a `tf.data.Dataset` object rather than a generator object
- `tf.data.Dataset` is a much more efficient API than the `ImageDataGenerator` API
    
[`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset)
</font>

In [None]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [None]:
IMAGE_SHAPE = (224, 224)
BATCH_SIZE = 32
IMAGE_SHAPE_COLOR = (224, 224, 3)

In [None]:
trainDir = dpath + "train"
testDir = dpath + "test"
trainDir, testDir

<font size = 4>


Parameters to use from the `image_dataset_from_directory()`:
- `directory`    - the file path of the target directory for images
- `image_size`   - the target image size that we want in our dataset
- `batch_size`   - how many images we want to load at a time, e.g. `default is 32`, load 32 images and their labels

[`tf.keras.preprocessing` Documentation](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory)
    
</font>

In [None]:
print("Training data")
trainData = tf.keras.preprocessing.image_dataset_from_directory(directory = trainDir,
                                                  image_size = IMAGE_SHAPE,
                                                  batch_size = BATCH_SIZE,
                                                  label_mode = "categorical")

In [None]:
trainData.class_names

In [None]:
trainData

<font size = 3>
    
`BatchDataset`:
    
- `(None, 224,224, 3)` is tensor shape, `None` is batch size, `224` is height & width of image, `3` color
- `(None, 10)` is tenor shape of the labels, `None` is batch size, `10` number of labels in dataset
- Both tensors and labels are of type `tf.float32`
    
`Batch Size` is `None`, it is like a placeholder that will be filled in when the `image_dataset_from_directory()` is executed
    
</font>

In [None]:
# example of batch data (taking one batch - size is 32)
for images, labels in trainData.take(1):
    print(labels, images)

Label above is `one hot encoded` `[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]`is `hamburger`

In [None]:
print("Test data")
testData = tf.keras.preprocessing.image_dataset_from_directory(directory = testDir,
                                                               image_size = IMAGE_SHAPE,
                                                               batch_size = BATCH_SIZE,
                                                               label_mode = "categorical")

In [None]:
import matplotlib.pyplot as plt

# Plot the validation and training data separately
def plot_loss_curves(history):
  """
  Returns separate loss curves for training and validation metrics.
  """ 
  loss = history.history['loss']
  val_loss = history.history['val_loss']

  accuracy = history.history['accuracy']
  val_accuracy = history.history['val_accuracy']

  epochs = range(len(history.history['loss']))

  # Plot loss
  plt.plot(epochs, loss, label='training_loss')
  plt.plot(epochs, val_loss, label='val_loss')
  plt.title('Loss')
  plt.xlabel('Epochs')
  plt.legend()

  # Plot accuracy
  plt.figure()
  plt.plot(epochs, accuracy, label='training_accuracy')
  plt.plot(epochs, val_accuracy, label='val_accuracy')
  plt.title('Accuracy')
  plt.xlabel('Epochs')
  plt.legend();

## Transfer Learning - Fine Tuning <br>

<font size = 3>

- Take the underlying `patterns (weights)` from the pre-defined model and `fine tune` to our dataset
- Usually this means training some, many or all the layers of the pre-defined model
- Useful when the dataset has `large number of classes` and data is slightly different from the data that the pre-defined model was trained on
- An example would be that we want to define our own input shape for the data (`unfreeze` that layer)
    
</font>

## Transfer Learning use Keras Functional API<br>

<font size = 3>
    
- To use our `predefined model`, use [`tf.keras.applications`](https://www.tensorflow.org/api_docs/python/tf/keras/applications)
- This applications module is already set up for using [`Keras Functional API`](https://keras.io/guides/functional_api/)
    
Perform following steps:
1. Instantiate a pre-trained base model object [`EfficientNetB0`](https://www.tensorflow.org/api_docs/python/tf/keras/applications/EfficientNetB0) from `tf.keras.applications`, setting the `include_top` parameter to `False` (we do this because we're going to create our own top, which are the output layers for the model).
2. Set base model to `un-trainable`, so that all the weights in the predefined model are frozen
3. Create the `input layer` for the model (set the `shape to our image size`)
4. Pass the `input layer` to the `base model` created in the steps above
5. Pool the outputs of the base model into a shape compatible with the output activation layer (turn base model output tensors into same shape as label tensors). This can be done using [`tf.keras.layers.GlobalAveragePooling2D()`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/GlobalAveragePooling2D) or [`tf.keras.layers.GlobalMaxPooling2D()`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/GlobalMaxPool2D?hl=en) though the former is more common in practice.
6. Create `output activation layer` with appropriate `number of neurons` and `tf.layers.Dense()`
7. Combine the inputs and the outputs into a model using [`tf.Keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model)
</font>

In [None]:
#import tensorflow as tf
#import tensorflow_hub as hub
from tensorflow.keras import layers

In [None]:
# 1. Create base model
baseModel = tf.keras.applications.EfficientNetB0(include_top = False)

In [None]:
# 2. Freeze the base model
baseModel.trainable = False

In [None]:
# 3. inputs into the base model
inputs = tf.keras.layers.Input(shape = IMAGE_SHAPE_COLOR, name = "InputLayer")

In [None]:
# 4. Pass input layer to base model
x = baseModel(inputs)

In [None]:
x.shape

In [None]:
# 5. Average pool the outputs of the base model
x = tf.keras.layers.GlobalAveragePooling2D(name = "GlobalAvgPooling") (x)
print(f"After average pooling shape is: {x.shape}")

In [None]:
# 6. Create output activation layer
outputs = tf.keras.layers.Dense(len(trainData.class_names), 
                                activation = "softmax",
                                name = "OutputLayer") (x)

In [None]:
# 7. Combine the inputs and outputs into a Model
effNetModel = tf.keras.Model(inputs, outputs, name = "FineTuning")

## Model Training

In [None]:
# compile model
effNetModel.compile(loss = "categorical_crossentropy",
                   optimizer = tf.keras.optimizers.Adam(),
                   metrics = ["accuracy"])

In [None]:
%%time
# train model (takes about 25+ minutes)

tf.random.set_seed(2345)
effNetHistory = effNetModel.fit(trainData,
                               epochs = 5,
                               steps_per_epoch = len(trainData),
                               validation_data = testData,
                               validation_steps = len(testData),
                               verbose = 1)

## Model Perfomance

In [None]:
plot_loss_curves(effNetHistory)

In [None]:
# check out the layers in the base model
for layerNum, layer in enumerate(baseModel.layers):
    print(layerNum, layer.name)

<font size = 3>

- A lot of layers here, if we were to hand code this will take a fairly long time
- These are the benefits of transfer learning
    
</font>

In [None]:
# Summary of base model
baseModel.summary()

In [None]:
# model summary
effNetModel.summary()

<font size = 3>
  
- The `InputLayer` has a shape of `(None, 224, 224, 3)`,  the `None` is a placeholder for the `batch size`
- The EfficientNetB0 has 236 layers (check the type it is `Functional`)
- In the output layer `(None, 10)`, 10 is the number of classes and `None` is batch size
- There are more that `4M parameters` of which only `12,810 are trainable`
- *(`1280 * 10`) neurons in output layer, + `10` biases for each neuron = **12,810**)*
    
    
</font>

### GlobalAveragePooling2D<br>

<font size = 3>

- The 4D tensors coming from the `EfficientNetB0` layer need to be converted into 2D tensors since our output layer is 10 classes
- Apply `tf.keras.GlobalAveragePooling2D()` to average inner axes in the tensors
- Here is an example:    
</font>

In [None]:
# define same dimensions as the output of the EfficientNetB0
inputShape = (1, 4, 4, 3)

# create a random tensor
tf.random.set_seed(2345)          # make repeatable

inTensor = tf.random.normal(inputShape)
print(f"Random input tensor:\n {inTensor}")
print(f"Number of dimensions: {inTensor.ndim}")

In [None]:
# Pass the tensor through global average pooling 2D layer
gblAvgPoolTensor = tf.keras.layers.GlobalAveragePooling2D() (inTensor)
print(f"Tensor after Global Average Pooling:\n {gblAvgPoolTensor}")

In [None]:
# explore input shape of tensor
inTensor.shape, inTensor.ndim

In [None]:
# explore output shape of tensor
gblAvgPoolTensor.shape, gblAvgPoolTensor.ndim

In [None]:
# can also be done by using reduce_mean
# create average on middle axes
tf.reduce_mean(inTensor, axis = [1, 2])