<a href="https://colab.research.google.com/github/SaketMunda/transfer-learning-with-tensorflow/blob/master/fine_tuning_transfer_learning_with_tensorflow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Transfer Learning with TensorFlow : Fine Tuning

In the previous section, we saw how we could leverage feature extraction transfer learning to get far better results on our Food vision project (using only 10% of original dataset) than building our own model from scratch.

Now, we're going to cover another type of transfer learning: fine-tuning.

In **fine-tuning transfer learning** the pre-trained weights from another model are unfrozen and tweaked during to better suit your own data.

*Feature extraction transfer learning vs. fine-tuning transfer learning. The main difference between the two is that in fine-tuning, more layers of the pre-trained model get unfrozen and tuned on custom data. This fine-tuning usually takes more data than feature extraction to be effective.*

> 💡 **This time we will use the helper functions to speed up our steps in learning**

Import the Helper functions

In [1]:
# Get helper_functions.py script from Github
!wget https://raw.githubusercontent.com/SaketMunda/ml-helpers/master/helper_functions.py

# Import helper functions we're going to use
from helper_functions import create_tensorboard_callback, unzip_data, walk_through_dir

--2022-12-16 05:53:59--  https://raw.githubusercontent.com/SaketMunda/ml-helpers/master/helper_functions.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1937 (1.9K) [text/plain]
Saving to: ‘helper_functions.py’


2022-12-16 05:53:59 (31.9 MB/s) - ‘helper_functions.py’ saved [1937/1937]



## 10 Food Classes : Working with Less data

As in the previous notebook of Transfer Leanring : Feature Extraction, we got good results from less data (10%) of the training data using transfer learning from TensorFlow Hub.

In this notebook, we're going to continue to work with smaller subsets of the data, except this time we'll have a look at how we can use in-built pretrained models within the `tf.keras.applications` module as well as how to fine-tune them to our own custom dataset.

We'll also practice using a new but similar dataloader function to what we've used before, `image_dataset_from_directory()` which is a part of `tf.keras.preprocessing` module.

Finally we'll be practicing using the [Keras Functional API](https://keras.io/guides/functional_api/) for building deep learning models. The functional API is a more flexible way to create models than the `tf.keras.Sequential`

Let's start by downloading the data

In [2]:
# Get 10% of the data of the 10 classes
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip

unzip_data('10_food_classes_10_percent.zip')

--2022-12-16 05:54:05--  https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.20.128, 108.177.98.128, 74.125.197.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.20.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 168546183 (161M) [application/zip]
Saving to: ‘10_food_classes_10_percent.zip’


2022-12-16 05:54:06 (176 MB/s) - ‘10_food_classes_10_percent.zip’ saved [168546183/168546183]



In [3]:
# Walk through the directory and list number of files
walk_through_dir('10_food_classes_10_percent')

There are 2 directories and 0 files in '10_food_classes_10_percent'
There are 10 directories and 0 files in '10_food_classes_10_percent/train'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/chicken_wings'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/steak'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/ice_cream'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/pizza'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/hamburger'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/fried_rice'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/grilled_salmon'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/ramen'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/chicken_curry'
There are 0 directories and 75 files in '10_food_classes_10_percent/train/sushi'
There are

It's the same number of files and classes we used in previous notebook.

In [4]:
# Create the training and testing directory
train_dir = '10_food_classes_10_percent/train/'
test_dir = '10_food_classes_10_percent/test/'

Now we've got some image data, we need a way of loading it into a Tensorflow compatible format.

Previously, we've used the `ImageDataGenerator` class. And while this works well and is still very commonly used, this time we're going to use the `image_dataset_from_directory` function.

One of the main benefits of using `tf.keras.preprocessing.image_dataset_from_directory()` rather than `ImageDataGenerator` is that it creates a `tf.data.Dataset` object rather than a generator. The main advantage of this is the `tf.data.Dataset` API is much faster and efficient than the `ImageDataGenerator` API which is paramount for larger datasets.



In [5]:
import tensorflow as tf

IMG_SHAPE = (224, 224)
BATCH_SIZE = 32

train_data_10_percent = tf.keras.preprocessing.image_dataset_from_directory(train_dir,
                                                  image_size=IMG_SHAPE,
                                                  label_mode='categorical',
                                                  batch_size=BATCH_SIZE)

test_data_10_percent = tf.keras.preprocessing.image_dataset_from_directory(test_dir,
                                                 image_size=IMG_SHAPE,
                                                 label_mode='categorical',
                                                 batch_size=BATCH_SIZE)

Found 750 files belonging to 10 classes.
Found 2500 files belonging to 10 classes.


Wonderful ! Looks like our dataloaders have found the correct number of images for each Dataset.

For now, the main parameters we're concerned about in the `image_dataset_from_directory` function are:
- `directory`
- `image_size`
- `batch_size`

In [6]:
# if we check the datatype of the preprocessed dataset
train_data_10_percent

<BatchDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 10), dtype=tf.float32, name=None))>

It's Batch Dataset

In [7]:
# check the classes
train_data_10_percent.class_names

['chicken_curry',
 'chicken_wings',
 'fried_rice',
 'grilled_salmon',
 'hamburger',
 'ice_cream',
 'pizza',
 'ramen',
 'steak',
 'sushi']

Or if we wanted to see an example of batch of data, we could use the `take()` method

In [8]:
# See an example batch of data
for images, labels in train_data_10_percent.take(1):
  print(images, labels)

tf.Tensor(
[[[[7.49846954e+01 2.19846935e+01 5.98469353e+00]
   [8.14234772e+01 2.84234734e+01 1.24234715e+01]
   [7.82040787e+01 2.52040825e+01 7.20408344e+00]
   ...
   [9.23366776e+01 3.16275311e+01 8.25512695e+00]
   [1.01515282e+02 4.05152817e+01 2.15152836e+01]
   [9.60258255e+01 3.50258255e+01 1.70258255e+01]]

  [[8.78622437e+01 3.11989803e+01 1.65306129e+01]
   [9.22653122e+01 3.62653122e+01 2.12653084e+01]
   [7.88367386e+01 2.58367367e+01 7.83673620e+00]
   ...
   [9.30457840e+01 3.10457840e+01 7.61725616e+00]
   [9.23725052e+01 3.13725033e+01 1.23725033e+01]
   [9.67041702e+01 3.57041740e+01 1.77041721e+01]]

  [[9.27244949e+01 3.37244873e+01 1.97244911e+01]
   [8.71479568e+01 2.91479568e+01 1.51479568e+01]
   [8.82704086e+01 3.29132614e+01 1.56989784e+01]
   ...
   [9.55917664e+01 3.35917625e+01 1.01632347e+01]
   [1.07147827e+02 4.50916939e+01 2.42600937e+01]
   [9.37090683e+01 3.22804985e+01 1.42805004e+01]]

  ...

  [[1.29101990e+02 1.29101990e+02 1.19101997e+02]
   [1

Image arrays come out as tensors of pixel values where as the labels come out as one-hot encodings.

## Model 0. Building a transfer learning model using the Keras Functional API

Alight, our data is tensor-ified, let's build a model.

To do so, we're going to be using the `tf.keras.applications` module as it contains a series of already trained (on ImageNet) computer vision models as well as the Keras Functional API to construct our model.

We're going to go through the following steps:

1. Instantiate the pretrained model object by choosing a target model such as `EfficientNetB0` from `tf.keras.applications`, setting the `include_top` to `False` (we do this because we're going to create our own top, which are the output layers for the model).
2. Set the base model's `trainable` attribute to `False` to freeze all of the weights in the pre-trained model.
3. Define an input layer for our model, for example, what shape of data should our model expect ?
4. [Optional] Normalize the inputs to our model if it requires. Some of comp vis models such as `ResNetV250` require their inputs to be between 0 & 1.

In [9]:
# 1. Create a base model with tf.keras.applications
base_model = tf.keras.applications.EfficientNetB0(include_top=False)

# 2. Freeze the base model (so the pre-trained weights remain same)
base_model.trainable = False

# 3. Create the inputs into the base model
inputs = tf.keras.layers.Input(shape=(224, 224, 3), name='input_layer')

# 4. If using ResNet50V2, add this to speed up convergence, remove for EfficientNet
# x = tf.keras.layers.experimental.preprocessing.Rescaling(1/255.)(inputs)

# 5. Pass the inputs to our base model
x = base_model(inputs)
# check data shape after passing into base model
print(f'Shape after base_model: {x.shape}')

# 6. Average pool the outputs of the base model (aggregate all the most important information, reduce number of computation)
x = tf.keras.layers.GlobalAveragePooling2D(name='global_average_pooling_layer')(x)
print(f'After Global Average Pooling 2D:{x.shape}')

# 7. Create the output activation layer
outputs = tf.keras.layers.Dense(10, activation='softmax', name='output_layer')(x)

# 8. Combine the inputs and outputs into our model
model_0 = tf.keras.Model(inputs, outputs)

# 9. Compile the model
model_0.compile(loss=tf.keras.losses.CategoricalCrossentropy(),
                optimizer='Adam',
                metrics=['accuracy'])

# 10. Fit the model
history_0 = model_0.fit(train_data_10_percent,
                        epochs=5,
                        steps_per_epoch=len(train_data_10_percent),
                        validation_data=test_data_10_percent,
                        validation_steps=len(test_data_10_percent),
                        callbacks=[create_tensorboard_callback('transfer_learning',
                                                               '10_percent_feature_extraction')])

Downloading data from https://storage.googleapis.com/keras-applications/efficientnetb0_notop.h5
Shape after base_model: (None, 7, 7, 1280)
After Global Average Pooling 2D:(None, 1280)
Saving Tensorboard log files to: transfer_learning/10_percent_feature_extraction/20221216-055414
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Alright, we get 88% Training accuracy and 86% Validation accuracy. Not Bad !

Next would be to plot the loss curves using Helper Function,