<a href="https://colab.research.google.com/github/ShahZebYousafzai/Deep-Learning-Basics/blob/main/6_Transfer_Learning(FineTuning).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Transfer_Learning (FineTuning)

In previous notebook, we covered transfer learning feature extraction, now it's time to learn about a new kind of transfer learning: fine-tuning.

In [1]:
# Check if we're using a GPU
!nvidia-smi

Sat Oct 16 14:11:05 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P8    28W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Creating helper functions

In previous notebooks, we've created a bunch of helper functions, now we could rewrite them all, however, this is tedious.

So, it's a good idea to put functions you'll want to use again in a script you can download and import into your notebooks (or elsewhere).

We've done this for some of the functions we've used previously here: https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py

In [2]:
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py

--2021-10-16 14:13:59--  https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10246 (10K) [text/plain]
Saving to: ‘helper_functions.py’


2021-10-16 14:13:59 (68.1 MB/s) - ‘helper_functions.py’ saved [10246/10246]



In [3]:
# Import helper functions we're going to use in this notebook
from helper_functions import create_tensorboard_callback, plot_loss_curves, unzip_data, walk_through_dir

> 🔑 **Note:** If you're running this notebook in Google Colab, when it times out Colab will delete `helper_function.py`, so you'll have to redownload it if you want access to your helper_functions.

## Let's get some data

This time we're going to see how we can use the pretrained models with `tf.keras.applications` and apply them to our own problem (recognizing images of food.

In [4]:
# Get 10% of the data of the 10 classes
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip 

unzip_data("10_food_classes_10_percent.zip")

--2021-10-16 14:22:08--  https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 108.177.120.128, 142.250.128.128, 142.251.6.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|108.177.120.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 168546183 (161M) [application/zip]
Saving to: ‘10_food_classes_10_percent.zip’


2021-10-16 14:22:09 (192 MB/s) - ‘10_food_classes_10_percent.zip’ saved [168546183/168546183]



In [5]:
# Cehck out how many images and subdirectories are in our dataset
walk_through_dir("10_food_classes_10_percent")

There are 2 directories and 0 images in '10_food_classes_10_percent'.
There are 10 directories and 0 images in '10_food_classes_10_percent/train'.
There are 0 directories and 75 images in '10_food_classes_10_percent/train/grilled_salmon'.
There are 0 directories and 75 images in '10_food_classes_10_percent/train/sushi'.
There are 0 directories and 75 images in '10_food_classes_10_percent/train/hamburger'.
There are 0 directories and 75 images in '10_food_classes_10_percent/train/chicken_curry'.
There are 0 directories and 75 images in '10_food_classes_10_percent/train/pizza'.
There are 0 directories and 75 images in '10_food_classes_10_percent/train/chicken_wings'.
There are 0 directories and 75 images in '10_food_classes_10_percent/train/steak'.
There are 0 directories and 75 images in '10_food_classes_10_percent/train/ramen'.
There are 0 directories and 75 images in '10_food_classes_10_percent/train/ice_cream'.
There are 0 directories and 75 images in '10_food_classes_10_percent/trai

In [6]:
# Create Train and test directory paths
train_dir = "/content/10_food_classes_10_percent/train"
test_dir = "/content/10_food_classes_10_percent/test"

In [7]:
import tensorflow as tf

IMG_SIZE = (224, 224)
BATCH_SIZE = 32

train_data_10_percent = tf.keras.preprocessing.image_dataset_from_directory(directory=train_dir,
                                                                            image_size=IMG_SIZE,
                                                                            label_mode="categorical",
                                                                            batch_size=BATCH_SIZE)

test_data_10_percent = tf.keras.preprocessing.image_dataset_from_directory(directory=test_dir,
                                                                            image_size=IMG_SIZE,
                                                                            label_mode="categorical",
                                                                            batch_size=BATCH_SIZE)

Found 750 files belonging to 10 classes.
Found 2500 files belonging to 10 classes.


In [8]:
train_data_10_percent

<BatchDataset shapes: ((None, 224, 224, 3), (None, 10)), types: (tf.float32, tf.float32)>

In [9]:
# Check out class names of our dataset
train_data_10_percent.class_names

['chicken_curry',
 'chicken_wings',
 'fried_rice',
 'grilled_salmon',
 'hamburger',
 'ice_cream',
 'pizza',
 'ramen',
 'steak',
 'sushi']

In [10]:
# See an example of a batch of data
for image, label in train_data_10_percent.take(1):
  print(image, label)

tf.Tensor(
[[[[  1.           0.          31.        ]
   [  1.           0.          31.        ]
   [  1.           0.          31.        ]
   ...
   [  1.           0.          31.        ]
   [  1.           0.          31.        ]
   [  1.           0.          31.        ]]

  [[  1.           0.          31.        ]
   [  1.           0.          31.        ]
   [  1.           0.          31.        ]
   ...
   [  1.           0.          31.        ]
   [  1.           0.          31.        ]
   [  1.           0.          31.        ]]

  [[  1.           0.          31.        ]
   [  1.           0.          31.        ]
   [  1.           0.          31.        ]
   ...
   [  1.           0.          31.        ]
   [  1.           0.          31.        ]
   [  1.           0.          31.        ]]

  ...

  [[107.50011     94.928635    89.08168   ]
   [115.71439    101.28578     94.15825   ]
   [117.97455    104.18881     95.18881   ]
   ...
   [118.61738     29.   

In this notebook, we will be doing a lot of experiments. These are mentioned below.

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-qxxr{background-color:#FFF;border-color:inherit;font-size:16px;text-align:center;vertical-align:top}
.tg .tg-bn4o{font-size:18px;font-weight:bold;text-align:center;vertical-align:top}
.tg .tg-qv16{font-size:16px;font-weight:bold;text-align:center;vertical-align:top}
.tg .tg-gmla{border-color:inherit;font-size:16px;text-align:center;vertical-align:top}
.tg .tg-a41q{background-color:#FFF;border-color:inherit;font-size:18px;font-weight:bold;text-align:center;vertical-align:top}
.tg .tg-2xbj{border-color:inherit;font-size:18px;font-weight:bold;text-align:center;vertical-align:top}
.tg .tg-lvth{font-size:16px;text-align:center;vertical-align:top}
.tg .tg-rwiu{background-color:#FFF;font-size:16px;text-align:center;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-a41q">Experiment</th>
    <th class="tg-a41q">Data</th>
    <th class="tg-2xbj">Preprocessing</th>
    <th class="tg-bn4o">Model</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-qxxr">Model 0 (baseline)</td>
    <td class="tg-qxxr">10 classes of Food101 (<span style="font-weight:bold">random 10%</span><br><span style="font-weight:normal">training data only)</span></td>
    <td class="tg-gmla">None</td>
    <td class="tg-lvth"><span style="font-weight:bold">Feature Extractor: </span>EfficientNetB0<br>(pre-trained on ImageNet, all layers<br>frozen) with no top</td>
  </tr>
  <tr>
    <td class="tg-gmla">Model 1</td>
    <td class="tg-qxxr">10 classes of Food101 (<span style="font-weight:bold">random 1%</span><br><span style="font-weight:normal">training data only)</span></td>
    <td class="tg-gmla">Random Flip, Rotation,<br>Zoom, Height, Width,<br>data augmentation</td>
    <td class="tg-lvth">Same as Model 0</td>
  </tr>
  <tr>
    <td class="tg-qxxr">Model 2</td>
    <td class="tg-qxxr">Same as Model 0</td>
    <td class="tg-gmla">Same as Model 1</td>
    <td class="tg-lvth">Same as Model 0</td>
  </tr>
  <tr>
    <td class="tg-rwiu">Model 3</td>
    <td class="tg-lvth">Same as Model 0</td>
    <td class="tg-lvth">Same as Model 1</td>
    <td class="tg-qv16">Fine Tuning: <span style="font-weight:normal">Model 2 (EfficientNetB0</span><br><span style="font-weight:normal">pre-trained on ImageNet) </span><span style="font-weight:bold">with top</span><br><span style="font-weight:bold">layer trained on custom data, top 10</span><br><span style="font-weight:bold">layers unfrozen </span></td>
  </tr>
  <tr>
    <td class="tg-rwiu">Model 4</td>
    <td class="tg-lvth">10 classes of Food101 data (<span style="font-weight:bold">100%</span><span style="font-weight:normal"> training</span><br><span style="font-weight:normal">data)</span></td>
    <td class="tg-lvth">Same as Model 1</td>
    <td class="tg-lvth">Same as Model 3</td>
  </tr>
</tbody>
</table>