<a href="https://colab.research.google.com/github/danchaud-vincent/tensorflow-deep-learning/blob/main/06_transfer_learning_in_tensorflow_part_3_scaling_up.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 06 - Transfer Learning with TensorFlow Part 3: Scaling up (Food Vision mini)

In the previous two notebooks (**transfer learning part 1: feature extraction** and **part 2: fine-tuning**) we've seen the power of transfer learning.

Now we know our smaller modelling experiments are working, it's time to step things up a notch with more data.

This is a common practice in machine learning and deep learning: get a model working on a small amount of data before scaling it up to a larger amount of data.

It's time to get closer to our Food Vision project coming to life. In this notebook we're going to scale up from  using 10 classes of the Food101 data to using all of the classes in the Food101 dataset.

Our goal is to **beat the original Food101 paper's results with 10% of data**.

![](https://raw.githubusercontent.com/danchaud-vincent/tensorflow-deep-learning/main/images/06-ml-serial-experimentation.png)
***Machine learning practitioners are serial experimenters. Start small, get a model working, see if yout experiments work then gradually scale them up to where you want to go (we're going to be looking at scaling up throughout this notebook).*

## What we're going to cover

We're going to go through the follow with TensorFlow:
- Downloading and preparing 10% of the Food101 data (10% of training data).
- Training a feature extraction transfer learning model on 10% of the Food101 training data.
- Fine-tuning our feature extraction model.
- Saving and loaded our trained model.
- Evaluating the performance of our Food Vision model trained on 10% of the training data
  - Finding our model's most wrong predictions
- Making predictions with our Food Vision model on custom images of food.

In [1]:
# Are we using a GPU?
!nvidia-smi

Mon Oct 17 17:39:45 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   70C    P8    11W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Creating Helper Functions


We've created a series of helper functions throughout the previous notebooks. Instead of rewriting them (tedious), we'll import the `helper_functions.py` file from the Github repo.


In [2]:
# Get helper functions file
!wget https://raw.githubusercontent.com/danchaud-vincent/tensorflow-deep-learning/main/utils/helper_functions.py

--2022-10-17 17:39:45--  https://raw.githubusercontent.com/danchaud-vincent/tensorflow-deep-learning/main/utils/helper_functions.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4405 (4.3K) [text/plain]
Saving to: ‘helper_functions.py’


2022-10-17 17:39:46 (60.1 MB/s) - ‘helper_functions.py’ saved [4405/4405]



In [3]:
# Import series of helper function for the notebook
from helper_functions import plot_loss_curves, unzip_data, compare_historys, walk_through_dir, create_tensorboard_callback

## 101 Food Classes: Working with less data

So far we've confirmed the transfer learning model's we've been using work pretty well with the 10 Food Classes dataset. Now it's time to step it up and see how they go with the full 101 Food Classes.

In the original Food101 dataset there's 1000 images per class (750 of each class in the training set and 250 of each class in the test set), totalling 101,000 images.

We could start modelling straight away on this large dataset but in the spirit of continually experimenting, we're going to see how our previously working model's go with 10% of the training data.

This means for each of the 101 food classes we'll be building a model on 75 training images and evaluating it on 250 test images.

## Downloading and preprocessing the data

Just as before we'll download a subset of the Food101 dataset which has been extracted from the original dataset (to see the preprocessing of the data check out the [ Food Vision preprocessing notebook](https://github.com/mrdbourke/tensorflow-deep-learning/blob/main/extras/image_data_modification.ipynb).

We download the data as a zip file so we'll use our unzip_data() function to unzip it.

In [4]:
# Download the data
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/101_food_classes_10_percent.zip

--2022-10-17 17:39:48--  https://storage.googleapis.com/ztm_tf_course/food_vision/101_food_classes_10_percent.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.68.128, 142.250.4.128, 74.125.24.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.68.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1625420029 (1.5G) [application/zip]
Saving to: ‘101_food_classes_10_percent.zip’


2022-10-17 17:40:13 (65.3 MB/s) - ‘101_food_classes_10_percent.zip’ saved [1625420029/1625420029]



In [5]:
# unzip the data
unzip_data("101_food_classes_10_percent.zip")

In [6]:
# train and test directories
train_dir = "101_food_classes_10_percent/train/"
test_dir = "101_food_classes_10_percent/test/"

In [7]:
# Walk through our dir
walk_through_dir("101_food_classes_10_percent")

101_food_classes_10_percent: There are 2 directories and 0 files
101_food_classes_10_percent/test: There are 101 directories and 0 files
101_food_classes_10_percent/test/bread_pudding: There are 0 directories and 250 files
101_food_classes_10_percent/test/donuts: There are 0 directories and 250 files
101_food_classes_10_percent/test/beignets: There are 0 directories and 250 files
101_food_classes_10_percent/test/prime_rib: There are 0 directories and 250 files
101_food_classes_10_percent/test/ice_cream: There are 0 directories and 250 files
101_food_classes_10_percent/test/risotto: There are 0 directories and 250 files
101_food_classes_10_percent/test/pork_chop: There are 0 directories and 250 files
101_food_classes_10_percent/test/cup_cakes: There are 0 directories and 250 files
101_food_classes_10_percent/test/poutine: There are 0 directories and 250 files
101_food_classes_10_percent/test/foie_gras: There are 0 directories and 250 files
101_food_classes_10_percent/test/creme_brulee: 

Let's use the `image_dataset_from_directory()` function to turn our images and labels into a `tf.data.Dataset`, a TensorFlow datatype which allows for us to pass it directory to our model.

For the test dataset, we're going to set `shuffle=False` so we can perform repeatable evaluation and visualization on it later.

In [8]:
# Setup data inputs
import tensorflow as tf
IMG_SIZE = (224,224)

train_data = tf.keras.preprocessing.image_dataset_from_directory(train_dir,
                                                                label_mode="categorical",
                                                                image_size=IMG_SIZE)

test_data = tf.keras.preprocessing.image_dataset_from_directory(test_dir,
                                                                label_mode="categorical",
                                                                image_size=IMG_SIZE,
                                                                shuffle=False)

Found 7575 files belonging to 101 classes.
Found 25250 files belonging to 101 classes.


## Train a big dog model with transfer learning on 10% of 101 food classes

Our food image data has been imported into TensorFlow, time to model it.

To keep our experiments swift, we're going to start by using feature extraction transfer learning with a pre-trained model for a few epochs and then fine-tune for a few more epochs.

More specifically, our goal will be to see if we can beeat the baseline from original **Food101 paper** (50.76% accuracy on 101 classes) with 10% of the training data and the following modelling setup:

- A `ModelCheckpoint` callback to save our progress during training, this means we could experiment with further training later without having to train from scratch every time.
- Data augmentation built right into the model
- A headless (no top layers) `EfficientNetB0` architecture from `
tf.keras.applications` as our base model.
- A `Dense` layer with 101 hidden neurons (same as number of food classes) and softmax activation as the output layer.
- Categorical crossentropy as the loss function since we're dealing with more than two classes.
- The Adam optimizer with the default settings.
- Fitting for 5 full passes on the training data while evaluating on 15% of the test data.

It seems like a lot but these are all things we've covered before in the **Transfer Learning in TensorFlow Part 2: Fine-tuning notebook**.

Let's start by creating the `ModelCheckpoint` callback.

In [9]:
# Create a ModelCheckpoint callback
checkpoint_path = "101_classes_10_percent_data_model_checkpoint"

checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,
                                                         save_weights_only=True,
                                                         monitor='val_accuracy',
                                                         save_best_only=True)

In [17]:
# Create a data augmentation layer
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
from tensorflow.keras.models import Sequential

# Setup data augmentation
data_augmentation = Sequential([
    preprocessing.RandomFlip("horizontal"),
    preprocessing.RandomHeight(0.2),
    preprocessing.RandomWidth(0.2),
    preprocessing.RandomRotation(0.2),
    preprocessing.RandomZoom(0.2),
    # preprocessing.Rescaling(1/255.) # rescale inputs of images to between 0 and 1 (not need for EfficientNet models)
], name="data_augmentation_layer")