# Transfer Learning with TensorFlow Part 3: Scaling up (Food Vision mini)

We have seen the power of transfer learning feature extraction and fin-tuning, now it's time to scale up to all of the classes in Food101 (101 total classes of food).

OUr goal is to beat the original Food101 paper with 10% of the training (leveraging the power of deep learning).

Original Food101 paper: https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/static/bossard_eccv14_food-101.pdf

Our baseline to beat is 50.76% accuracy across 101 classes.

In [None]:
# Check to see if we are using a GPU
!nvidia-smi

Sun Oct 17 07:04:00 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P8    29W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Creating helper function

In previous notebooks we have created a seris of helper function, let's download that

In [None]:
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py

--2021-10-17 11:36:49--  https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10246 (10K) [text/plain]
Saving to: ‘helper_functions.py’


2021-10-17 11:36:49 (64.7 MB/s) - ‘helper_functions.py’ saved [10246/10246]



In [None]:
# Import series of helper functions for our notebook
from helper_functions import  create_tensorboard_callback, plot_loss_curves, unzip_data, compare_historys, walk_through_dir

## 101 Food classes working with less data

our goal is to beat original Food101 paper with 10% of the original training data, so let's download it.

The data we are downloading comes from the original Food101 dataset but has been preprocessed using the image_data_mode


In [None]:
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/101_food_classes_10_percent.zip
unzip_data('101_food_classes_10_percent.zip')

train_dir = '101_food_classes_10_percent/train'
test_dir = '101_food_classes_10_percent/test'

--2021-10-17 11:39:22--  https://storage.googleapis.com/ztm_tf_course/food_vision/101_food_classes_10_percent.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.199.128, 74.125.142.128, 74.125.195.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.199.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1625420029 (1.5G) [application/zip]
Saving to: ‘101_food_classes_10_percent.zip’


2021-10-17 11:39:32 (153 MB/s) - ‘101_food_classes_10_percent.zip’ saved [1625420029/1625420029]



In [None]:
# How many images/classes are there?
walk_through_dir('101_food_classes_10_percent')

There are 2 directories and 0 images in '101_food_classes_10_percent'.
There are 101 directories and 0 images in '101_food_classes_10_percent/train'.
There are 0 directories and 75 images in '101_food_classes_10_percent/train/prime_rib'.
There are 0 directories and 75 images in '101_food_classes_10_percent/train/huevos_rancheros'.
There are 0 directories and 75 images in '101_food_classes_10_percent/train/beignets'.
There are 0 directories and 75 images in '101_food_classes_10_percent/train/breakfast_burrito'.
There are 0 directories and 75 images in '101_food_classes_10_percent/train/tuna_tartare'.
There are 0 directories and 75 images in '101_food_classes_10_percent/train/deviled_eggs'.
There are 0 directories and 75 images in '101_food_classes_10_percent/train/french_onion_soup'.
There are 0 directories and 75 images in '101_food_classes_10_percent/train/grilled_salmon'.
There are 0 directories and 75 images in '101_food_classes_10_percent/train/pho'.
There are 0 directories and 75 

In [None]:
# Setup data inputs
import tensorflow as tf
IMG_SIZE = (224, 224)
train_data_all_10_percent = tf.keras.preprocessing.image_dataset_from_directory(train_dir,
                                                                                label_mode='categorical',
                                                                                image_size=IMG_SIZE)

test_data = tf.keras.preprocessing.image_dataset_from_directory(test_dir,
                                                                label_mode='categorical',
                                                                image_size=IMG_SIZE,
                                                                shuffle=False) # dont' shuffle test data for prediction analysis

Found 7575 files belonging to 101 classes.
Found 25250 files belonging to 101 classes.


## Train a big dog model with transfer learning on 10% of 101 food classes

Here are the steps we are going to take:
* Create a ModelCheckpoint callback
* Create a data augmentation layer to build data augmentation right into the model.
* Build a headless (no topless layers) functional EfficientNetB0 backboned-model (we will create our own output layer)

In [None]:
# Create checkpoint callback
checkpoint_path = '101_classes_10_percent_data_model_checkpoint'
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,
                                                         save_weights_only=True,
                                                         monitor='val_accuracy',
                                                         save_best_only=True)

In [None]:
# Create data augmentation layer to incroporate it right into the model
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
from tensorflow.keras.models import Sequential

In [None]:
# Setup data augmentation
data_augmentation = Sequential([
  preprocessing.RandomFlip('horizontal'),
  preprocessing.RandomRotation(0.2),
  preprocessing.RandomHeight(0.2),
  preprocessing.RandomWidth(0.2),
  preprocessing.RandomZoom(0.2),
  # preprocessing.Rescaling(1/255.) # Rescale inputs of images to between 0 and 1, required for models like REsNet50
], name='data_augmentation')

In [None]:
# Setup the base model and freeze its layeer (this will extract features)
base_model = tf.keras.applications.EfficientNetB0(include_top=False)
base_model.trainable=False

# Setup model architecture with trainable top leyers
inputs = layers.Input(shape=(224, 224, 3), name='input_layers')
x = data_augmentation(inputs) # augment images (only happens during training phase)
x = base_model(x, training=False) # This will put the base model in infrence mode so weights which need to be forzen, stays frozen.
x = layers.GlobalAveragePooling2D(name='global_avg_pool_layer')(x)
outputs = layers.Dense(len(train_data_all_10_percent.class_names), activation='softmax',name='output_layer')(x)
model = tf.keras.Model(inputs, outputs)

Downloading data from https://storage.googleapis.com/keras-applications/efficientnetb0_notop.h5


In [None]:
# Get a summary of the model we createed
model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_layers (InputLayer)    [(None, 224, 224, 3)]     0         
_________________________________________________________________
data_augmentation (Sequentia (None, None, None, 3)     0         
_________________________________________________________________
efficientnetb0 (Functional)  (None, None, None, 1280)  4049571   
_________________________________________________________________
global_avg_pool_layer (Globa (None, 1280)              0         
_________________________________________________________________
output_layer (Dense)         (None, 101)               129381    
Total params: 4,178,952
Trainable params: 129,381
Non-trainable params: 4,049,571
_________________________________________________________________


In [None]:
# Compile
model.compile(loss='categorical_crossentropy',
              optimizer=tf.keras.optimizers.Adam(),
              metrics=['accuracy'])

# Fit
history_all_classes_10_percent = model.fit(train_data_all_10_percent,
                                           epochs=5,
                                           validation_data=test_data,
                                           validation_steps=int(0.15 * len(test_data)),
                                           callbacks=[checkpoint_callback])

Epoch 1/5
Epoch 2/5
Epoch 3/5


Exception ignored in: <function IteratorResourceDeleter.__del__ at 0x7f3d401ec200>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 546, in __del__
    handle=self._handle, deleter=self._deleter)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_dataset_ops.py", line 1264, in delete_iterator
    _ctx, "DeleteIterator", name, handle, deleter)
KeyboardInterrupt: 


Epoch 4/5
Epoch 5/5


In [None]:
# Evaluate on the whole dataset
feature_extraction_results = model.evaluate(test_data)
feature_extraction_results

 48/790 [>.............................] - ETA: 1:39 - loss: 2.1747 - accuracy: 0.4447

In [None]:
plot_loss_curves(history_all_classes_10_percent)

> **Question:** What do these curves seggest? Hint: ideally, the two curves should be very similar to each other, if not it may suggest that our model is overfitting (performing too well on training data and not on unseen data)

## Fine-tuning

In [None]:
# Unfreeze all of the layers in the base model
base_model.trainable = True

# Refreeze every layer except last 5
for layer in base_model.layers[:-5]:
  layer.trainable = False

In [None]:
# Recompile model with lower learning rate (it's typeically best practice to lower the lr when fine-tuning)
model.compile(loss='categorical_crossentropy',
              optimizer=tf.keras.optimizers.Adam(lr=0.0001),
              metrics=['accuracy'])

In [None]:
# What layers in model are trainable?
for layer in model.layers:
  print(layer.name, layer.trainable)

In [None]:
# Check which layers are trainable in our base model
for layer_number, layer in enumerate(model.layers[2].layers):
  print(layer_number, layer.name, layer.trainable)

In [None]:
# Fine-tune for 5 more epochs
fine_tune_epochs = 10 # model her already done 5 epochs, this is the total number of epochs we are after

# Fine-tune our model
history_all_classes_10_percent_fine_tune = model.fit(train_data_all_10_percent,
                                                     epochs=fine_tune_epochs,
                                                     validation_data=test_data,
                                                     validation_steps=int(0.15 * len(test_data)),
                                                     initial_epoch=history_all_classes_10_percent.epoch[-1])

In [None]:
# Evaluate on the whole test data
all_classes_10_percent_fine_tune_results = model.evaluate(test_data)
all_classes_10_percent_fine_tune_results

In [None]:
plot_loss_curves(history_all_classes_10_percent_fine_tune)

In [None]:
# Compare the histories of feature extraction model and fine-tune model
compare_historys(original_history=history_all_classes_10_percent,
                 new_history=history_all_classes_10_percent_fine_tune,
                 initial_epochs=5)

## Saving and Loading our Model

To use our model in external application, we wil need to save it and export it somewhere.

In [None]:
model.save("drive/MyDrive/tensorflow_course/101_food_classes_10_percent_saved_big_dog_model")

In [None]:
# Load and evaluted saved model
loaded_model = tf.keras.models.load_model('drive/MyDrive/tensorflow_course/101_food_classes_10_percent_saved_big_dog_model')



In [None]:
# Evaluate loaded model and compare performance to pre-saved model
loaded_model_results = loaded_model.evaluate(test_data)
loaded_model_results



[1.6223490238189697, 0.5783366560935974]

## Evaluating the perfromance of the big dog model across all differnet classes

Let's make some predictions, visualize them and then later find out which predicitions were teh most wrong.

In [None]:
import tensorflow as tf

In [None]:
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/06_101_food_class_10_percent_saved_big_dog_model.zip

--2021-10-17 11:37:14--  https://storage.googleapis.com/ztm_tf_course/food_vision/06_101_food_class_10_percent_saved_big_dog_model.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.197.128, 74.125.142.128, 74.125.195.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.197.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 46760742 (45M) [application/zip]
Saving to: ‘06_101_food_class_10_percent_saved_big_dog_model.zip’


2021-10-17 11:37:15 (144 MB/s) - ‘06_101_food_class_10_percent_saved_big_dog_model.zip’ saved [46760742/46760742]



In [None]:
unzip_data('/content/06_101_food_class_10_percent_saved_big_dog_model.zip')

In [None]:
# Load in saved model
model = tf.keras.models.load_model('/content/06_101_food_class_10_percent_saved_big_dog_model')





In [None]:
# Evaluate loaded model
results_downloaded_model = model.evaluate(test_data)
results_downloaded_model



[1.802719235420227, 0.6077623963356018]

## Making predictions with our trained model

In [None]:
# Make predictions with our model
preds_probs = model.predict(test_data, verbose=1) # set verbosity to see how long is left

In [None]:
len(test_data)

In [None]:
790 * 32

In [None]:
# How many predictions ar there?
len(preds_probs)

In [None]:
# What's the shape of our prediction
preds_probs.shape

In [None]:
# Let's see what the first 10 predictions look like
preds_probs[:10]

In [None]:
# What does the first prediction probability looks like
preds_probs[0], len(preds_probs[0]), sum(preds_probs[0])

Our model outputs a prediction probability array (with N number of vaiables, where N is the number of classes) for each sample passed to the predict method.

In [None]:
# We get one prediction probability per class (in our case there's 101 prediction probabilities)
print(f'Number of prediction probabilities for sample 0: {len(preds_probs[0])}')
print(f'What prediction probability sample 0 looks like: {preds_probs[0]}')
print(f'The class with the hightes prediction probability by the model for sample 0: {preds_probs[0].argmax()}')

In [None]:
test_data.class_names[52]

In [None]:
# Get the pred classes of each label
pred_classes = preds_probs.argmax(axis=1)

# How do they look?
pred_classes[:10]

In [None]:
# How many pred classes do we have?
len(pred_classes)

Now we have got prediction arrays of all of our model's predictions, to evaluate them, we need to compare them to the original test dataset.

In [None]:
# To get our test labels we need to unravel our test_data BatchDataset
y_labels = []
for images, labels in test_data.unbatch():
  y_labels.append(labels.numpy().argmax()) # currently test labels look like: [0, 0, 0, 1, ....0, 0]

y_labels[:10] # look at the first 10 

## Evaluating mode's prediction
one way to check that our model's prediction array is in the same order as our test labels array is to find the accuarcy score.