# Milestone Project 1 - Food Vision Big

## What we're going to cover?

* Using tensorflow dataset to download and explore all of food 101
* Create a preprocessing function for our data
* Batching and prepare datasets for modeling (making them run fast)
* Setting up mixed precision training (faster model training)
* Building and training a feature extraction model
* Fine-tuning your feature extraction model to the beat the DeepFood paper
* Evaluating your model results on Tensorboard
* Evaluating your model results by making and plotting predictions


see: https://github.com/mrdbourke/tensorflow-deep-learning/blob/main/07_food_vision_milestone_project_1.ipynb

In [None]:
# check GPU
!nvidia-smi -L

## Check GPU

Google Colab offers free GPUs (thank you Google), however, not all of them are compatible with mixed precision training.

Google Colab offers:
* k80 (not compatible)
* p100 (not compatible)
* tesla T4 (compatible)

Knowing this, in order to use mixed precision training we need access to a Tesla T4 (from within Google Colab) or if we're using own hardware, our GPU need scores of 7.0+ (see here: https://developer.nvidida.com/cuda-gpus)

## Helper Functions


In past modules, we've created a brunch of helper functions to do small task required for our notebooks. Rather than rewrite all these, we can import a script and load them in from threre. The script we've got available can be found on github: https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py

In [24]:
# import urllib.request as ur
# uncomment this line below and run it to download helper_functions file
# ur.urlretrieve('https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py', filename='helper_functions.py')
from helper_functions import create_tensorboard_callback, plot_loss_curves, walk_through_dir, compare_historys

## Tensorflow dataset 


- see: https://www.tensorflow.org/datasets?hl=pt-br
- see: https://www.tensorflow.org/datasets/catalog/overview?hl=pt-br
- see: https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/

In [1]:
import os
import tensorflow as tf
import tensorflow_datasets as tfds

print(tf.__version__)

  from .autonotebook import tqdm as notebook_tqdm


2.9.3


In [None]:
# list all dataset available in TFDS
# dataset_list = tfds.list_builders()

In [None]:
# is our target dataset food101 in the list of TFDS?
# print(f"food101" in dataset_list)

In [None]:
# load in the data (takes 5-6 minutes in Google Colab)
# (train_data, test_data), ds_info = tfds.load(name='food101', 
#                                              split=['train', 'validation'],
#                                              shuffle_files=True,
#                                              as_supervised=True, # data gets retuned in tuple format (data, label)
#                                              with_info=True) 

What we know about our data:
* In `unit8` datatype
* Comprised of all different size tensors (different size images)
* Not scaled (pixel values are between 0 & 255)

we know models like:

* data in `float32` dtype (or for mixed precision `float16` and  `float32`)
* for batches, Tensorflow likes all of the tensors within a batch to be of the same size
* scaled (values between 0 & 1) also called normalized tensor generally perform better

with these point in mind, we got a few things we can tackle with a preprocessing function. Since we're going to be using an EfficientNet petrained model from `tf.keras.application` we don't need to rescale our data (these architetures have rescaling built-in).

these main our functions need to:
1. reshape our images to all the same size.
2. convert the dtype of our model image tensor from `unit8` to `float32`

In [None]:
# make function to preprocessing images
def preprocessing_img(image, label, img_shape=224):
    """
    Converts image datatype from 'unit8' to 'float32' and reshape image to [img_shape, img_shape, color_channels]
    Args:
        image (unit8) required
        label (int) required
        img_shape (int) optional
    Returns:
        (float32_image, label)
    """

    image = tf.image.resize(image, [img_shape, img_shape]) # reshape target image
    return tf.cast(image, tf.float32), label

## Batch and Prepare Datasets

We're now going to make our data input pipeline run really fast. For more resources follow guide: https://www.tensorflow.org/guide/data_performance?hl=pt-br

In [None]:
# make preprocessing function to training and parallelize
# train_data = train_data.map(map_func=preprocessing_img, num_parallel_calls=tf.data.AUTOTUNE)

# shuffle train_data and turn it into batches and prefetch it (load it faster)
# traind_data = traind_data.shuffle(buffer_size=len(train_data)).batch(batch_size=32).prefetch(buffer_size=tf.data.AUTOTUNE)

# map preprocessing function to test data
# test_data = test_data.map(map_func=preprocessing_img, num_parallel_calls=tf.data.AUTOTUNE).batch(32).prefetch(tf.data.AUTOTUNE).cache()

> "Hey tensorflow, map this preprocessing function (`preprocessing_img`) across our training dataset, then shuffle a number of elments and then batch them finally make sure you prepare new batches (prefetch) whilist the model is looking through finding patterns the current batch"

In [5]:
STORAGE = os.path.join('..', 'storage')
ZIP_PATH = f'{STORAGE}/zip'
TRANSFER_LEARNING_PATH = f'{STORAGE}/transfer_learning'

In [3]:
filename = '101_food_classes_10_percent.zip'
folder = filename.split('.')[0]
url = f'https://storage.googleapis.com/ztm_tf_course/food_vision/{filename}'

In [6]:
train_dir = f'{TRANSFER_LEARNING_PATH}/{folder}/train'
test_dir = f'{TRANSFER_LEARNING_PATH}/{folder}/test'

In [7]:
# setup data inputs
IMG_SIZE = (224, 224)
train_data = tf.keras.preprocessing.image_dataset_from_directory(directory=train_dir,
                                                                                label_mode='categorical', # 101 classes
                                                                                image_size=IMG_SIZE)
test_data = tf.keras.preprocessing.image_dataset_from_directory(directory=test_dir,
                                                                label_mode='categorical',
                                                                image_size=IMG_SIZE,
                                                                shuffle=False) # don't shuffle test data for prediction analysis
train_data

Found 7575 files belonging to 101 classes.
Found 25250 files belonging to 101 classes.


<BatchDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 101), dtype=tf.float32, name=None))>

## Create a modelling callbacks

What are callbacks?

* Callbacks are tools which can `add help funcionality` to your model during training evaluation or inference.
* Some popular callbacks include:
    * Tensorboard (`tf.keras.callbacks.Tensorboard()`) - Log the performace of multiple models and then view and compare these models in a visual way. helpful to compare results
    * ModelCheckpoint (`tf.keras.callbacks.ModelCheckpoint()`) - save your model as it trains so you can stop training if need and come back to continue off where you left.
    * Early Stopping (`tf.keras.callbacks.EarlyStopping()`) - leave your model training for an arbitrary amount of time and have it stop training automatically.

In [8]:
# set model check point path
checkpoint_dir = f'{TRANSFER_LEARNING_PATH}/tensorflow_hub/milestone1_model_checkpoint_weight'
os.makedirs(checkpoint_dir, exist_ok=True)
checkpoint_path = f'{checkpoint_dir}/checkpoint.ckpt'

# create a model checkpoint callback that saves the model's weights only
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                               save_weights_only=True,
                                                               save_best_only=True,
                                                               save_freq='epoch', # save every epoch
                                                               monitor='val_accuracy', 
                                                               verbose=0) # don't print anything whether or not model is being saved

## setup mixed precision training

First and foremost, for a deep understanding of mixed precision training, check out tensorflow guide: https://www.tensorflow.org/guide/mixed_precision

Mixed precision utilizes a combination of float32 and float16 data types to speed up model performance

see: https://www.cherryservers.com/blog/introduction-to-gpu-programming-with-cuda-and-python  
see: https://en.wikipedia.org/wiki/Half-precision_floating-point_format

In [9]:
# turn on mixed precision training
from tensorflow.keras import mixed_precision

mixed_precision.set_global_policy("mixed_float16") # set global data policy to mixed precision

INFO:tensorflow:Mixed precision compatibility check (mixed_float16): OK
Your GPU will likely run quickly with dtype policy mixed_float16 as it has compute capability of at least 7.0. Your GPU: NVIDIA GeForce GTX 1650 Ti, compute capability 7.5


## Build feature extraction model

see: https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy  
see: https://storage.googleapis.com/tensorflow/keras-applications/efficientnet_v2/efficientnetv2-b0_notop.h5

In [13]:
# create a base model
input_shape=(224, 224, 3)
base_model = tf.keras.applications.EfficientNetV2B0(include_top=False)
base_model.trainable = False

# create function model
inputs = tf.keras.layers.Input(shape=input_shape, name='input_shape')
# Note: EfficienteBX models have rescaling built-in but if your model doesn't you can have a layer
# x = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)(x)
x = base_model(inputs, training=False) # make sure layers which should be in inference model only 
x = tf.keras.layers.GlobalAveragePooling2D(name='global_average_pooling_2D')(x)
x = tf.keras.layers.Dense(len(test_data.class_names), name='dense_layer')(x)
outputs = tf.keras.layers.Activation('softmax', dtype=tf.float32, name='softmax_float32')(x)
model = tf.keras.Model(inputs, outputs)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/efficientnet_v2/efficientnetv2-b0_notop.h5


In [18]:
model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_shape (InputLayer)    [(None, 224, 224, 3)]     0         
                                                                 
 efficientnetv2-b0 (Function  (None, None, None, 1280)  5919312  
 al)                                                             
                                                                 
 global_average_pooling_2D (  (None, 1280)             0         
 GlobalAveragePooling2D)                                         
                                                                 
 dense_layer (Dense)         (None, 101)               129381    
                                                                 
 softmax_float32 (Activation  (None, 101)              0         
 )                                                               
                                                             

## Checking layer dtype policies (are we using mixed precision?)

In [21]:
# check the dtype_policies attributes of layers in the model
for layer in model.layers:
    print(f'layer: {layer.name},\ntrainable: {layer.trainable},\npolicy: {layer.dtype_policy},\ndtype: {layer.dtype}\n')

layer: input_shape,
trainable: True,
policy: <Policy "float32">,
dtype: float32

layer: efficientnetv2-b0,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: global_average_pooling_2D,
trainable: True,
policy: <Policy "mixed_float16">,
dtype: float32

layer: dense_layer,
trainable: True,
policy: <Policy "mixed_float16">,
dtype: float32

layer: softmax_float32,
trainable: True,
policy: <Policy "float32">,
dtype: float32



Going through the above we see:
* `layer.name`: the human readable name of a particular layer
* `layer.trainable`: is the layer trainable or not? (if `False`, the weights are frozen)
* `layer.dtype`: the datatype a layer stores its variables in
* `layer.dtype_policy`: the data type policy a layer computes on its variables with

In [22]:
# check the dtype_policies attributes of layers in the model in the base_model
for layer in model.layers[1].layers:
    print(f'layer: {layer.name},\ntrainable: {layer.trainable},\npolicy: {layer.dtype_policy},\ndtype: {layer.dtype}\n')

layer: input_4,
trainable: False,
policy: <Policy "float32">,
dtype: float32

layer: rescaling_3,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: normalization_3,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: stem_conv,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: stem_bn,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: stem_activation,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: block1a_project_conv,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: block1a_project_bn,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: block1a_project_activation,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: block2a_expand_conv,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: float32

layer: block2a_expand_bn,
trainable: False,
policy: <Policy "mixed_float16">,
dtype: floa

## Fit the feature extraction model

if our goal is to fine-tuning a pretrained model, the general order doing thing is:
1. Build a feature extraction model (train a couple out layer with base model layer frozen)
2. Fine-tune some of the fronze layer

In [26]:
# model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
#               optimizer=tf.keras.optimizers.Adam(),
#               metrics=['accuracy'])

model.compile(loss=tf.keras.losses.CategoricalCrossentropy(),
              optimizer=tf.keras.optimizers.Adam(),
              metrics=['accuracy'])

In [27]:

history_101_food_classes_feature_extraction = model.fit(train_data, # directory of all images after have been passed of batches
                                                        epochs=3, # size of epoch of training
                                                        steps_per_epoch=len(train_data), # steps of training size equal to train data
                                                        validation_data=test_data, # step validation data include test_data
                                                        validation_steps=int(0.15 * len(test_data)), # which size the step will be 15% of all test data
                                                        callbacks=[model_checkpoint_callback, 
                                                                   create_tensorboard_callback(
                                                                       dir_name=f'{TRANSFER_LEARNING_PATH}/tensorflow_hub',
                                                                       experiment_name='milestone1_feature_extraction')]) # callbacks 

Saving TensorBoard log files to: ..\storage/transfer_learning/tensorflow_hub/milestone1_feature_extraction/20240406-112028
Epoch 1/3
Epoch 2/3
Epoch 3/3


In [28]:
# evaluate model on whole test data
feature_extraction_result = model.evaluate(test_data)
feature_extraction_result

# underfitting - when value of loss is greater than accuracy
# overfitting - when value of loss is less than accuracy



[1.579253911972046, 0.6041584014892578]