<a href="https://colab.research.google.com/github/LivingstonTardzenyuy/Deep-Learning-with-TensorFlow/blob/main/07_milestone_project_1_Food_vision.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Milestone project 1: Food Vision Big.

# Check GPU

* Google colab offers free GPU, however, not all of them are compatiable with mixed precision training. The one that is compatiable is Tesla T4.

Hence we will uss a Tesla T4 in google Colab.

In [1]:
!nvidia-smi -L

GPU 0: Tesla T4 (UUID: GPU-97b0d480-cb30-89e5-3804-365e09d69304)


### SetUp mixed precision

In [2]:
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import mixed_precision



policy = mixed_precision.Policy('mixed_float16')
mixed_precision.set_global_policy(policy)

print('Compute dtype: %s' % policy.compute_dtype)
print('Variable dtype: %s' % policy.variable_dtype)

Compute dtype: float16
Variable dtype: float32


## Getting helper functions.

We will reUse all the helper functions we created in past module and use it here.
The script we've got available is found here: https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/refs/heads/main/extras/helper_functions.py

In [3]:
# Download helper functions script.
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/refs/heads/main/extras/helper_functions.py

--2024-12-31 20:37:58--  https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/refs/heads/main/extras/helper_functions.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10246 (10K) [text/plain]
Saving to: ‘helper_functions.py’


2024-12-31 20:37:58 (18.1 MB/s) - ‘helper_functions.py’ saved [10246/10246]



In [4]:
# Import series of help functions for the notebook.
from helper_functions import create_tensorboard_callback, plot_loss_curves, compare_historys

### Use TensorFlow Datasets to Download Data(TFDS)

In [5]:
# Get Tensorflow datassets.
import tensorflow_datasets as tfds

In [6]:
# List all available datasts in tf.
datasets_list = tfds.list_builders()   # list all available datasets
print("food101" in datasets_list)

True


In [None]:
# Load in the data (takes quite about 7 minutes)
(train_data, test_data), ds_info = tfds.load(name="food101",
                                             split=["train", "validation"],
                                             shuffle_files = True,
                                             as_supervised = True, # include labels
                                             with_info = True)

Downloading and preparing dataset 4.65 GiB (download: 4.65 GiB, generated: Unknown size, total: 4.65 GiB) to /root/tensorflow_datasets/food101/2.0.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

In [None]:
train_data

In [None]:
## Features of Food101 from TFDS

ds_info.features

In [None]:
# Get the class names.
class_names = ds_info.features['label'].names
class_names[:10]

## Exploring the Food101 data from TensorFlow Datasets.

To become one with the data, we want to find:
* Class names
* The shape of our input data
* The datatype of our input data
* What the labels look like (e.g are they one-hot encoded or label encoded)
* Do the labels match up with the class names ?.

In [None]:
# Take one sample of our train data.
train_one_sample = train_data.take(1)
train_one_sample

In [None]:
# OUtput info about our training sample.

for image, label in train_one_sample:
  print(f"""
    The shape of our image tensor is: {image.shape}
    The datatype of our image tensor is: {image.dtype}
    The shape of our label tensor is: {label.shape}
    The datatype of our label tensor is: {label.dtype}
    Target class from Food101 (tensor form): {label}
    Class name (str form): {class_names[label.numpy()]
    }
  """)

In [None]:
# What does our image tensor from TFDS's Food101 look like ?.
image

In [None]:
# What are the min and max of our images ?.
tf.reduce_min(image), tf.reduce_max(image)

### Plot an image from tensorflow dataset

In [None]:
import matplotlib.pyplot as plt
plt.imshow(image)
plt.title(class_names[label.numpy()])  # Add title to image to verify the label is associated with right image.
plt.axis(False);

## Create preprocessing functions for our data

Neural network perform well when our images are in a certain way (e.g batched, normalized, etc). Hence we have to look into this......

What we know about our data.

* In 'uinit8' datatype
* Comprised of all different size of tensors (different sized images)
* Not scaled(The pixel values are b/t 0 and 255

What we know models like:
* Data in 'float32' dtype (or for mixed precision 'float16' and 'float32')
* For batches, TensorFlow likes all of the tensors within a batch to be in the same
* Scaled (values between 0 & 1) also called normalized tensors generally perform better.

With this point in mind... we have to do some things to tackle preprocessing function.

Since we're going to be using an EfficientNetBx pretrained model from tf.keras.applications we don't need to rescale our data (These architectures have rescaling build-in).

Hence our function will.
1. Reshape our images to all the smae size.
2. Convert the dtype of our image tnesor from unit8 to float32

In [None]:
# Make a function for preprocessing images
def preprocess_image(image, label, image_size = 224):
  """
    Convert image datatype from 'uint8' -> 'float32'
    Resize image to 'image_size' x 'image_size'
  """

  image = tf.image.resize(image, [image_size, image_size])
  # image = image/225.  # Normalizing or scaling image values. But this is not required in EfficientNetBX since it is already build in...
  return tf.cast(image, tf.float32), label

In [None]:
# Preprocess a single image and check the output
preprocess_image(image, label)

## Batch and prepare datasets

We're now going to make our data input pipeline run really fast. For more info get to this: https://www.tensorflow.org/guide/data_performance

In [21]:
# Map preprocessing function to training (and parallelize)
train_data = train_data.map(map_func = preprocess_image, num_parallel_calls=tf.data.AUTOTUNE)

# Shufle train_data and turn it into batches and prefetch it (load it faster)
train_data = train_data.shuffle(buffer_size=1000).batch(batch_size=32).prefetch(buffer_size=tf.data.AUTOTUNE)

# Turn test data into batches (don't need to shuffle)
test_data = test_data.map(preprocess_image).batch(batch_size=32).prefetch(buffer_size=tf.data.AUTOTUNE)   #Prefetch allows us to notify the other tensors to be preprared for use by CPU.

train_data, test_data

(<_PrefetchDataset element_spec=(TensorSpec(shape=(None, None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, None), dtype=tf.int64, name=None))>,
 <_PrefetchDataset element_spec=(TensorSpec(shape=(None, None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, None), dtype=tf.int64, name=None))>)

In [20]:
train_data

<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int64, name=None))>