# Food 101 Challenge

Author: Felipe C. de Pauli
Date: 20/11/2023

This challenge was made in four steps:

1. A simple walk through the data, creating a binary classification with pizza and steak classes;
2. Create a multiclass classification with 10 classes and adjust hyperparameters;
3. Create the final notebook with all 101 classes and test the execution;
4. Create the final python program to run in a computer with GPU.

# Part 1: Binary Classification

Let's build a convolutional neural network to find patterns in our images, more specifically we need a way to:

1. Load and learn about our images
2. Preprocess the images
3. Build a CNN to find patterns in our images
4. Compile our CNN
5. Fit the CNN to our training data
6. Evaluate the model and restart the process

In [None]:
import numpy as np
import os
import tensorflow as tf

Let's look if there is a GPU available to use (if you are using the colab, you have to enable the GPU on notebook settings).

In [None]:
print("Available GPUs: ", tf.config.list_physical_devices('GPU'))

At this moment, I'm using COLAB in an environment with GPU. Then, it appears over there. For now, we got a GPU!

# 1. Load and learn about our images

## Get the data

First of all, we have to get the data and prepare it.

Ps.: I found another place to get the data wihtout the need to get a key.

In [None]:
# Getting data from ztm (a Deep Learning Course's Plataform)
import zipfile

!wget https://storage.googleapis.com/ztm_tf_course/food_vision/pizza_steak.zip

zip_fd = zipfile.ZipFile("pizza_steak.zip")
zip_fd.extractall()
zip_fd.close()

## Inspect the Data (become one with it)

In [None]:
!ls pizza_steak

In [None]:
!ls pizza_steak/train

In [None]:
!ls pizza_steak/test/pizza

Inside the pizza_steak directory, we have all images of pizzas and steaks from Food-101. We will begin with this simpler case.

We got the following directories with images
* pizza_steak/train/pizza
* pizza_steak/test/pizza
* pizza_steak/train/steak
* pizza_steak/test/steak

In [None]:
import os

for dirpath, dirname, filename in os.walk("pizza_steak"):
    if (len(filename) == 0):
        continue
    if (len(dirpath) > 0):
        print(">>", dirpath)
    if (len(filename) > 0):
        print("   images: ", len(filename))

We need the name of the classes used. This is very easy now, but when we are working with 101, that way to get it will be very useful.

In [None]:
import pathlib
import numpy as np

# Generate the Path object data_dir
data_dir = pathlib.Path("pizza_steak/train")

# Get from Path object all the names os each file inside this directory
classes = np.array(sorted([item.name for item in data_dir.glob("*")]))
print(classes)

In [None]:
# Let's visualize our images
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import random

def view_random_image(target_dir, target_class):
  # Setup the target directory (we'll view images from here)
  target_folder = target_dir+target_class

  # Get a random image path
  random_image = random.sample(os.listdir(target_folder), 1)
  print(random_image)

  # Read in the image and plot it using matplotlib
  img = mpimg.imread(target_folder + "/" + random_image[0])
  plt.imshow(img)
  plt.title(target_class)
  plt.axis("off");

  print(f"Image shape: {img.shape}") # show the shape of the image

  return img

In [None]:
# View a random image from the training dataset
img = view_random_image(target_dir="pizza_steak/train/",
                        target_class="pizza")

In [None]:
# Ok. We've used the mpimg to get an image from our computer and plot it using matplotlib.
# Matplotlib plots images using the RGB system. Then you have to give to it
# an image of (cols, rows, channels)
print(type(img))
print(img.shape)

# But to work with tensoflow, we need a tensor. A tensor is a structured that
# stores descriptions of objects. A vector is a tensor, but we can have strucutures
# more complex than a vector to describe an object, and it could be using a tensor.
# We need to cast img as a tensor.
tf.constant(img)

The image is a huge tensor with 3 channel per pixel. The number of columns and rows could vary. That's not good for a neural network's input.

In [None]:
img.shape

# 2. Preprocess images

In [None]:
train_data_path = "./pizza_steak/train"
test_data_path  = "./pizza_steak/test"

In [None]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_data_gen = ImageDataGenerator(rescale=1/255.)
test_data_gen  = ImageDataGenerator(rescale=1/255.)

train_data = train_data_gen.flow_from_directory(
    directory   = train_data_path,
    batch_size  = 32,
    target_size = (224, 224),
    seed        = 42,
    class_mode  = "binary"
)

test_data  = test_data_gen.flow_from_directory(
    directory   = test_data_path,
    batch_size  = 32,
    target_size = (224, 224),
    seed        = 42,
    class_mode  = "binary"
)

# 3. Build a CNN to find patterns in our images

We got the train_data and the test_data ready to cnn input. Now let's create our CNN!

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten

In [None]:
# Create the architecture
model_1 = Sequential([
    Conv2D(
        filters     = 10,
        kernel_size = (3, 3),
        activation  = "relu",
        input_shape = (224, 224, 3)
    ),
    Conv2D(10, 3, activation="relu"),
    MaxPool2D(
        pool_size   = 2,
        padding     = "valid"
    ),

    Conv2D(10, 3, activation="relu"),
    Conv2D(10, 3, activation="relu"),
    MaxPool2D(),

    Flatten(),

    Dense(1, activation = "sigmoid")
])

# 4. Compile our CNN

In [None]:
# Compile it with important hyperparameters
model_1.compile(
    optimizer   = tf.keras.optimizers.Adam(),
    metrics     = ["accuracy"],
    loss        = "binary_crossentropy"
)

# 5. Fit the CNN to our training data

In [None]:
# Time to train
history_1 = model_1.fit(
    train_data,
    epochs           = 10,
    steps_per_epoch  = len(train_data),
    validation_data  = test_data,
    validation_steps = len(test_data)
)

We got a good accuracy (training and test sets).

# 6. Evaluate the model

In [None]:
# Create a function to import and image and resize it to be able to be used with our model
def load_and_prep_image(filename=None, url=None, img_shape=224):
  """
  Reads an image from filename, turns it into a tensor and reshapes it
  to (img_shape, img_shape, colour_channels).
  """

  if url is not None:
    filename = tf.keras.utils.get_file(origin=url, fname=url.split('/')[-1], cache_dir='.', cache_subdir='')

  # Read in the image (instead using matplotlib, as we won't plot, let's use tensorflow method)
  img = tf.io.read_file(filename)

  # Decode the read file into a tensor
  img = tf.image.decode_image(img)

  # Resize the image (with the shape of CNN's input layer)
  img = tf.image.resize(img, size=[img_shape, img_shape])

  # Rescale the image (get all values between 0 and 1)
  img = img/255.
  return img

In [None]:
img = load_and_prep_image("pizza_steak/test/pizza/1001116.jpg")
img

In [None]:
pred = model_1.predict(tf.expand_dims(img, axis=0))
pred

Zero is Pizza and One is Steak.

In [None]:
def predict_steak_or_pizza(img_path=None, url=None):
    img = load_and_prep_image(img_path, url)
    pred = model_1.predict(tf.expand_dims(img, axis=0))
    if pred > 0.5:
        print("You got a steak")
    else:
        print("You got a pizza")

In [None]:
predict_steak_or_pizza("pizza_steak/test/pizza/1032754.jpg")
predict_steak_or_pizza("pizza_steak/test/pizza/103708.jpg")
predict_steak_or_pizza("pizza_steak/test/pizza/1060407.jpg")
predict_steak_or_pizza("pizza_steak/test/pizza/121960.jpg")
predict_steak_or_pizza("pizza_steak/test/pizza/138961.jpg")



In [None]:
predict_steak_or_pizza("pizza_steak/test/steak/100274.jpg")
predict_steak_or_pizza("pizza_steak/test/steak/1012080.jpg")
predict_steak_or_pizza("pizza_steak/test/steak/108310.jpg")
predict_steak_or_pizza("pizza_steak/test/steak/13023.jpg")
predict_steak_or_pizza("pizza_steak/test/steak/13719.jpg")

This looks very good! Let's see with our own images?

In [None]:
# Only pizzas
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSYc4FDO-ZSCqCtWfpb7AX4RBYUvWXwGd1_aFAEjoyODVmMv0syOjNFoHSy0g6j5uU7Jes&usqp=CAU")
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRnZMTxbiq-6Rk6w5wajajLa3eSApBkTHioMobQ54DBz_cnQliOe3OXYc_5dQof7qLZn3Q&usqp=CAU")
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcT4hrnTXz5Dr4aVHN6xkb3hg85Q5F4z5Nxiboi8o176skOSZjlTHh99NkaDt8e-SqznwCs&usqp=CAU")
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQLxpXTxE5QO4Y5S6FHQYiQ6-uf1Qwe6FHb28YDggeuamrPOUsIdP2Nt1OlY6sCZgJSYFI&usqp=CAU")
predict_steak_or_pizza(url="https://media-cdn.tripadvisor.com/media/photo-s/17/98/96/31/photo0jpg.jpg")
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTyo8-Fg7Pa_8RQdKuLrWY6A5MQDIQgQxPuVTzA4Po8On3rMWl9I9NOY24WLIpHMOqUyss&usqp=CAU")

Well... it seems this is not as good as could be. We got 3 right predictions, and 3 mistakes... 50%! Maybe our model is not generalizing very well.

# Binary Classification, a new begin

1. Get the data (we have already gotten it)
2. Inspect the data (visualize, visualize, visualize - become one with it)
3. Preprocess de data (prepare it for our model)
4. Create a model
5. Fit the model
6. Evaluate the model
7. Adjust different parameters and improve the model
8. Repeat until satisfied

## 1. Become one with the data

We already have a function to show the images:

view_random_image

as we have two classes, let's show the both on one unique figure.

In [None]:
plt.figure()
plt.subplot(1, 2, 1)
dump = view_random_image("pizza_steak/train/", "steak")
plt.subplot(1, 2, 2)
dump = view_random_image("pizza_steak/train/", "pizza")

## 2. Preprocess the data (prepare it for a model)

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_data_gen = ImageDataGenerator(rescale=1/255.)
test_data_gen  = ImageDataGenerator(rescale=1/255.)

train_data = train_data_gen.flow_from_directory(
    directory       = train_data_path,
    batch_size      = 32,
    target_size     = (224, 224),
    seed            = 42,
    class_mode      = "binary"
)

test_data  = test_data_gen.flow_from_directory(
    directory       = test_data_path,
    batch_size      = 1,
    target_size     = (224, 224),
    seed            = 42,
    class_mode      = "binary"
)

The train_data and test_data are generators. Then, to get a batch of images (32 for training and 1 for testing) we use next().

## 3. Create a CNN model (start with a baseline)

In [None]:
model_2 = Sequential([
    Conv2D(
        filters     = 10,
        kernel_size = (3,3),
        strides     = (1, 1),
        padding     = "valid",
        input_shape = (224, 224, 3),
        activation  = "relu"
    ),
    Conv2D(10, 3, 1, padding="valid", activation="relu"),
    MaxPool2D(),

    Conv2D(10, 3, 1, padding="valid", activation="relu"),
    Conv2D(10, 3, 1, padding="valid", activation="relu"),
    MaxPool2D(),

    Flatten(),

    Dense(1, activation="sigmoid")
])

In [None]:
model_2.compile(
    optimizer = tf.keras.optimizers.Adam(),
    metrics = ["accuracy"],
    loss = "binary_crossentropy"
)

In [None]:
model_2.summary()

In [None]:
history_2 = model_2.fit(
    train_data,
    epochs           = 10,
    steps_per_epoch  = len(train_data),
    validation_data  = test_data,
    validation_steps = len(test_data)
)

## 5. Evaluating our model

In [None]:
print(type(history_2.history))
print(history_2.history.keys())

In [None]:
import pandas as pd
pd.DataFrame(history_2.history).plot(figsize=(10,7))

In [None]:
# Plot the validation and training curves separately
def plot_loss_curves(history):
  """
  Returns separate loss curves for training and validation metrics.
  """
  loss = history.history["loss"]
  val_loss = history.history["val_loss"]

  accuracy = history.history["accuracy"]
  val_accuracy = history.history["val_accuracy"]

  epochs = range(len(history.history["loss"])) # how many epochs did we run for?

  # Plot loss
  plt.plot(epochs, loss, label="training_loss")
  plt.plot(epochs, val_loss, label="val_loss")
  plt.title("loss")
  plt.xlabel("epochs")
  plt.legend()

  # Plot accuracy
  plt.figure()
  plt.plot(epochs, accuracy, label="training_accuracy")
  plt.plot(epochs, val_accuracy, label="val_accuracy")
  plt.title("accuracy")
  plt.xlabel("epochs")
  plt.legend();

In [None]:
plot_loss_curves(history_2)

It seems we have overfitting whilst the epochs increase. We can use data augmentation to try breaking that behavior.

In [None]:
train_datagen_augmented = ImageDataGenerator(
    rescale             = 1/255.,
    rotation_range      = 0.2,
    shear_range         = 0.2,
    zoom_range          = 0.2,
    width_shift_range   = 0.2,
    height_shift_range  = 0.2,
    horizontal_flip     = True
)

train_data_augmented = train_datagen_augmented.flow_from_directory(
    directory       = test_data_path,
    batch_size      = 16,
    target_size     = (224, 224),
    seed            = 42,
    class_mode      = "binary",
)

In [None]:
history_2 = model_2.fit(
    train_data_augmented,
    epochs              = 10,
    steps_per_epoch     = len(train_data_augmented),
    validation_data     = test_data,
    validation_steps    = len(test_data)
)

In [None]:
plot_loss_curves(history_2)

It seems to be better. Let's try to use the shuffle argument.

In [None]:
train_data_augmented = train_datagen_augmented.flow_from_directory(
    directory       = test_data_path,
    batch_size      = 32,
    target_size     = (224, 224),
    seed            = 42,
    class_mode      = "binary",
    shuffle         = True
)

In [None]:
history_2 = model_2.fit(
    train_data_augmented,
    epochs              = 10,
    steps_per_epoch     = len(train_data_augmented),
    validation_data     = test_data,
    validation_steps    = len(test_data)
)

In [None]:
plot_loss_curves(history_2)

In [None]:
def predict_steak_or_pizza(img_path=None, url=None):
    img = load_and_prep_image(img_path, url)
    pred = model_2.predict(tf.expand_dims(img, axis=0))
    if pred > 0.5:
        print("You got a steak")
    else:
        print("You got a pizza")

In [None]:
# Only pizzas
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSYc4FDO-ZSCqCtWfpb7AX4RBYUvWXwGd1_aFAEjoyODVmMv0syOjNFoHSy0g6j5uU7Jes&usqp=CAU")
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRnZMTxbiq-6Rk6w5wajajLa3eSApBkTHioMobQ54DBz_cnQliOe3OXYc_5dQof7qLZn3Q&usqp=CAU")
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcT4hrnTXz5Dr4aVHN6xkb3hg85Q5F4z5Nxiboi8o176skOSZjlTHh99NkaDt8e-SqznwCs&usqp=CAU")
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQLxpXTxE5QO4Y5S6FHQYiQ6-uf1Qwe6FHb28YDggeuamrPOUsIdP2Nt1OlY6sCZgJSYFI&usqp=CAU")
predict_steak_or_pizza(url="https://media-cdn.tripadvisor.com/media/photo-s/17/98/96/31/photo0jpg.jpg")
predict_steak_or_pizza(url="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTyo8-Fg7Pa_8RQdKuLrWY6A5MQDIQgQxPuVTzA4Po8On3rMWl9I9NOY24WLIpHMOqUyss&usqp=CAU")

Now we got a 4/2. It's better! With a simple data augmentation, we increase the quality of our model.