<h1 align="left"><font color=#0a1f89>Description:</font></h1>  

   The project aims to develop a deep learning model capable of accurately recognizing food items from images and estimating their calorie content. With the prevalence of smartphones and the increasing interest in health and nutrition, such a model could empower users to track their dietary intake more effectively, make informed food choices, and achieve their health and fitness goals. By leveraging advanced machine learning techniques, including convolutional neural networks (CNNs) for image recognition and regression algorithms for calorie estimation, the model seeks to provide users with a convenient and accessible tool for managing their nutrition.



<h1 align="left"><font color=#0a1f89>Objectives:</font></h1>    
    
    
- Download and extract Food 101 dataset.
- Understand dataset structure and files.
- Visualize random image from each of the 101 classes.
- Split the image data into train and test using train.txt and test.txt.
- Create a subset of data with few classes(3) - train_mini and test_mini for experimenting.
- Fine tune Inception Pretrained model using Food 101 dataset.
- Visualize accuracy and loss plots.
- Predicting classes for new images from internet.
- Scale up and fine tune Inceptionv3 model with 11 classes of data.
- Model Explainability.
- Summary of the things I tried.
- Further improvements.
- Feedback.



<h1 align="left"><font color=#0a1f89>Applications </font></h1>
    
- **Dietary Tracking**: Users can track their daily food intake by simply taking photos of their meals, enabling them to monitor their calorie consumption and make adjustments to their diet as needed.
    
- **Health and Fitness Monitoring**: The model can provide valuable insights into users' nutritional habits, helping them make healthier food choices and achieve their fitness goals.

- **Weight Management**: By accurately estimating the calorie content of meals, the model can assist individuals in managing their weight more effectively and maintaining a healthy lifestyle.

- **Nutritional Education**: The model can serve as an educational tool by providing users with information about the nutritional content of various foods, fostering greater awareness of dietary choices and their impact on health.
    
- **Personalized Recommendations**: Over time, the model can learn users' preferences and dietary patterns, providing personalized recommendations for balanced meals and optimal nutrition.


<font color=#0c741c>Let's get started:</font></h2>

<p style="background-color: #742d0c; font-family:calibri; color:white; font-size:140%; font-family:Verdana; text-align:center; border-radius:15px 50px;">Step 1 | Setup and Initialization</p>


 <b><span style='color:black'>Step 1.1 |</span><span style='color:#742d0c '> Importing Libraries</span></b>


In [51]:
import tensorflow as tf  # Import TensorFlow library for deep learning tasks
import matplotlib.image as img  # Import matplotlib for image reading
import numpy as np  # Import NumPy for numerical operations
from collections import defaultdict  # Import defaultdict for creating dictionaries with default values
import collections  # Import collections module for collection data types
from shutil import copy  # Import shutil for high-level file operations
from shutil import copytree, rmtree  # Import shutil for directory copying and removal
import tensorflow.keras.backend as K  # Import Keras backend functions
from tensorflow.keras.models import load_model  # Import Keras function for loading pre-trained models
from tensorflow.keras.preprocessing import image  # Import Keras for image preprocessing
import matplotlib.pyplot as plt  # Import matplotlib for visualization
import os  # Import os module for operating system functions
import random  # Import random module for generating random numbers
import cv2  # Import OpenCV for image processing
from tensorflow.keras import regularizers  # Import regularizers for regularization techniques
from tensorflow.keras.applications.inception_v3 import InceptionV3  # Import pre-trained InceptionV3 model
from tensorflow.keras.models import Sequential, Model  # Import Sequential and Model for building neural network models
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten  # Import layers for building neural network architectures
from tensorflow.keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D, GlobalAveragePooling2D, AveragePooling2D  # Import layers for building convolutional neural networks
from tensorflow.keras.preprocessing.image import ImageDataGenerator  # Import ImageDataGenerator for data augmentation
from tensorflow.keras.callbacks import ModelCheckpoint, CSVLogger  # Import callbacks for model saving and logging
from tensorflow.keras.optimizers import SGD  # Import SGD optimizer for training models
from tensorflow.keras.regularizers import l2  # Import L2 regularization
from tensorflow import keras  # Import Keras for deep learning tasks
from tensorflow.keras import models  # Import Keras for building neural network models
import zipfile
import os


 <b><span style='color:black'>Step 1.2 |</span><span style='color:#742d0c '> Checking State of GPU</span></b>


In [52]:
# Check TensorFlow version
print(tf.__version__)

# Check GPU device name
print(tf.test.gpu_device_name())


2.15.0
/device:GPU:0


 <b><span style='color:black'>Step 1.3 |</span><span style='color:#742d0c '> Changing Directory</span></b>


In [53]:
%cd /kaggle/input/food-101/

/kaggle/input/food-101


<b><span style='color:black'>Step 1.4 |</span><span style='color:#742d0c '> Downloading and Extracting the Dataset</span></b>


In [54]:
def get_data_extract():
    """
    Check if the dataset exists, and if not, download and extract it.
    """
    if "food-101" in os.listdir():
        print("Dataset already exists")
    else:
        print("Downloading the data...")
        !wget http://data.vision.ee.ethz.ch/cvl/food-101.tar.gz
        print("Dataset downloaded!")
        print("Extracting data..")
        !tar xzvf food-101.tar.gz
        print("Extraction done!")


In [55]:
# Download data and extract it to folder
# Uncomment this below line if you are on Colab

get_data_extract()

Dataset already exists


<p style="background-color: #742d0c; font-family:calibri; color:white; font-size:140%; font-family:Verdana; text-align:center; border-radius:15px 50px;">Step 2 | Data Exploration and Verification</p>


In [56]:
# Check the extracted dataset folder
!ls food-101/

__MACOSX  food-101




- The dataset being used is **[Food 101](https://www.vision.ee.ethz.ch/datasets_extra/food-101/)**
- This dataset has 101000 images in total. It's a food dataset with 101 categories(multiclass)
- Each type of food has 750 training samples and 250 test samples
- Note found on the webpage of the dataset :  
- On purpose, the training images were not cleaned, and thus still contain some amount of noise. This comes mostly in the form of intense colors and sometimes wrong labels. 
- All images were rescaled to have a maximum side length of 512 pixels.
- The entire dataset is 5GB in size

 <b><span style='color:black'>Step 2.1 |</span><span style='color:#742d0c '> Understanding the Structure of Image Data</span></b>



In [None]:
os.listdir('food-101/images')

 <b><span style='color:black'>Step 2.2 |</span><span style='color:#742d0c '> Understanding the Metadata or Additional Information</span></b>

Understanding the Metadata or Additional Information

In [None]:
os.listdir('food-101/meta')

In [None]:
!head food-101/meta/train.txt


In [None]:
!head food-101/meta/classes.txt

<p style="background-color: #742d0c; font-family:calibri; color:white; font-size:140%; font-family:Verdana; text-align:center; border-radius:15px 50px;">Step 3 | Data Visualization</p>


In [None]:
# Define the number of rows and columns for the subplot grid
rows = 17
cols = 6

# Create a subplot grid with specified size
fig, ax = plt.subplots(rows, cols, figsize=(25,25))

# Set the title of the plot
fig.suptitle("Showing one random image from each class", y=1.05, fontsize=24)

# Define the directory containing the image data
data_dir = "food-101/images/"

# Get a sorted list of food class names
foods_sorted = sorted(os.listdir(data_dir))

# Initialize food_id variable
food_id = 0

# Loop through rows and columns to display images
for i in range(rows):
    for j in range(cols):
        try:
            food_selected = foods_sorted[food_id] 
            food_id += 1
        except:
            break
        if food_selected == '.DS_Store':
            continue
        # Get a list of images for the current food class
        food_selected_images = os.listdir(os.path.join(data_dir, food_selected))
        # Select a random image from the list
        food_selected_random = np.random.choice(food_selected_images)
        # Read and display the image
        img = plt.imread(os.path.join(data_dir, food_selected, food_selected_random))
        ax[i][j].imshow(img)
        ax[i][j].set_title(food_selected, pad=10)  # Set the title of the subplot
        
# Remove x and y ticks from all subplots
plt.setp(ax, xticks=[], yticks=[])

# Adjust the layout of subplots to fit the figure
plt.tight_layout()


 <p style="background-color: #742d0c; font-family:calibri; color:white; font-size:140%; font-family:Verdana; text-align:center; border-radius:15px 50px;">Step 4 | Data Preprocessing</p>


 <b><span style='color:black'>Step 4.1 |</span><span style='color:#742d0c '> Data Splitting for Training and Test</span></b>


In [None]:
# Helper method to split dataset into train and test folders
def prepare_data(filepath, src, dest):
    # Create a dictionary to store image paths for each class
    classes_images = defaultdict(list)
    
    # Read the filepath and extract image paths
    with open(filepath, 'r') as txt:
        paths = [read.strip() for read in txt.readlines()]
        for p in paths:
            food = p.split('/')
            classes_images[food[0]].append(food[1] + '.jpg')

    # Iterate over classes and copy images to destination folder
    for food in classes_images.keys():
        print("\nCopying images into ", food)
        if not os.path.exists(os.path.join(dest, food)):
            os.makedirs(os.path.join(dest, food))
        for i in classes_images[food]:
            copy(os.path.join(src, food, i), os.path.join(dest, food, i))
    print("Copying Done!")


 <b><span style='color:black'>Step 4.2 |</span><span style='color:#742d0c '> Prepares the Train Dataset </span></b>



In [None]:
# Change current directory to the root directory
%cd /

# Print message indicating the start of creating train data
print("Creating train data...")

# Call prepare_data function to copy images from train.txt to train directory
prepare_data('/kaggle/input/food-101/food-101/meta/train.txt', '/kaggle/input/food-101/food-101/images', 'train')


<b><span style='color:black'>Step 4.3 |</span><span style='color:#742d0c '> Creating the Test Data </span></b>


In [None]:
# Print message indicating the start of creating test data
print("Creating test data...")

# Call prepare_data function to copy images from test.txt to test directory
prepare_data('/kaggle/input/food-101/food-101/meta/test.txt', '/kaggle/input/food-101/food-101/images', 'test')


<b><span style='color:black'>Step 4.4 |</span><span style='color:#742d0c '> Counting the Files and Directories in "Train" Folder </span></b>


In [None]:
# Print message indicating the total number of samples in the train folder
print("Total number of samples in train folder")

# Execute the find command to search for files and directories in the train folder
# -type d: Search for directories
# -or: Logical OR operator
# -type f: Search for regular files
# -printf '.': Print a single character for each file or directory found
# wc -c: Count the number of characters (which corresponds to the number of files and directories)
!find train -type d -or -type f -printf '.' | wc -c


<b><span style='color:black'>Step 4.5 |</span><span style='color:#742d0c '> Counting the Files and Directories in "Test" Folder </span></b>


In [None]:
# Print message indicating the total number of samples in the test folder
print("Total number of samples in test folder")

# Execute the find command to search for files and directories in the test folder
# -type d: Search for directories
# -or: Logical OR operator
# -type f: Search for regular files
# -printf '.': Print a single character for each file or directory found
# wc -c: Count the number of characters (which corresponds to the number of files and directories)
!find test -type d -or -type f -printf '.' | wc -c




- We now have train and test data ready  
- But to experiment and try different architectures, working on the whole data with 101 classes takes a lot of time and computation  
- To proceed with further experiments, I am creating train_min and test_mini, limiting the dataset to 3 classes  
- Since the original problem is multiclass classification which makes key aspects of architectural decisions different from that of binary classification, choosing 3 classes is a good start instead of 2

<b><span style='color:black'>Step 4.6 |</span><span style='color:#742d0c '> Removing the .DS_Store Entry </span></b>


In [None]:
# List of all 101 types of foods(sorted alphabetically)
del foods_sorted[0] # remove .DS_Store from the list

In [None]:
print(foods_sorted)


 <b><span style='color:black'>Step 4.7 |</span><span style='color:#742d0c '> Creaing Subset </span></b>


In [None]:
# Helper method to create train_mini and test_mini data samples
def dataset_mini(food_list, src, dest):
    # Check if the destination directory exists
    if os.path.exists(dest):
        # If it exists, remove it to ensure a clean slate
        rmtree(dest)  # Removing dataset_mini (if it already exists) folders
    # Create the destination directory
    os.makedirs(dest)
    
    # Iterate over each food item in the provided list
    for food_item in food_list:
        print("Copying images into", food_item)
        # Recursively copy the images from the source directory to the destination directory for each food item
        copytree(os.path.join(src, food_item), os.path.join(dest, food_item))


In [None]:
# List of food items for creating mini datasets
food_list = ['apple_pie', 'pizza', 'omelette']

# Source and destination directories for train and test datasets
src_train = 'train'
dest_train = 'train_mini'
src_test = 'test'
dest_test = 'test_mini'

# Create train_mini dataset
dataset_mini(food_list, src_train, dest_train)

# Create test_mini dataset
dataset_mini(food_list, src_test, dest_test)


In [None]:
# Print message indicating the creation of the train data folder with new classes
print("Creating train data folder with new classes")

# Create train_mini dataset with specified food classes
dataset_mini(food_list, src_train, dest_train)


In [None]:
# Print message indicating the total number of samples in the train folder
print("Total number of samples in train folder")

# Execute the find command to search for files and directories in the train_mini folder
# -type d: Search for directories
# -or: Logical OR operator
# -type f: Search for regular files
# -printf '.': Print a single character for each file or directory found
# wc -c: Count the number of characters (which corresponds to the number of files and directories)
!find train_mini -type d -or -type f -printf '.' | wc -c


In [None]:
# Print message indicating the creation of the test data folder with new classes
print("Creating test data folder with new classes")

# Create test_mini dataset with specified food classes
dataset_mini(food_list, src_test, dest_test)


In [None]:
# Print message indicating the total number of samples in the test folder
print("Total number of samples in test folder")

# Execute the find command to search for files and directories in the test_mini folder
# -type d: Search for directories
# -or: Logical OR operator
# -type f: Search for regular files
# -printf '.': Print a single character for each file or directory found
# wc -c: Count the number of characters (which corresponds to the number of files and directories)
!find test_mini -type d -or -type f -printf '.' | wc -c


 <p style="background-color: #742d0c; font-family:calibri; color:white; font-size:140%; font-family:Verdana; text-align:center; border-radius:15px 50px;">Step 5 | Training Neural Network Model for Image Classification</p>




- Keras and other Deep Learning libraries provide pretrained models  
- These are deep neural networks with efficient architectures(like VGG,Inception,ResNet) that are already trained on datasets like ImageNet  
- Using these pretrained models, we can use the already learned weights and add few layers on top to finetune the model to our new data  
- This helps in faster convergance and saves time and computation when compared to models trained from scratch


- We currently have a subset of dataset with 3 classes - samosa, pizza and omelette  
- Use the below code to finetune Inceptionv3 pretrained model

 <p style="background-color: #742d0c; font-family:calibri; color:white; font-size:170%; font-family:Verdana; text-align:center; border-radius:15px 50px;">Fine tune Inception Pretrained model using Food 101 dataset</p>


In [None]:
# Clear Keras session to release resources
K.clear_session()

# Number of classes in the dataset
n_classes = 3

# Image dimensions
img_width, img_height = 299, 299

# Directories for training and validation data
train_data_dir = 'train_mini'
validation_data_dir = 'test_mini'

# Number of samples in the training and validation sets
nb_train_samples = 2250  # Number of training samples
nb_validation_samples = 750  # Number of validation samples

# Batch size for training
batch_size = 16

# Data augmentation and normalization for training images
train_datagen = ImageDataGenerator(
    rescale=1. / 255,  # Normalize pixel values to the range [0,1]
    shear_range=0.2,   # Shear transformation
    zoom_range=0.2,    # Random zoom
    horizontal_flip=True)  # Horizontal flip

# Normalization for validation images
test_datagen = ImageDataGenerator(rescale=1. / 255)  # Normalize pixel values to the range [0,1]

# Generate batches of training data
train_generator = train_datagen.flow_from_directory(
    train_data_dir,  # Path to the training data directory
    target_size=(img_height, img_width),  # Resize images to match the input size of the model
    batch_size=batch_size,  # Number of samples per batch
    class_mode='categorical')  # Use categorical labels for multi-class classification

# Generate batches of validation data
validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,  # Path to the validation data directory
    target_size=(img_height, img_width),  # Resize images to match the input size of the model
    batch_size=batch_size,  # Number of samples per batch
    class_mode='categorical')  # Use categorical labels for multi-class classification

# Load the pre-trained InceptionV3 model without the top layers
inception = InceptionV3(weights='imagenet', include_top=False)

# Add custom top layers for fine-tuning
x = inception.output  # Output tensor of the InceptionV3 model
x = GlobalAveragePooling2D()(x)  # Global average pooling layer
x = Dense(128, activation='relu')(x)  # Fully connected layer with ReLU activation
x = Dropout(0.2)(x)  # Dropout layer for regularization

# Predictions layer with softmax activation for class probabilities
predictions = Dense(3, kernel_regularizer=regularizers.l2(0.005), activation='softmax')(x)

# Create the final model with InceptionV3 as the base and custom top layers
model = Model(inputs=inception.input, outputs=predictions)

# Compile the model with SGD optimizer, categorical cross-entropy loss, and accuracy metric
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])

# Define callbacks for saving the best model and logging training history
checkpointer = ModelCheckpoint(filepath='best_model_3class.hdf5', verbose=1, save_best_only=True)
csv_logger = CSVLogger('history_3class.log')

# Train the model using the training and validation generators
history = model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,  # Number of batches per epoch
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size,  # Number of validation batches per epoch
    epochs=30,  # Number of training epochs
    verbose=1,  # Verbosity mode (0=silent, 1=progress bar, 2=one line per epoch)
    callbacks=[csv_logger, checkpointer])  # List of callbacks for training

# Save the trained model
model.save('model_trained_3class.hdf5')  # Save the model to HDF5 file format


<b><span style='color:black'>Step 5.1 |</span><span style='color:#742d0c '> Obtaining the Class Indices Mapping </span></b>


In [None]:
# Get the class indices mapping for the labels in the training data generator
class_map_3 = train_generator.class_indices

# Display the class indices mapping
class_map_3


<b><span style='color:black'>Step 5.2 |</span><span style='color:#742d0c '> Visualizing the Training and Validation Accuracy </span></b>



In [None]:
def plot_accuracy(history, title):
    """
    Plot training and validation accuracy over epochs.
    
    Args:
    - history: Training history obtained from model training
    - title: Title of the plot
    
    Returns:
    - None
    """
    plt.title(title)
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train_accuracy', 'validation_accuracy'], loc='best')
    plt.show()

def plot_loss(history, title):
    """
    Plot training and validation loss over epochs.
    
    Args:
    - history: Training history obtained from model training
    - title: Title of the plot
    
    Returns:
    - None
    """
    plt.title(title)
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train_loss', 'validation_loss'], loc='best')
    plt.show()


 <b><span style='color:black'>Step 5.3 |</span><span style='color:#742d0c '> Model Evaluation </span></b>

In [None]:
plot_accuracy(history, 'FOOD101-Inceptionv3')  # Plot training and validation accuracy
plot_loss(history, 'FOOD101-Inceptionv3')      # Plot training and validation loss


  
- The plots show that the accuracy of the model increased with epochs and the loss has decreased
- Validation accuracy has been on the higher side than training accuracy for many epochs
This could be for several reasons:

-   We used a pretrained model trained on ImageNet which contains data from a variety of classes
-   Using dropout can lead to a higher validation accuracy

In [None]:
%%time
# Loading the best saved model to make predictions
K.clear_session()  # Clear Keras session
model_best = load_model('best_model_3class.hdf5', compile=False)  # Load the best saved model


  
- Setting compile=False and clearing the session leads to faster loading of the saved model
- Withouth the above addiitons, model loading was taking more than a minute!

 <b><span style='color:black'>Step 5.4 |</span><span style='color:#742d0c '> Predicting Class Label </span></b>


In [None]:
def predict_class(model, images, show=True):
    """
    Predict the class label for each image in the given list of image paths.
    
    Args:
    - model: Trained model for making predictions
    - images: List of image paths
    - show: Boolean flag to control image display
    
    Returns:
    - None
    """
    for img in images:
        img = image.load_img(img, target_size=(299, 299))  # Load image and resize to model's input size
        img = image.img_to_array(img)                     # Convert image to numpy array
        img = np.expand_dims(img, axis=0)                 # Add batch dimension
        img /= 255.                                       # Normalize pixel values

        pred = model.predict(img)                         # Make prediction
        index = np.argmax(pred)                           # Get the index of the class with the highest probability
        food_list.sort()                                  # Sort the list of food items
        pred_value = food_list[index]                     # Get the predicted class label
        
        if show:
            plt.imshow(img[0])                           # Display the image
            plt.axis('off')
            plt.title(pred_value)                        # Set title as the predicted class label
            plt.show()


 <b><span style='color:black'>Step 5.5 |</span><span style='color:#742d0c '> Downloading Images from the Internet</span></b>



In [None]:
# Downloading images from internet using the URLs
!wget -O samosa.jpg http://veggiefoodrecipes.com/wp-content/uploads/2016/05/lentil-samosa-recipe-01.jpg
!wget -O applepie.jpg https://acleanbake.com/wp-content/uploads/2017/10/Paleo-Apple-Pie-with-Crumb-Topping-gluten-free-grain-free-dairy-free-15.jpg
!wget -O pizza.jpg http://104.130.3.186/assets/itemimages/400/400/3/default_9b4106b8f65359684b3836096b4524c8_pizza%20dreamstimesmall_94940296.jpg
!wget -O omelette.jpg https://www.incredibleegg.org/wp-content/uploads/basic-french-omelet-930x550.jpg


In [None]:
# Make a list of downloaded images
images = ['applepie.jpg', 'pizza.jpg', 'omelette.jpg']

# Test the trained model
predict_class(model_best, images, True)



<h3 align="left"><font>Yes!!! The model got them all right!!</font></h3>


 <p style="background-color: #742d0c; font-family:calibri; color:white; font-size:170%; font-family:Verdana; text-align:center; border-radius:15px 50px;">Fine tune Inceptionv3 model with 11 classes of data</p>



  
- We trained a model on 3 classes and tested it using new data
- The model was able to predict the classes of all three test images correctly
- Will it be able to perform at the same level of accuracy for more classes?
- FOOD-101 dataset has 101 classes of data
- Even with fine tuning using a pre-trained model, each epoch was taking more than an hour when all 101 classes of data is used(tried this on both Colab and on a Deep Learning VM instance with P100 GPU on GCP)
- But to check how the model performs when more classes are included, I'm using the same model to fine tune and train on 11 randomly chosen classes

In [None]:
def pick_n_random_classes(n):
    """
    Select n random food classes from the sorted list of food items.
    
    Args:
    - n: Number of random food classes to select
    
    Returns:
    - List of n randomly selected food classes
    """
    food_list = []
    random_food_indices = random.sample(range(len(foods_sorted)), n)  # Sample n random indices
    for i in random_food_indices:
        food_list.append(foods_sorted[i])  # Retrieve corresponding food items
    food_list.sort()  # Sort the list of randomly selected food classes
    return food_list


In [None]:
n = 11  # Number of random food classes to select
food_list = pick_n_random_classes(n)  # Select n random food classes
food_list = ['apple_pie', 'beef_carpaccio', 'bibimbap', 'cup_cakes', 'foie_gras', 'french_fries', 'garlic_bread', 'pizza', 'spring_rolls', 'spaghetti_carbonara', 'strawberry_shortcake']
print("These are the randomly picked food classes we will be training the model on...\n", food_list)


In [None]:
# Print a message indicating the start of the process
print("Creating training data folder with new classes...")

# Call the dataset_mini function to create a new data subset with the selected food classes for training
dataset_mini(food_list, src_train, dest_train)


In [None]:
# Print the total number of samples in the train folder
print("Total number of samples in train folder")

# Count the number of files and directories in the train_mini folder and print the count
!find train_mini -type d -or -type f -printf '.' | wc -c


In [None]:
# Print a message indicating the start of the process
print("Creating test data folder with new classes")

# Call the dataset_mini function to create a new data subset with the selected food classes for testing
dataset_mini(food_list, src_test, dest_test)


In [None]:
# Print a message indicating the start of the process
print("Total number of samples in test folder")

# Count the number of files and directories in the test_mini folder and print the count
!find test_mini -type d -or -type f -printf '.' | wc -c


In [None]:
# Clear any previous sessions to free up memory
K.clear_session()

# Set the number of classes
n_classes = n

# Set the dimensions of input images
img_width, img_height = 299, 299

# Set the paths for the training and validation data directories
train_data_dir = 'train_mini'
validation_data_dir = 'test_mini'

# Set the number of training and validation samples
nb_train_samples = 8250
nb_validation_samples = 2750

# Set the batch size for training
batch_size = 16

# Define data augmentation for training images
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

# Define data augmentation for validation images
test_datagen = ImageDataGenerator(rescale=1. / 255)

# Generate batches of augmented training and validation data
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical')

# Load the InceptionV3 model pretrained on ImageNet without the top layer
inception = InceptionV3(weights='imagenet', include_top=False)

# Add custom top layers for classification
x = inception.output
x = GlobalAveragePooling2D()(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.2)(x)
predictions = Dense(n, kernel_regularizer=regularizers.l2(0.005), activation='softmax')(x)

# Create the final model
model = Model(inputs=inception.input, outputs=predictions)

# Compile the model
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])

# Define callbacks for model checkpointing and logging
checkpointer = ModelCheckpoint(filepath='best_model_11class.hdf5', verbose=1, save_best_only=True)
csv_logger = CSVLogger('history_11class.log')

# Train the model
history_11class = model.fit_generator(train_generator,
                    steps_per_epoch=nb_train_samples // batch_size,
                    validation_data=validation_generator,
                    validation_steps=nb_validation_samples // batch_size,
                    epochs=30,
                    verbose=1,
                    callbacks=[csv_logger, checkpointer])

# Save the trained model
model.save('model_trained_11class.hdf5')


In [None]:
# Get the class indices for the 11 food classes from the training data generator
class_map_11 = train_generator.class_indices
class_map_11


In [None]:
# Plot the accuracy over epochs for the training and validation sets
plot_accuracy(history_11class, 'FOOD101-Inceptionv3')

# Plot the loss over epochs for the training and validation sets
plot_loss(history_11class, 'FOOD101-Inceptionv3')


  
- The plots show that the accuracy of the model increased with epochs and the loss has decreased
- Validation accuracy has been on the higher side than training accuracy for many epochs
- This could be for several reasons:
    - We used a pretrained model trained on ImageNet which contains data from a variety of classes
    - Using dropout can lead to a higher validation accuracy 


In [None]:
%%time
# Clear any previous session and load the best saved model for predictions
K.clear_session()
model_best = load_model('best_model_11class.hdf5', compile=False)


In [None]:
# Downloading images from internet using the URLs
!wget -O cupcakes.jpg https://www.publicdomainpictures.net/pictures/110000/nahled/halloween-witch-cupcakes.jpg
!wget -O springrolls.jpg https://upload.wikimedia.org/wikipedia/commons/6/6f/Vietnamese_spring_rolls.jpg
!wget -O pizza.jpg http://104.130.3.186/assets/itemimages/400/400/3/default_9b4106b8f65359684b3836096b4524c8_pizza%20dreamstimesmall_94940296.jpg
!wget -O garlicbread.jpg https://c1.staticflickr.com/1/84/262952165_7ba3466108_z.jpg?zz=1

# If you have an image in your local computer and want to try it, uncomment the below code to upload the image files


# from google.colab import files
# image = files.upload()

In [None]:
# Make a list of downloaded images and test the trained model
images = []
images.append('cupcakes.jpg')
images.append('pizza.jpg')
images.append('springrolls.jpg')
images.append('garlicbread.jpg')
predict_class(model_best, images, True)

  
- The model did well even when the number of classes are increased to 11
- Model training on all 101 classes takes some time
- It was taking more than an hour for one epoch when the full dataset is used for fine tuning

 <p style="background-color: #742d0c; font-family:calibri; color:white; font-size:140%; font-family:Verdana; text-align:center; border-radius:15px 50px;">Step 6 | Model Explainability</p>


 
- Human lives and Technology are blending more and more together
- The rapid advancements in technology over the past few years can be attributed to how Neural Networks have evolved
- Neural Networks and Deep Learning are now being used in so many fields and industries - healthcare, finance, retail, automative etc
- Thanks to the Deep Learning libraries which enable us to develop applications/models with few lines of code, which a decade ago only those with a lot of expertise and research could do
- All of this calls for the need to understand how neural networks do what they do and how they do it
- This has led to an active area of research - Neural Network Model Interpretability and Explainability


- Neural Networks learn incrementally
- How does a neural network know what is in the image and how does it conclude that its a dog?
- The best analogy to understand the incremental learning of the model here is to think about how we would hand sketch the dog
- You can't start right away by drawing eyes, nose, snout etc
- To have any of those dogly features, you need a lot of edges and curves
- You start with edges/lines, put many of them together
- Use edges with curves to sketch patterns
- The patterns with more finer details will help us draw the visible features of a dog like eyes, ears, snout etc
- Neural networks adopt a very similar process when they are busy detecting what's in the provided data examples

![](https://images.deepai.org/publication-preview/visualizing-and-understanding-convolutional-networks-page-4-medium.jpg)

  
* The above image is taken from the paper - [Visualizing and Understanding Convolutional Networks](https://arxiv.org/abs/1311.2901)
* The image contains the features of a trained model along with the kind of objects they would detect
* In the first row and first column, we have a grid of edge detecting features in layer 1 and some curve detectors in layer 2 in the 2nd column
* The last column in 1st row are the kind of objects that get detected using those curvy features
* With layer three in 2nd row, the model starts looking for patterns with edges and curves
* The second column in second row contains examples of patterns that are detected in layer 3 of the model
* With layer 4, the model starts detecting parts of object specific features and in layer 5 the model knows what's in the image


* Using feature visualization, we can know what a neural network layer and its features are looking for
* Using attribution, we can understand how the features impact the output and what regions in the image led the model to the generated output

<p style="background-color: #742d0c; font-family:calibri; color:white; font-size:140%; font-family:Verdana; text-align:center; border-radius:15px 50px;">Step 7 | Evaluating the Model</p>


 <b><span style='color:black'>Step 7.1 |</span><span style='color:#742d0c '> Loading the Saved Model and a Test Image</span></b>



In [None]:
# Load the saved model trained with 3 classes
K.clear_session()
print("Loading the model..")
model = load_model('best_model_3class.hdf5', compile=False)
print("Done!")


 <b><span style='color:black'>Step 7.2 |</span><span style='color:#742d0c '> Summary of the Model </span></b>



In [None]:
model.summary()

<b><span style='color:black'>Step 7.3 |</span><span style='color:#742d0c '> Defining Helper Functions</span></b>



In [None]:
def deprocess_image(x):
    # Normalize tensor: center on 0., ensure standard deviation is 0.1
    x -= x.mean()
    x /= (x.std() + 1e-5)
    x *= 0.1

    # Clip values to [0, 1]
    x += 0.5
    x = np.clip(x, 0, 1)

    # Convert to RGB array
    x *= 255
    x = np.clip(x, 0, 255).astype('uint8')
    return x


<b><span style='color:black'>Step 7.4 |</span><span style='color:#742d0c '> Generating Pattern Function
</span></b>


In [None]:
def generate_pattern(layer_name, filter_index, size=150):
    # Build a loss function that maximizes the activation
    # of the nth filter of the layer considered.
    layer_output = model.get_layer(layer_name).output
    loss = K.mean(layer_output[:, :, :, filter_index])

    # Compute the gradient of the input picture wrt this loss
    grads = K.gradients(loss, model.input)[0]

    # Normalization trick: we normalize the gradient
    grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)

    # This function returns the loss and grads given the input picture
    iterate = K.function([model.input], [loss, grads])
    
    # We start from a gray image with some noise
    input_img_data = np.random.random((1, size, size, 3)) * 20 + 128.

    # Run gradient ascent for 40 steps
    step = 1.
    for i in range(40):
        loss_value, grads_value = iterate([input_img_data])
        input_img_data += grads_value * step
        
    img = input_img_data[0]
    return deprocess_image(img)


 <b><span style='color:black'>Step 7.5 |</span><span style='color:#742d0c '> Getting Activations Function
</span></b>


In [None]:
def get_activations(img, model_activations):
    """
    Get activations of a model for a given image.

    Args:
    img (str): Path to the image file.
    model_activations (Model): Model object to get activations from.

    Returns:
    numpy.ndarray: Activations produced by the model.
    """
    # Load and preprocess the image
    img = image.load_img(img, target_size=(299, 299))
    img = image.img_to_array(img)                    
    img = np.expand_dims(img, axis=0)         
    img /= 255. 
    # Visualize the image
    plt.imshow(img[0])
    plt.show()
    # Get activations
    return model_activations.predict(img)


 <b><span style='color:black'>Step 7.6 |</span><span style='color:#742d0c '> Showing Activations Function
</span></b>


In [None]:
def show_activations(activations, layer_names):
    """
    Display feature maps for each layer.

    Args:
    activations (list of numpy.ndarray): List of activation tensors for each layer.
    layer_names (list of str): Names of the layers.

    Returns:
    None
    """
    images_per_row = 16

    # Loop through each layer
    for layer_name, layer_activation in zip(layer_names, activations):
        # Number of features in the feature map
        n_features = layer_activation.shape[-1]

        # Size of the feature map
        size = layer_activation.shape[1]

        # Number of columns for visualization
        n_cols = n_features // images_per_row
        display_grid = np.zeros((size * n_cols, images_per_row * size))

        # Tile each filter into a big horizontal grid
        for col in range(n_cols):
            for row in range(images_per_row):
                channel_image = layer_activation[0, :, :, col * images_per_row + row]
                # Post-process the feature to make it visually palatable
                channel_image -= channel_image.mean()
                channel_image /= channel_image.std()
                channel_image *= 64
                channel_image += 128
                channel_image = np.clip(channel_image, 0, 255).astype('uint8')
                display_grid[col * size : (col + 1) * size, row * size : (row + 1) * size] = channel_image

        # Display the grid
        scale = 1. / size
        plt.figure(figsize=(scale * display_grid.shape[1], scale * display_grid.shape[0]))
        plt.title(layer_name)
        plt.grid(False)
        plt.imshow(display_grid, aspect='auto', cmap='viridis')

    plt.show()



Check how many layers are in the trained model(this includes the 1st input layer as well)


In [None]:
len(model.layers)


  
> 1. * Can we visualize the outputs of all the layers?
* Yes, we can. But that gets too tedious
* So, let's choose a few layers to visualize

<b><span style='color:black'>Step 7.7 |</span><span style='color:#742d0c '> Extracting Intermediate Layer Activations</span></b>


In [None]:
# We start with index 1 instead of 0, as input layer is at index 0
layers = [layer.output for layer in model.layers[1:11]]
# We now initialize a model which takes an input and outputs the above chosen layers
activations_output = models.Model(inputs=model.input, outputs=layers)


As seen below, the 10 chosen layers contain 3 convolution, 3 batch normalization, 3 activation and 1 max pooling layers


In [None]:
layers

  
>  * Get the names of all the selected layers

 <b><span style='color:black'>Step 7.8 |</span><span style='color:#742d0c '> Extracting Layer Names for Intermediate Activations
</span></b>


In [None]:
# Initialize an empty list to store the names of the layers
layer_names = []

# Iterate through the layers of the model starting from index 1 and ending at index 10 (inclusive)
for layer in model.layers[1:11]:
    # Append the name of each layer to the list
    layer_names.append(layer.name)

# Print the list of layer names
print(layer_names)


  
Provide an input to the model and get the activations of all the 10 chosen layers


In [None]:
# Define the filename of the image
food = 'applepie.jpg'

# Get the activations of the model for the specified image using the defined activations_output model
activations = get_activations(food, activations_output)



**activations** contain the outputs of all the 10 layers which can be plotted and visualized

Visualize the activations of intermediate layers from layer 1 to 10

In [None]:
# Visualize the activations of the model for the specified image using the defined layer names
show_activations(activations, layer_names)



* What we see in the above plots are the activations or the outputs of each of the 11 layers we chose 
* The activations or the outputs from the 1st layer(conv2d_1) don't lose much information of the original input
* They are the results of applying several edge detecting filters on the input image**
* With each added layer, the activations lose visual/input information and keeps building on the class/ouput information
* As the depth increases, the layers activations become less visually interpretabale and more abstract
* By doing so, they learn to detect more specific features of the class rather than just edges and curves
* We plotted just 10 out of 314 intermediate layers. We already have in these few layers, activations which are blank/sparse(for ex: the 2 blank activations in the layer activation_1)
* These blank/sparse activations are caused when any of the filters used in that layer didn't find a matching pattern in the input given to it
* By plotting more layers(specially those towards the end of the network), we can observe more of these sparse activations and how the layers get more abstract


 <b><span style='color:black'>Step 7.9 |</span><span style='color:#742d0c '> Getting the Activations for a Different Input / Food
</span></b>



In [None]:
# Get the activations for the specified image using the defined layer names
activations = get_activations(food, activations_output)


In [None]:
# Display the activations for the specified image using the defined layer names
show_activations(activations, layer_names)



* The feature maps in the above activations are for a different input image
* We see the same patterns discussed for the previous input image
* It is interesting to see the blank/sparse activations in the same layer(activation_1) and for same filters when a different image is passed to the network
* Remember we used a pretrained Inceptionv3 model. All the filters that are used in different layers come from this pretrained model

<b><span style='color:black'>Step 7.10 |</span><span style='color:#742d0c '> Look into the Sparse Activations in the Layer Activation_1
</span></b>



* We have two blank/sparse activations in layer 6
* Below cell displays one of the sparse activations

In [None]:
# Get the index of activation_1 layer which has sparse activations
ind = layer_names.index('activation_1')
sparse_activation = activations[ind]
# Select the activation values of a specific filter
a = sparse_activation[0, :, :, 13]


In [None]:
all(np.isnan(a[j][k]) for j in range(a.shape[0]) for k in range(a.shape[1]))
#This line checks if all elements in the array a are NaN.


* We can see that the activation has all nan values(it was all zeros when executed outside Kaggle, Im yet to figure out why its showing all nan values here
 
* To know why we have all zero/nan values for this activation, lets visualize the activation at same index 13 from previous layer

In [None]:
# Get the index of batch_normalization_1 layer which has sparse activations
ind = layer_names.index('batch_normalization_1')
# Extract sparse activations from the layer
sparse_activation = activations[ind]
# Select activations for the 14th filter
b = sparse_activation[0, :, :, 13]
# Print the sparse activations
b


* All the values in the above activation map from the layer batch_normalization_1 are negative
* This activation in batch_normalization_1 is passed to the next layer activation_1 as input
* As the name says, activation_1 is an activation layer and ReLu is the activation function used
* ReLu takes an input value, returns 0 if its negative, the value otherwise
* Since the input to activation array contains all negative values, the activation layer fills its activation map with all zeros for the index
* Now we know why we have those 2 sparse activations in activation_1 layer

<b><span style='color:black'>Step 7.11 |</span><span style='color:#742d0c '> Visualization of Convolutional Layer Activations

</span></b>


In [None]:
# Extract activations for the first three convolutional layers
first_convlayer_activation = activations[0]
second_convlayer_activation = activations[3]
third_convlayer_activation = activations[6]

# Visualize the activations for each layer
f, ax = plt.subplots(1, 3, figsize=(10, 10))

# Plot activations for the first convolutional layer
ax[0].imshow(first_convlayer_activation[0, :, :, 3], cmap='viridis')
ax[0].axis('OFF')
ax[0].set_title('Conv2d_1')

# Plot activations for the second convolutional layer
ax[1].imshow(second_convlayer_activation[0, :, :, 3], cmap='viridis')
ax[1].axis('OFF')
ax[1].set_title('Conv2d_2')

# Plot activations for the third convolutional layer
ax[2].imshow(third_convlayer_activation[0, :, :, 3], cmap='viridis')
ax[2].axis('OFF')
ax[2].set_title('Conv2d_3')


<b><span style='color:black'>Step 7.12 |</span><span style='color:#742d0c '> Generating Class Activation Map (CAM)

</span></b>


In [None]:
def get_attribution(food):
    """
    Generate class activation map for a given food image using Grad-CAM technique.

    Args:
    food (str): Path to the food image.

    Returns:
    numpy.ndarray: Predictions made by the model for the input image.
    """
    # Load and preprocess the input image
    img = image.load_img(food, target_size=(299, 299))
    img = image.img_to_array(img) 
    img /= 255. 

    # Display the input image
    f, ax = plt.subplots(1, 3, figsize=(15, 15))
    ax[0].imshow(img)
    ax[0].set_title("Input Image")

    # Expand the dimensions and predict the class probabilities
    img = np.expand_dims(img, axis=0) 
    preds = model.predict(img)
    class_id = np.argmax(preds[0])

    # Get the class output and last convolutional layer
    class_output = model.output[:, class_id]
    last_conv_layer = model.get_layer("mixed10")
    
    # Calculate gradients and pooled gradients
    grads = K.gradients(class_output, last_conv_layer.output)[0]
    pooled_grads = K.mean(grads, axis=(0, 1, 2))
    iterate = K.function([model.input], [pooled_grads, last_conv_layer.output[0]])
    pooled_grads_value, conv_layer_output_value = iterate([img])

    # Generate heatmap
    for i in range(2048):
        conv_layer_output_value[:, :, i] *= pooled_grads_value[i]
    heatmap = np.mean(conv_layer_output_value, axis=-1)
    heatmap = np.maximum(heatmap, 0)
    heatmap /= np.max(heatmap)
    ax[1].imshow(heatmap)
    ax[1].set_title("Heat map")
    
    # Overlay heatmap on the original image
    act_img = cv2.imread(food)
    heatmap = cv2.resize(heatmap, (act_img.shape[1], act_img.shape[0]))
    heatmap = np.uint8(255 * heatmap)
    heatmap = cv2.applyColorMap(heatmap, cv2.COLORMAP_JET)
    superimposed = cv2.addWeighted(act_img, 0.6, heatmap, 0.4, 0)
    cv2.imwrite('classactivation.png', superimposed)
    img_act = image.load_img('classactivation.png', target_size=(299, 299))
    ax[2].imshow(img_act)
    ax[2].set_title("Class Activation")
    plt.show()
    return preds


In [None]:
print("Showing the class map..")
print(class_map_3)


<b><span style='color:black'>Step 7.13 |</span><span style='color:#742d0c '> Getting Attribution and Display Softmax Predictions

</span></b>


In [None]:
# Get attribution for the image 'applepie.jpg' and display the softmax predictions
pred = get_attribution('applepie.jpg')
print("Here are softmax predictions:", pred)


In [None]:
# Get attribution and display softmax predictions for the pizza image
pred = get_attribution('pizza.jpg')
print("Here are softmax predictions:", pred)


 
* We can see how the heat map is different for a different image i.e the model looks for a totally different features/regions if it has to classify it as a pizza
- Lets see if we can break the model or see what it does when we surpise it with different data! 
* We trained our model to perform multi class classification and it seems to be doing well with >95% of accuracy
* What will the model do when we give it an image which has more than one object that model is trained to classify?

 <b><span style='color:black'>Step 7.14 |</span><span style='color:#742d0c '> Downloading Images from URLs

</span></b>


In [None]:
# Downloading images from internet using the URLs
!wget -O piepizza.jpg https://raw.githubusercontent.com/theimgclist/PracticeGround/master/Food101/piepizza.jpg
!wget -O piepizzas.png https://raw.githubusercontent.com/theimgclist/PracticeGround/master/Food101/piepizzas.png
!wget -O pizzapie.jpg https://raw.githubusercontent.com/theimgclist/PracticeGround/master/Food101/pizzapie.jpg
!wget -O pizzapies.png https://raw.githubusercontent.com/theimgclist/PracticeGround/master/Food101/pizzapies.png


<b><span style='color:black'>Step 7.15 |</span><span style='color:#742d0c '> Loading and Retrieving Activations

</span></b>


In [None]:
food = 'piepizza.jpg'
activations = get_activations(food,activations_output)

In [None]:
show_activations(activations, layer_names)

In [None]:
pred = get_attribution('piepizza.jpg')
print("Here are softmax predictions..",pred)


* Given an image with pizza and applepie, the model thinks its a pizza with 75.4% confidence and an applie pie with 18% confidence
* Now let's flip the image vertically and see what the model does

In [None]:
food = 'pizzapie.jpg'
activations = get_activations(food,activations_output)

In [None]:
pred = get_attribution('pizzapie.jpg')
print("Here are softmax predictions..",pred)


* Well, the model flipped its output too!
* The model now thinks its an apple pie with 49.7% confidence and a pizza with 31.9%

<h2 align="left"><font>More surprise data to the model...</font></h2>


In [None]:
food = 'pizzapies.png'
activations = get_activations(food,activations_output)

In [None]:
pred = get_attribution('pizzapies.png')
print("Here are softmax predictions..",pred)


* This time it's applie pie with 73% and a pizza with 19% confidence
* Let's try one last horizontal flip, this is the last really!

In [None]:
food = 'piepizzas.png'
activations = get_activations(food,activations_output)

In [None]:
pred = get_attribution('piepizzas.png')
print("Here are softmax predictions..",pred)


* No surprise from model this time. We flipped the image but the model didnt flip its output
* It's an apple pie again with 52% confidence