# Challenge 1: Classification
In this challenge, you're given a food classification dataset which has 101 classes. You need to analyze and preprocess the dataset as well as build deep learning models for performing food classification. 
<br>
Three models are to be trained for this task, mainly light, medium, and heavy model. <br>
Examples: <br>
Light model - mobilenetv2 <br>
Medium model - Resnet50 <br>
Heavy model - VGG19 <br>
<br>
The above given models are examples. You are free to choose any deep learning model to train. 

**Main Objective**:
You are supposed to use both TensorFlow and PyTorch for this task. You need to train one model for each framework. (You can use one of the frameworks again for the third model)

## Summary 

Create a table for your train and test accuracy as well as speed for each model (mention the framework used for training)

# Analyze the dataset
## Objectives
1. Upload the dataset provided (Google Drive link). 
2. Extract the dataset. 
3. Re-arrange dataset into training and testing folders. 
4. List number of samples in training and testing folders. 
5. Plot sample images from training and testing datasets. 

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import os
import glob
from collections import defaultdict
import collections
from shutil import copy
from shutil import copytree, rmtree

### Your Response/Notes

You can summarize your work for this section here/give any explanations if required. 


In [None]:
# Extract the dataset
!unzip /content/drive/MyDrive/food-101.zip

**Re-arranging dataset into folders**

In [None]:
# Function to re-arrange the dataset
def prepare_data(filepath, src, dest):
  classes_images = defaultdict(list)
  with open(filepath, 'r') as txt:
      paths = [read.strip() for read in txt.readlines()]
      for p in paths:
        food = p.split('/')
        classes_images[food[0]].append(food[1] + '.jpg')

  for food in classes_images.keys():
    print("\nCopying images into ",food)
    if not os.path.exists(os.path.join(dest,food)):
      os.makedirs(os.path.join(dest,food))
    for i in classes_images[food]:
      copy(os.path.join(src,food,i), os.path.join(dest,food,i))
  print("Copying Done!")

In [None]:
# create training data
prepare_data('/content/food-101/meta/train.txt','/content/food-101/images','/content/train')

In [None]:
n_samples = [len(os.listdir(os.path.join('/content/train', folder))) for folder in os.listdir('/content/train')]
print("Total number of samples in train folder:", sum(n_samples))

Total number of samples in train folder: 75750


In [None]:
# create testing data
prepare_data('/content/food-101/meta/test.txt','/content/food-101/images','/content/test')

In [None]:
n_samples = [len(os.listdir(os.path.join('/content/test', folder))) for folder in os.listdir('/content/test')]
print("Total number of samples in test folder:", sum(n_samples))

# Pre-process Images
## Objectives
1. Implement preprocessing codes for each model. 
2. Augment the dataset. 
3. Preview the preprocessed dataset. 

### Your Response/Notes

You can summarize your work for this section here/give any explanations if required. 


### Preprocessing steps for light model


In [None]:
import matplotlib.pyplot as plt
import numpy as np
import os
import tensorflow as tf

from tensorflow.keras.preprocessing import image_dataset_from_directory

In [None]:
train_dir = '/content/train'
test_dir = '/content/test'

BATCH_SIZE = 64
IMG_SIZE = (160, 160)

In [None]:
train_dataset = image_dataset_from_directory(train_dir,
                                             shuffle=True,
                                             batch_size=BATCH_SIZE,
                                             image_size=IMG_SIZE)

Found 75750 files belonging to 101 classes.


In [None]:
test_dataset = image_dataset_from_directory(test_dir,
                                            shuffle=True,
                                            batch_size=BATCH_SIZE,
                                            image_size=IMG_SIZE)

Found 25250 files belonging to 101 classes.


In [None]:
print("Number of train batches: %d" % tf.data.experimental.cardinality(train_dataset))
print("Number of test batches: %d" % tf.data.experimental.cardinality(test_dataset))

Number of train batches: 1184
Number of test batches: 395


In [None]:
AUTOTUNE = tf.data.AUTOTUNE

train_dataset = train_dataset.prefetch(buffer_size=AUTOTUNE)
test_dataset = test_dataset.prefetch(buffer_size=AUTOTUNE)

In [None]:
data_augmentation = tf.keras.Sequential([
  #tf.keras.layers.experimental.preprocessing.Rescaling(1./127.5),
  tf.keras.layers.experimental.preprocessing.RandomFlip('horizontal_and_vertical'),
  tf.keras.layers.experimental.preprocessing.RandomRotation((-0.2,0.2)),
  tf.keras.layers.experimental.preprocessing.RandomZoom(.2, .2),
])

In [None]:
# Create the base model from the pre-trained model MobileNet V2
IMG_SHAPE = IMG_SIZE + (3,)
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                               include_top=False,
                                               weights='imagenet')

In [None]:
image_batch, label_batch = next(iter(train_dataset))
feature_batch = base_model(image_batch)
print(feature_batch.shape)
print(image_batch.shape)
print(label_batch.shape)

### Preprocessing steps for medium model

### Preprocessing steps for heavier model

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import os
import tensorflow as tf

from tensorflow.keras.preprocessing import image_dataset_from_directory

In [None]:
train_dir = '/content/train'
test_dir = '/content/test'

BATCH_SIZE = 64
IMG_SIZE = (299, 299)

In [None]:
train_dataset = image_dataset_from_directory(train_dir,
                                             shuffle=True,
                                             batch_size=BATCH_SIZE,
                                             image_size=IMG_SIZE)

Found 75750 files belonging to 101 classes.


In [None]:
test_dataset = image_dataset_from_directory(test_dir,
                                            shuffle=True,
                                            batch_size=BATCH_SIZE,
                                            image_size=IMG_SIZE)

Found 25250 files belonging to 101 classes.


In [None]:
print("Number of train batches: %d" % tf.data.experimental.cardinality(train_dataset))
print("Number of test batches: %d" % tf.data.experimental.cardinality(test_dataset))

Number of train batches: 1184
Number of test batches: 395


In [None]:
AUTOTUNE = tf.data.AUTOTUNE

train_dataset = train_dataset.prefetch(buffer_size=AUTOTUNE)
test_dataset = test_dataset.prefetch(buffer_size=AUTOTUNE)

In [None]:
data_augmentation = tf.keras.Sequential([
  #tf.keras.layers.experimental.preprocessing.Rescaling(1./127.5),
  tf.keras.layers.experimental.preprocessing.RandomFlip('horizontal'),
  tf.keras.layers.experimental.preprocessing.RandomRotation((-0.2,0.2)),
  tf.keras.layers.experimental.preprocessing.RandomZoom(.2, .2),
])

In [None]:
# Create the base model from the pre-trained model MobileNet V2
IMG_SHAPE = IMG_SIZE + (3,)
base_model = tf.keras.applications.InceptionResNetV2(input_shape=IMG_SHAPE,
                                               include_top=False,
                                               weights='imagenet')

In [None]:
base_model.summary()

In [None]:
image_batch, label_batch = next(iter(train_dataset))
feature_batch = base_model(image_batch)
print(feature_batch.shape)
print(image_batch.shape)
print(label_batch.shape)

# Training different models
## Objectives
1. Obtain 90% accuracy in all the models trained. 
2. You're free to use any techniques for traning such as transfer learning, knowledge transfer, etc. 
3. The models should not overfit the training dataset. 
4. Measure the performance in terms of accuracy and speed of each model. 
5. Visualize the training and testing performance using TensorBoard. 

#### Optional:
1. Apply weight quantization to increase the speed of the models. 

### Your Response/Notes

1. Light Model - Fine Tuning on MobileNetv2 used. Unfreezed the layers after 100 and trained the MobileNetv2 network.

2. Medium Model - Trained by transfer learning on ResNet50 base model for the first few epochs (25). After this unfreezed all the layers of the base model and uploaded the weights saved in the transfer learning step and continued fine tuning the model.

3. Heavy Model - Followed the same method transfer learning first on the base model i.e., InceptionResNetV2 and then fine tuning using the weights from transfer learning step.


## Train Light model

In [None]:
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, CSVLogger, ReduceLROnPlateau
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
from tensorflow.keras.models import load_model

In [None]:
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
feature_batch_average = global_average_layer(feature_batch)
print(feature_batch_average.shape)

(64, 1280)


In [None]:
prediction_layer = tf.keras.layers.Dense(101, activation='softmax')
prediction_batch = prediction_layer(feature_batch_average)
print(prediction_batch.shape)

(64, 101)


In [None]:
base_model.trainable = True

In [None]:
# Let's take a look to see how many layers are in the base model
print("Number of layers in the base model: ", len(base_model.layers))

# Fine-tune from this layer onwards
fine_tune_at = 100

# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
  layer.trainable =  False

Number of layers in the base model:  154


In [None]:
inputs = tf.keras.Input(shape=(160, 160, 3))
x = data_augmentation(inputs)
x = preprocess_input(x)
x = base_model(x, training=False)
x = global_average_layer(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = prediction_layer(x)
model = tf.keras.Model(inputs, outputs)

In [None]:
base_learning_rate = 0.0001
model.compile(optimizer=tf.keras.optimizers.Adam(lr=base_learning_rate),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

In [None]:
model.summary()

In [None]:
# Loading weights from Transfer Learning model, to get pretrained weights of the last FC layer

In [None]:
model.load_weights('/content/drive/MyDrive/model_light_tl.h5')

In [None]:
log_addr = '/content/drive/MyDrive/'
csv_logger = CSVLogger(log_addr+"model_light_logs_ft.csv", append=True, separator=';')
model_checkpoint = ModelCheckpoint(log_addr+'model_light_ft.h5', monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False)
early_stopping = EarlyStopping(monitor='val_loss', patience=10, mode='min')
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, verbose=1, mode="min")

In [None]:
from datetime import datetime
logs = "logs/" + datetime.now().strftime("%Y%m%d-%H%M%S")

tboard_callback = tf.keras.callbacks.TensorBoard(log_dir = logs,
                                                 histogram_freq = 1)

In [None]:
history = model.fit(train_dataset,
                    epochs=60,
                    validation_data=test_dataset,
                    callbacks=[csv_logger,model_checkpoint,early_stopping,reduce_lr,tboard_callback])

In [None]:
model.save('/content/drive/MyDrive/model_light.h5')

In [None]:
%load_ext tensorboard

In [None]:
%tensorboard --logdir=logs

In [None]:
#### Inference
import time
from tensorflow.keras.models import load_model
model = load_model('/content/drive/MyDrive/model_light.h5')
start = time.time()
model.predict(test_dataset, steps=395)
end = time.time()
print("Time taken: ", end-start)

Time taken:  48.252076148986816


## Train Medium model

In [None]:
#### Inference

In [None]:
# Import necessary PyTorch libraries
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms, models
import numpy as np
import time

In [None]:
train_transforms = transforms.Compose([transforms.RandomResizedCrop(224),
                                       transforms.RandomHorizontalFlip(),
                                       transforms.RandomVerticalFlip(),
                                       transforms.RandomRotation(45),
                                       transforms.RandomAffine(45),
                                       transforms.ColorJitter(),
                                       transforms.ToTensor(),
                                       transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                                            std=[0.229, 0.224, 0.225])])

# Use 10-crop for Test Time Augmentation
test_transforms = transforms.Compose([transforms.Resize((224,224)),
                                      transforms.ToTensor(),
                                      transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                                            std=[0.229, 0.224, 0.225])])
                                      #transforms.TenCrop(224),
                                      #transforms.Lambda(lambda crops: torch.stack([transforms.ToTensor()(crop) for crop in crops])),
                                      #transforms.Lambda(lambda crops: torch.stack([transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225])(crop) for crop in crops]))])

# Load the datasets with ImageFolder
train_data = datasets.ImageFolder("/content/train", transform=train_transforms)
test_data = datasets.ImageFolder("/content/test", transform=test_transforms)

# Using the image datasets and the tranforms, define the dataloaders
train_loader = torch.utils.data.DataLoader(train_data, batch_size = 64, shuffle = True)
test_loader = torch.utils.data.DataLoader(test_data, batch_size= 64, shuffle = True)

In [None]:
# Load the Final saved model
checkpoint = torch.load("/content/drive/My Drive/resnet50model.pth", map_location='cpu')

model = models.resnet50(pretrained=False)

classifier = nn.Linear(2048, 101)
model.fc = classifier

model.load_state_dict(checkpoint['model_state'], strict=False)

# specify loss function (categorical cross-entropy) same as used earlier
criterion = nn.CrossEntropyLoss()

In [None]:
train_on_gpu = torch.cuda.is_available()

In [None]:

# Create list of class names
with open('food-101/meta/classes.txt', 'r') as txt:
    classes = [l.strip() for l in txt.readlines()]

# track test loss
test_loss = 0.0
class_correct = list(0. for i in range(len(classes)))
class_total = list(0. for i in range(len(classes)))

start = time.time()
#move model to gpu
model.cuda()
model.eval()

# iterate over test data

with torch.no_grad():
  for data, target in test_loader:
    # move tensors to GPU if CUDA is available
    if train_on_gpu:
      data, target = data.cuda(), target.cuda()
            
      ## For 10-crop Testing
      print(data.size())
      bs, c, h, w = data.size()
      # forward pass: compute predicted outputs by passing inputs to the model
      output = model(data.view(-1, c, h, w))
      #output = temp_output.view(bs, -1).mean(1)
      # calculate the batch loss
      #loss = criterion(output, target)
      # update average test loss 
      #test_loss += loss.item()*data.size(0)
      # convert output probabilities to predicted class
      _, pred = torch.max(output, 1)    
      # compare predictions to true label
      correct_tensor = pred.eq(target.data.view_as(pred))
      correct = np.squeeze(correct_tensor.numpy()) if not train_on_gpu else np.squeeze(correct_tensor.cpu().numpy())
      # calculate test accuracy for each object class
      for i in range(len(target)):
          label = target.data[i]
          class_correct[label] += correct[i].item()
          class_total[label] += 1
    
# average test loss
#test_loss = test_loss/len(test_loader.dataset)
#print('Test Loss: {:.6f}\n'.format(test_loss))
'''
for i in range(len(classes)):
    if class_total[i] > 0:
        print('Test Accuracy of %5s: %.2f%% (%2d/%2d)' % (classes[i], 100 * class_correct[i] / class_total[i],
                                                         np.sum(class_correct[i]), np.sum(class_total[i])))
    else:
        print('Test Accuracy of %5s: N/A (no training examples)' % (classes[i]))
'''
print('\nTest Accuracy (Overall): %.2f%% (%2d/%2d)' % (100. * np.sum(class_correct) / np.sum(class_total),
                                                      np.sum(class_correct), np.sum(class_total)))
end=time.time()
print('Time taken: ', end-start)

## Train heavy model

In [None]:
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, CSVLogger, ReduceLROnPlateau
from tensorflow.keras.applications.inception_resnet_v2 import preprocess_input
from tensorflow.keras.models import load_model

In [None]:
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
feature_batch_average = global_average_layer(feature_batch)
print(feature_batch_average.shape)

(64, 1536)


In [None]:
fc_layer = tf.keras.layers.Dense(512, activation='relu')
prediction_layer = tf.keras.layers.Dense(101, activation='softmax')
prediction_batch = prediction_layer(feature_batch_average)
print(prediction_batch.shape)

(64, 101)


In [None]:
base_model.trainable = True

In [None]:
inputs = tf.keras.Input(shape=(299, 299, 3))
x = data_augmentation(inputs)
x = preprocess_input(x)
x = base_model(x, training=False)
x = global_average_layer(x)
x = tf.keras.layers.Dropout(0.2)(x)
#x = fc_layer(x)
#x = tf.keras.layers.Dropout(0.2)(x)
outputs = prediction_layer(x)
model = tf.keras.Model(inputs, outputs)

In [None]:
# Using the pre-trained weight for last FC layer from Transfer Learning
model.load_weights('/content/drive/MyDrive/model_heavy.h5')

In [None]:
log_addr = '/content/drive/MyDrive/'
csv_logger = CSVLogger(log_addr+"model_heavy_logs_ft3.csv", append=True, separator=';')
model_checkpoint = ModelCheckpoint(log_addr+'model_heavy_ft3.h5', monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False)
early_stopping = EarlyStopping(monitor='val_loss', patience=5, mode='min')
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=3, verbose=1, mode="min")

In [None]:
from datetime import datetime
logs = "logs/" + datetime.now().strftime("%Y%m%d-%H%M%S")

tboard_callback = tf.keras.callbacks.TensorBoard(log_dir = logs,
                                                 histogram_freq = 1)

In [None]:
base_learning_rate = 0.00005
model.compile(optimizer=tf.keras.optimizers.Adam(lr=base_learning_rate),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

In [None]:
model.summary()

In [None]:
history_lr2 = model.fit(train_dataset,
                    epochs=60,
                    validation_data=test_dataset,
                    callbacks = [model_checkpoint, early_stopping, csv_logger, reduce_lr, tboard_callback])

In [None]:
%load_ext tensorboard

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


In [None]:
%tensorboard --logdir=logs

In [None]:
model.save('/content/drive/MyDrive/model_light.h5')

In [None]:
#### Inference
import time
start = time.time()
y_pred = model.predict(test_dataset, steps=395)
end = time.time()
print('Time taken:', end-start)

Time taken: 131.5124113559723


### Summary

| Type  | Model | Framework   | Training Acc | Training Loss | Test Acc | Test Loss | Speed |
|-------|-------|-------------|--------------|---------------|---------|-----------|-------|
|Light  | MobileNetv2 | Tensorflow  | 0.7819       | 0.7617        | 0.7204   | 1.0571    | 48.25 s    |
|Medium| ResNet50| PyTorch     | 0.8462       | 0.5685        | 0.8615   | 0.5290    | 198.30 s    |
|Heavy| InceptionResNetv2 | Tensorflow  | 0.9131       | 0.2933        | 0.8516   | 0.5660    | 131.51 s    |

Please note:
1. In the column "Speed", we have calculalted inference time on test dataaset, i.e., time taken to predict for 25250 images.
2. Heavy model shows a faster inference time due to tf.data pipelining, which reduces the GPU idle time significantly and hence producs a better inference time. For a ResNet50 model trained using tensorflow, our inference time will be faster than the Heavy model.
3. The notebook for Medium model i.e., ResNet50 implemented in PyTorch will be different. In this notebook we have run only inference for the PyTorch trained model.
4. The Heavy model could be tuned further, but it takes a long time to train and I did not have any GPU available.
5. The outputs in the cells do not correspond to the best scenario results achieved.