## Front Matter
I want the Notebook to be as informative as possible, but model creating and training process follows some standard procedure that I do not want to repeat. Therefore, if you can, spend time reading the `PROLOGUE/Routine.ipynb` Notebook first.

# Paper Implementation - VGG16
Hello, this is my first milestone project - implementation of the VGG16 architecture from the paper ["Very Deep Convolutional Networks for Large-Scale Image Recognition"](https://arxiv.org/abs/1409.1556). The paper explored the effect of increasing layers on a model based on the filter size of $3*3$ that we previously explored in the `TinyCNN` notebook (that is also the reason why that architecture is called TinyVGG). The architecture was the runner-up in the ImageNet 2014 Challenge for classification.

16 in VGG16 stands for 16 layers, where they are based on two basic units: convolution with filter size $3*3$, stride $1$, padding $1$ and max-pooling with window size $2*2$, stride $2$. The table shown below, taken from the paper abovem, is the architecture for each of the VGG configuration. In this notebook, we will implement the VGG16-D one.

![](Screenshot 2022-12-23 at 14-16-30 () - 10.48550_arxiv.1409.1556.pdf.png)

In this first notebook, we will focus on getting and transforming the data first.

## Downloading and extracting data
The task for our model will be classification, using a bigger dataset called [Food101](https://pytorch.org/vision/main/generated/torchvision.datasets.Food101.html). This is a built-in PyTorch dataset, so the processing can be fairly straightforward. However, it is not fun, so let's take the [Kaggle version](https://www.kaggle.com/datasets/kmader/food41?select=food_c101_n1000_r384x384x3.h5) and process it to what we want.

First, downloading data from Kaggle. The easy way: you can download the zip file (~6 GB), upload it to Google Drive, and then mount Google Drive to Colab. . The slightly harder: you will need to sign up and obtain a Kaggle token, and then use the `kaggle` module to download the data. Let's do that.

In [None]:
!pip install torchmetrics

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting torchmetrics
  Downloading torchmetrics-0.11.0-py3-none-any.whl (512 kB)
[K     |████████████████████████████████| 512 kB 4.3 MB/s 
Installing collected packages: torchmetrics
Successfully installed torchmetrics-0.11.0


In [None]:
!pip install --upgrade mlxtend kaggle

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting mlxtend
  Downloading mlxtend-0.21.0-py2.py3-none-any.whl (1.3 MB)
[K     |████████████████████████████████| 1.3 MB 4.1 MB/s 
Installing collected packages: mlxtend
  Attempting uninstall: mlxtend
    Found existing installation: mlxtend 0.14.0
    Uninstalling mlxtend-0.14.0:
      Successfully uninstalled mlxtend-0.14.0
Successfully installed mlxtend-0.21.0


First, we need a variable to keep track of the environment we are in as it is different to run the notebook right on Kaggle and run it anywhere else (from the teaching of a Kaggle Grandmaster)

In [None]:
# Import modules
import os
from pathlib import Path

# Keep an environment variable
iskaggle = os.environ.get('KAGGLE_KERNEL_RUN_TYPE', '')

Next, based on the [docs](https://github.com/Kaggle/kaggle-api), we will need to create a `/.kaggle/kaggle.json`. You can go to File Explorer and create a folder in your machine, or we can code that. I will code.

In [None]:
# Paste your API here (I have run and then deleted mine)
creds = ''

In [None]:
cred_path = Path('~/.kaggle/kaggle.json').expanduser()
if not cred_path.exists():
    cred_path.parent.mkdir(exist_ok=True)
    cred_path.write_text(creds)
    cred_path.chmod(0o600)

Next, let's use the method `dataset_download_cli` to download and unzip data files.

In [None]:
# Sanity check
!kaggle datasets list

ref                                                             title                                             size  lastUpdated          downloadCount  voteCount  usabilityRating  
--------------------------------------------------------------  -----------------------------------------------  -----  -------------------  -------------  ---------  ---------------  
meirnizri/covid19-dataset                                       COVID-19 Dataset                                   5MB  2022-11-13 15:47:17          11624        344  1.0              
michals22/coffee-dataset                                        Coffee dataset                                    24KB  2022-12-15 20:02:12           2316         63  1.0              
thedevastator/jobs-dataset-from-glassdoor                       Salary Prediction                                  3MB  2022-11-16 13:52:31           7233        155  1.0              
thedevastator/unlock-profits-with-e-commerce-sales-data         E-Commerce 

In [None]:
path = Path('kmader/food41')

In [None]:
if not iskaggle and not path.exists():
    import kaggle
    kaggle.api.dataset_download_cli(str(path))

Downloading food41.zip to /content


100%|██████████| 5.30G/5.30G [02:58<00:00, 31.9MB/s]







We have the data in the zip file. Now all we need to do is to extract them out.

In [None]:
folder_path = Path('food41')
os.mkdir(folder_path)

In [None]:
import zipfile
zipfile.ZipFile(f'{folder_path}.zip').extractall(folder_path)

## Food-101
The dataset is introduced in this [paper](https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/ttps://), consisting of 101 classes, each with 750 training and 250 testing examples, totalling 1000 images each. The dataset comes with a metadata folder, giving information about which image should go into which subset, which is great! The data was already split into traing and testing examples, but we also need a *validation set*. The testing examples were manually selected to contain noise and challenge the model, so we will not touch that, but we will split the training set further to create a validation set. Now, creating a good validation set is [an art](https://www.fast.ai/posts/2017-11-13-validation-sets.html), but here we will jsut use good ol' random splitting.

First, we will need to format the in the `images` folder into `train` and `test` folders. Next, we will load the data. The process is quite the same, what's new this time is we will random split the training data into training set and validation set, as well as applying more transformation.

In [None]:
# Generic torch process
from torch import nn
import torch
from torch.utils.data import DataLoader

# Specifically for computer vision
import torchvision
from torchvision import datasets, transforms

# Other module(s)
import matplotlib.pyplot as plt
import gc
import json
import shutil
import itertools 

In [None]:
with open('/content/food41/meta/meta/train.json', 'r') as fp:
    train_dict = json.load(fp)
with open('/content/food41/meta/meta/test.json', 'r') as fp:
    test_dict = json.load(fp)
print(len(train_dict['apple_pie']), train_dict['apple_pie'][-10:])
print(len(test_dict['apple_pie']), test_dict['apple_pie'][-10:])

750 ['apple_pie/960233', 'apple_pie/960669', 'apple_pie/962315', 'apple_pie/966595', 'apple_pie/973088', 'apple_pie/973428', 'apple_pie/98352', 'apple_pie/98449', 'apple_pie/987860', 'apple_pie/997124']
250 ['apple_pie/885848', 'apple_pie/886793', 'apple_pie/904832', 'apple_pie/908367', 'apple_pie/963140', 'apple_pie/981895', 'apple_pie/984571', 'apple_pie/986844', 'apple_pie/99556', 'apple_pie/997950']


The list value of a dictionary key contains the strings that are the file paths of the images without the extension. We will use this to copy the images to proper folders.

In [None]:
os.mkdir('data')

In [None]:
new_data_path = Path('data')
original_data_path = Path('food41/images')
new_folders = ['train', 'test']
for folder in new_folders:
    if folder == 'train':
        for key, value in train_dict.items():
            value_set = set(value)
            if not os.path.exists(new_data_path/folder/key):
                os.mkdir(new_data_path/folder/key)
            for image in os.listdir(original_data_path/key):
                image_path = key + '/' + image
                image_path = image_path.split('.')[0]
                if image_path in value_set:
                    shutil.copy(original_data_path/key/image, new_data_path/folder/key/image)
    else:
        for key, value in test_dict.items():
            value_set = set(value)
            if not os.path.exists(new_data_path/folder/key):
                os.mkdir(new_data_path/folder/key)
            for image in os.listdir(original_data_path/key):
                image_path = key + '/' + image
                image_path = image_path.split('.')[0]
                if image_path in value_set:
                    shutil.copy(original_data_path/key/image, new_data_path/folder/key/image)

And we are done! Now we can load data as we like!

But first, let's write some transformations for the images to perform data augmentation.

## Data Augmentation
This is a technique to generate more training data by performing operations on the original data (such as flipping, shearing, rotating for images). The artificial data should generate the same output as the original one, but they are different, so hopefully the model is encouraged to learn the general pattern of the data instead of overfitting. Data augmentation is usually seen in training, but there is a technique called "test-time augmentation" that has been passed down among the Kaggle Grandmaster and implemented in the library [fastai](https://docs.fast.ai/learner.html#tta).

For the transformations, first we have the staples: `ToTensor()`, which turns images to `torch.tensor` objects. You may notice the `Normalize()` with some arbitrary parameters (the first is a list of means for each color channel and the second is a list of standard deviations for each color channel). These are parameters for the normalization of [ImageNet](https://image-net.org/index.php) dataset and are required by all PyTorch pre-trained models. This may not necessarily be true for our data, but it can be used. PyTorch also recommends having images of size $224*224$ pixels, so we use resize to that. The other transformations do what it is called for, with parameters for angle, probability, etc. (Explore more transformations on PyTorch [docs](https://pytorch.org/vision/stable/transforms.html).) Finally, we call Compose to stack these transformations together.



In [None]:
train_transforms = transforms.Compose([transforms.RandomResizedCrop(224),
                                      transforms.RandomRotation(35),
                                      transforms.RandomVerticalFlip(0.27),
                                      transforms.RandomHorizontalFlip(0.27),
                                      transforms.ToTensor(),
                                      transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])

valid_n_test_transforms = transforms.Compose([transforms.Resize(224),
                                       transforms.ToTensor(),
                                       transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])

In [None]:
data_dir = Path('data')
train_dir = data_dir/'train'
test_dir = data_dir/'test'

In [None]:
train_dataset = datasets.ImageFolder(train_dir, transform = train_transforms)