## Introduction

This tutorial will introduce you to a popular deep learning package in Python - PyTorch. Pytorch provides tensor-like(like numpy) operations with GPU support and functions for building and optimizing neural networks. Additionally, this tutorial leverages the convenient functions of inferno library that is built around of Pytorch. Inferno is a relatively new library that was released in 2017. It is still in progress of development but because it makes using PyTorch a lot simpler, this tutorial recommends using it.

### Tutorial content

In this tutorial, we will show how to train a simple convolutional neural network and make predictions on the categories of food images. 

Since the professor will cover some basics of neural networks in future lectures, this tutorial won't go deep into explaining the concept. In machine learning, neural networks are formed based on layers of connected units or nodes. Each unit calculates a weighted sum of outputs from the previous layer and apply a non-linear function to output. Convolutional neural networks is a class of deep NNs that is commonly used in image recognition because of its shift-invariant and space-invariant properties.(Two images with a slight shift difference should still be recognized as the same class by the model.)

We will cover the following topics in this tutorial:
- [Installing the libraries](#Installing-the-libraries)
- [Preprocessing the Data](#Preprocessing-the-Data)
- [Building the Model](#Building-the-Model)
- [Training the Model](#Training-the-Model)
- [Putting it Together](#Putting-it-Together)
- [Testing the Model](#Testing-the-Model)
- [Visualization](#Visualization)

## Installing the libraries

Before getting started, you'll need to install the various libraries that we will use.  

First, install PyTorch according to your python version and CUDA version for GPU. The corresponding command can be found here: http://pytorch.org/ 
Suppose you want to install on a Linux machine without GPU support using `pip`:

    $ pip3 install http://download.pytorch.org/whl/cpu/torch-0.3.1-cp36-cp36m-linux_x86_64.whl 
    
    $ pip3 install torchvision
    
Then install inferno folloing this page: https://pytorch-inferno.readthedocs.io/en/latest/installation.html
It seems like conda doesn't support inferno. If none of the provided commands in the documentation work, use the following:
    
    $ pip install inferno-pytorch
    
Note that you have to install PyTorch before installing inferno!! Also, inferno only works with python3.X, not python2.X.
(Installing packages can be a hassle, please double check before continuing)

For visualization later, we need to install tensorflow. Please check https://www.tensorflow.org/install for the correct installation command for your machine. Suppose you had a linux machine with GPU support,
    
    $ pip3 install --upgrade tensorflow-gpu

In [22]:
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, Dataset
from inferno.extensions.layers.reshape import Flatten
from inferno.trainers.basic import Trainer
from inferno.trainers.callbacks.logging.tensorboard import TensorboardLogger

## Preprocessing the Data

Now that we've installed and loaded the libraries, let's take a look at our dataset.

The dataset originally comes from a Kaggle competition to label color food images in 101 categories from apple pies to waffles: https://www.kaggle.com/kmader/food41

The training data `food_c101_n10099_r64x64x3.h5` contains three fields: ['category', 'category_names', 'images']. The `images` field contains 10099 images of size 64x64x3 in HDF5 format. The category field is a 10099x101 boolean array that has value true at the correct category for each training sample. The `category_names` field matches `0,1,2...` to the actual category names `'apple_pie', 'baby_back_ribs', 'baklava'`. In order to use it in Python, we want to transform the original HDF5 format into numpy. 

The testing data `food_test_c101_n1000_r64x64x3.h5` has the same format as training. It contains 1000 testing samples.


#Train data download: https://www.kaggle.com/kmader/food41/downloads/food_c101_n10099_r64x64x3.h5/4

#Test data download: https://www.kaggle.com/kmader/food41/downloads/food_test_c101_n1000_r64x64x3.h5/4

In [24]:
import h5py

trainf = h5py.File('food_c101_n10099_r64x64x3.h5', 'r+')
testf = h5py.File('food_test_c101_n1000_r64x64x3.h5', 'r+')
print(list(trainf.keys()))  # display fields: ['category', 'category_names', 'images']

### Train Data
train_images = trainf['images'].value
category_names = trainf['category']
# Change type to unicode for category names
category_names = category_names.astype(np.unicode_)
train_labels = trainf['category'].value
# Transform category from one-hot boolean array to label
_, train_labels = np.where(train_labels == True)
print("train data size:")
print(train_images.shape)

### Same for Test Data
test_images = testf['images'].value
test_labels = testf['category'].value
# Transform category from one-hot boolean array to label
_, test_labels = np.where(test_labels == True)
print("test data size:")
print(test_images.shape)

### Save to numpy file for future use
np.save('food_train', train_images)
np.save('food_train_labels', train_labels)
np.save('food_test', test_images)
np.save('food_test_labels', test_labels)

['category', 'category_names', 'images']
train data size:
(10099, 64, 64, 3)
test data size:
(1000, 64, 64, 3)


## Building the Model

PyTorch provides many built-in neural network modules to choose for different types of layers. For example, it has `Conv1d` and `Conv2d` for convolutional, `Maxpool1d` or `Avgpool1d` for pooling, `Linear` for linear function, and `ReLU` or `Sigmoid` for non-linearity. Look here for more: http://pytorch.org/docs/master/nn.html

Here we will be constructing a sequence of Conv1d and Maxpool1d layers, and a final linear layer in the end.

In [25]:
def model_fn():
    return nn.Sequential(
        nn.Conv1d(3, 192, 3, 1, 1),  # in_channel, out_channel, kernel, stride, padding
        nn.MaxPool1d(2, 2),  # kernel, stride (2 shrinks the size to half)
        nn.Conv1d(192, 192, 3, 1, 1),
        nn.MaxPool1d(2, 2),
        Flatten(),
        nn.Linear(in_features=1024*192, out_features=101),
    )

## Training the Model

Using inferno library, the training process is automated by the trainer once you have the data and set all the parameters correctly. 

First, let's prepare the dataset. The inferno Trainer takes a DataLoader which requires a Dataset class. It is common to subclass the abstract Dataset class so that you can initialize the data specifically in `__init__()`. One of the functions required to implement is `__getitem__(i)`. It takes an index parameter to specify which data sample in wanted. For example, if i is 1, then you may return data[1]. Notice that the returned data must be a torch tensor type and the corresponding label needs to be returned as well. The `__len__` function indicates how many samples there are in the dataset.  

In [26]:
class FoodDataset(Dataset):
    """
    Dataset yields features and labels
    """
    def __init__(self, name, test=False):
        super(FoodDataset, self).__init__()
        self.name = name
        self.data = np.load('food_{}.npy'.format(name))
        # Flatten data
        size = self.data.shape[0]
        self.data = np.reshape(self.data, (size, -1 , 3))
        self.data = self.data.transpose(0, 2, 1)
        if test: # dummy array
            self.labels = [np.zeros((d[1].shape[0],), dtype=np.int32) for d in self.data]
        else:
            self.labels = np.load('food_{}_labels.npy'.format(name))  
        self.len = len(self.labels)

    def __getitem__(self, item):  # return feature and label
        print(self.data[item], self.labels[item])
        return torch.from_numpy(self.data[item]).float(), self.labels[item]
        
    def __len__(self):
        return self.len


Now we have a FoodDataset class, in the training process, we initialize a DataLoader class and specify the batch size. Then bind it to the trainer. The trainer will perform forward pass and backpropagation as many times as specified. Below are some important settings:
* build_criterion: the loss function used, here we use 'CrossEntropyLoss' built in PyTorch
* build_metric: used for visualization later
* build_optimizer: optimization function, here we use 'Adam' built in PyTorch. There are also other methods such as 'SGD', 'Adagrad'...
* save_every: how often to save our trained model
* save_to_directory: directory to save our trained model
* set_max_num_epochs: number of times to run in terms of 'epochs' or 'iterations'
* build_logger: used for visualization

Notice that some parameters use keywords from PyTorch. Inferno simplified the usage of functions in PyTorch. For example, in PyTorch you specify the optimizer and perform backprop explictly, where in inferno everything is already handled.

PyTorch:
    
    $ optimizer = optim.Adam([var1, var2], lr = 0.0001)
      for i in range(epochs):
          ....
          optimizer.step()


In [17]:
def train_model():
    model = model_fn()
    kwargs = {'num_workers': args['num_workers'], 'pin_memory': True} if args['cuda'] else {}
    train_loader = DataLoader(
        FoodDataset('train'), shuffle=True, batch_size=args['batch_size'], **kwargs)

    # Build trainer
    trainer = Trainer(model) \
        .build_criterion('CrossEntropyLoss') \
        .build_metric('CategoricalError') \
        .build_optimizer('Adam') \
        .save_every((1, 'epochs')) \
        .save_to_directory(args['save_directory']) \
        .set_max_num_epochs(20) \
        .build_logger(TensorboardLogger(log_scalars_every=(1, 'iteration'),
                                        log_images_every='never'),
                      log_directory=args['save_directory'])

    # Bind loaders
    trainer \
        .bind_loader('train', train_loader)

    if args['cuda']:
        trainer.cuda()  # Move data to GPU if available

    # Go!
    trainer.fit()  # Start training!!
    trainer.save()

## Putting it Together

Let's start training!

*If you only have a CPU this could be really slow.....

In [27]:
args = {}
args['batch_size'] = 64
args['save_directory'] = 'output-log'
args['cuda'] = torch.cuda.is_available()
args['num_workers'] = 4 
train_model()

Output should be something like this indicating your progress....

[+][2018-04-02 02:14:03.100711] Training iteration 0 (batch 1 of epoch 4).  
[+][2018-04-02 02:14:03.234169] Training iteration 1 (batch 2 of epoch 4).  
[+][2018-04-02 02:14:03.372486] Training iteration 2 (batch 3 of epoch 4).  
[+][2018-04-02 02:14:03.516519] Training iteration 3 (batch 4 of epoch 4).  
[+][2018-04-02 02:14:03.665878] Training iteration 4 (batch 5 of epoch 4).  
[+][2018-04-02 02:14:03.817522] Training iteration 5 (batch 6 of epoch 4).  
[+][2018-04-02 02:14:03.965428] Training iteration 6 (batch 7 of epoch 4).  
[+][2018-04-02 02:14:04.112429] Training iteration 7 (batch 8 of epoch 4).  
[+][2018-04-02 02:14:04.258638] Training iteration 8 (batch 9 of epoch 4).  
[+][2018-04-02 02:14:04.404254] Training iteration 9 (batch 10 of epoch 4).  
[+][2018-04-02 02:14:04.550624] Training iteration 10 (batch 11 of epoch 4).  
.....

## Testing the Model

Now it's time to see how well our model is trained! To test, we repeat similar steps. First, we initialize a FoodDataset with test=True, this way the labels would not be used. Then we loop through the DataLoader and feed each sample into the model. The output of a single sample is a 101x1 array where the index of the max value is the predicted category.

Remember to set shuffle=False in dataloader because we want to preserve the order to compare to the ground truth.

In [20]:
from torch.autograd import Variable

# Model setup, similar to training
model = Trainer().load('output-log').model
model.eval()  # Very important!!
kwargs = {'num_workers': args['num_workers'], 'pin_memory': True} if args['cuda'] else {}
test_loader = DataLoader(
    FoodDataset('test', test=True), shuffle=False,
    batch_size=1, **kwargs)

# Make predictions
result = []
for sample,_ in test_loader:
    if args['cuda']:
        model.cuda()
        sample = sample.cuda()
    output = model(Variable(sample))
    pred = torch.max(output, 1)[1]
    result.append(pred.data[0])

# Calculate accuracy
label = np.load('food_test_labels.npy')
count = 0
total = 0
for i, pred in enumerate(result):
    total = total + 1
    if result[i] == label[i]:
        count += 1
print(count/total)


0.035


## Visualization

As you can see, there isn't much output during the training. It would be a waste of time if the only way to verify our model is by testing in the end. Luckily, inferno can be used with tensorboard to log information and track the current loss.

Start running tensorboard in a bash shell

    $ tensorboard --logdir <directory>

The directory should be the same as `save_to_directory` in Trainer. In our case, this would be `args['save_directory'] = output-log` .

Then, open up your browser and go to port 6006 on the localhost. And you will see wonderful metrics demonstrating the training loss and error.

    http://localhost:6006
    
<img src="tensorboard.png">

From visualization, we can speculate the reason behind the low testing accuracy. It is probable that the model has overfitted.

## Summary and references

This tutorial gives an example of how to use the inferno package to train deep neural networks in Python. There is a lot more you can do with neural networks! Futhermore, it usualy requires a lot of tuning and optimization to acheive the best results 
For more complicated usage, you can visit the documentation pages below.

1. inferno documentation: https://pytorch-inferno.readthedocs.io/en/latest/
2. Pytorch documentation: http://pytorch.org/docs/0.3.1/