# ASU Hackathon Mascot Detection notebook

In this notebook, we will train the model on the data set collected using ``classification_task/data_collection.ipynb`` notebook on the Jetbot

We will use PyTorch Deep Learning framework to train model. We will use pre-trained ResNet-18 model as it is pretty much suited for our purposes. The pre-trained ResNet-18 provided by PyTorch is trained on ImageNet which has 1000 classes however the dataset we collected has 5 different classes (or labels), thus we will change last layer of the ResNet-18 in this notebook. 

We will be training on Jetson Nano for this notebook, however it is recommended to use Server grade GPUs for training purpose. Remember, number of GPU cores will have tremendous effect on execution time. The execution time of training neural network depends on following things:
1. Neural Network Architecture (layers, depth etc.),
2. Number of epoch, 
3. Number of training data,
4. Batch Size

More details on :
1. [ResNet Image Recognition Architecture](https://arxiv.org/abs/1512.03385)
2. [PyTorch Pre-Trained Networks](https://pytorch.org/docs/stable/torchvision/models.html)

So lets get started on training model for classification task!!

We will use ``autotime`` library for timing each notebook cell. Just for convenience, such that you will kind of have idea which is the most time consuming cell.

In [None]:
!pip install torch==1.1.0 -f https://download.pytorch.org/whl/cu100/stable # CUDA 10.0 build
!pip install torchvision==0.3.0

In [None]:
!pip install git+https://github.com/cpcloud/ipython-autotime

In [None]:
%load_ext autotime

### Import Libraries

As we had discussed, we will be using PyTorch deep learning framework. To know best about this framwork, please have a look at [PyTorch talk from GTC On-Demand](https://on-demand-gtc.gputechconf.com/gtcnew/sessionview.php?sessionName=s8817-pytorch%3a+a+fast+and+flexible+deep+learning+framework+%28presented+by+facebook%29)

In [None]:
import torch
from torch import nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision
import torchvision.datasets as datasets
import torchvision.models as models
import torchvision.transforms as transforms

### Upload and extract dataset (if you are using Server Grade GPU)

Before you start, you should upload the ``dataset.zip`` file that you created in the ``data_collection.ipynb`` notebook on the robot.

You should then extract this dataset by calling the command below

In [None]:
!wget https://asu-hackathon.s3.amazonaws.com/dataset/asu_mascots_dataset.zip

In [1]:
!unzip -o -q asu_mascots_dataset.zip

You should see a folder named ``dataset`` appear in the file browser.

### Create dataset instance

Now we use the ``ImageFolder`` dataset class available with the ``torchvision.datasets`` package.  We attach transforms from the ``torchvision.transforms`` package to prepare the data for training.  

In [None]:
image_width = 224
image_height = 224
dataset = datasets.ImageFolder(
    'asu_dataset',
    transforms.Compose([
        transforms.ColorJitter(0.1, 0.1, 0.1, 0.1),
        transforms.Resize((image_width, image_height)),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
)

### Split dataset into train and test sets

Next, we split the dataset into *training* and *test* sets.  The test set will be used to verify the accuracy of the model we train.

In [None]:
test_percent = 0.2
num_test = int(test_percent * len(dataset))
train_dataset, test_dataset = torch.utils.data.random_split(dataset, [len(dataset) - 50, 50])

### Create data loaders to load data in batches

We'll create two ``DataLoader`` instances, which provide utilities for shuffling data, producing *batches* of images, and loading the samples in parallel with multiple workers.

In [None]:
train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=64,
    shuffle=True,
    num_workers=4
)

test_loader = torch.utils.data.DataLoader(
    test_dataset,
    batch_size=64,
    shuffle=True,
    num_workers=4
)

### Define the neural network

Now, we define the neural network we'll be training.  The *torchvision* package provides a collection of pre-trained models that we can use.

In a process called *[transfer learning](https://youtu.be/yofjFQddwHE)*, we can repurpose a pre-trained model (trained on millions of images) for a new task that has possibly much less data available.

Important features that were learned in the original training of the pre-trained model are re-usable for the new task.  We'll use the ``resnet-18`` model.

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = models.resnet18(pretrained=True)

The ``resnet-18`` model was originally trained for a dataset that had 1000 class labels, but our dataset only has six class labels!  We'll replace
the final layer with a new, untrained layer that has only six outputs.  

In [None]:
model.fc = torch.nn.Linear(512,10) #original

criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.0005)
#optimizer = optim.Adam(model.parameters())
model.to(device)

Finally, we transfer our model for execution on the GPU

If you are using Jetson Nano, this might take little more time as compared to server grade GPUs, thus we will use ``applause`` sound clip to alert you that Jetson Nano has completed the training. 
For 100 each class i.e. 500 images it takes about 80 secs per epoch for transfer learning on ResNet-18. 

> Note: If you are using on server grade GPU, feel free to comment out following cell and last three lines from ``Train the Neural Network`` cell. 

### Create Load Dataset in to Batchs and Train the neural network

In this, we will train model with different batch sizes with minimum batch size as 4 and maximum batch size as 16. For every batch size, we'll create two ``DataLoader`` instances, which provide utilities for shuffling data, producing *batches* of images, and loading the samples in parallel with multiple workers. For each batch, we will store best model based on test accuracy named ``best_model_<BATCH_SIZE>.pth``
> We use different batch sizes to perform analysis of batch size vs training time and batch size vs accuracy. Once we are confident which is the best batch size based on analysis, we do not need to perform loops of different bacth sizes.

Using the code below we will train the neural network for 30 epochs, saving the best performing model after each epoch.

> An epoch is a full run through our data.

In [None]:
epochs = 100
steps = 0
running_loss = 0
print_every = 10
train_losses, test_losses = [], []

for epoch in range(epochs):
    for inputs, labels in train_loader:
        steps += 1
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        logps = model.forward(inputs)
        loss = criterion(logps, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
        
        if steps % print_every == 0:
            test_loss = 0
            accuracy = 0
            model.eval()
            with torch.no_grad():
                for inputs, labels in test_loader:
                    inputs, labels = inputs.to(device), labels.to(device)
                    logps = model.forward(inputs)
                    batch_loss = criterion(logps, labels)
                    test_loss += batch_loss.item()
                    
                    ps = torch.exp(logps)
                    top_p, top_class = ps.topk(1, dim=1)
                    equals = top_class == labels.view(*top_class.shape)
                    accuracy += torch.mean(equals.type(torch.FloatTensor)).item()

            train_losses.append(running_loss/len(train_loader))
            test_losses.append(test_loss/len(test_loader))                    
            print(f"Epoch {epoch+1}/{epochs}.. "
                  f"Train loss: {running_loss/print_every:.3f}.. "
                  f"Test loss: {test_loss/len(test_loader):.3f}.. "
                  f"Test accuracy: {accuracy/len(test_loader):.3f}")
            running_loss = 0
            model.train()

torch.save(model, 'asu_best_model.pth')


Now we have completed training model for our own dataset with different batch sizes. You will see model files (.pth) for each batch size. 
So we performed:
1. Split Dataset into 80% train data and 20% test data
2. We created different batch sizes of data with minimum batch sizes as ``MIN_BATCH_SIZE`` and maximum batch sizes as ``MAX_BATCH_SIZE``
3. We used Cross Entropy to calculate loss and Stochastic Gradient Descend as Optimizer
4. We used ResNet-18 architecture, however we changed output layer size to accomodate our project requirements such that only 5 classes are being used. 
5. We saved best model with highest accuracy among all the epochs.

Once that is finished, you should see a file ``best_model.pth`` in the Jupyter Lab file browser.  

> Note: Select ``Right click`` -> ``Download`` to download the model to your workstation