# Road Follower - Train Model

In this notebook we will train a neural network to take an input image, and output a set of x, y values corresponding to a target.

We will be using PyTorch deep learning framework to train ResNet18 neural network architecture model for road follower application.

In [12]:
import torch
import torch.optim as optim
import torch.nn.functional as F
import torch.nn as nn
import torchvision
import torchvision.datasets as datasets
import torchvision.models as models
from torchvision.transforms import ToTensor
import glob
import PIL.Image
import os
import numpy as np

### Download and extract data

Before you start, you should upload the ``road_following_<Date&Time>.zip`` file that you created in the ``data_collection.ipynb`` notebook on the robot. 

> If you're training on the JetBot you collected data on, you can skip this!

You should then extract this dataset by calling the command below:

In [13]:
!unzip -q road_following.zip

'unzip' is not recognized as an internal or external command,
operable program or batch file.


You should see a folder named ``dataset_all`` appear in the file browser.

### Create Dataset Instance

Here we create a custom ``torch.utils.data.Dataset`` implementation, which implements the ``__len__`` and ``__getitem__`` functions.  This class
is responsible for loading images and parsing the x, y values from the image filenames.  Because we implement the ``torch.utils.data.Dataset`` class,
we can use all of the torch data utilities :)

We hard coded some transformations (like color jitter) into our dataset.  We made random horizontal flips optional (in case you want to follow a non-symmetric path, like a road
where we need to 'stay right').  If it doesn't matter whether your robot follows some convention, you could enable flips to augment the dataset.

In [14]:
import glob

In [15]:
class XYDataset(torch.utils.data.Dataset):
    
    def __init__(self):
        path_person = './iit delhi/person/*'
        path_animal = './iit delhi/animal/*'
        path_roadCones = './iit delhi/roadCones/*'
        path_zebra = './iit delhi/zebra/*'
        path_not_zebra = './iit delhi/notZebra/*'

        self.filenames = []
        self.labels = []

        for img in glob.glob(path_person):
            self.filenames.append(img)
            self.labels.append(0)
        for img in glob.glob(path_animal):
            self.filenames.append(img)
            self.labels.append(1)
        for img in glob.glob(path_roadCones):
            self.filenames.append(img)
            self.labels.append(2)
        for img in glob.glob(path_zebra):
            self.filenames.append(img)
            self.labels.append(3)
        for img in glob.glob(path_not_zebra):
            self.filenames.append(img)
            self.labels.append(4)
    
    def __len__(self):
        return len(self.filenames)
    
    def __getitem__(self, idx):
        image_path = self.filenames[idx]
        image = PIL.Image.open(image_path)
        image = ToTensor()(image)
        
        return image, self.labels[idx]

    def printOut(self):
        return set(self.labels)
    
dataset = XYDataset()

In [16]:
dataset.printOut(), len(dataset)

({0, 1, 2}, 5537)

### Split dataset into train and test sets
Once we read dataset, we will split data set in train and test sets. In this example we split train and test a 90%-10%. The test set will be used to verify the accuracy of the model we train.

In [17]:
test_percent = 0.1
num_test = int(test_percent * len(dataset))
train_dataset, test_dataset = torch.utils.data.random_split(dataset, [len(dataset) - num_test, num_test])

### Create data loaders to load data in batches

We use ``DataLoader`` class to load data in batches, shuffle data and allow using multi-subprocesses. In this example we use batch size of 64. Batch size will be based on memory available with your GPU and it can impact accuracy of the model.

In [18]:
train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=128,
    shuffle=True,
    num_workers=0
)

test_loader = torch.utils.data.DataLoader(
    test_dataset,
    batch_size=128,
    shuffle=True,
    num_workers=0
)

### Define Neural Network Model 

We use ResNet-18 model available on PyTorch TorchVision. 

In a process called transfer learning, we can repurpose a pre-trained model (trained on millions of images) for a new task that has possibly much less data available.


More details on ResNet-18 : https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py

More Details on Transfer Learning: https://www.youtube.com/watch?v=yofjFQddwHE 

In [19]:
# model = models.resnet50(pretrained=True)
model = models.alexnet(pretrained=True)
for param in model.parameters():
    param.requires_grad = False

# for param in model.layer4.parameters():
#     param.requires_grad = True
# model.layer4.requires_grad_ = True

In [20]:
from torchsummary import summary
summary(torch.nn.Sequential(*(list(model.children())[:-1])), (3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 64, 55, 55]          23,296
              ReLU-2           [-1, 64, 55, 55]               0
         MaxPool2d-3           [-1, 64, 27, 27]               0
            Conv2d-4          [-1, 192, 27, 27]         307,392
              ReLU-5          [-1, 192, 27, 27]               0
         MaxPool2d-6          [-1, 192, 13, 13]               0
            Conv2d-7          [-1, 384, 13, 13]         663,936
              ReLU-8          [-1, 384, 13, 13]               0
            Conv2d-9          [-1, 256, 13, 13]         884,992
             ReLU-10          [-1, 256, 13, 13]               0
           Conv2d-11          [-1, 256, 13, 13]         590,080
             ReLU-12          [-1, 256, 13, 13]               0
        MaxPool2d-13            [-1, 256, 6, 6]               0
AdaptiveAvgPool2d-14            [-1, 25

ResNet model has fully connect (fc) final layer with 512 as ``in_features`` and we will be training for regression thus ``out_features`` as 1

Finally, we transfer our model for execution on the GPU

In [21]:
# Resnet
# model.fc = torch.nn.Linear(2048,3)

# Alexnet
model.classifier = nn.Sequential(
            nn.Dropout(p=0.5),
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, 2)
        )
device = torch.device('cpu')
model = model.to(device)

### Train Regression:

We train for 50 epochs and save best model if the loss is reduced. 

In [24]:
# for param in model.parameters():
#     param.requires_grad = True

NUM_EPOCHS = 4
BEST_MODEL_PATH = 'best_steering_model_xy.pth'
best_loss = 1e9

optimizer = optim.Adam(model.parameters())

for epoch in range(NUM_EPOCHS):
    
    model.train()
    train_loss = 0.0
    for i, (images, labels) in enumerate(iter(train_loader)):
        images = images.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = F.cross_entropy(outputs, labels)
        train_loss += float(loss)
        loss.backward()
        optimizer.step()
        print(f'Step: {i+1},  Epoch: {epoch+1}/{NUM_EPOCHS},  Loss: {loss.item()}')
    train_loss /= len(train_loader)
    
    model.eval()
    test_loss = 0.0
    for images, labels in iter(test_loader):
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        loss = F.cross_entropy(outputs, labels)
        test_loss += float(loss)
    test_loss /= len(test_loader)
    
    print('%f, %f' % (train_loss, test_loss))
    if test_loss < best_loss:
        torch.save(model.state_dict(), BEST_MODEL_PATH)
        best_loss = test_loss

Step: 1,  Epoch: 1/4,  Loss: 0.00035020645009353757
Step: 2,  Epoch: 1/4,  Loss: 0.008379651233553886
Step: 3,  Epoch: 1/4,  Loss: 3.609030272855307e-06
Step: 4,  Epoch: 1/4,  Loss: 6.365256558638066e-05
Step: 5,  Epoch: 1/4,  Loss: 0.003242322476580739
Step: 6,  Epoch: 1/4,  Loss: 0.0002571040531620383
Step: 7,  Epoch: 1/4,  Loss: 6.51925624595151e-09
Step: 8,  Epoch: 1/4,  Loss: 1.221592174260877e-05
Step: 9,  Epoch: 1/4,  Loss: 3.7252898543727042e-09
Step: 10,  Epoch: 1/4,  Loss: 8.381876881458084e-08
Step: 11,  Epoch: 1/4,  Loss: 0.01873539201915264
Step: 12,  Epoch: 1/4,  Loss: 0.01912238821387291
Step: 13,  Epoch: 1/4,  Loss: 0.00026010500732809305
Step: 14,  Epoch: 1/4,  Loss: 0.0004089302965439856
Step: 15,  Epoch: 1/4,  Loss: 3.6786383361686603e-07
Step: 16,  Epoch: 1/4,  Loss: 0.007668440230190754
Step: 17,  Epoch: 1/4,  Loss: 5.3004077926743776e-05
Step: 18,  Epoch: 1/4,  Loss: 3.9487727576670295e-07
Step: 19,  Epoch: 1/4,  Loss: 0.0013589721638709307
Step: 20,  Epoch: 1/4, 

In [25]:
model.eval()
correct = 0
for images, labels in iter(test_loader):
    images = images.to(device)
    labels = labels.to(device)
    outputs = model(images)
    outputs = torch.argmax(F.softmax(outputs), axis=1)
    outputs = (outputs>0.5).float()
    print(outputs.shape, labels.shape)
    print((outputs == labels).float())
    correct += (outputs == labels).float().sum()
print("Epoch {}/{}, Accuracy: {:.3f}".format(epoch+1,NUM_EPOCHS, correct/(len(dataset)*0.1)))



  import sys


torch.Size([128]) torch.Size([128])
tensor([0., 1., 1., 1., 0., 1., 1., 1., 1., 0., 0., 1., 1., 0., 1., 1., 1., 1.,
        1., 1., 0., 1., 0., 1., 1., 1., 1., 0., 0., 1., 0., 1., 0., 1., 0., 0.,
        1., 0., 0., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 0., 1., 0., 0., 1.,
        0., 1., 1., 1., 1., 1., 0., 1., 1., 0., 0., 0., 1., 1., 1., 1., 1., 0.,
        0., 0., 1., 1., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 1., 1., 1., 0.,
        1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 0., 0., 0., 1., 1., 1., 1., 1.,
        1., 0., 0., 0., 1., 0., 1., 1., 1., 0., 0., 0., 1., 0., 0., 0., 1., 1.,
        1., 0.])
torch.Size([128]) torch.Size([128])
tensor([1., 0., 0., 1., 1., 1., 0., 0., 0., 1., 1., 0., 0., 0., 1., 0., 0., 0.,
        1., 0., 1., 1., 1., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 1., 0., 1.,
        0., 1., 0., 1., 1., 0., 0., 0., 1., 0., 0., 1., 0., 0., 1., 1., 0., 1.,
        1., 1., 1., 1., 1., 1., 0., 0., 1., 0., 0., 0., 0., 1., 0., 1., 1., 0.,
        1., 1., 0., 0., 1., 0.,

Once the model is trained, it will generate ``best_steering_model_xy.pth`` file which you can use for inferencing in the live demo notebook.

If you trained on a different machine other than JetBot, you'll need to upload this to the JetBot to the ``road_following`` example folder.