# Road Follower - Train Model

In this notebook we will train a neural network to take an input image, and output a set of x, y values corresponding to a target.

We will be using PyTorch deep learning framework to train ResNet18 neural network architecture model for road follower application.

In [42]:
import torch
import torch.optim as optim
import torch.nn.functional as F
import torch.nn as nn
import torchvision
import torchvision.datasets as datasets
import torchvision.models as models
from torchvision.transforms import ToTensor
import glob
import PIL.Image
import os
import numpy as np

### Download and extract data

Before you start, you should upload the ``road_following_<Date&Time>.zip`` file that you created in the ``data_collection.ipynb`` notebook on the robot. 

> If you're training on the JetBot you collected data on, you can skip this!

You should then extract this dataset by calling the command below:

In [43]:
!unzip -q road_following.zip

'unzip' is not recognized as an internal or external command,
operable program or batch file.


You should see a folder named ``dataset_all`` appear in the file browser.

### Create Dataset Instance

Here we create a custom ``torch.utils.data.Dataset`` implementation, which implements the ``__len__`` and ``__getitem__`` functions.  This class
is responsible for loading images and parsing the x, y values from the image filenames.  Because we implement the ``torch.utils.data.Dataset`` class,
we can use all of the torch data utilities :)

We hard coded some transformations (like color jitter) into our dataset.  We made random horizontal flips optional (in case you want to follow a non-symmetric path, like a road
where we need to 'stay right').  If it doesn't matter whether your robot follows some convention, you could enable flips to augment the dataset.

In [44]:
import glob

In [45]:
class XYDataset(torch.utils.data.Dataset):
    
    def __init__(self):
        path_person = './dataset/zebra/yes/*'
        path_noperson = './dataset/zebra/no/*'

        self.filenames = []
        self.labels = []

        for img in glob.glob(path_person):
            self.filenames.append(img)
            self.labels.append(0)
        for img in glob.glob(path_noperson):
            self.filenames.append(img)
            self.labels.append(1)
    
    def __len__(self):
        return len(self.filenames)
    
    def __getitem__(self, idx):
        image_path = self.filenames[idx]
        image = PIL.Image.open(image_path)
        image = ToTensor()(image)
        
        return image, self.labels[idx]

    def printOut(self):
        return set(self.labels)
    
dataset = XYDataset()

In [46]:
dataset.printOut(), len(dataset)

({0, 1}, 1000)

### Split dataset into train and test sets
Once we read dataset, we will split data set in train and test sets. In this example we split train and test a 90%-10%. The test set will be used to verify the accuracy of the model we train.

In [47]:
test_percent = 0.1
num_test = int(test_percent * len(dataset))
train_dataset, test_dataset = torch.utils.data.random_split(dataset, [len(dataset) - num_test, num_test])

### Create data loaders to load data in batches

We use ``DataLoader`` class to load data in batches, shuffle data and allow using multi-subprocesses. In this example we use batch size of 64. Batch size will be based on memory available with your GPU and it can impact accuracy of the model.

In [48]:
train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=128,
    shuffle=True,
    num_workers=0
)

test_loader = torch.utils.data.DataLoader(
    test_dataset,
    batch_size=128,
    shuffle=True,
    num_workers=0
)

### Define Neural Network Model 

We use ResNet-18 model available on PyTorch TorchVision. 

In a process called transfer learning, we can repurpose a pre-trained model (trained on millions of images) for a new task that has possibly much less data available.


More details on ResNet-18 : https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py

More Details on Transfer Learning: https://www.youtube.com/watch?v=yofjFQddwHE 

In [49]:
# model = models.resnet50(pretrained=True)
model = models.alexnet(pretrained=True)
for param in model.parameters():
    param.requires_grad = False

# for param in model.layer4.parameters():
#     param.requires_grad = True
# model.layer4.requires_grad_ = True

In [50]:
from torchsummary import summary
summary(torch.nn.Sequential(*(list(model.children())[:-1])), (3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 64, 55, 55]          23,296
              ReLU-2           [-1, 64, 55, 55]               0
         MaxPool2d-3           [-1, 64, 27, 27]               0
            Conv2d-4          [-1, 192, 27, 27]         307,392
              ReLU-5          [-1, 192, 27, 27]               0
         MaxPool2d-6          [-1, 192, 13, 13]               0
            Conv2d-7          [-1, 384, 13, 13]         663,936
              ReLU-8          [-1, 384, 13, 13]               0
            Conv2d-9          [-1, 256, 13, 13]         884,992
             ReLU-10          [-1, 256, 13, 13]               0
           Conv2d-11          [-1, 256, 13, 13]         590,080
             ReLU-12          [-1, 256, 13, 13]               0
        MaxPool2d-13            [-1, 256, 6, 6]               0
AdaptiveAvgPool2d-14            [-1, 25

ResNet model has fully connect (fc) final layer with 512 as ``in_features`` and we will be training for regression thus ``out_features`` as 1

Finally, we transfer our model for execution on the GPU

In [51]:
# Resnet
# model.fc = torch.nn.Linear(2048,3)

# Alexnet
model.classifier = nn.Sequential(
            nn.Dropout(p=0.5),
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, 2)
        )
device = torch.device('cpu')
model = model.to(device)

### Train Regression:

We train for 50 epochs and save best model if the loss is reduced. 

In [52]:
# for param in model.parameters():
#     param.requires_grad = True

NUM_EPOCHS = 2
BEST_MODEL_PATH = 'best_steering_model_xy.pth'
best_loss = 1e9

optimizer = optim.Adam(model.parameters())

for epoch in range(NUM_EPOCHS):
    
    model.train()
    train_loss = 0.0
    for i, (images, labels) in enumerate(iter(train_loader)):
        images = images.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = F.cross_entropy(outputs, labels)
        train_loss += float(loss)
        loss.backward()
        optimizer.step()
        print(f'Step: {i+1},  Epoch: {epoch+1}/{NUM_EPOCHS},  Loss: {loss.item()}')
    train_loss /= len(train_loader)
    
    model.eval()
    test_loss = 0.0
    for images, labels in iter(test_loader):
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        loss = F.cross_entropy(outputs, labels)
        test_loss += float(loss)
    test_loss /= len(test_loader)
    
    print('%f, %f' % (train_loss, test_loss))
    if test_loss < best_loss:
        torch.save(model.state_dict(), BEST_MODEL_PATH)
        best_loss = test_loss

Step: 1,  Epoch: 1/2,  Loss: 0.7090116143226624
Step: 2,  Epoch: 1/2,  Loss: 6.853683948516846
Step: 3,  Epoch: 1/2,  Loss: 30.74087905883789
Step: 4,  Epoch: 1/2,  Loss: 13.608169555664062
Step: 5,  Epoch: 1/2,  Loss: 1.7158637046813965
Step: 6,  Epoch: 1/2,  Loss: 0.4060060977935791
Step: 7,  Epoch: 1/2,  Loss: 0.8044204115867615
Step: 8,  Epoch: 1/2,  Loss: 1.0331655740737915
6.983900, 0.451650
Step: 1,  Epoch: 2/2,  Loss: 0.6505378484725952
Step: 2,  Epoch: 2/2,  Loss: 0.32486963272094727
Step: 3,  Epoch: 2/2,  Loss: 0.4739901125431061
Step: 4,  Epoch: 2/2,  Loss: 0.38344818353652954
Step: 5,  Epoch: 2/2,  Loss: 0.20350608229637146
Step: 6,  Epoch: 2/2,  Loss: 0.1551542431116104
Step: 7,  Epoch: 2/2,  Loss: 0.08076869696378708
Step: 8,  Epoch: 2/2,  Loss: 0.044496044516563416
0.289596, 0.014793


In [53]:
model.eval()
correct = 0
for images, labels in iter(test_loader):
    images = images.to(device)
    labels = labels.to(device)
    outputs = model(images)
    outputs = torch.argmax(F.softmax(outputs), axis=1)
    outputs = (outputs>0.5).float()
    print(outputs.shape, labels.shape)
    print((outputs == labels).float())
    correct += (outputs == labels).float().sum()
print("Epoch {}/{}, Accuracy: {:.3f}".format(epoch+1,NUM_EPOCHS, correct/(len(dataset)*0.1)))



torch.Size([100]) torch.Size([100])
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
Epoch 2/2, Accuracy: 0.990


  import sys


In [56]:
model.eval()
correct = 0
for images, labels in iter(test_loader):
    images = images.to(device)
    labels = labels.to(device)
    outputs = model(images)
    print(F.softmax(outputs))
    break



tensor([[4.0519e-03, 9.9595e-01],
        [1.5054e-02, 9.8495e-01],
        [1.1087e-02, 9.8891e-01],
        [2.8821e-02, 9.7118e-01],
        [1.7663e-02, 9.8234e-01],
        [9.9917e-01, 8.3488e-04],
        [8.5589e-02, 9.1441e-01],
        [9.9522e-01, 4.7823e-03],
        [9.9963e-01, 3.6904e-04],
        [1.0000e+00, 3.8045e-06],
        [9.9998e-01, 1.9092e-05],
        [9.9998e-01, 2.0525e-05],
        [3.5637e-03, 9.9644e-01],
        [3.9051e-03, 9.9610e-01],
        [4.8817e-03, 9.9512e-01],
        [1.4104e-02, 9.8590e-01],
        [9.9996e-01, 3.5865e-05],
        [9.9999e-01, 1.4594e-05],
        [1.7659e-02, 9.8234e-01],
        [9.9993e-01, 6.7247e-05],
        [1.1369e-02, 9.8863e-01],
        [9.9991e-01, 8.6879e-05],
        [5.8903e-03, 9.9411e-01],
        [9.9946e-01, 5.4237e-04],
        [3.2175e-03, 9.9678e-01],
        [1.4019e-02, 9.8598e-01],
        [9.9999e-01, 1.2301e-05],
        [1.0336e-02, 9.8966e-01],
        [9.7574e-01, 2.4261e-02],
        [9.997

  import sys


Once the model is trained, it will generate ``best_steering_model_xy.pth`` file which you can use for inferencing in the live demo notebook.

If you trained on a different machine other than JetBot, you'll need to upload this to the JetBot to the ``road_following`` example folder.