# Convolutional Nueral Network - Train Model

In this notebook you'll train our image classifier to detect four classes
```free```, ```left```, ```right```, and ```blocked```.  For this, we'll use a *PyTorch* due to memory concerns. 

Once you've connected to Colab, the first thing you need to do is change your runtime to a GPU.


Through the menu at the top, click ```Runtime``` => ```Chage run time type``` => set hardware accelerator ```GPU``` => ```Save```

<img src='https://ifh.cc/g/Nt2cGh.jpg' width='1200'>

In [0]:
import torch
import torch.optim as optim
import torch.nn.functional as F
import torchvision
import torchvision.datasets as datasets
import torchvision.models as models
import torchvision.transforms as transforms

### Upload and extract dataset

If you are training a model in Colab, you should follow this sequence:



1.   Upload your zip file in Google drive
2.   Connect Google dirve from Colab
3.   Unzip your zip file (You must enter the path to your home file)




In [2]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive



Before you start, you should upload the *dataset.zip* file that you created in the *2_Data_collection.ipynb* notebook on the robot.

You should then extract this dataset by calling the command below

In [0]:
#!unzip path_to_file.zip -d path_to_directory (enter your path)
#example
!unzip "/content/drive/My Drive/practice/dataset.zip" -d "/content/drive/My Drive/practice"

You can easily find the path through the left **files** menu


<img src='https://ifh.cc/g/lBOE2d.jpg'>

You should see a folder named *dataset* appear in the file browser.

### Create a dataset instance

Now you use the ImageFolder dataset class available with the torchvision.datasets package and attach transforms from the ``torchvision.transforms`` package to prepare the data for training.  

In [0]:
# Your dataset consists of image files (jpg). Therefore, it must be converted into a form for learning (torch)
dataset = datasets.ImageFolder(
    '/content/drive/My Drive/practice/dataset', # set your path
        transforms.Compose([ # pipline function
        transforms.ColorJitter(0.1, 0.1, 0.1, 0.1), # ColorJitter provide function change the brightness, contrast and saturation of an image. 
        transforms.Resize((224, 224)), # Resize provide your saved image to definded size
        transforms.ToTensor(),  # ToTensor provide send your data CPU to GPU
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # # Normalize provide normalize a tensor image with mean and standard deviation
    ])
)

# You can check detail about transformation to follwing link: https://pytorch.org/docs/stable/torchvision/transforms.html

### Split the dataset into train and test sets

Next,  split the dataset into *training* and *test* sets.  The test set will be used to verify the accuracy of the model we train. We use 70 percent of the dataset as a training set and the rest as a test set.

<img src='https://miro.medium.com/max/1400/1*-8_kogvwmL1H6ooN1A1tsQ.png' width='500'>


In [0]:
training_size = int(len(dataset) * 0.7)
test_size = len(dataset) - training_size
train_dataset, test_dataset = torch.utils.data.random_split(dataset, [training_size, test_size]) 
#torch.utils.data.random_split(dataset, lengths) 
#Randomly split a dataset into non-overlapping new datasets of given lengths.
#https://pytorch.org/docs/stable/data.html

### Create data loaders to load data in batches

Create two ``DataLoader`` instances, which provide utilities for shuffling data, producing *batches* of images, and loading the samples in parallel with multiple workers.

In [0]:
train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=4
)

test_loader = torch.utils.data.DataLoader(
    test_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=4
)

### Define the neural network (CNN)


We need to define a Convolutional Neural Network (CNN) model to process the image from the camera. There are two ways to define the model:

1. Users create and use model
2. Reuse already trained model (transfer learning)


You can use either model, but we recommend transfer leraning that you can use efficiently.

#### 1.User-created model


Design the model as a class. The example below is a layer consisting of two convolutional layers and one fully connected layer. You can add layers for better performance

<img src='https://ifh.cc/g/yPYuSj.png'>

In [0]:
class CNN(torch.nn.Module):

    def __init__(self):
        super(CNN, self).__init__()
        # First layer
        # ImgIn shape=(?, 224, 224, 3)
        #    Conv     -> (?, 224, 224, 32)
        #    Pool     -> (?, 112, 112, 32)
        self.layer1 = torch.nn.Sequential(
            torch.nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2))

        # Second layer
        # ImgIn shape=(?, 112, 112, 32)
        #    Conv      ->(?, 112, 112, 64)
        #    Pool      ->(?, 56, 56, 64)
        self.layer2 = torch.nn.Sequential(
            torch.nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2))

        # Fully connected layer with 112x112x64 inputs -> 10 outputs
        self.fc = torch.nn.Linear(112 * 112* 64, 4, bias=True)

        # initailize fully connected layer's weight
        torch.nn.init.xavier_uniform_(self.fc.weight)

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.view(out.size(0), -1) 
        out = self.fc(out)
        return out

Define model using the CNN class

In [0]:
model = CNN().to(device)

#### 2.Transfer learning
The *torchvision* package provides a collection of pre-trained models that we can use.

In a process called *transfer learning*, we can repurpose a pre-trained model (trained on millions of images) for a new task that has possibly much less data available.

Important features that were learned in the original training of the pre-trained model are re-usable for the new task.  We'll use the alexnet model. But of course, you can use other pre-trained models. 

In [0]:
model = models.alexnet(pretrained=True)

Downloading: "https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth" to /root/.cache/torch/checkpoints/alexnet-owt-4df8aa71.pth


HBox(children=(FloatProgress(value=0.0, max=244418560.0), HTML(value='')))




The alexnet model was originally trained for a dataset that had 1000 class labels, but our dataset only has four class labels! You'll replace
the final layer with a new, untrained layer that has only four outputs (i.e., free, left, right and blocked.)  

<img src='https://www.researchgate.net/profile/Huafeng_Wang4/publication/300412100/figure/fig1/AS:388811231121412@1469711229450/AlexNet-Architecture-To-be-noted-is-copied-2_W640.jpg' >

Using the summary provided by ```torchsummary```, you can examine the detailed structure of Alexnet. As you can see from the summary, Alexnet consists of a convolutional layer and a classification layer. We fine-tune the number of outputs of the last layer to the number of our classes(4,i.e, free, left, right, block) to create a model.

In [0]:
from torchsummary import summary
summary(model,  input_size=(3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 64, 55, 55]          23,296
              ReLU-2           [-1, 64, 55, 55]               0
         MaxPool2d-3           [-1, 64, 27, 27]               0
            Conv2d-4          [-1, 192, 27, 27]         307,392
              ReLU-5          [-1, 192, 27, 27]               0
         MaxPool2d-6          [-1, 192, 13, 13]               0
            Conv2d-7          [-1, 384, 13, 13]         663,936
              ReLU-8          [-1, 384, 13, 13]               0
            Conv2d-9          [-1, 256, 13, 13]         884,992
             ReLU-10          [-1, 256, 13, 13]               0
           Conv2d-11          [-1, 256, 13, 13]         590,080
             ReLU-12          [-1, 256, 13, 13]               0
        MaxPool2d-13            [-1, 256, 6, 6]               0
AdaptiveAvgPool2d-14            [-1, 25

In [0]:
# As you can see from the above summary, Alexnet consists of a convolutional layer and a classification layer.
# Here, model.classifier[6] means the fully connected layer of the last layer (Dropout-15 is the first classification layer)

model.classifier[6] = torch.nn.Linear(model.classifier[6].in_features, 4) # 4 here refers to the number of outputs (class labels)
# or you can use model.classifier[-1] instead of model.classifier[6]

Declaring model training on the GPU

In [0]:
device = torch.device('cuda')
model = model.to(device)

## Train the neural network

In deep learning, **backpropagation** is a widely used algorithm in training feedforward neural networks for supervised learning. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions generally – a class of algorithms referred to generically as "backpropagation.  In fitting a neural network, **backpropagation** computes the gradient of the loss function with respect to the weights of the network for a single input–output example, and does so efficiently, unlike a naive direct computation of the gradient with respect to each weight individually. This efficiency makes it feasible to use gradient methods for training multilayer networks, updating weights to minimize loss; gradient descent, or variants such as stochastic gradient descent, are commonly used. The backpropagation algorithm works by computing the gradient of the loss function with respect to each weight by the **chain rule**, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of intermediate terms in the chain rule; this is an example of dynamic programming

<img src='https://miro.medium.com/max/1400/1*q1M7LGiDTirwU-4LcFq7_Q.png'>


### Deep learning with pytorch

PyTorch enables the learning process described above. It involves the following steps.

1. Send data to the GPU to speed up processing.
2. Calculate the loss with forward process.
3. Update the weight with backward process.

Using the code below you'll train the neural network for 50 epochs, saving the best performing model after each epoch.

<img src='https://ifh.cc/g/BOLgNx.png'>

In [0]:
NUM_EPOCHS = 50
BEST_MODEL_PATH = '/content/drive/My Drive/practice/best_model.pth' # for BEST_MODEL_PATH, please set your own path (left is an example of the path by TA)
best_accuracy = 0.0  # Variable for storing the best performing model in your path


optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

for epoch in range(NUM_EPOCHS):
    
    for images, labels in iter(train_loader):
        # send data to device then your array data move from CPU to GPU 
        images = images.to(device)
        labels = labels.to(device)

        # zero gradients of parameters
        # we need to set the gradients to zero before starting to do backpropragation because PyTorch accumulates the gradients on subsequent backward passes. 
        optimizer.zero_grad()

        # execute model to get outputs
        outputs = model(images)
        # compute loss, cross_entropy= −(ylog(p)+(1−y)log(1−p)),
        loss = F.cross_entropy(outputs, labels)

        # run backpropogation to accumulate gradients
        loss.backward()

        # step optimizer to adjust parameters
        optimizer.step()
    

    # After completing one epoch training, you can verify your model through a test set.
    test_error_count = 0.0
    for images, labels in iter(test_loader):
        # send data to device then your array data move from CPU to GPU 
        images = images.to(device)
        labels = labels.to(device)
        # execute model to get outputs
        outputs = model(images)
        # We can get the labels predicted by the trained model through the argmax function
        # After that, we calculate the error by comparing it to the actual label
        test_error_count += float(torch.sum(torch.abs(labels - outputs.argmax(1))))
    
    test_accuracy = 1.0 - float(test_error_count) / float(len(test_dataset))
    print('%d: %f' % (epoch, test_accuracy))

    
    # With the following code, the best performing model during training is stored your the path.
    if test_accuracy > best_accuracy:
        torch.save(model.state_dict(), BEST_MODEL_PATH)
        best_accuracy = test_accuracy

Once that is finished, you should see a file **best_model.pth** in the your predefined directory.  Select Right click -> Download to download the model to your workstation