# <p style="text-align: center;"> Mini Project Three: Autonomous Vehicle Driving with CNN</p>

![title](Images\car-running.gif)

In [1]:
from IPython.display import HTML
from IPython.display import Image
HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
The raw code for this IPython notebook is by default hidden for easier reading.
To toggle on/off the raw code, click <a href="javascript:code_toggle()">here</a>.''')

# <p style="text-align: center;"> Table of Contents </p>
- ## 1. [Introduction](#Introduction)
   - ### 1.1 [Abstract](#abstract)
   - ### 1.2 [Importing Libraries](#importing_libraries)
   - ### 1.3 [Setting up the Environment](#Environment)

- ## 2.0 [Creation of the Dataset](#Creation_Dataset)
- ## 3.0 [Training](#Training)
- ## 4.0 [Driving](#Driving)  
- ## 5. [Conclusion](#Conclusion)
- ## 6. [Contribution](#Contribution)
- ## 7. [Citation](#Citation)
- ## 8. [License](#License)

# <p style="text-align: center;"> 1.0 Introduction </p> <a id='Introduction'></a>

#   1.1 Abstract  <a id='abstract'></a>

Welcome to Imitation Learning.

[Back to top](#Introduction)

#   1.2 Importing Libraries  <a id='importing_libraries'></a>

This is the official start to any Data Science or Machine Learning Project. A Python library is a reusable chunk of code that you may want to include in your programs/ projects. 
In this step we import a few libraries that are required in our program. Some major libraries that are used are Numpy, Gym, Torch, PIL, Pyglet etc.

[Back to top](#Introduction)

In [2]:
# Using Numpy for working with arrays
import numpy as np

# The gym library is a collection of test problems — environments
import gym
from gym import envs

# Imageio for creating and saving the image
import imageio

# The OS module in python provides functions for interacting with the operating system. OS
import os

# The sys module provides functions used to manipulate different parts of the Python runtime environment.
import sys

# Pyglet is a library for developing visually rich GUI, Windows may appear as floating regions 
import pyglet
from pyglet.window import key

# We use copy module for shallow and deep copy operations.
import copy 

# PyTorch is a Python package that provides two high-level features:
#- Tensor computation (like NumPy) with strong GPU acceleration
#- Deep neural networks built on a tape-based autograd system

import torch
from torch.autograd import Variable
from torch import optim, nn
from torch.nn import Softmax

# Python Imaging Library for simple image processing such as resizing (scaling), rotation, and trimming (partial cutout)
import PIL

# Importing our Python scripts
from model import CustomModel
from data import transform_driving_image, LEFT, RIGHT, GO, ACTIONS, CustomDataset, get_dataloader

# For making a table in results
from astropy.table import Table, Column

# To ignore warnings
import warnings; warnings.simplefilter('ignore')

#   1.3 Setting up the Environment  <a id='Environment'></a>

Before we start with the setup of our environment, we need to install a few pakages which will make our game and neural network work.

### 1) Gym facility
Install OpenAI Gym on the machine

Follow the instructions at https://github.com/openai/gym#installation for extensive and deep guide.

**Summary of instructions:**
- Install Python 3.5+
- Clone the gym repo: git clone https://github.com/openai/gym.git
- cd gym
- Gym installation, with the box2d environments: pip install -e '.[box2d]'

Follow the following steps to play the Car Racing Game
- cd gym/envs/box2d
- python car_racing.py

### 2) Pytorch
Pytorch is the deep learning framework that we will be using. It makes it possible to build neural networks very simply.

Follow the instructions on http://pytorch.org/ for a deep guide.

**Summary of instructions:**
- Install Python 3.5+
- It is recommended to manage PyTorch with Anaconda. Please install Anaconda
- Install PyTorch following instructions at https://pytorch.org/get-started/locally/
![title](Images\Pytorch_Installation.png)

For example this is the setup for my Computer
> pip install torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

## The Environment

For this tutorial, we will use the gym library developed by OpenAI. It provides environments (simple games) to develop reinforcement learning algorithms.

The environment we will be using is CarRacing-v0 ( https://gym.openai.com/envs/CarRacing-v0/ ). It is about driving a car on a circuit, the objective being to move forward while staying on the track, which contains many turns. The input to the algorithm (the state provided by the environment) is only the image displayed by the environment: we see the car, and the terrain around it.
![title](Images\car-racing.png)

The idea is to drive the car by analyzing this image.

We are going to use this library in a roundabout way: It is designed for reinforcement learning. The objective is in principle to use the rewards (rewards) provided by the environment to learn the optimal strategy without user action. Here we will not be using these rewards.

In addition, we will be doing end-to-end learning , which means that the neural network will directly give us the commands to navigate the car. This is not a road detection module, which will then be analyzed by another program (most true autonomous driving systems are made this way). Here, the neural network takes the field matrix as input, and issues a command to be executed (turn left, turn right, continue straight ahead), without any intermediate program.

To use the environment, you need to import it like this:

>import gym

>env = gym.make('CarRacing-v0').env

You can then access several useful functions:

- **env.reset() :** Allows you to restart the environment
- **env.step(action) :** Allows you to perform the action `action`. This function returns a tuple `state`, `reward`, `done`, `info` containing the state of the game after the action, the reward obtained, doneindicates if the game is finished, and infocontains debug data.
- **env.render() :** Displays the game window.

Here, the state `state` that will be returned by env.step(action)is the image displayed on the screen (the pixel matrix). It is this data that we will use to steer our car.

[Back to top](#Introduction)

# <p style="text-align: center;"> 2.0 Creation of the dataset </p> <a id='Creation_Dataset'></a>

![title](Images\Racing_start.gif)

The car is controllable with the arrows of the keyboard.

For the rest, we want to train a neural network which will take the game image as input, and as output, return the command to send (left, right, straight). We will focus first on controlling the direction . The speed control will still have to be done with the up and down keys.

The first step in training our neural network is to create a dataset. It is about recording a set of images accompanied by their label . We will represent the possible actions with integers:

- 0 to indicate to go left
- 1 to indicate go right
- 2 to indicate to go straight

Thus, we will save a set of 3000 images in a folder, accompanied by a file labels.txt indicating on each line <image path> label. We have 3 labels, so we save 1000 images of each label for the training set. For the testing set, we will save 600.

We have created a train set, which will be used to train the network, and a test set, which will be used to evaluate its performance during training, to know when to interrupt it. Indeed, given the relatively low number of images that we use (3000), there is a risk of overfitting i.e. the network will lose in power of generalization to be better, in the special cases of the training set . This is a situation that we want to avoid, since we want to use our model subsequently in situations that it has not seen. The technique of stopping training before convergence is called early stopping.
    
>As the epochs go by, the algorithm leans and its error on the training set naturally goes down, and so does its error on the validation set. However, after a while, the validation error stops decreasing and actually starts to go back up. This indicates that the model has started to overfit the training data. With Early Stopping, you just stop training as soon as the validation error reaches the minimum.
    
We run this piece of code twice as we want to save different datasets for training and testing.
    
[Back to top](#Introduction)

In [25]:
# %%capture
# record_dataset.py

samples_each_classes = 1000
def action_to_id(a):
    if all(a == [-1, 0, 0]): return LEFT
    elif all(a == [1, 0, 0]): return RIGHT
    else:
        return GO

# is_pressed_esc   = False

if __name__=='__main__':
    quit=False
    if len(sys.argv) < 2:
        sys.exit("Usage : python record_dataset.py path")
    
    env = gym.make('CarRacing-v0').env
    
    envs.box2d.car_racing.WINDOW_H = 750
    envs.box2d.car_racing.WINDOW_W = 1200
    env.reset()

    folder = sys.argv[1]
    images = os.path.join(folder, "train_images") 
    labels = os.path.join(folder, "train_labels.txt")
    os.makedirs(images, exist_ok=True)

    a = np.array([0.0, 0.0, 0.0])
    
    def key_press(k, mod):
        global restart
        global quit
        if k == 65307: quit = True
        if k==key.LEFT:  a[0] = -1.0
        if k==key.RIGHT: a[0] = +1.0
        if k==key.UP:    a[1] = +1.0
        if k==key.DOWN:  a[2] = +0.8   # set 1.0 for wheels to block to zero rotation
#         if k==65307 : is_pressed_esc = True    
    def key_release(k, mod):
        if k==key.LEFT  and a[0]==-1.0: a[0] = 0
        if k==key.RIGHT and a[0]==+1.0: a[0] = 0
        if k==key.UP:    a[1] = 0
        if k==key.DOWN:  a[2] = 0

    env.viewer.window.on_key_press = key_press
    env.viewer.window.on_key_release = key_release
    env.reset()
    for i in range(100):
        env.step([0, 0, 0])
        env.render()

    file_labels = open(labels, 'w')
    samples_saved = {a: 0 for a in ACTIONS}

    i = 0
#     while not is_pressed_esc:
    while not quit :  
        env.render(close = False)
        s, r, done, info = env.step(a)
        action_id = action_to_id(a)
        if samples_saved[action_id] < samples_each_classes:
            samples_saved[action_id] += 1
            samples_each_classes
            imageio.imwrite(os.path.join(folder, 'train_images', 'img-%s.jpg' % i ), s)
            file_labels.write('%s %s\n' % ('img-%s.jpg' % i, action_id))
            file_labels.flush()
            i += 1
            print(samples_saved)
#             env.render()       
    env.render(close=True)

[2020-12-06 01:09:20,051] Making new env: CarRacing-v0


Track generation: 1042..1312 -> 270-tiles track
Track generation: 1111..1393 -> 282-tiles track
{0: 0, 1: 0, 2: 1}
{0: 0, 1: 0, 2: 2}
{0: 0, 1: 0, 2: 3}
{0: 0, 1: 0, 2: 4}
{0: 0, 1: 0, 2: 5}
{0: 0, 1: 0, 2: 6}
{0: 0, 1: 0, 2: 7}
{0: 0, 1: 0, 2: 8}
{0: 0, 1: 0, 2: 9}
{0: 0, 1: 0, 2: 10}
{0: 0, 1: 0, 2: 11}
{0: 0, 1: 0, 2: 12}
{0: 0, 1: 0, 2: 13}
{0: 0, 1: 0, 2: 14}
{0: 0, 1: 0, 2: 15}
{0: 0, 1: 0, 2: 16}
{0: 0, 1: 0, 2: 17}
{0: 0, 1: 0, 2: 18}
{0: 0, 1: 0, 2: 19}
{0: 0, 1: 0, 2: 20}
{0: 0, 1: 0, 2: 21}
{0: 0, 1: 0, 2: 22}
{0: 0, 1: 0, 2: 23}
{0: 0, 1: 0, 2: 24}
{0: 0, 1: 0, 2: 25}
{0: 0, 1: 0, 2: 26}
{0: 0, 1: 0, 2: 27}
{0: 0, 1: 0, 2: 28}
{0: 0, 1: 0, 2: 29}
{0: 0, 1: 0, 2: 30}
{0: 0, 1: 0, 2: 31}
{0: 0, 1: 0, 2: 32}
{0: 0, 1: 0, 2: 33}
{0: 0, 1: 0, 2: 34}
{0: 0, 1: 0, 2: 35}
{0: 0, 1: 0, 2: 36}
{0: 0, 1: 0, 2: 37}
{0: 0, 1: 0, 2: 38}
{0: 0, 1: 0, 2: 39}
{0: 0, 1: 0, 2: 40}
{0: 0, 1: 0, 2: 41}
{0: 0, 1: 0, 2: 42}
{0: 0, 1: 0, 2: 43}
{0: 0, 1: 0, 2: 44}
{0: 0, 1: 0, 2: 45}
{0: 0, 1: 0, 

{0: 24, 1: 13, 2: 346}
{0: 25, 1: 13, 2: 346}
{0: 26, 1: 13, 2: 346}
{0: 27, 1: 13, 2: 346}
{0: 28, 1: 13, 2: 346}
{0: 29, 1: 13, 2: 346}
{0: 29, 1: 13, 2: 347}
{0: 29, 1: 13, 2: 348}
{0: 29, 1: 13, 2: 349}
{0: 29, 1: 13, 2: 350}
{0: 29, 1: 13, 2: 351}
{0: 29, 1: 13, 2: 352}
{0: 29, 1: 13, 2: 353}
{0: 29, 1: 13, 2: 354}
{0: 29, 1: 13, 2: 355}
{0: 29, 1: 13, 2: 356}
{0: 29, 1: 13, 2: 357}
{0: 29, 1: 13, 2: 358}
{0: 29, 1: 13, 2: 359}
{0: 29, 1: 13, 2: 360}
{0: 29, 1: 13, 2: 361}
{0: 29, 1: 13, 2: 362}
{0: 30, 1: 13, 2: 362}
{0: 31, 1: 13, 2: 362}
{0: 32, 1: 13, 2: 362}
{0: 33, 1: 13, 2: 362}
{0: 33, 1: 13, 2: 363}
{0: 33, 1: 13, 2: 364}
{0: 33, 1: 13, 2: 365}
{0: 33, 1: 13, 2: 366}
{0: 33, 1: 13, 2: 367}
{0: 33, 1: 13, 2: 368}
{0: 33, 1: 13, 2: 369}
{0: 33, 1: 13, 2: 370}
{0: 33, 1: 13, 2: 371}
{0: 33, 1: 13, 2: 372}
{0: 33, 1: 13, 2: 373}
{0: 34, 1: 13, 2: 373}
{0: 35, 1: 13, 2: 373}
{0: 36, 1: 13, 2: 373}
{0: 37, 1: 13, 2: 373}
{0: 38, 1: 13, 2: 373}
{0: 39, 1: 13, 2: 373}
{0: 40, 1: 

{0: 78, 1: 36, 2: 631}
{0: 78, 1: 36, 2: 632}
{0: 78, 1: 36, 2: 633}
{0: 78, 1: 36, 2: 634}
{0: 78, 1: 36, 2: 635}
{0: 78, 1: 36, 2: 636}
{0: 78, 1: 36, 2: 637}
{0: 78, 1: 36, 2: 638}
{0: 78, 1: 36, 2: 639}
{0: 78, 1: 36, 2: 640}
{0: 78, 1: 36, 2: 641}
{0: 78, 1: 36, 2: 642}
{0: 78, 1: 36, 2: 643}
{0: 78, 1: 36, 2: 644}
{0: 78, 1: 36, 2: 645}
{0: 78, 1: 36, 2: 646}
{0: 78, 1: 36, 2: 647}
{0: 78, 1: 36, 2: 648}
{0: 78, 1: 36, 2: 649}
{0: 78, 1: 36, 2: 650}
{0: 78, 1: 36, 2: 651}
{0: 78, 1: 36, 2: 652}
{0: 78, 1: 36, 2: 653}
{0: 78, 1: 36, 2: 654}
{0: 78, 1: 36, 2: 655}
{0: 78, 1: 36, 2: 656}
{0: 78, 1: 36, 2: 657}
{0: 78, 1: 36, 2: 658}
{0: 79, 1: 36, 2: 658}
{0: 80, 1: 36, 2: 658}
{0: 81, 1: 36, 2: 658}
{0: 81, 1: 36, 2: 659}
{0: 81, 1: 36, 2: 660}
{0: 81, 1: 36, 2: 661}
{0: 81, 1: 36, 2: 662}
{0: 81, 1: 36, 2: 663}
{0: 81, 1: 36, 2: 664}
{0: 81, 1: 36, 2: 665}
{0: 81, 1: 36, 2: 666}
{0: 81, 1: 36, 2: 667}
{0: 81, 1: 36, 2: 668}
{0: 81, 1: 36, 2: 669}
{0: 81, 1: 36, 2: 670}
{0: 81, 1: 

{0: 116, 1: 62, 2: 916}
{0: 116, 1: 62, 2: 917}
{0: 116, 1: 62, 2: 918}
{0: 116, 1: 62, 2: 919}
{0: 116, 1: 62, 2: 920}
{0: 116, 1: 62, 2: 921}
{0: 116, 1: 62, 2: 922}
{0: 116, 1: 62, 2: 923}
{0: 116, 1: 62, 2: 924}
{0: 116, 1: 63, 2: 924}
{0: 116, 1: 64, 2: 924}
{0: 116, 1: 65, 2: 924}
{0: 116, 1: 66, 2: 924}
{0: 116, 1: 66, 2: 925}
{0: 116, 1: 66, 2: 926}
{0: 116, 1: 66, 2: 927}
{0: 116, 1: 66, 2: 928}
{0: 116, 1: 66, 2: 929}
{0: 116, 1: 66, 2: 930}
{0: 116, 1: 66, 2: 931}
{0: 116, 1: 66, 2: 932}
{0: 116, 1: 66, 2: 933}
{0: 116, 1: 66, 2: 934}
{0: 116, 1: 66, 2: 935}
{0: 116, 1: 66, 2: 936}
{0: 116, 1: 66, 2: 937}
{0: 116, 1: 66, 2: 938}
{0: 116, 1: 66, 2: 939}
{0: 116, 1: 66, 2: 940}
{0: 116, 1: 66, 2: 941}
{0: 116, 1: 66, 2: 942}
{0: 116, 1: 66, 2: 943}
{0: 116, 1: 66, 2: 944}
{0: 116, 1: 66, 2: 945}
{0: 116, 1: 66, 2: 946}
{0: 116, 1: 66, 2: 947}
{0: 116, 1: 66, 2: 948}
{0: 117, 1: 66, 2: 948}
{0: 118, 1: 66, 2: 948}
{0: 119, 1: 66, 2: 948}
{0: 120, 1: 66, 2: 948}
{0: 121, 1: 66, 

{0: 295, 1: 131, 2: 1000}
{0: 295, 1: 132, 2: 1000}
{0: 295, 1: 133, 2: 1000}
{0: 295, 1: 134, 2: 1000}
{0: 296, 1: 134, 2: 1000}
{0: 297, 1: 134, 2: 1000}
{0: 298, 1: 134, 2: 1000}
{0: 298, 1: 135, 2: 1000}
{0: 298, 1: 136, 2: 1000}
{0: 298, 1: 137, 2: 1000}
{0: 299, 1: 137, 2: 1000}
{0: 300, 1: 137, 2: 1000}
{0: 301, 1: 137, 2: 1000}
{0: 302, 1: 137, 2: 1000}
{0: 303, 1: 137, 2: 1000}
{0: 304, 1: 137, 2: 1000}
{0: 305, 1: 137, 2: 1000}
{0: 306, 1: 137, 2: 1000}
{0: 307, 1: 137, 2: 1000}
{0: 308, 1: 137, 2: 1000}
{0: 309, 1: 137, 2: 1000}
{0: 310, 1: 137, 2: 1000}
{0: 311, 1: 137, 2: 1000}
{0: 312, 1: 137, 2: 1000}
{0: 313, 1: 137, 2: 1000}
{0: 314, 1: 137, 2: 1000}
{0: 315, 1: 137, 2: 1000}
{0: 316, 1: 137, 2: 1000}
{0: 317, 1: 137, 2: 1000}
{0: 318, 1: 137, 2: 1000}
{0: 319, 1: 137, 2: 1000}
{0: 320, 1: 137, 2: 1000}
{0: 321, 1: 137, 2: 1000}
{0: 322, 1: 137, 2: 1000}
{0: 323, 1: 137, 2: 1000}
{0: 324, 1: 137, 2: 1000}
{0: 325, 1: 137, 2: 1000}
{0: 326, 1: 137, 2: 1000}
{0: 327, 1: 

{0: 519, 1: 223, 2: 1000}
{0: 519, 1: 224, 2: 1000}
{0: 519, 1: 225, 2: 1000}
{0: 519, 1: 226, 2: 1000}
{0: 519, 1: 227, 2: 1000}
{0: 519, 1: 228, 2: 1000}
{0: 519, 1: 229, 2: 1000}
{0: 519, 1: 230, 2: 1000}
{0: 520, 1: 230, 2: 1000}
{0: 521, 1: 230, 2: 1000}
{0: 522, 1: 230, 2: 1000}
{0: 523, 1: 230, 2: 1000}
{0: 524, 1: 230, 2: 1000}
{0: 525, 1: 230, 2: 1000}
{0: 526, 1: 230, 2: 1000}
{0: 527, 1: 230, 2: 1000}
{0: 528, 1: 230, 2: 1000}
{0: 529, 1: 230, 2: 1000}
{0: 530, 1: 230, 2: 1000}
{0: 531, 1: 230, 2: 1000}
{0: 532, 1: 230, 2: 1000}
{0: 533, 1: 230, 2: 1000}
{0: 534, 1: 230, 2: 1000}
{0: 534, 1: 231, 2: 1000}
{0: 534, 1: 232, 2: 1000}
{0: 534, 1: 233, 2: 1000}
{0: 535, 1: 233, 2: 1000}
{0: 536, 1: 233, 2: 1000}
{0: 537, 1: 233, 2: 1000}
{0: 538, 1: 233, 2: 1000}
{0: 539, 1: 233, 2: 1000}
{0: 540, 1: 233, 2: 1000}
{0: 541, 1: 233, 2: 1000}
{0: 542, 1: 233, 2: 1000}
{0: 543, 1: 233, 2: 1000}
{0: 544, 1: 233, 2: 1000}
{0: 545, 1: 233, 2: 1000}
{0: 546, 1: 233, 2: 1000}
{0: 547, 1: 

{0: 728, 1: 331, 2: 1000}
{0: 728, 1: 332, 2: 1000}
{0: 728, 1: 333, 2: 1000}
{0: 728, 1: 334, 2: 1000}
{0: 729, 1: 334, 2: 1000}
{0: 730, 1: 334, 2: 1000}
{0: 731, 1: 334, 2: 1000}
{0: 732, 1: 334, 2: 1000}
{0: 733, 1: 334, 2: 1000}
{0: 734, 1: 334, 2: 1000}
{0: 735, 1: 334, 2: 1000}
{0: 736, 1: 334, 2: 1000}
{0: 737, 1: 334, 2: 1000}
{0: 738, 1: 334, 2: 1000}
{0: 739, 1: 334, 2: 1000}
{0: 740, 1: 334, 2: 1000}
{0: 741, 1: 334, 2: 1000}
{0: 742, 1: 334, 2: 1000}
{0: 743, 1: 334, 2: 1000}
{0: 744, 1: 334, 2: 1000}
{0: 745, 1: 334, 2: 1000}
{0: 746, 1: 334, 2: 1000}
{0: 747, 1: 334, 2: 1000}
{0: 748, 1: 334, 2: 1000}
{0: 749, 1: 334, 2: 1000}
{0: 750, 1: 334, 2: 1000}
{0: 751, 1: 334, 2: 1000}
{0: 752, 1: 334, 2: 1000}
{0: 753, 1: 334, 2: 1000}
{0: 754, 1: 334, 2: 1000}
{0: 755, 1: 334, 2: 1000}
{0: 756, 1: 334, 2: 1000}
{0: 757, 1: 334, 2: 1000}
{0: 757, 1: 335, 2: 1000}
{0: 757, 1: 336, 2: 1000}
{0: 757, 1: 337, 2: 1000}
{0: 757, 1: 338, 2: 1000}
{0: 757, 1: 339, 2: 1000}
{0: 758, 1: 

{0: 967, 1: 410, 2: 1000}
{0: 968, 1: 410, 2: 1000}
{0: 969, 1: 410, 2: 1000}
{0: 970, 1: 410, 2: 1000}
{0: 971, 1: 410, 2: 1000}
{0: 972, 1: 410, 2: 1000}
{0: 973, 1: 410, 2: 1000}
{0: 974, 1: 410, 2: 1000}
{0: 975, 1: 410, 2: 1000}
{0: 976, 1: 410, 2: 1000}
{0: 977, 1: 410, 2: 1000}
{0: 978, 1: 410, 2: 1000}
{0: 978, 1: 411, 2: 1000}
{0: 978, 1: 412, 2: 1000}
{0: 978, 1: 413, 2: 1000}
{0: 978, 1: 414, 2: 1000}
{0: 978, 1: 415, 2: 1000}
{0: 978, 1: 416, 2: 1000}
{0: 978, 1: 417, 2: 1000}
{0: 978, 1: 418, 2: 1000}
{0: 978, 1: 419, 2: 1000}
{0: 978, 1: 420, 2: 1000}
{0: 978, 1: 421, 2: 1000}
{0: 978, 1: 422, 2: 1000}
{0: 978, 1: 423, 2: 1000}
{0: 978, 1: 424, 2: 1000}
{0: 978, 1: 425, 2: 1000}
{0: 978, 1: 426, 2: 1000}
{0: 978, 1: 427, 2: 1000}
{0: 978, 1: 428, 2: 1000}
{0: 978, 1: 429, 2: 1000}
{0: 978, 1: 430, 2: 1000}
{0: 978, 1: 431, 2: 1000}
{0: 978, 1: 432, 2: 1000}
{0: 978, 1: 433, 2: 1000}
{0: 978, 1: 434, 2: 1000}
{0: 978, 1: 435, 2: 1000}
{0: 979, 1: 435, 2: 1000}
{0: 980, 1: 

{0: 1000, 1: 683, 2: 1000}
{0: 1000, 1: 684, 2: 1000}
{0: 1000, 1: 685, 2: 1000}
{0: 1000, 1: 686, 2: 1000}
{0: 1000, 1: 687, 2: 1000}
{0: 1000, 1: 688, 2: 1000}
{0: 1000, 1: 689, 2: 1000}
{0: 1000, 1: 690, 2: 1000}
{0: 1000, 1: 691, 2: 1000}
{0: 1000, 1: 692, 2: 1000}
{0: 1000, 1: 693, 2: 1000}
{0: 1000, 1: 694, 2: 1000}
{0: 1000, 1: 695, 2: 1000}
{0: 1000, 1: 696, 2: 1000}
{0: 1000, 1: 697, 2: 1000}
{0: 1000, 1: 698, 2: 1000}
{0: 1000, 1: 699, 2: 1000}
{0: 1000, 1: 700, 2: 1000}
{0: 1000, 1: 701, 2: 1000}
{0: 1000, 1: 702, 2: 1000}
{0: 1000, 1: 703, 2: 1000}
{0: 1000, 1: 704, 2: 1000}
{0: 1000, 1: 705, 2: 1000}
{0: 1000, 1: 706, 2: 1000}
{0: 1000, 1: 707, 2: 1000}
{0: 1000, 1: 708, 2: 1000}
{0: 1000, 1: 709, 2: 1000}
{0: 1000, 1: 710, 2: 1000}
{0: 1000, 1: 711, 2: 1000}
{0: 1000, 1: 712, 2: 1000}
{0: 1000, 1: 713, 2: 1000}
{0: 1000, 1: 714, 2: 1000}
{0: 1000, 1: 715, 2: 1000}
{0: 1000, 1: 716, 2: 1000}
{0: 1000, 1: 717, 2: 1000}
{0: 1000, 1: 718, 2: 1000}
{0: 1000, 1: 719, 2: 1000}
{

{0: 1000, 1: 989, 2: 1000}
{0: 1000, 1: 990, 2: 1000}
{0: 1000, 1: 991, 2: 1000}
{0: 1000, 1: 992, 2: 1000}
{0: 1000, 1: 993, 2: 1000}
{0: 1000, 1: 994, 2: 1000}
{0: 1000, 1: 995, 2: 1000}
{0: 1000, 1: 996, 2: 1000}
{0: 1000, 1: 997, 2: 1000}
{0: 1000, 1: 998, 2: 1000}
{0: 1000, 1: 999, 2: 1000}
{0: 1000, 1: 1000, 2: 1000}


# <p style="text-align: center;"> 3.0 Training </p> <a id='Training'></a>

## Model training with PyTorch

Pytorch is a python matrix computing and deep learning library. It consists on the one hand in an equivalent of numpy, but usable both on CPU and on GPU. And on the other hand, in a library which allows to calculate the gradient of each operation performed on the data, so as to apply the backpropagation algorithm, At the base of the training of neural networks Pytorch also has a set of modules to assemble, which makes it possible to create neural networks very simply.

In pytorch, the basic object is the module. Each module is a function, or an assembly of pytorch functions, which takes as input Tensors (matrices containing data), and emerges another tensor. All the operations performed in this module will be recorded, because the operation graph is necessary for the backpropagation algorithm.

![title](Images\Custom_Model.png)

**The __init__ function :** here we define the network architecture. Our network is made up of two parts: self.convnet and self.classifier. The part convnet is the convolutional part: it is this part, which is responsible for analyzing the image, and recognizing the shapes. It is made up of two layers of convolution (pattern recognition), followed by a non-linearity (ReLU), and a pooling layer (which makes the output invariant to translations).

The second part is the 'classifier', it takes the output of the convolution network, and comes out a size vector num_classes = 3 which represents the score of each action to be performed.

The call nn.Sequential creates the layers in succession. The input will pass successively through all these layers, the input of one layer being the output of the previous one.


**The forward function :** This function will be called by pytorch when our module is called. We notice the passage from a 2D input to a 1D input between the two parts convnet and classify thanks to the function input = input.view(input.size(0), -1)(the first dimension being the number of images in a batch). It's a shortcut for input = input.view(input.size(0), input.size(1) * input.size(2) * input.size(3)

The entry will indeed have 4 dimensions: the first for the batch, the next 2 for the x and y dimensions of the image, and the last for the number of channels of the image: It will be 3 for the 3 colors to entering the network, then each convolution will create new channels while reducing the x and y size. Thus, as the layers progress, the 1st dimension will remain fixed (the number of images in the batch), but the following two will decrease, and the 3rd (channels) will increase.


## Data Preparation

We are going to create a class Dataset, which will be used by pytorch to load our dataset into memory thanks to its class DataLoder.

First of all, we will define the transformations that will be used to preprocess the images, in order to give them as input to the neural network.

`from torchvision import transforms

transform_driving_image = transforms.Compose([
    transforms.CenterCrop(72),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])`
This transformation performs the following actions:

**Crop the image :** transforms.CenterCrop(72)

To keep only a square of size 72 pixels, centered in the same way as the image. Indeed, the image that we get of the environment is like this: 

![title](Images\Crop_Image.png)

We can see that the screen shows an indication bar on the speed and the direction and acceleration controls. If we do not hide it, CNN may learn to associate the commands we give it, with these indications (this is indeed the best indicator to deduce the command to be made from the screen).

After cropping, the resulting image is below. CNN will be forced to analyze the road and the position of the car in order to perform its analysis

![title](Images\Crop_Image_2.png)

We notice that the images provided to CNN are of much lower quality than those displayed by the environment during the game. They are indeed only 96 pixels apart. This will be enough for the neural network to analyze the shapes, and will make the training much faster (because much less neurons will be needed).

**Transform the matrix into Tensor pytorch** transforms.ToTensor()

The tensor is the base object in pytorch to store data. It is the analogue to a numpy matrix, except that it can be stored on CPU, or on GPU. We need to transform our image into a pytorch tensor before giving it to the neural network as input.

We could also use the function tensor.from_numpy(numpy_array)to transform a numpy array into Tensor .

**Standardization:** transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))

The images provided by PIL have data between 0 and 1. Here we subtract 0.5, and divide by 0.5 in order to have data between -1 and 1, which is more efficient for training a network. neuron (data centered in 0 and variance close to 1).

**The __len__ function** should return the length of the dataset. Here is the total number of images.

**The __getitem __ (self, index) function** should return the index object index. Here, we load the image corresponding to this index, we apply the transformations to it, then we return the matrix as well as the labels (in the form of Tensor).

**Directions** We have encoded the directions in three variables LEFT, RIGHTand GO, which will be used in the different modules.

# Code for training the neural network

This code is taken from the tutorial http://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html .

The general idea is as follows:

At each epoch, we train on the entire train dataset, then we evaluate on the training dataset. The data is loaded using a DataLoader , provided by pytorch (we give it as argument the object Datasetwe created previously).

Some important steps:

**Wrapping** them `Tensors` in `Variables`: In Pytorch, it is necessary to do this step `data = Variable(tensor)`, because it is the object `Variable` that will keep in memory the gradient of this variable according to the final loss. A variable is in fact a combination of two tensors, that of the data and that of the gradients.

**Backpropagation**

To perform the backpropagation in pytorch, the following steps are necessary:

- **optimizer.zero_grad() :** at each iteration of the loop. This resets the gradients of each parameter to zero.
- **loss.backward() :** this will calculate the gradients for each variable by backpropagation according to the loss, and store them in the Variable object
- **optimizer.step() :** Modifies each parameter of our model (network weight) in order to minimize the loss.

We have explained the intricate workings of CNN in the following notebook:

[CNN](./INFO7390_Assignment_3_Mini_Project_Basics_of_Convolutional_Neural_Network.ipynb)

[Back to top](#Introduction)

In [3]:


# %%capture
# train.py

def train(model, criterion, train_loader, test_loader, max_epochs=50, 
          learning_rate=0.001):
    
    dataloaders = {
        "train":train_loader, "val": test_loader
    }

    optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9)

    best_acc = 0
    for epoch in range(max_epochs):
        print('Epoch {}/{}'.format(epoch, max_epochs - 1))
        print('-' * 10)
        # Each epoch has a training and validation phase
        for phase in ['val', 'train']:
            if phase == 'train':
                model.train(True)  # Set model to training mode
            else:
                model.train(False)  # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            for data in dataloaders[phase]:
                # get the inputs
                inputs, labels = data
                labels = labels.view(labels.size(0))

                inputs, labels = Variable(inputs), Variable(labels)
                optimizer.zero_grad()

                outputs = model(inputs)
                _, preds = torch.max(outputs.data, 1)
                loss = criterion(outputs, labels)

                # backward + optimize only if in training phase
                if phase == 'train':
                    loss.backward()
                    optimizer.step()
                # statistics
                running_loss += loss.data * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / len(dataloaders[phase])
            epoch_acc = running_corrects / len(dataloaders[phase])

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())
                torch.save(best_model_wts, "models2/model-%s.weights" % epoch)

    print('Training complete')
    print('Best val Acc: {:4f}'.format(best_acc))
    return best_model_wts
if __name__=='__main__': 
    num_classes = 3
    model = CustomModel()

    train_path = "train_images"
    test_path = "test_images"
    train_loader = get_dataloader(train_path, batch_size=8)
    test_loader = get_dataloader(test_path, batch_size=30)

    loss = nn.CrossEntropyLoss()
    x=train(model, loss, train_loader, test_loader)

Epoch 0/49
----------
val Loss: 33.0593 Acc: 10.0000
train Loss: 7.5771 Acc: 4.5227
Epoch 1/49
----------
val Loss: 14.7585 Acc: 25.7000
train Loss: 4.4853 Acc: 6.1653
Epoch 2/49
----------
val Loss: 13.9112 Acc: 22.0500
train Loss: 4.1538 Acc: 6.3280
Epoch 3/49
----------
val Loss: 11.4187 Acc: 26.7500
train Loss: 3.9667 Acc: 6.3493
Epoch 4/49
----------
val Loss: 10.9690 Acc: 26.1000
train Loss: 3.9197 Acc: 6.4187
Epoch 5/49
----------
val Loss: 11.9713 Acc: 25.9000
train Loss: 3.8055 Acc: 6.4587
Epoch 6/49
----------
val Loss: 10.2514 Acc: 27.0000
train Loss: 3.7262 Acc: 6.5120
Epoch 7/49
----------
val Loss: 10.5059 Acc: 26.8000
train Loss: 3.6343 Acc: 6.5333
Epoch 8/49
----------
val Loss: 10.9338 Acc: 26.8500
train Loss: 3.5882 Acc: 6.6320
Epoch 9/49
----------
val Loss: 10.3310 Acc: 26.6500
train Loss: 3.4162 Acc: 6.6027
Epoch 10/49
----------
val Loss: 9.8062 Acc: 26.5000
train Loss: 3.3022 Acc: 6.6640
Epoch 11/49
----------
val Loss: 9.2858 Acc: 26.7500
train Loss: 3.2339 Acc:

# <p style="text-align: center;"> 4.0 Driving  </p> <a id='Driving'></a>

## Driving the car using our model

We now have our model being trained. We will now use it to automate the steering of the car.

Let's take a closer look at what happens in the loop:

`s, r, done, info = env.step(a)
s = s.copy()        
s  = PIL.Image.fromarray(s)  `

We get the pixel matrix, and we read it using PIL (so that it is in the same format as the images read by the dataloader during the training)

`input = transform_driving_image(s)`

We apply the same transformations as in the dataset (cropping of the sides of the image, transformation into Tensor, and normalization between -1 and 1.)

`input = Variable(input[None, :], volatile=True)`

We convert it `Tensor` to `Variable` to give it as input to the neuron network. The argument `volatile=True` saves memory, by telling the network not to save the operations performed (useful when you don't want to train the model with these examples).

`output = Softmax()(model(input))
_, index = output.max(1)  # index is a tensor
index = index.data[0]  # get the integer inside the tensor`

We give the image to the network, we get the output. It is a tensor of size 3, each entry corresponds to the score of each action (left, right or straight). The action to choose will be the one with the highest score (we pass it in a Softmax to have an output between 0 and 1). We get the action with the function `max` which returns the max value, and its index.

`a[0] = id_to_steer[index] * output.data[0, index] * 0.3  # lateral acceleration
env.render()`

`a[0]` is lateral acceleration. It is given the value 0, 1 or -1 depending on the action chosen by the neuron network. We multiply this action by a coefficient of 0.3 to avoid too abrupt actions, and also by the probability of the action given by the network (this makes it possible to have more important actions if the network is sure of its action, and less important when the network hesitates).

**After launching**, you have to control the speed of the car with the up and down keys of the keyboard. The direction will be chosen by the neural network.

![title](Images\car-running-auto.gif)

[Back to top](#Introduction)

In [5]:
# %%capture
# drive.py

id_to_steer = {
    LEFT: -1,
    RIGHT: 1,
    GO: 0,
}

if __name__=='__main__':
    quit=False
    if len(sys.argv) < 2:
        sys.exit("Usage : python drive.py path/to/weights")
    # load the model
    #model_weights = "models2/model-1.weights"
    #model_weights = sys.argv[1]
    model = CustomModel()
    model.load_state_dict(x)

    env = gym.make('CarRacing-v0').env
    env.reset()

    a = np.array([0.0, 0.0, 0.0])

    def key_press(k, mod):
        global restart
        global quit
        if k == 65307: quit = True
        if k==key.LEFT:  a[0] = -1.0
        if k==key.RIGHT: a[0] = +1.0
        if k==key.UP:    a[1] = +1.0
        if k==key.DOWN:  a[2] = +0.8   # set 1.0 for wheels to block to zero rotation
    def key_release(k, mod):
        if k==key.LEFT  and a[0]==-1.0: a[0] = 0
        if k==key.RIGHT and a[0]==+1.0: a[0] = 0
        if k==key.UP:    a[1] = 0
        if k==key.DOWN:  a[2] = 0

    env.viewer.window.on_key_press = key_press
    env.viewer.window.on_key_release = key_release
    env.reset()
    
    # initialisation
    for i in range(50):
        env.step([0, 0, 0])
        env.render()
    
    i = 0
    while not quit :  
        env.render(close = False)
        s, r, done, info = env.step(a)
        s = s.copy()
        # We transform our numpy array to PIL image
        # because our transformation takes an image as input
        s  = PIL.Image.fromarray(s)  
        input = transform_driving_image(s)
        input = Variable(input[None, :], volatile=True)
        output = Softmax()(model(input))
        _, index = output.max(1)
        index = index.data[0].item()
        print(id_to_steer[index])
        a[0] = id_to_steer[index] * output.data[0, index] * 0.3  # lateral acceleration
#         env.render()
    env.render(close=True)
#     env.close()

[2020-12-12 16:05:04,564] Making new env: CarRacing-v0


Track generation: 1001..1255 -> 254-tiles track
Track generation: 1205..1510 -> 305-tiles track
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
-1
0
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
-1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
-1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
-1
0
0
-1
0
-1
-1
0
0
0
0
0
0
-1
-1
-1
-1
-1
-1
0
0
0
0
0
0
-1
-1
-1
-1
-1
-1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
0
1
1
1
1
1
0
0
0
0
0
0
0
0

# <p style="text-align: center;">Conclusion<p><a id='Conclusion'></a>
    
Our network recognizes the shapes to keep the car on the desired path. It's a sort of classifier that just indicates whether the car is in the right position, too far to the right or too far to the left. We then send this command to the simulator. All of this is done in real time.
    
Behavioural Cloning though has a few disadvantages, and we can see them here in this notebook.
- We need to manually accelerate and decelerate, and we can only accelerate till a certain limit, because beyond that, the car will spin out of control and go outside in the patch of grass. 
- Since while training we never leave the track, the car has no way of coming back to the road after it has left the track and is into the grass.
- Here we only have a train set of 3000 and validation set of 600, but we tried increasing the sizes of these by a magintude of 10 (30,000 and 6000), but because of the substantial increase in the size of the dataset, the error while generating the dataset also shot up, which turned out to be a very bad dataset for out neural net. 
- Also, because we were well within the tracks, the car has no data on cases in which it goes out by accident.
- A possible remedy for this is preprocessing the data in such a way that the dataset has images of car coming in, but not going out.
    

[Back to top](#Introduction)

# For further
## Acceleration control
The control of the car is not total here: the network only controls the lateral acceleration (the right / left direction) of the car, but does not control the acceleration (therefore the speed). The problem is that it is impossible to guess the speed of the car by looking at a single image, so it cannot control the acceleration to maintain a suitable speed.

- use the speed bar which is under the image (the one that we have hidden). But the direction bar should be kept hidden, which misleads the direction classifier;

- Give the network several successive images, instead of just one. In this way, the network could deduce the speed of the car

- Ask the network to control only the speed, and not the acceleration (it is then necessary to code a feedback system that will maintain the requested speed): this approach is not really end-to-end but can be simpler if we has correct external data on the current speed (one could modify the environment to provide it in addition to the state).

## Data increase

The best way to improve the performance of classifiers is to increase the amount of data. But here it is quite long because the data has to be saved while playing the game manually. One way to artificially increase the amount of data is called data augmentation . It is a question of carrying out transformations to the images, which will not modify the labels (or will modify it in a determined way).

One can for example take the image symmetrical with respect to the vertical axis. The left / right labels will then be inverted, and the amount of data is multiplied by 2 immediately. Other possible transformations may be to distort the image a little or to modify the colors slightly (here the colors are fixed in the environment, so it will surely be less effective here than on real images).

[Back to top](#Introduction)

# <p style="text-align: center;">Contribution<p><a id='Contribution'></a>

This was a fun project in which we explore the idea of Reinforcement Learning. 
       
- Code by self : 35%
- Code from external Sources : 65%

[Back to top](#Introduction)

# <p style="text-align: center;">Citation<p><a id='Citation'></a>
- https://github.com/openai/gym
- https://gym.openai.com/envs/CarRacing-v0/
- https://cdancette.fr/2018/04/09/self-driving-CNN/
- https://www.jessicayung.com/explaining-tensorflow-code-for-a-convolutional-neural-network/
-
-

    
[Back to top](#Introduction)

# <p style="text-align: center;">License<p><a id='License'></a>

Copyright (c) 2020 Manali Sharma, Rushabh Nisher

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

[Back to top](#Introduction)