[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/Barchid/car-steering/blob/master/notebook.ipynb)

# Train Car Steering Prediction on Google Colab


This notebook can be used on Google Colab to launch experiments with GPUs using the current git repository. It's implemented using Pytorch and Pytorch-Lightning).

## Setup

First and foremost, let's configure your Google Colab session. You have to enable a GPU runtime. On your Colab tabs (top of the screen), go to `Runtime` ▶ `Change runtime type` ▶ `Hardware accelerator` select "GPU" ▶ Click on `Save`.

Now it's done, let's clone the project repository and the required dependencies.

In [None]:
!pip install spikingjelly pytorch-lightning torchmetrics tonic
!git clone https://github.com/Barchid/car-steering

# Move to the cloned project
%cd car-steering

In [None]:
# pull some changes if you pushed changes after cloning the project
!git pull

## Program arguments
The project provides a lot of arguments to shape the training to your needs. A wide majority of them are provided by pytorch-lightning, and some are specific to the CNN model.

In [None]:
# Print out the available parameters.
!python main.py --help

 Let's take a look at the **most interesting** for our example :

- **Related to `pytorch-lightning` :**
  - `--default_root_dir`: path of the directory where all the tensoboard logs, checkpoints, etc will be stored. Prefer a path that starts with `experiments/....` to make it cleaner.
  - `--gpus` and `--auto_select_gpus`: helps you make the choice of GPU to run your CNN. **On Colab, always use `--gpus=1 --auto_select_gpus`** since you have access to one GPU.
  - `--max_epochs` : number of epochs to run your training experiment.

- **Related to the model**:
  - `--learning_rate`: learning rate for your training.
  - `--batch_size`: the batch size for your experiment

- **Related to the program**:
  - `--height`: height of the input images for the CNN.
  - `--width`: width of the input images for the CNN.
  - `--ckpt_path`: path of a checkpoint file containing a trained CNN. It can be used to test your network's performance after training.
  - `--mode`: mode of your program. There are values possible. `--mode="train"` to train your CNN or `--mode="validate"` to perform a validation/test of your CNN(mostly used coupled with `--ckpt_path`).






## Training
Now we know a little bit about the training, we can launch a typical training with the standard arguments for a Google Colab session.

In [None]:
!python main.py --default_root_dir="experiments/typical_training" --gpus=1 --auto_select_gpus --learning_rate=1e-3 --max_epochs=2 --mode="train"--batch_size=16

## Validation

After training, we obtain a checkpoint of the best performing model. We can evaluate this checkpoint with a validation step of the trained CNN with
`--mode="validate"` and `--ckpt_path="..."`. 

In [None]:
!python main.py --default_root_dir="experiments/typical_training" --mode="validate" --gpus=1 --auto_select_gpus --ckpt_path="SELECT YOUR CHECKPOINT FILE PATH HERE"

## Useful features

`pytorch-lightning` comes with a lot of features that can help you build and optimize your network. I strongly recommend you to take a look at the documentation (https://pytorch-lightning.readthedocs.io/en/latest/) to know more about it. But let's make a quick crash course of some features you can try with the project using the built-in arguments :

### Learning Rate Finder

Finding a good learning rate can be difficult. The following argument can help you find a good learning rate automatically: `--mode="lr_find"` .

In [None]:
!python main.py --default_root_dir="experiments/typical_training" --gpus=1 --auto_select_gpus --mode="lr_find" --batch_size=16

### Debugging your model
Like any programs, a neural networks are not protected from bugs in their implementation. However, contrary to common programs, it is hard to find bugs or even detect there is a problem in the implementation. You won't have an exception that will pop out of nowhere.

Hopefully, a simple debugging strategy can be used to check if your neural network actually learns something. It consists of overfitting a very small number of batches and check if the loss quickly goes to 0 (or very close to 0). If it is not the case, it means that your model has a problem that prevents it from learning anything.

Here, we will use this technique with the flag `--overfit_batches=4`, to overfit our model on 4 batches.



In [None]:
!python main.py --default_root_dir="experiments/typical_training" --gpus=1 --auto_select_gpus --learning_rate=1e-3 --max_epochs=50 --overfit_batches=4 --mode="train" --batch_size=16

### Don't Forget
After your experiments are done, don't forget to download the checkpoint files and logs folders to analyze the loss curves of your model using `tensorboard`.
To do so, after you downloaded the experiment folders (i.e. path indicated in `--default_root_dir` argument), type the following command in your local machine.

In [None]:
tensorboard --logdir="PATH OF YOUR LOG DIRECTORY"