# Sagemaker Donkey training

Create and train a donkey car using sample data.

The training script provided by the [Donkey](https://github.com/wroscoe/donkey) library is optimized to run in a distrubuted way. At best, it can be accelerated by running on a GPU if available. In this tutorial, we'll start with installing and using the library as it comes 'out-of-the-box'.

## Install

If you already installed the [Donkey](https://github.com/wroscoe/donkey) library in the [previous chapter](./donkey-tools.ipynb), you can [skip](#Train) this step.

Otherwise, go ahead:

In [None]:
# Make sure we're in SageMaker root
%cd ~/SageMaker

# Remove any old versions of the library
!rm -rf ~/SageMaker/donkey

In [None]:
# Clone the Donkey library git
!git clone https://github.com/wroscoe/donkey.git

In [None]:
# Keras is pinned to version 2.0.8 in the Donkey requirements. Change this to allow a newer version
!sed -i -e 's/keras==2.0.8/keras>=2.1/g' donkey/setup.py

# Install
!pip install -e donkey

In [None]:
# Create a new car using the library CLI
!donkey createcar --path ~/d2

## Train

### Create a Donkey car application (again)

If you haven't already, create a new *Donkey car application* using the default parameters. See [Donkey library tools - createcar](./donkey-tools.ipynb#createcar)

In [None]:
!donkey createcar

### Download data (again)

Download the sample data again:

In [None]:
from sagemaker import get_execution_role

# Bucket location to get training data
sample_data_location = 's3://jayway-robocar-raw-data/samples'

In [None]:
!aws s3 cp {sample_data_location}/ore.zip .

In [None]:
!unzip -o ore.zip

### Train a model using the Donkey library

Start training using the `Donkey car` created in `~/d2/`.

The main entrypoint is the `~/d2/manage.py` script, which allows you to start *drive* and *train* sessions. More documentation can be found here:

- [http://docs.donkeycar.com/guide/train_autopilot/](http://docs.donkeycar.com/guide/train_autopilot/)

**Note!** this training is not accelerated and will take a long time, approximately **30min**.

In [None]:
%%time

# Make sure we're in Donkey car root
%cd ~/d2

# Start the training
!python manage.py train --tub='../SageMaker/tub_8_18-02-09' --model './models/my-first-model'

There are some interesting information printed during training:

---

`joining the tubs 13630 records together. This could take 0 minutes.`

Training input is created by joining all records in the *Tub* together.

---

`train: 10904, validation: 2726`

The input is split in 2 sets; *training set* and *validation set*, with an 80/20 percent split.

---

`steps_per_epoch 85`

The number of steps per epoch. One **epoch** is one full pass of the training set. The **epoch** is divided up in **batches**, which basically means that the **weights** of the model are updated after each **batch** in the **epoch**. One **step** is one **batch** long.

`steps_per_epoch` is calculated as:

$$
s = \frac{t}{b} = \frac{10904}{128} \approx 85
$$

where $s$ is steps per epoch, $t$ is the length of the training set and $b$ is batch size. Batch size is configurable in the `config.py` file as `BATCH_SIZE`. 

---

```bash
Epoch 1/100
...
85/85 [==============================] - 141s 2s/step - loss: 3.1792 - angle_out_loss: 3.5319 - throttle_out_loss: 0.4266 - val_loss: 2.1385 - val_angle_out_loss: 2.3759 - val_throttle_out_loss: 0.2332
...
```

This is the result of 1 epoch of training. A short summary:

|                       | Value       | Description |
| :-                    | :-          | :-          |
| Total epochs          | 100         | The total number of epochs in the training. This is configurable, but not in an easy way |
| Time spent            | 141 seconds | The total time spent training this epoch. Since this training is neither accelerated (GPU) nor distributed ( > 1 training instance ), it is very slow. |
| Nbr of steps          | 85          | 85 steps per epoch, each step being 128 records long |
| loss                  | 3.1792      | The total training loss after the first epoch. This should decrease for each epoch. |
| angle_out_loss        | 3.5319      | Training loss for angle. |
| throttle_out_loss     | 0.4266      | Training loss for throttle .|
| val_loss              | 2.1385      | The total validation loss after the first epoch. This should decrease for each epoch. |
| val_angle_out_loss    | 2.3759      | Validation loss for angle. |
| val_throttle_out_loss | 0.2332      | Validation loss for throttle. |


### Test the new model

The easiest way to test the model is to use the simulator on your local computer, see [last chapter](./donkey-tools.ipynb#simulator).

Download the model. It is outputted to `./models`:

In [None]:
!ls -la ./models

To download the model to you local computer, either use an S3 bucket you've already created in the account, or use the default SageMaker bucket:

In [None]:
import sagemaker
dl_bucket = sagemaker.Session().default_bucket()

In [None]:
# push the model to your bucket
!aws s3 cp ./models/my-first-model s3://{dl_bucket}/models/my-first-model

Next, download the model to your computer either manually using the AWS console, or using the CLI if you have it configured:
```bash
aws s3 cp s3://<bucketname>/models/my-first-model .
```

If you've installed the donkey library properly, you should be able to start the model server using:
```bash
donkey sim --model=./models/my-first-model --config ~/d2/config.py
```

If you've installed the simulator properly, you should be able to start it using (on ubuntu):
```bash
./donkey_sim.x86_64
```

Finally, click the **NN Steering over Websockets** button, and the car should start driving using the model (not particulary well though)...

![simulator button](./simulator-connect.png)

## Next

[Digging into the neural network](./donkey-nn.ipynb)