# Deepracer Dev Log
[_Congrats Richard U., you've managed to nerd snipe me._](https://xkcd.com/356/)

Hi this is a dev log for the _redacted_ 2024 DeepRacer competition.

DeepRacer is an AWS platform used to introduce developers to Reinforcement Learning.
Basically Reinforcement learning is automated supervised learning in the sense that supervised training data is automatically generated from and fed back into the training process.
In the case of DeepRacer, we are training a neural network to race a car through an arbitrary track using video image data.
This is done by using a "reward function" to "grade" the successive "states" that occur due to the neural network's actions.

For example in the case of DeepRacer, we are given a "params" dictionary object that contains information about the car's current state.
This information includes data about the position, speed, and orientation of the car; along with information about its position relative to the track (is it on or off the track).
As the goal of this project is to get the car to race through the track in the least amount of time possible without going off track or crashing;
our reward function should reward states that are associated with fast race completion times higher than slower race times or crashes.

Information about the properties supplied by the "params" dictionary object can be found [here](https://docs.aws.amazon.com/deepracer/latest/developerguide/deepracer-reward-function-input.html)
Sample simple reward functions can be found [here](https://docs.aws.amazon.com/deepracer/latest/developerguide/deepracer-reward-function-examples.html)

Other useful information that I have located is that:
 - We are using a single camera car (the original deep racer)
 - The camera pulls images at a rate of 15 frames per second
 - we can use a restricted set of python libraries (math, random, numpy, scipy, shapely)

## Initial thoughts
Based off of prior experience with racing go karts, the neural net should probably try to get the car to drive racing lines.

## Project Plan and Tasks:
    - [x] read through documentation

## Open questions:
    - How do we know if the car has flipped in the simulator? (params.is_crashed?)
    - How can we detect skidding in the simulator?

## Proximal Policy Optimization (PPO) Notes

PPO was created by OpenAi in 2017 and appears to be the current standard for Reinforcement Learning.
Appears to be an incremental improvement on the standard Gradient Descent method but with "clipping".
Where "clipping" refers to the limiting of policy changes between updates to prevent performance collapse.

Resources:
 - https://huggingface.co/learn/deep-rl-course/unit8/introduction 
 - Arxiv PDF: https://arxiv.org/abs/1707.06347
 - Open ai docs:
    - https://spinningup.openai.com/en/latest/algorithms/ppo.html 
    - https://openai.com/index/openai-baselines-ppo/ 
 - wiki: https://en.wikipedia.org/wiki/Proximal_policy_optimization
 - Simulator notes:
    - https://openai.com/index/roboschool/ (defunct) see https://github.com/Farama-Foundation/Gymnasium and https://farama.org/projects 
    - joint simulation
        - mujoco
            - https://gymnasium.farama.org/environments/mujoco/ 
            - https://github.com/google-deepmind/mujoco 
            - https://mujoco.readthedocs.io/en/stable/overview.html



## Deepracer notes

Utilities to process DeepRacer logs: https://github.com/aws-deepracer-community/deepracer-utils 
 - useful for extracting route information
Blog on how to train with Racing Lines:
 - https://mickqg.github.io/DeepracerBlog/ 
 - https://mickqg.github.io/DeepracerBlog/part2.html
 - https://github.com/MickQG/deepracer-analysis/blob/master/README.md

AWS stuff:
 - https://github.com/aws-solutions-library-samples/guidance-for-training-an-aws-deepracer-model-using-amazon-sagemaker 
 - https://docs.aws.amazon.com/deepracer/latest/student-userguide/reward-function.html
 - https://github.com/matrousseau/AWS-Deepracer-Optimal-Path-Generator 


## Racing Lines:
    - https://www.reddit.com/r/coolguides/comments/vw0k49/ideal_racing_lines/ 
    

## Shapely
 - https://shapely.readthedocs.io/en/stable/manual.html 
 - https://pypi.org/project/shapely/ 
 - https://github.com/shapely/shapely
 
Spatial analysis library for python, useful for route manipulation.
Additional work in [shapelyExp.ipynb](shapelyExp.ipynb)

In [16]:
%load_ext autoreload
%autoreload 2
import json 

params = None 
with open("./params.json", 'r') as f:
    params = json.load(f)
print(params)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
{'all_wheels_on_track': True, 'x': 3.0, 'y': 3.0, 'closest_objects': [3, 4], 'closest_waypoints': [0, 1], 'distance_from_center': 0.1, 'is_crashed': False, 'is_left_of_center': True, 'is_offtrack': False, 'is_reversed': False, 'heading': 0.1, 'objects_distance': [0.3, 0.1], 'objects_heading': [0.1, 0.2], 'objects_left_of_center': [True, False], 'objects_location': [], 'objects_speed': [3.0, 5.0], 'progress': 0.1, 'speed': 0.1, 'steering_angle': 0, 'steps': 0, 'track_length': 3, 'track_width': 0.1, 'waypoints': [[0, 0], [0, 2]]}


#Reward function component testing

In [36]:
import reward_function

al = [0,1,2,3,4,5]
assert [1, 2, 3, 4, 5, 0, 1] == reward_function.spliceWithLoop(al, 1, 8)
params['speed'] = 5
params['heading'] = 90
params["steering_angle"] = 0
params['x'] = 10
params['y'] = 0
reward_function.get_projected_car_path(params).xy

(array('d', [10.0, 10.0]), array('d', [0.0, 0.3333333333333333]))

In [42]:
from deepracer.logs import (AnalysisUtils, DeepRacerLog)

drl = DeepRacerLog(model_folder='./deepracerlogs/joes-test-model-42-training_job_ngORYSXKStmsCby370Xdtw_logs')
drl.load_training_trace()
training_df = drl.dataframe()

TypeError: expected str, bytes or os.PathLike object, not NoneType