During our study year at University of Kent, we had to produce a project in a group of 4 peoples. We decided to work on a Reinforcement Learning project. The goal was to train a car to drive on a track using Reinforcement Learning algorithms. We had to implement the algorithms and the models from scratch. We also had to implement a way to visualize the training and the results of the training.
Algorithm | Description | File | Applicability | Implemented ? | Responsible |
---|---|---|---|---|---|
Classic Genetic Algorithm | Genetic.py | ✅ | ✅ | Gabriel | |
Deep Q Neural Network (DQN) | DQN.py | ✅ | ✅ | Nathan | |
Neat Algorithm | NEAT.py | ✅ | ✅ | Tom | |
DDPG Algorithm | DDPG.py | ✅ | 🔧 | Gabriel | |
PPO Algorithm | PPO.py | ✅ | 🚧 | Hugo | |
Actor Critic Method | ✅ | 🔴 | Hugo | ||
VPG Algorithm | ✅ | 🚧 | Maxime | ||
Q-learning or value-iteration methods | 🔴 | ||||
Q-Learning | 🔴 |
🔴 : Not implemented
🚧 : In progress
✅ : Implemented
Model | Description | File | Implemented ? | Responsible |
---|---|---|---|---|
CNN | Classic CNN where outputs are the value of simulation parameters | CNN.py | ✅ | Nathan |
Fully Connected | Only Dense layers where outputs are the value of simulation parameters | FullyConnected.py | ✅ | Gabriel |
Selective CNN | CNN where the outputs are which move done (move predifined) | SelectiveCNN.py | ✅ | Nathan |
Selective Fully Connected + Kmeans | Only Dense layers where the outputs are which move done (move predifined), but the input isn't the image but a single line of majoritary class selected by the Kmeans | SelectiveKMNN.py | ✅ | Nathan |
CNN | Fully Connected | Selective CNN | Selective Fully Connected + Kmeans |
---|---|---|---|
Genetic Algorithm + FullyConnected | DQN + SelectiveKMNN | NEAT | DDPG + CNN |
---|---|---|---|
pip install -r requirements.txt
python carRacing.py MODEL_NAME ALGORITHM_NAME
python visualize.py <MODEL_NAME> <ALGORITHM_NAME>
# or
python visualize.py <MODEL_NAME> <ALGORITHM_NAME> <SEED> # to run a specific seed
python saves/stats.py <CSV_FILE>
python saves/stats.py SPECIFIC <FILTER> <TYPE_OF_STATS>
python saves/stats.py ALL <TYPE_OF_STATS>
flowchart TD
A[Start] --> B[Load Brain and Estimator \nfrom entry arguments]
B --> C{Save file\nexists ?}
C --> |Yes| D[Load weights into brain]
D --> E
C --> |No| E[Loop of all simulations]
E --> G[Simulation reset]
G --> J[Current simulation loop]
J --> K[Brain predicts next move]
K --> L[Estimator memorizes current\nmove and state if needed]
L --> M[Calculate current score\nfrom simulation]
M --> N{Simulation\ndone ?}
N --> |No| J
N --> |Yes| O{New best\nscore ?}
O --> |Yes| P[Save weights\nin file]
O --> |No| Q
P --> Q[Estimator updates brain's weights]
Q --> R[Brain trains its network\nwith new weights]
R --> E