Author: Joshua Sia
Date: 2020-11-10
Reinforcement learning is used to solve a randomly generated maze. Some examples of mazes are shown below. The red circle represents the agent's starting state and the blue circle represents the goal.
Random maze 1 | Random maze 2 | Random maze 3 |
---|---|---|
Deep Q Learning (DQL) was implemented to train an agent to reach the goal. Training was conducted for 10 minutes and at the end of training, the agent's policy was executed greedily to determine whether the agent is able to reach the goal.
For more details on the rules of the problem, please see here.
For more details on the implementation of DQL, please see here.
Examples of the agent's learned policy after training are shown below. The purple line represents the path the agent took under its learned policy.
Reached goal in 69 steps | Reached goal in 85 steps | Reached goal in 99 steps |
---|---|---|
It is recommended to run the scripts in a virtual environment. To get started, create a virtual environment named randommaze
by running the following command at the command line:
conda create --name randommaze python=3.8.2 -y
Next, activate the virtual environment:
conda activate randommaze
Then, install the dependencies listed below using:
pip3 install numpy==1.19.3 matplotlib==3.3.0 opencv-python==4.5.1.48
To install pytorch, visit the pytorch website and look for version 1.7.0. Copy the installation command and run it at the command line. For instance, the installation command for Mac is:
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 -c pytorch -y
Finally, run the python script to train the agent using:
python train_and_test.py
- numpy=1.19.3
- matplotlib=3.3.0
- opencv-python=4.5.1.48
- torch=1.7.0
The code for the random environment was provided by Dr Edward Johns at Imperial College London as part of the Reinforcement Learning module.