Skip to content

Reinforcement Learning Project - Obstacle Tower Agent

Notifications You must be signed in to change notification settings

nishai/RL-Obstacle-Tower

Repository files navigation

Reinforcement Learning - Obstacle Tower Challenge

alt text

In this project we had to create an agent to tackle the Obstacle Tower Challenge. The agent must ascend a tower, proceeding through as many floors/levels as possible.

Team

  • Nishai Kooverjee (135477)
  • Kenan Karavoussanos (1348582)
  • Angus Mackenzie (1106817)
  • Africa Khoza (1137682)

Setup

To run this code, you need to have the requisite packages and the environment setup.

Packages

To install the packages, run the following command:

conda env create -f environment.yml

Then activate the environment by running:

conda activate proj

Environment

This project required an offshoot of the obstacle tower environment. The environment is too large for github, so we had to save it on google drive. Download the ObstacleTower.zip file from Google Drive, and then unzip it into the repository's directory. You will likely need to change the permissions in order to make it executable, you can do this by running the following in the repository directory.

chmod -R 755 ./ObstacleTower/obstacletower.x86_64

Environment Configuration

The following configuration was laid out for us in the course:

starting-floor':        0
total-floors':          9
dense-reward':          1
lighting-type':         0
visual-theme':          0
default-theme':         0
agent-perspective':     1
allowed-rooms':         0
allowed-modules':       0
allowed-floors':        0

Evaluating

To get an estimate of the score obtained by the agent during the marking, you can do the following.

Before attempting an evaluation, ensure the MyAgent.py file's __init__ method has the path to load the weights from, an example follows:

self.policy_network.load_state_dict(torch.load("checkpoints/40000.pth",map_location=torch.device(device)))

Where "checkpoints/40000.pth" is the location of our model's weights.

Then to run the evaluation script:

python evaluation.py --realtime

This will run the evaluation.py script on 5 different seeds, and will return the score gained across those runs. The --realtime flag indicates whether the environment will be rendered so you can watch the trial happening. If you do not want to watch the trial, and want to get the results as fast as possible, simply run the command without the --realtime flag.

Training

To train a new agent simply run:

python train_atari.py --checkpoint checkpoints/40000.pth

You can remove the --checkpoint flag if you want to train one from scratch and not use any pretrained weights.

The above command will create a new folder, called results/experiment_1, and will store the rewards attained as well as checkpoints in that folder. For each new run of train_atari.py a new experiment_<n> folder will be created.

Approach

We used a Deep Q Network as the backbone of our agent. The code was largely based off one of our previous assignments. We used minimal wrappers, and simply trained a number of models over the course of few weeks. Often using a pretrained model's weights to initialise another model, and changing different hyperparameters along the way. We reached level 5 in the tower, and achieved a score of 40000. Considering the aim was to beat an agent with a score of 8000, we did notably well. This assignment has a leader board so that students can track how their agents compare against others, and some students achieved truly remarkable performance.

Alternative Methods

We tried a few different methods before our DQN model achieved good results. Such as:

  • Tensorflow based PPO
  • PyTorch PPO
  • Random Agent

You can view them in the alt_methods directory

About

Reinforcement Learning Project - Obstacle Tower Agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages