Methods for Deep Reinforcement Learning

This repository contains code made for a Capita Selecta on Reinforcement Learning.

The report (and code) cover the following subjects:

Traditional RL (Monte Carlo, SARSA-λ)
Linear value function approximation (Linear function approx SARSA-λ)
(Deep) Neural Network based function approximation inspired by DQN (Deep Q learning, Deep SARSA-λ)
An Actor Critic algorithm (Using A2C with SARSA-λ as value function approximator)
SAC-Q (https://deepmind.com/blog/learning-playing/)

These algorithms were evaluated in multiple experiments. The experiments used in the paper are all in the folder cluster_experiments, apart from the SAC-Q experiment which resides in sacx/experiments/mountaincar.py.

If you wish to read and gain a more in-depth understanding of these algorithyms, we invite you to read the report or peruse the assosiated slides:

Some shiny gifs

Cart Pole:

Snake:

Dependencies

In order to run the experiments a number of different Python packages are required, including Tensorflow, Keras, Numpy, Matplotlib, H5py and Pandas. Alternatively an environment that contains all dependencies can be installed using Conda:

conda create --name rl --file requirements.txt

The environment containing the dependencies can then be activated using:

source activate rl

Environments

You may find that you need to install the environments seperately depending on which experiments you plan on running. The environments used within the experiments here are either gym environments, PLE, or the snake environment.

The gym and PLE environments can be installed with your favoriate package manager and the snake environement can be found here.

If you wish to run the agents against a different environment, this is a relatively trivial task. You need to write a wrapper like those that can be found in the /environments folder and supply the state and action spaces.

Running experiments

Experiments can be run as follows:

python -m cluster_experiments.cartpole_sarsa_lambda

Results and configuration are logged to results/<filename>.h5, where the filename depends on the experiment. Depending on how logging is used in the experiment, the log file contains results for multiple runs and parameters of the experiment. As an example for how the log files can be read, see the scripts in /plots

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
agents		agents
cluster_experiments		cluster_experiments
environments		environments
experiments		experiments
gifs		gifs
plots		plots
results		results
sacx		sacx
sandbox		sandbox
.gitignore		.gitignore
LICENSE		LICENSE
MethodsForDeepRL-Slides.pdf		MethodsForDeepRL-Slides.pdf
README.md		README.md
agent.py		agent.py
core.py		core.py
experiment_util.py		experiment_util.py
methods-deep-reinforcement.pdf		methods-deep-reinforcement.pdf
p_network.py		p_network.py
policy.py		policy.py
q_estimator.py		q_estimator.py
q_network.py		q_network.py
q_network_sarsa_lambda.py		q_network_sarsa_lambda.py
q_table.py		q_table.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Methods for Deep Reinforcement Learning

Some shiny gifs

Dependencies

Environments

Running experiments

About

Releases

Packages

Contributors 4

Languages

License

Gerryflap/RL_project_common

Folders and files

Latest commit

History

Repository files navigation

Methods for Deep Reinforcement Learning

Some shiny gifs

Dependencies

Environments

Running experiments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages