Skip to content

SchutteJan/RLProject

Repository files navigation

Reinforcement Learning: N-step Bootstrapping in Actor Critic Methods

This repository contains the code to run all our experiments.

Davide Barbieri, Jan Schutte, Hinrik Snær Guðmundsson, Zi Long Zhu

Requirements

Use the supplied environment.yml file for conda or the requirements.txt for pip to install the python depedencies.

Python version should be 3.7 or higher.

Training

The entrypoint of the training code is the run.py file, it's arguments specify the experiments you can run. These arguments can be supplied directly through the commandline but it is strongly advised to use one of the config files found in the config/ folder. To get a description of all parameters run:

python run.py -h

Running experiments causes them to generate results files, by default these can be found in the results/ folder. These files are required for generating plots and calculating metrics.

Examples

Learn CartPole environment with REINFORCE:

python run.py --load_from config/examples/cartpole/test_actor_reinforce.json

Acrobot environment with GAE:

python run.py --load_from config/examples/acrobot/test_GAE.json

Plotting

The entry point for plotting is the plots.py file, you can supply it the results files generated by run.py. We can generate two plots, episode versus return and episode versus episode length see the arguments for more information python plots.py -h.

Examples

If you ran both examples from the training section you can generate an episode return plot:

python plots.py --results_files \
    results/exp_testgae_<timestamp>/output_1.json \
    results/exp_test_actor_REINFORCE_<timestamp>/output_1.json \
    --labels "CartPole REINFORCE" "Acrobot GAE"  \
    --plot e_return  \
    --title "my title" \
    --show 

If you ran only a single experiment, only specify that results file and one label.

Analyse

The analyse.py file also contains some functions for measuring standard deviation and mean when running multiple experiments. The 'aggregate' function will combine multiple results files into a single one with mean and std instead of the original values. The 'AUC' function will compute the 'Area under Curve' metric and asymptote of results files.

Examples

Aggregate all files that match glob pattern output_*.json:

python analyse.py --input_name "results/Acrobot_AE/output_*.json" \
    --output_name aggregate_acrobot_ae.json  \
    --function aggregate \
    --targets return

Calculate AUC for results files (can also be a glob pattern):

python analyse.py --input_name "results/Acrobot_AE/output_1.json" \
    --output_name metrics.json  \
    --function auc \
    --targets return

About

RL Reproducible Research Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages