Skip to content

ml-jku/align-rudder

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
September 29, 2020 13:07

Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Vihang P. Patil1, Markus Hofmarcher1, Markus-Constantin Dinu1, Matthias Dorfer3, Patrick M. Blies3, Johannes Brandstetter1, Jose A. Arjona-Medina1, Sepp Hochreiter1, 2

1 ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria
2 Institute of Advanced Research in Artificial Intelligence (IARAI)
3 enliteAI, Vienna, Austria


Detailed blog post on this paper at this link and a video showcasing the MineCraft agent at this link.

The full paper is available at https://arxiv.org/abs/2009.14108

Implementation of Align-RUDDER

This package contains an implementation of Align-RUDDER together with code to reproduce the results of artificial tasks I & II as stated in the paper. For the sake of time the default settings include only 10 seeds per experiment instead of the 100 used for the results in the paper.

Dependencies

To reproduce all results we provide an environment.yml file to setup a conda environment with the required packages. Run the following command to create the environment:

conda env create --file environment.yml
conda activate align-rudder
pip install -e .

Usage

To recreate the results from the paper you can run the included run scripts for the FourRooms and EightRooms environments and the respective method.

Align-RUDDER

python align_rudder/run_four_alignrudder.py
python align_rudder/run_eight_alignrudder.py

Behavioral Cloning + Q-Learning

python align_rudder/run_four_bc.py
python align_rudder/run_eight_bc.py

DQFD (Deep Q-Learning from Demonstrations)

python align_rudder/run_four_dqfd.py
python align_rudder/run_eight_dqfd.py

RUDDER (LSTM)

python align_rudder/run_four_rudder_lstm.py
python align_rudder/run_eight_rudder_lstm.py

Results

Once you ran all experiments you are interested in you can run the following script to get a summary of the results. By default plots for all available environments will be generated.

python align_rudder/plot_results.py [--env "FourRooms"|"EightRooms"|"all"]

LICENSE

MIT LICENSE

About

Code to reproduce results on toy tasks and companion blog for the paper.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages