Skip to content
Hybrid Reward Architecture
Branch: master
Clone or download
fatemi Merge pull request #2 from Maluuba/mini_env
fix possible_fruits issue
Latest commit f052da0 May 3, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
dqn add dqn + temp folder for results Dec 1, 2017
environment fix possible_fruits issue May 2, 2018
results add dqn + temp folder for results Dec 1, 2017
tabular files added Nov 27, 2017
.gitignore add dqn + temp folder for results Dec 1, 2017
LICENSE.txt files added Nov 27, 2017
README.md fixes website address Jan 24, 2018
__init__.py files added Nov 27, 2017
run.sh files added Nov 27, 2017
utils.py

README.md

Hybrid Reward Architecture

This repository hosts the code published along with the following NIPS article (Experiment 4.1: Fruit Collection Task):

For more information about this article, see the following blog posts:

Dependencies

We strongly suggest to use Anaconda distribution.

  • Python 3.5 or higher
  • pygame 1.9.2+ (pip install pygame)
  • click (pip install click)
  • numpy (pip install numpy -- or install Anaconda distribution)
  • Keras 1.2.0+, but less than 2.0 (pip install keras==1.2)
  • Theano or Tensorflow. The code is fully tested on Theano. (pip install theano)

Usage

While any run is going on, the results as well as the AI models will be saved in the ./results subfolder. For a complete run, five experiments for each method, use the following command (may take several hours depending on your machine):

./run.sh
  • NOTE: Because the state-shape is relatively small, the deep RL methods of this code run faster on CPU.

Alternatively, for a single run use the following commands:

  • Tabular GVF:
ipython ./tabular/train.py -- -o use_gvf True -o folder_name tabular_gvf_ -o nb_experiments 1
  • Tabular no-GVF:
ipython ./tabular/train.py -- -o use_gvf False -o folder_name tabular_no-gvf_ -o nb_experiments 1
  • DQN:
THEANO_FLAG="device=cpu" ipython ./dqn/train.py -- --mode hra+1 -o nb_experiments 1
  • --mode can be either of dqn, dqn+1, hra, hra+1, or all.

Demo

We have also provided the code to demo Tabular GVF/NO-GVF methods. You first need to train the model using one of the above commands (Tabular GVF or no-GVF) and then run the demo. For example,

ipython ./tabular/train.py -- -o use_gvf True -o folder_name tabular_gvf_ -o nb_experiments 1
ipython ./tabular/train.py -- --demo -o folder_name tabular_gvf_

If you would like to save the results, use the --save option:

ipython ./tabular/train.py -- --demo --save -o folder_name tabular_gvf_

The rendered images will be saved in ./render directory by default.

License

Please refer to LICENSE.txt.

You can’t perform that action at this time.