Skip to content
Code for the paper "Generative Adversarial Imitation Learning"
Python
Branch: master
Clone or download

Latest commit

Latest commit 8a2ed90 Nov 22, 2018

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
environments Release Jun 10, 2016
expert_policies Release Jun 10, 2016
pipelines Release Jun 10, 2016
policyopt Release Jun 10, 2016
results Release Jun 10, 2016
scripts Release Jun 10, 2016
LICENSE Release Jun 10, 2016
README.rst update README with repo status Nov 21, 2018

README.rst

Status: Archive (code is provided as-is, no updates expected)

Generative Adversarial Imitation Learning

Jonathan Ho and Stefano Ermon

Contains an implementation of Trust Region Policy Optimization (Schulman et al., 2015).

Dependencies:

  • OpenAI Gym >= 0.1.0, mujoco_py >= 0.4.0
  • numpy >= 1.10.4, scipy >= 0.17.0, theano >= 0.8.2
  • h5py, pytables, pandas, matplotlib

Provided files:

  • expert_policies/* are the expert policies, trained by TRPO (scripts/run_rl_mj.py) on the true costs
  • scripts/im_pipeline.py is the main training and evaluation pipeline. This script is responsible for sampling data from experts to generate training data, running the training code (scripts/imitate_mj.py), and evaluating the resulting policies.
  • pipelines/* are the experiment specifications provided to scripts/im_pipeline.py
  • results/* contain evaluation data for the learned policies
You can’t perform that action at this time.