Skip to content
[ICML 2019] Implementation of "Imitation Learning from Imperfect Demonstration"
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
demonstrations
.gitignore remove cache Jun 18, 2019
2IWIL.py initial commit May 11, 2019
IC_GAIL.py init May 15, 2019
README.md initial commit May 11, 2019
conjugate_gradients.py initial commit May 11, 2019
loss.py initial commit May 11, 2019
models.py initial commit May 11, 2019
replay_memory.py initial commit May 11, 2019
running_state.py initial commit May 11, 2019
trpo.py initial commit May 11, 2019
utils.py save model May 14, 2019

README.md

Imitation Learning from Imperfect Demonstration

The TRPO part is hugely based on: https://github.com/ikostrikov/pytorch-trpo

Requirement

  • Python 3.6
  • PyTorch 0.4.1
  • gym
  • mujoco
  • numpy
  • scipy

Execute

The .py files take trajectories and confidence data as inputs (in demonstrations folder) and record accumulated reward at each update in the log folder. Please follow below commands to run our methods and baselines. Traj-size option is the same as specifying $n_c+n_u$ in the paper and num-epochs specifies the maximum number of update iterations.

  • IC_GAIL
python IC_GAIL.py --env Ant-v2 --num-epochs 5000 --traj-size 600 
  • 2IWIL
python 2IWIL.py --env Ant-v2 --num-epochs 5000 --traj-size 600 --weight
  • GAIL (U+C)
python 2IWIL.py --env Ant-v2 --num-epochs 5000 --traj-size 600
  • GAIL (C)
python 2IWIL.py --env Ant-v2 --num-epochs 5000 --traj-size 600 --weight --only --noconf
  • GAIL (Reweight)
python 2IWIL.py --env Ant-v2 --num-epochs 5000 --traj-size 600 --weight --only
You can’t perform that action at this time.