Inverse reinforcement learning experiments from variational discriminator bottleneck (VDB) paper
VDB IRL experiments

Implementation of Adversarial IRL (AIRL) with information bottleneck. Used in the Variational Discriminator Bottleneck (VDB) paper at ICLR.

Getting Set Up

  • Install rllab if not already. Try qxcv/rllab on the minor-fixes branch (which adds some missing hooks & addresses some bugs in the original RLLab).
  • Add folders multiple_irl, inverse_rl, and scripts to python path (to double check that this works, just try importing multiple_irl.envs from a python shell). The easiest way to do this is using the file in this directory, with pip install -e ..

When running scripts, you ought to run then directly from the root folder of the git repository.


The core algorithm is in `multiple_irl/models/

All the environments are in multiple_irl/envs: you can also find a comprehensive list of environments below in the README.

All the scripts are in scripts/scripts_generic. The scripts do the following:

  • This script collects expert trajectories.
  • This script trains an AIRL reward function on a single task.
  • Takes a trained AIRL reward function and uses it to train a new policy from scratch in an environment.
