Skip to content
Implementation of Inverse Reinforcement Learning (IRL) algorithms in python/Tensorflow. Deep MaxEnt, MaxEnt, LPIRL
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
cartpole
imgs demos Jun 5, 2017
mdp
.gitignore
README.md
deep_maxent_irl.py demos Jun 5, 2017
deep_maxent_irl_gridworld.py demos Jun 5, 2017
demo.py demos Jun 5, 2017
demo_gridworld1d.py demos Jun 5, 2017
img_utils.py
linear_irl_gridworld.py
lp_irl.py
maxent_irl.py
maxent_irl_gridworld.py demos Jun 5, 2017
tf_utils.py
utils.py demos Jun 5, 2017

README.md

irl-imitation

Implementation of selected Inverse Reinforcement Learning (IRL) algorithms in python/Tensorflow.

python demo.py

Algorithms implemented
  • Linear inverse reinforcement learning (Ng & Russell 2000)
  • Maximum entropy inverse reinforcement learning (Ziebart et al. 2008)
  • Maximum entropy deep inverse reinforcement learning (Wulfmeier et al. 2015)
MDP & solver implemented
  • gridworld 2D
  • gridworld 1D
  • value iteration

Dependencies

  • python 2.7
  • cvxopt
  • Tensorflow 0.12.1
  • matplotlib

Linear Inverse Reinforcement Learning

$ python linear_irl_gridworld.py --act_random=0.3 --gamma=0.5 --l1=10 --r_max=10

Maximum Entropy Inverse Reinforcement Learning

(This implementation is largely influenced by Matthew Alger's maxent implementation)

$ python maxent_irl_gridworld.py --height=10 --width=10 --gamma=0.8 --n_trajs=100 --l_traj=50 --no-rand_start --learning_rate=0.01 --n_iters=20

$ python maxent_irl_gridworld.py --gamma=0.8 --n_trajs=400 --l_traj=50 --rand_start --learning_rate=0.01 --n_iters=20

Maximum Entropy Deep Inverse Reinforcement Learning

  • Following Wulfmeier et al. 2015 paper: Maximum Entropy Deep Inverse Reinforcement Learning. FC version implemented. The implementation does not follow exactly the model proposed in the paper. Some tweaks applied including elu activations, clipping gradients, l2 regularization etc.
  • $ python deep_maxent_irl_gridworld.py --help for options descriptions
$ python deep_maxent_irl_gridworld.py --learning_rate=0.02 --n_trajs=200 --n_iters=20

MIT License

You can’t perform that action at this time.