C implementation of RL and IRL algorithms
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
asterix
data
matrix2latex
tutorials
.gitignore
CSI_asterix_theta_C.mat
CSI_asterix_thetas.mat
DP.py
Exp1.ipynb
Exp1.py
Exp10.ipynb
Exp10.py
Exp11.ipynb
Exp11.pdf
Exp11.py
Exp12.ipynb
Exp12.py
Exp13.ipynb
Exp13.py
Exp14.ipynb
Exp14.pdf
Exp14.py
Exp14_zoom.pdf
Exp16.ipynb
Exp16.py
Exp17.ipynb
Exp17.pdf
Exp17.py
Exp18.ipynb
Exp18.py
Exp2.ipynb
Exp2.py
Exp3.ipynb
Exp3.py
Exp4.ipynb
Exp4.py
Exp5.ipynb
Exp5.py
Exp6.ipynb
Exp6.py
Exp7.ipynb
Exp7.py
Exp8.ipynb
Exp8.py
Exp9.ipynb
Exp9.py
Highway.ipynb
Highway.py
Highway_P.mat
Highway_R.mat
Highway_perturbed_P.mat
Makefile
OldCode.py
Plot.py
Plot15.ipynb
Plot15.py
README.org
inverted_pendulum_expert_omega.mat
mountain_car_batch_data.mat
mountain_car_boubou_trajs.mat
mountain_car_expert_omega.mat
pendulum.py
rl.py
stuff.py

README.org

Source code for Inverse Reinforcement Learning

You’ll find in this repo the code for three reinforcement learning algorithms :

  • LSTD$μ$
  • SCIRL
  • CSI

the description of which can be found on my research page.

Only SCIRL has a somewhat good, heavily commented implementation. It can be found in tutorials/Exp7.py.

I intend to implement those algorithms properly as a part of some well-known machine learning library. When this is done I will destroy this repo.

In the meantime, feel free to try to make sense of all this. Please don’t hesitate to contact me if you have any question.

Exp1.py : CSI on the inverted pendulum, parameters can be played with.

Exp2.py : Finding out in which areas of the state space of the inverted pendulum the expert is good.

Exp3.py : Trying to find what to plot with CSI on the inverted pendulum.

Exp4.py : Testing different parameters for LSPI and CSI on the Mountain Car.

Exp5.py : Running all algos (SCRIL x2, SCI, Classif, RE) on the Mountain Car.

Exp6.py : Evaluating the policies found in Exp5.

Exp7.py : Running SCIRL on the Mountain Car

Exp8.py : Relative entropy on the mountain car.

Exp9.py : Cascading on the data from Asterix

Exp10.py : Relative Entropy on the Highway

Exp11.py : Plotting the results of different IRL algos on the mountain car

Exp12.py : Running SCIRL on the Highway

Exp13.py : Running CSI on the Highway