Skip to content
Random MDP experiments on true online TD from a forthcoming work by van Seijen et al. (2015)
Python Shell
Branch: totd-rndmdp-ex…
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

Random MDP experiments on true online TD(lambda) algorithm

This project contains random MDP experiments comparing true online TD (TOTD) by van Seijen and Sutton (2014) with TD with accumulating traces (TD) and TD with replacing traces (TDR). These experiments are done as a part of a forthcoming work by van Seijen, Sutton, Mahmood, Pilarski and Machado (2015).

It can be imported as an Eclipse Pydev project.

Read or execute runtotd-rndmdp-experiments.sh for an example of running the experiments and plotting the python figures.

References

van Seijen, H., Sutton, R.S. (2014). True online TD(lambda). In Proceedings of the 31st International Conference on Machine Learning. JMLR W&CP 32(1):692-700.

van Seijen, H., Sutton, R.S., Mahmood, A.R., Pilarski, P.M., Machado, M.C. (2015). An empirical evaluation of true-online TD(lambda). (forthcoming)

You can’t perform that action at this time.