ContinuousQLearning

A (maybe working?) implementation of the first part of this paper: https://arxiv.org/pdf/1603.00748.pdf, tested on the OpenAI Pendulum task. Not very well documented / organized at present. Ideally I'll be able to make it robust enough to work across many tasks with minimal tuning (which may require implementing other features described in the paper). I also plan to try integrating the algorithm into some recurrent attention models (e.g. https://github.com/jlindsey15/RAM and possibly a modified version of https://github.com/jlindsey15/DRAM).

The following were/are helpful as references -- at the moment I don't think my code doesn't offer any more significant functionality than these... but more to come!

https://gist.github.com/tambetm/78227e1a15c52fbbcaeef7715dd079f0#file-pendulum-v0-md https://github.com/carpedm20/NAF-tensorflow

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.DS_Store		.DS_Store
MUJOCO_LOG.TXT		MUJOCO_LOG.TXT
README.md		README.md
continuousqlearning.py		continuousqlearning.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.DS_Store

.DS_Store

MUJOCO_LOG.TXT

MUJOCO_LOG.TXT

README.md

README.md

continuousqlearning.py

continuousqlearning.py

Repository files navigation

ContinuousQLearning

About

Releases

Packages

Languages

jlindsey15/ContinuousQLearning

Folders and files

Latest commit

History

Repository files navigation

ContinuousQLearning

About

Resources

Stars

Watchers

Forks

Languages