"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow
Switch branches/tags
Clone or download
carpedm20 Merge pull request #9 from jackokaiser/master
Update to tensorflow 1.9.0
Latest commit 5754bd4 Jul 21, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
assets add more results and commands Jul 14, 2016
src Update to tensorflow 1.9.0 Jul 20, 2018
.gitignore initial commit Jul 7, 2016
LICENSE add LICENSE Aug 10, 2016
README.md Update to tensorflow 1.9.0 Jul 20, 2018
main.py Update to tensorflow 1.9.0 Jul 20, 2018
run_mujoco.sh fix run_mujoco Jul 14, 2016
utils.py Update to tensorflow 1.9.0 Jul 20, 2018


Normalized Advantage Functions (NAF) in TensorFlow

TensorFlow implementation of Continuous Deep q-Learning with Model-based Acceleration.




First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for an environment with a continuous action space:

$ python main.py --env_name=Pendulum-v0 --is_train=True
$ python main.py --env_name=Pendulum-v0 --is_train=True --display=True

To test and record the screens with gym:

$ python main.py --env_name=Pendulum-v0 --is_train=False
$ python main.py --env_name=Pendulum-v0 --is_train=False --display=True


Training details of Pendulum-v0 with different hyperparameters.

$ python main.py --env_name=Pendulum-v0 # dark green
$ python main.py --env_name=Pendulum-v0 --action_fn=tanh # light green
$ python main.py --env_name=Pendulum-v0 --use_batch_norm=True # yellow
$ python main.py --env_name=Pendulum-v0 --use_seperate_networks=True # green




Taehoon Kim / @carpedm20