"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow
Switch branches/tags
Nothing to show
Clone or download
carpedm20 Merge pull request #9 from jackokaiser/master
Update to tensorflow 1.9.0
Latest commit 5754bd4 Jul 20, 2018
Permalink
Failed to load latest commit information.
assets add more results and commands Jul 14, 2016
src Update to tensorflow 1.9.0 Jul 20, 2018
.gitignore initial commit Jul 7, 2016
LICENSE add LICENSE Aug 10, 2016
README.md Update to tensorflow 1.9.0 Jul 20, 2018
main.py Update to tensorflow 1.9.0 Jul 20, 2018
run_mujoco.sh fix run_mujoco Jul 14, 2016
utils.py Update to tensorflow 1.9.0 Jul 20, 2018

README.md

Normalized Advantage Functions (NAF) in TensorFlow

TensorFlow implementation of Continuous Deep q-Learning with Model-based Acceleration.

algorithm

Requirements

Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for an environment with a continuous action space:

$ python main.py --env_name=Pendulum-v0 --is_train=True
$ python main.py --env_name=Pendulum-v0 --is_train=True --display=True

To test and record the screens with gym:

$ python main.py --env_name=Pendulum-v0 --is_train=False
$ python main.py --env_name=Pendulum-v0 --is_train=False --display=True

Results

Training details of Pendulum-v0 with different hyperparameters.

$ python main.py --env_name=Pendulum-v0 # dark green
$ python main.py --env_name=Pendulum-v0 --action_fn=tanh # light green
$ python main.py --env_name=Pendulum-v0 --use_batch_norm=True # yellow
$ python main.py --env_name=Pendulum-v0 --use_seperate_networks=True # green

Pendulum-v0_2016-07-15

References

Author

Taehoon Kim / @carpedm20