Dueling Double Deep Q-Network with Tensorflow

You need a powerful GPU for training. After 12 hours of training for each game on GTX-1070 I got this result:

Setup

You need anaconda, tensorflow, gym, and tflearn to run this project.

Once you install anaconda, you might need the following commands helpful to create and destroy your conda virtual environment.

$ export PATH=~/anaconda3/bin/:$PATH # Add it to your path.
$ conda -V # You can check conda version
$ conda update conda # You can update your conda
$ conda search "^python$" # See python versions
$ conda create -n <env-name> python=<version> anaconda # Create a virtual env
$ source activate <env-name> # Activate your virtual env
$ conda info -e # List your virtual envs
$ conda install -n <env-name> [package] # Install more packages
$ source deactivate # Deactivate current virtual env
$ conda remove -n <env-name> --all # Delete your environment <env-name>

Follow tensorflow installation on anaconda. If you want to use your GPU, follow the Cuda installation and gpu-enabled tensorflow installation. If you are installing gpu-enabled tensorflow, you might find this video helpful. If you are building gpu-enabled tensorflow from scratch, follow this.

TFLearn setup is as easy as running the following command.

pip install tflearn

Following commands install gym on your virtual environment.

$ apt-get install -y python-dev cmake zlib1g-dev libjpeg-dev xvfb libav-tools xorg-dev python-opengl libboost-all-dev libsdl2-dev swig

$ pip install gym[atari]

I had to run the following commands to fix the problems that appeared after gym installation.

$ conda install -f numpy
$ conda install libgcc

How to train

We have custom games and gym atari games. The custom games takes considerably shorter time to train.

Here is a training command.

python dddqn.py train my-Catch \
--experiment=catch1 \ 
--num_random_steps=5000 \
--num_training_steps=2500 \
--num_validation_steps=1250 \
--epsilon_annealing_steps=50000 \
--experience_buffer_size=225000 \
--summary_dir=/tmp/summaries \
--checkpoint_dir=/tmp/checkpoints \
--target_update_frequency=5000 \
--tau=0.0 \
--alpha=0.00025

See dddqn_args.py for all options.

The custom games I created using the code provided by this blog post.

my-Catch
my-Avoid

Popular atari games are Breakout-v0, Pong-v0, SpaceInviders-v0, etc. See gym atari environments for the full list of atari games.

You can use tensorboard to follow the training progress.

tensorboard --logdir=/tmp/summaries/catch1

Go to http://127.0.1.1:6006.

How to test

python dddqn.py test my-Catch /tmp/checkpoints/catch1.ckpt-XXXX --eval_dir=/tmp/catch1

How to plot

python plot.py /tmp/summaries/catch1/plot.csv --x_axis=epoch --y_axis=reward
python plot.py /tmp/summaries/catch1/plot.csv --x_axis=epoch --y_axis=maxq
python plot.py /tmp/summaries/catch1/plot.csv --x_axis=epoch --y_axis=epsilon

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
myenvs		myenvs
.gitignore		.gitignore
README.md		README.md
dddqn.py		dddqn.py
dddqn_args.py		dddqn_args.py
envmaker.py		envmaker.py
experiencebuffer.py		experiencebuffer.py
gymenvironment.py		gymenvironment.py
myenvironment.py		myenvironment.py
plot.py		plot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dueling Double Deep Q-Network with Tensorflow

Setup

How to train

How to test

How to plot

Resources

About

Releases

Packages

Languages

gokhanettin/dddqn-tf

Folders and files

Latest commit

History

Repository files navigation

Dueling Double Deep Q-Network with Tensorflow

Setup

How to train

How to test

How to plot

Resources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages