Deep Q Network

Reproduce performance (rewards) of the following deep reinforcement learning methods using TensorFlow:

Deep Q Network (DQN): Human-level Control Through Deep Reinforcement Learning
Dueling DQN: Dueling Network Architectures for Deep Reinforcement Learning
Double DQN: Deep Reinforcement Learning with Double Q-Learning

This is an easy to understand and modify the DQN structure as well as memory efficiency implementation that can store 1M transitions using ~8GB memory. The model architecture used in this code is not quite similar to the one described in the original DQN paper. I have tested with Pong, Breakout, and MsPacman so far.

It took ~25 hours of training to reach its first 400 points reward on Breakout evaluation using 1 GTX 1080.

Environment I used

Python 3.6
TensorFlow 1.10
OpenAI Gym 0.10.5
OpenCV 3.4.2
mpi4py 3.0.0

How to run

Set up hyper-parameters in config.py. To run the program:

python train.py

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.gitignore		.gitignore
Agent.py		Agent.py
Net.py		Net.py
README.md		README.md
ReplayBuffer.py		ReplayBuffer.py
config.py		config.py
logger.py		logger.py
train.py		train.py
video.gif		video.gif
wrappers.py		wrappers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Q Network

Environment I used

How to run

About

Releases

Packages

Languages

gdao-research/Deep-Q-Network

Folders and files

Latest commit

History

Repository files navigation

Deep Q Network

Environment I used

How to run

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages