Asynchronous-Methods-for-Deep-Reinforcement-Learning

Using a paper from Google DeepMind I've developed a new version of the DQN using threads exploration instead of memory replay as explain in here: http://arxiv.org/pdf/1602.01783v1.pdf I used the one-step-Q-learning pseudocode, and now we can train the Pong game in less than 20 hours and without any GPU or network distribution.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Wrapped-Game-Code		Wrapped-Game-Code
save_networks_asyn		save_networks_asyn
.gitignore		.gitignore
README.md		README.md
asynchronous_one_step_Q_learning.py		asynchronous_one_step_Q_learning.py
asynchronous_one_step_Q_learning_play.py		asynchronous_one_step_Q_learning_play.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrapped-Game-Code

Wrapped-Game-Code

save_networks_asyn

save_networks_asyn

.gitignore

.gitignore

README.md

README.md

asynchronous_one_step_Q_learning.py

asynchronous_one_step_Q_learning.py

asynchronous_one_step_Q_learning_play.py

asynchronous_one_step_Q_learning_play.py

Repository files navigation

Asynchronous-Methods-for-Deep-Reinforcement-Learning

About

Releases

Packages

Languages

tshahpuri/Asynchronous-Methods-for-Deep-Reinforcement-Learning

Folders and files

Latest commit

History

Repository files navigation

Asynchronous-Methods-for-Deep-Reinforcement-Learning

About

Resources

Stars

Watchers

Forks

Languages