Asynchronous-Methods-for-Deep-Reinforcement-Learning

The project proposes a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. This is the implementation of the research paper with the same name available here. Flappy bird game is used for the implementation.

Installation Dependencies:

Python 2.7 or 3
Tensorflow
pygame
OpenCV-Python

How to run?

git clone https://github.com/Nuclearstar/Asynchronous-Methods-for-Deep-Reinforcement-Learning.git
cd Asynchronous-Methods-for-Deep-Reinforcement-Learning
python deep_q_network_actual.py

Objectives:

We present asynchronous variants of three standard reinforcement learning algorithms:

One-step method
Actor critic method
n steps Q-learning method.

Proposed solution methods

We present multi-threaded asynchronous n-step Q-learning and advantage actor-critic.
First, we use asynchronous actor-learners, similarly to the Gorila framework, but instead of using separate machines and a parameter server, we use multiple CPU threads on a single machine.
Second, we make the observation that multiple actors-learners running in parallel.

System design

1. Deep Q-Network

Deep Q-Network is a convolutional neural network, trained with a variant of Q-learning.
Q-function with a neural network, that takes the state and action as input and outputs the corresponding Q-value.
One forward pass through the network and having all Q-values for all actions available is made use of.

2. Environment

3. Architecture of the network

Implementation

Flappy bird game is used for the implementation.

Conclusion and future work

We have presented asynchronous versions of three standard reinforcement learning algorithms and also showed that they are able to train neural network controllers on a variety of domains in a stable manner.
Combining other existing reinforcement learning methods or recent advances in deep reinforcement learning with our asynchronous framework presents many possibilities for immediate improvements to the methods we presented.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
game		game
images		images
saved_networks		saved_networks
Flappy.pptx		Flappy.pptx
Phase2Report.pdf		Phase2Report.pdf
README.md		README.md
deep_q_network_actual.py		deep_q_network_actual.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Asynchronous-Methods-for-Deep-Reinforcement-Learning

Installation Dependencies:

How to run?

Objectives:

Proposed solution methods