Skip to content

Asynchronous methods for Deep Reinforcement learning using Flappy bird game implementation

Notifications You must be signed in to change notification settings

Nuclearstar/Asynchronous-Methods-for-Deep-Reinforcement-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Asynchronous-Methods-for-Deep-Reinforcement-Learning

The project proposes a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. This is the implementation of the research paper with the same name available here. Flappy bird game is used for the implementation.

Installation Dependencies:

  • Python 2.7 or 3
  • Tensorflow
  • pygame
  • OpenCV-Python

How to run?

git clone https://github.com/Nuclearstar/Asynchronous-Methods-for-Deep-Reinforcement-Learning.git
cd Asynchronous-Methods-for-Deep-Reinforcement-Learning
python deep_q_network_actual.py

Objectives:

We present asynchronous variants of three standard reinforcement learning algorithms:

  1. One-step method
  2. Actor critic method
  3. n steps Q-learning method.

Proposed solution methods

  • We present multi-threaded asynchronous n-step Q-learning and advantage actor-critic.

  • First, we use asynchronous actor-learners, similarly to the Gorila framework, but instead of using separate machines and a parameter server, we use multiple CPU threads on a single machine.

  • Second, we make the observation that multiple actors-learners running in parallel.

System design

1. Deep Q-Network

  • Deep Q-Network is a convolutional neural network, trained with a variant of Q-learning.

  • Q-function with a neural network, that takes the state and action as input and outputs the corresponding Q-value.

  • One forward pass through the network and having all Q-values for all actions available is made use of. alt text

2. Environment

alt text

3. Architecture of the network

alt text

Implementation

Flappy bird game is used for the implementation.

alt text

Conclusion and future work

  • We have presented asynchronous versions of three standard reinforcement learning algorithms and also showed that they are able to train neural network controllers on a variety of domains in a stable manner.

  • Combining other existing reinforcement learning methods or recent advances in deep reinforcement learning with our asynchronous framework presents many possibilities for immediate improvements to the methods we presented.

About

Asynchronous methods for Deep Reinforcement learning using Flappy bird game implementation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages