Skip to content

d3rezz/flappybird-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Playing Flappy Bird with Reinforcement Learning (DQN)

A Reinforcement Learning Bot that learns how to play the game of Flappy Bird.

Implemented in Tensorflow 2.0. Uses a modified version of this Python version of Flappy Bird to simulate the game.

The code on this repository serves as reference for one of my recent blog posts.

Installation

Clone this repository and run flappy.py .

After playing 500 episodes, the agent should start achieving a reasonable score (>25).

Implementation details

Based on the original DQN paper by Deepmind [1].

State encoding

Each frame of the game is considered a state for the Q-Learning algorithm. The vector encoding each state contains 5 values:

  • Horizontal distance between the player and the next pipe
  • Vertical distance between the player and the lower pipe
  • Vertical distance between the player and the upper pipe
  • Distance from the player to the top of the map
  • Distance from the player to the base of the map

After each action, the transition to a new state is stored in a Replay Memory buffer as the tuple

(state, action, next_state, reward)

The Replay Memory was implemented using a ring buffer.

The policy network is implemented using a Multilayer Perceptron with 2 hidden layer with 10 neurons each. This network predicts the expected return of taking each of the available actions given the current state. Similarly to [1], a second network that is only updated every few episodes is used for computing Q(s_{t+1}). This increases the stability of the training.

Exploration is ensured using by an e-greedy policy with an annealing epsilon value.

A reward of +1 is returned at every transition to non-final state, and of -1000 when transitioning to a final state (when the bird crashes).

References

[1] Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529-533.

About

A Reinforcement Learning Bot that learns how to play the game of Flappy Bird.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages