Skip to content

yanyongyu/FlappyBird

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flappy Bird

Overview

  • python ~= 3.7
  • pygame ~= 1.9.6
  • tensorflow ~= 1.13
  • opencv ~= 3.4.2
  • numpy ~= 1.14.5
  • sqlite ~= 3.27.2
  • pillow ~= 6.0.0
  • pywin32 ~= 223 [Optional]

You can use the following code to install my environment by pip:

pip install -r requirements.txt

If you are using conda, you may use the following code to create a new environment:

conda install --yes --file requirements.txt

How to Play?

git clone https://github.com/yanyongyu/FlappyBird.git
cd FlappyBird
python main.py

using space/key-up/mouse-left to make the bird fly

using escape/key-p or click the pause button to pause the game

shortcuts

  • homepage

  • setting

  • rank

  • game

  • share

What is Deep Q Network?

It is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards.

Experiments

Environment

Since deep Q-network is trained on the raw pixel values observed from the game screen at each time step, Kevin Chen finds that remove the background appeared in the original game can make it converge faster. This process can be visualized as the following figure:

Network Architecture

How to Train?

python dqn.py

Model will be saved in ./dqn_model/

Log file will be saved in ./dqn_logs/

Use following code to see the graph and loss:

tensorboard --logdir dqn_logs

Process

0 Step

300w Step

Result

After about 300w step, the bird has a good performance.The score can easily reach 100.

Improve

It's enough? No!

Double DQN

Change the network into two parts: target_net and eval_net.

  • use eval network to select what is the best action to take for the next state (the action with the highest Q value).
  • use target network to calculate the target Q value of taking that action at the next state.

The model can converge much faster. At about 175w step, the score has a dramatic growth.

About

FlappyBird Reinforcement Learning based on Pygame, OpenCV, Tensorflow

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages