- Implement 2013 NIPS paper's DQN(with replay buffer)
- Implement 2015 Nature paper's DQN(add the target network)
- Implement double DQN
- Implement Dueling DQN
Each approach both played about 500k frame in my computer.
approach | mean score | highest score |
---|---|---|
2013NIPS | 20.534 | 468 |
2015Nature | 25.121 | 931 |
double_DQN | 31.530 | 870 |
dueling_DQN | 30.520 | 797 |
The score here cannot strictly represent the pros and cons of the algorithm.
You can download the trained model and logs
The change of mean Q value:
The change of score:
- tensorflow
- pygame
- opencv
pip install tensorflow pygame opencv-python
git clone https://github.com/FLming/DQN_flappy_bird.git
cd DQN_flappybird
python flappybird.py
change the approach just change the code in flappybird.py e.g.
from double_DQN import DeepQNetworks
from DQN_Nature import DeepQNetworks
if you wanna see network architecture, the change of variables and scores.
tensorboard --logdir logs
if you wanna speed traning up, comment FPSCLOCK.tick(FPS) in wrapped_flappy_bird.py or add FPS