Skip to content

A Flappy Bird learning environment in Unity for Reinforcement Learning.

Notifications You must be signed in to change notification settings

Sookeyy-12/FlappyBird-Unity-MLAgents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flappy Bird AI with Unity and ML-Agents

flapygif

Overview

This project demonstrates the training of an artificial intelligence (AI) agent to master the popular mobile game "Flappy Bird" using Unity and the ML-Agents library. By combining reinforcement learning and the Proximal Policy Optimization (PPO) algorithm, I've empowered an AI to autonomously navigate the challenging Flappy Bird environment, achieving impressive results.

Features

  • Reinforcement Learning: Leveraged the power of reinforcement learning to teach the AI agent how to play Flappy Bird effectively.
  • ML-Agents Library: The ML-Agents library provides a flexible framework for training agents in Unity environments.
  • Proximal Policy Optimization (PPO): PPO is employed as the training algorithm to optimize the AI agent's performance.
  • Unity Environment: The Flappy Bird game environment is created in Unity, providing an interactive training platform.

Configurations

# .yaml
behaviors:
  FlappyBird:
    trainer_type: ppo
    hyperparameters:
      batch_size: 64
      buffer_size: 2048
      learning_rate: 3e-4
      beta: 5e-3
      epsilon: 0.2
      lambd: 0.95
      num_epoch: 3
      learning_rate_schedule: linear
      beta_schedule: constant
      epsilon_schedule: linear
    network_settings:
      normalize: false
      hidden_units: 128
      num_layers: 2
    reward_signals:
        extrinsic:
          strength: 1.0
          gamma: 0.99
        curiosity:
          strength: 0.01
          gamma: 0.99
          encoding_size: 64
    max_steps: 5000000
    time_horizon: 64
    summary_freq: 10000

Observation Space

The Flappy Bird AI agent relies on ray perception sensors as its "eyes" to perceive the game environment. These sensors emit rays to gather essential data like detecting:

  • lower pipe
  • upper pipe
  • ground
  • ceiling
  • and pipes ahead

enabling the agent to make informed decisions during gameplay. These observations are crucial for training the agent to navigate the Flappy Bird environment effectively.

image

Reward System

The reward system in the Flappy Bird AI project encourages specific behaviors:

  • Staying Alive: The agent receives a small, positive reward of 0.01 for each time step it remains alive.
  • Passing a Pair of Pipes: Successfully navigating through a pair of pipes yields a +1 reward.
  • Crashing/Dying: Collisions result in a -5 penalty.

These rewards guide the AI agent's learning process, promoting survival, skillful pipe passage, and avoidance of collisions for optimal gameplay.

Action Space

In the Flappy Bird AI project, the action space closely mimics the original game. The agent's sole action is to flap its wings, inspired by the classic Flappy Bird gameplay.

  • Flapping: The agent can execute the action of flapping its wings, allowing it to control its upward movement in response to the game environment.

This simplified action space encourages the AI agent to master the core gameplay mechanics of the original Flappy Bird game.

About

A Flappy Bird learning environment in Unity for Reinforcement Learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages