Flappy Bird AI with Unity and ML-Agents

Overview

This project demonstrates the training of an artificial intelligence (AI) agent to master the popular mobile game "Flappy Bird" using Unity and the ML-Agents library. By combining reinforcement learning and the Proximal Policy Optimization (PPO) algorithm, I've empowered an AI to autonomously navigate the challenging Flappy Bird environment, achieving impressive results.

Features

Reinforcement Learning: Leveraged the power of reinforcement learning to teach the AI agent how to play Flappy Bird effectively.
ML-Agents Library: The ML-Agents library provides a flexible framework for training agents in Unity environments.
Proximal Policy Optimization (PPO): PPO is employed as the training algorithm to optimize the AI agent's performance.
Unity Environment: The Flappy Bird game environment is created in Unity, providing an interactive training platform.

Configurations

# .yaml
behaviors:
  FlappyBird:
    trainer_type: ppo
    hyperparameters:
      batch_size: 64
      buffer_size: 2048
      learning_rate: 3e-4
      beta: 5e-3
      epsilon: 0.2
      lambd: 0.95
      num_epoch: 3
      learning_rate_schedule: linear
      beta_schedule: constant
      epsilon_schedule: linear
    network_settings:
      normalize: false
      hidden_units: 128
      num_layers: 2
    reward_signals:
        extrinsic:
          strength: 1.0
          gamma: 0.99
        curiosity:
          strength: 0.01
          gamma: 0.99
          encoding_size: 64
    max_steps: 5000000
    time_horizon: 64
    summary_freq: 10000

Observation Space

The Flappy Bird AI agent relies on ray perception sensors as its "eyes" to perceive the game environment. These sensors emit rays to gather essential data like detecting:

lower pipe
upper pipe
ground
ceiling
and pipes ahead

enabling the agent to make informed decisions during gameplay. These observations are crucial for training the agent to navigate the Flappy Bird environment effectively.

Reward System

The reward system in the Flappy Bird AI project encourages specific behaviors:

Staying Alive: The agent receives a small, positive reward of 0.01 for each time step it remains alive.
Passing a Pair of Pipes: Successfully navigating through a pair of pipes yields a +1 reward.
Crashing/Dying: Collisions result in a -5 penalty.

These rewards guide the AI agent's learning process, promoting survival, skillful pipe passage, and avoidance of collisions for optimal gameplay.

Action Space

In the Flappy Bird AI project, the action space closely mimics the original game. The agent's sole action is to flap its wings, inspired by the classic Flappy Bird gameplay.

Flapping: The agent can execute the action of flapping its wings, allowing it to control its upward movement in response to the game environment.

This simplified action space encourages the AI agent to master the core gameplay mechanics of the original Flappy Bird game.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Assets		Assets
Config		Config
Packages		Packages
ProjectSettings		ProjectSettings
results		results
.gitignore		.gitignore
.vsconfig		.vsconfig
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assets

Assets

Config

Config

Packages

Packages

ProjectSettings

ProjectSettings

results

results

.gitignore

.gitignore

.vsconfig

.vsconfig

README.md

README.md

Repository files navigation

Flappy Bird AI with Unity and ML-Agents

Overview

Features

Configurations

Observation Space

Reward System

Action Space

About

Releases

Packages

Languages

Sookeyy-12/FlappyBird-Unity-MLAgents

Folders and files

Latest commit

History

Repository files navigation

Flappy Bird AI with Unity and ML-Agents

Overview

Features

Configurations

Observation Space

Reward System

Action Space

About

Resources

Stars

Watchers

Forks

Languages