Skip to content

Latest commit

 

History

History

bevy_snake

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Creating an AI to Play a Bevy Snake Game

This example shows how to expose a Bevy app as an entity-gym environment, use enn-trainer to train a neural network to play snake, and then run the resulting neural network as part of a Bevy game. The snake implementation is lightly modified from Marcus Buffett's snake clone.

Overview

The majority of the code is mostly unchanged from the original implementation:

  • The main.rs file has been renamed to lib.rs, with the new entry point moved to bin/main.rs.
  • The new AI controller lives in src/ai.rs.
  • The additional code required for training is in src/python.rs, which defines a PyO3 Python API. train.py is a simple script that runs training, train.ron defines some hyperparameters, and pyproject.toml/poetry.lock define required Python dependencies using the Poetry package manager.

Usage

Clone the repo and move to the examples/bevy_snake directory:

git clone https://github.com/entity-neural-network/entity-gym-rs.git
cd entity-gym-rs/examples/bevy_snake

Running the game with random actions:

cargo run --bin main

Run with a trained neural network (download link):

cargo run -- --agent-path bevy_snake1m.roguenet

Training a new agent with enn-trainer (requires Poetry, only tested on Linux, Nvidia GPU recommended):

poetry install
# Replace "cu113" with "cpu" to train on CPU.
poetry run pip install torch==1.12.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
poetry run pip install torch-scatter -f https://data.pyg.org/whl/torch-1.12.0+cu113.html
poetry run maturin develop --release --features=python
poetry run python train.py --config=train.ron --checkpoint-dir=checkpoints

How it works

This guide will walk you through the steps require to create an AI for the Bevy snake game.

The first step is to add a new resource to the Bevy app which stores the AI player. The resource is defined in src/ai.rs and holds a Box<dyn Agent>:

pub struct Player(pub Box<dyn Agent>);

The Agent trait abstracts over different AI implementations provided by entity-gym-rs.

Depending on how the game is configured, we instantiate the Player resource in src/lib.rs as either a neural network loaded from a file, or an agent that takes random actions.

        .insert_non_send_resource(match agent_path {
            Some(path) => Player(agent::load(path)),
            None => Player(agent::random()),
        })

The actual integration of the AI player with the game happens inside the snake_movement_agent system. This system runs on every tick, obtains actions from the AI, and applies them to the game. The first step is to construct an Obs structure which collects all the parts of the game state that are visible to the AI:

let obs = Obs::new(segments_res.len() as f32)
    .entities(food.iter().map(|(_, p)| Food { x: p.x, y: p.y }))
    .entities([head_pos].iter().map(|p| Head { x: p.x, y: p.y }))
    .entities(segment.iter().map(|(_, p)| SnakeSegment { x: p.x, y: p.y }));

The argument to Obs::new is the current score of the agent, which is the quantity that will be maximized in the training process. Since we want the agent to grow the snake as long as possible, we use the number of segments as the score.

The entities method allows us to make different entities visible to the AI. The argument to entities is an iterator over Featurizable items, which is a trait that allows structs to be converted into a representation that can be processed by the neural network. The Featurizable trait can be derived automatically for enums with unit variants and most fixed-size structs:

#[derive(Featurizable)]
pub struct Head {
    x: i32,
    y: i32,
}

#[derive(Featurizable)]
pub struct SnakeSegment {
    x: i32,
    y: i32,
}

#[derive(Featurizable)]
pub struct Food {
    x: i32,
    y: i32,
}

With the observation constructed, we simply call act method on the neural network to obtain an Option<Direction>:

let action = player.0.act::<Direction>(&obs);

The type we use for the action (here, Direction) must implement the Action trait. The Action trait can be derived automatically for any enum that consists only of unit variants:

#[derive(PartialEq, Copy, Clone, Debug, Action)]
enum Direction {
    Left,
    Up,
    Right,
    Down,
}

Due to a limitation in the current implementation, the act method can return None, which indicates that we should exit the game. If the result is not None, we simple apply the action to the game the same way we would do with human input:

match action {
    Some(dir) => {
        if dir != head.direction.opposite() {
            head.direction = dir;
        }
    }
    None => exit.send(AppExit),
}

Training

Training neural network agents requires a version of the game that can interface with Python and run many headless game instances in parallel.

Headless runner

If you look at src/lib.rs, you will see that the original main method has been split into three functions:

  • base_app defines all the systems which we want to run both in the game and in the headless runner.
  • run adds all the systems which we want when running the game normally, such as as creating a window and handling user input.
  • run_headless, used during training, omits the window and user input, uses the MinimalPlugins plugin set, and sets up a run loop with 0 wait_duration.

Another difference is that the original snake implementation used an event timer to spawn a food every second. This doesn't work when running without a fixed framerate, so we instead use a FoodTimer resource to keep track of the time since the last food was spawned and spawn food on every 7th tick instead.

Python API

We use PyO3 to export the game as a Python module. There is currently an issue that causes long compile times when using PyO3 as a dependency. For this reason, we gate all the Python specific code and the PyO3 dependecy behind a "python" feature flag. We also need to build the crate as a cdylib.

[lib]
crate-type = ["cdylib", "rlib"]
name = "bevy_snake_enn"

[dependencies]
pyo3 = { version = "0.15", features = ["extension-module"], optional = true }

[features]
python = ["pyo3", "entity-gym-rs/python"]

All the code that is required to define the Python API is in src/python.rs. It defines a Config struct that allows us to pass in game settings from Python (not actually used for anything in this case).

#[derive(Clone)]
#[pyclass]
struct Config;

#[pymethods]
impl Config {
    #[new]
    fn new() -> Self {
        Config
    }
}

The create_env function uses the TrainEnvBuilder to construct a [PyVecEnv][PyVecEnv] which runs multiple instances of the game in parallel and will be used directly by the Python training framework in train.py. The TrainEnvBuilder requires us to declar the types of all the entities and actions that we want to use in the game using the entity and action methods. When we pass the run_headless function to build, the TrainEnvBuilder will spawn one thread for each environment that calls run_headless with a clone of the Config, a TrainAgent that connects the game to the Python training framework, and a random seed. The num_envs, threads, and first_env_index parameters are simply forwarded from Python and allow the training framework to control the number of worker threads and game instances.

#[pyfunction]
fn create_env(config: Config, num_envs: usize, threads: usize, first_env_index: u64) -> PyVecEnv {
    TrainEnvBuilder::default()
        .entity::<ai::Head>()
        .entity::<ai::SnakeSegment>()
        .entity::<ai::Food>()
        .action::<Direction>()
        .build(
            config,
            super::run_headless,
            num_envs,
            threads,
            first_env_index,
        )
}

Finally, the #[pymodule] macro constructs the Python module and registers the Config type and the create_env function.

#[pymodule]
fn bevy_snake_ai(_py: Python, m: &PyModule) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(create_env, m)?)?;
    m.add_class::<Config>()?;
    Ok(())
}

With this we can now run maturin develop --release --features=python to build and install the crate as a Python package, which is then imported by the training script.