# Predator-Prey-Grass MARL

A Predator-Prey-Grass multi-agent gridworld environment. Featuring dynamic spawning and deletion and partial observability of agents.

https://github.com/doesburg11/PredPreyGrass



## Environment dynamics
Learning agents Predators (red) and Prey (blue) both sequentially expend energy moving around, and replenish it by eating. Prey eat Grass (green), and Predators eat Prey if they end up on the same grid cell. The agents obtain all the energy from the eaten resource. Predators die of starvation when their energy is run out, Prey die either of starvation or when being eaten by a Predator. Both learning agents asexually reproduce when energy levels exceed a certain threshold (by eating). In the base configuration, newly created agents are placed at random over the entire gridworld. Learning agents learn to move based on their partial observations (transparent red and blue squares) of the environment.

## Step 1: Clone the PredPreyGrass repository from GitHub

In [1]:
!git clone https://github.com/doesburg11/PredPreyGrass.git # > /dev/null 2>&1
%cd PredPreyGrass

Cloning into 'PredPreyGrass'...
remote: Enumerating objects: 24266, done.[K
remote: Counting objects: 100% (555/555), done.[K
remote: Compressing objects: 100% (41/41), done.[K
remote: Total 24266 (delta 515), reused 529 (delta 506), pack-reused 23711 (from 3)[K
Receiving objects: 100% (24266/24266), 989.34 MiB | 23.97 MiB/s, done.
Resolving deltas: 100% (14579/14579), done.
Updating files: 100% (7664/7664), done.
/content/PredPreyGrass


## Step 2: Install the PredPreyGrass package

In [2]:
!pip install -e . # > /dev/null 2>&1

Obtaining file:///content/PredPreyGrass
  Installing build dependencies ... [?25l[?25hdone
  Checking if build backend supports build_editable ... [?25l[?25hdone
  Getting requirements to build editable ... [?25l[?25hdone
  Preparing editable metadata (pyproject.toml) ... [?25l[?25hdone
Collecting pettingzoo==1.24.3 (from predpreygrass==0.1.0)
  Downloading pettingzoo-1.24.3-py3-none-any.whl.metadata (8.5 kB)
Collecting stable-baselines3==2.6.0 (from stable-baselines3[extra]==2.6.0->predpreygrass==0.1.0)
  Downloading stable_baselines3-2.6.0-py3-none-any.whl.metadata (4.8 kB)
Collecting SuperSuit==3.9.3 (from predpreygrass==0.1.0)
  Downloading SuperSuit-3.9.3-py3-none-any.whl.metadata (3.2 kB)
Collecting ray==2.44.1 (from ray[rllib]==2.44.1->predpreygrass==0.1.0)
  Downloading ray-2.44.1-cp311-cp311-manylinux2014_x86_64.whl.metadata (19 kB)
Collecting tensorboard==2.19.0 (from predpreygrass==0.1.0)
  Downloading tensorboard-2.19.0-py3-none-any.whl.metadata (1.8 kB)
Collecting 

## Predator-Prey-Grass MARL with SB3 PPO
The MARL environment is implemented using PettingZoo, and the agents are trained using Stable Baselines3 (SB3) PPO. Essentially this solution demonstrates how SB3 can be adapted for MARL using parallel environments and centralized training.

## Step 3: Evaluate the environment with a random policy

The configuration of the environment can be adjusted in:  /content/PredPreyGrass/predpreygrass/single_objective/config/config_predpreygrass.py

In [None]:
%env XDG_RUNTIME_DIR=/tmp/runtime-dir
!mkdir -p /tmp/runtime-dir
!python src/predpreygrass/pettingzoo/eval/evaluate_random_policy.py

error: XDG_RUNTIME_DIR not set in the environment.
self.motion_range: [[-2, -2], [-2, -1], [-2, 0], [-2, 1], [-2, 2], [-1, -2], [-1, -1], [-1, 0], [-1, 1], [-1, 2], [0, -2], [0, -1], [0, 0], [0, 1], [0, 2], [1, -2], [1, -1], [1, 0], [1, 1], [1, 2], [2, -2], [2, -1], [2, 0], [2, 1], [2, 2]]
len(self.motion_range): 25
self.action_range: 5
seed: 42 set in predpreygrass_aec.py seed
Seed set to 42
agent: prey_44, reward: 10.0
agent: prey_40, reward: 10.0
agent: prey_42, reward: 10.0
agent: prey_40, reward: 10.0
agent: prey_97, reward: 10.0


## Step 4: Train model and save to file

In [None]:
!python src/predpreygrass/pettingzoo/train/train_sb3_ppo_parallel_wrapped_aec_env.py

## Step 5: Evaluation trained model from file

The trained model is now saved to a zip file on Colab, along with the file structure and configuration. To evaluate this model and the asociated configuration, manually fill in the correct timestamp below.

In [None]:
# adjust time stamp accordingly
timestamp="2025-03-16_18:38:46"
evaluation_script = "/content/PredPreyGrass/output/"+timestamp+"/eval/evaluate_ppo_from_file_aec_env.py"

In [None]:
!python {evaluation_script}

python3: can't open file '/content/PredPreyGrass/{evaluation_script}': [Errno 2] No such file or directory


## Step 6: Save project to Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import shutil

# Source directory in Colab
source_dir = '/content/PredPreyGrass'

# Destination directory in Google Drive
destination_dir = '/content/drive/My Drive/PredPreyGrass'

# Copy the directory
shutil.copytree(source_dir, destination_dir)

print(f"Directory copied to {destination_dir}")


Directory copied to /content/drive/My Drive/PredPreyGrass


## Step 7: Display population developments per episode

The population developments per episode are saved to google drive into pdf's:

"/content/drive/MyDrive/PredPreyGrass/output/"{timestamp}"/output/population_charts/PredPreyPopulation_episode_{episode number}.pdf

## The RLlib solution with decentralized traing

In [3]:
!python src/predpreygrass/rllib/v4_select_coef_HBP/train_rllib_ppo_multiagentenv.py

E0000 00:00:1744579110.754545    2346 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1744579110.819649    2346 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-13 21:18:50,440	INFO worker.py:1852 -- Started a local Ray instance.
 === Start a new experiment === 
Saved config to: /root/Dropbox/02_marl_results/predpreygrass_results/ray_results/PPO_2025-04-13_21-18-52/PPO_PredPreyGrass_00000/run_config.json
╭────────────────────────────────────────────────────────────╮
│ Configuration for experiment     PPO_2025-04-13_21-18-52   │
├────────────────────────────────────────────────────────────┤
│ Search algorithm                 BasicVariantGenerator     │
│ Scheduler                        FIFOScheduler             │
│ Number of trials                 1                         │
╰─────────────