-
Notifications
You must be signed in to change notification settings - Fork 0
Quick Start
This section contains installation methods
Library was tested on the following OS versions:
- Windows 11
- Ubuntu 22.04
- macOS 12 Monterey
Minimal working hardware parameters:
- CPU: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
- 8GB RAM
Python3.9
or higher is required
List of dependency versions can be found here: https://github.com/jbr-ai-labs/marlben/blob/main/requirements.txt
To use GPU accelerators during training, simply specify number of GPUs available in the config:
EnvConfig = get_config("Corridor")
EnvConfig.NUM_GPUS = 1
run_tune_experiment(EnvConfig(), 'Corridor', rllib_wrapper.PPOCustom)
Minimal working hardware parameters:
- NVidia RTX 2060
- [Optional] If you using conda or another python environment manager, you may create a separate environment first.
- Run
pip install marlben
in your Terminal. Wait for installation to complete. - [Optional] Install RLLib integration:
pip install marlben[rllib]
- [Optional] If you using conda or another python environment manager, you may create a separate environment first.
- Clone the repository:
git clone https://github.com/jbr-ai-labs/marlben
- Move to the root folder of the package:
cd marlben
- Run
pip install -r requirements.txt
in your Terminal. Wait for installation to complete. - Run
pip install .
in your Terminal. Wait for installation to complete.
RLLib
integration example can be seen in train.py
To run train.py
, full installation is required:
pip install marlben[rllib]
In case of the following error:
undefined symbol: cublasLtHSHMatmulAlgoInit, version libcublasLt.so.11
Run pip3 uninstall nvidia_cublas_cu11
All environment classes inherit from the base Env class, which implements ParallelEnv
API from PettingZoo
:
To use wandb
integration for logging agents performance, create wandb_api_key
file in the project root folder with wandb token.
- wandb token can be found here -- https://wandb.ai/authorize
- You will need to create wandb account to use it -- https://docs.wandb.ai/quickstart
All environments are compatible with OpenAI Gym interface. Hence, you may create an environment instance using gym.make()
method. This is a simple method to get started with MARLBEN, as it is not require you to specify any additional parameters for the environment.
A simple example:
import gymnasium as gym
import marlben
env = gym.make("MARLBEN-BossFight-v1")
Alternatively, you may create an environment instance by directly calling a constructor method of corresponding class. This way, you will be able to create an environment with a customized configuration. Use this method if you want to simplify a task or make it harder to solve.
A simple example:
from marlben.envs import BossFight, BossFightConfig
env = BossFight(BossFightConfig())
For the full list of available environments, their descriptions and additional configuration suggestions please, reffer to the list of environments page.
A basic API of all environments implements an OpenAI Gym environment API:
-
env.reset()
- resets an environment to the initial state. Returns a dictionary, which maps an agent ID to it's observation, а reward and a done flag for each agent. -
env.step(action)
- as an argument, takes a dictionary, which maps an agent ID to it's action.
Note, that MARLBEN environments do not support default env.render()
method. If you want to visualize an environment, see the section "Rendering" below.
An observation for a single agent is a dictionary, that contains information about entities ("Entity"
key) and about visible part of the map ("Tile"
key).
For entities, there are a number of available continuous ("Continuous"
key) and discrete ("Discrete"
key) features. By default, there may be represented up to 100 agents with the current agent always going first.
For tiles, there are also a number of available continuous ("Continuous"
key) and discrete ("Discrete"
key) features. For simplicity, a visible part of the map is flattened to a vector and have a dimensionality of n_tiles X n_features
. Feel free to reshape it back if you need to.
The most simple method to get specific information from an observation is to use marlben.scripting.Observation
class, which allows you to get a required information from observation dictionary using Observation.attribute
method and fields of marlben.io.stimulus.Serialized.Entity
and marlben.io.stimulus.Serialized.Tile
classes.
Depending on the environment type, there may be a different actions subsets available. Please, refer to the list of environments for more information.
In MARLBEN, each agent's action is represented as a dictionary. In this dictionary, agent may declare multiple types of actions:
-
marlben.io.action.Move
- action that allows an agent to move within the map. Enabled for all environments by default. -
marlben.io.action.Attack
- allows to attack another entity with specified attack type. RequiresCombat
system to be enabled. -
marlben.io.action.Build
- allows to build an unpassable rock at the previous position. RequiresBuilding
system to be enabled. -
marlben.io.action.Plant
- allows to plant a Food resource by spending Water resource. RequiresPlanting
system to be enabled. -
marlben.io.action.Share
- allows to share a given amount of specified resource with another entity. RequiresSharing
system to be enabled.
Note, that agent not necessary need to perform all types of available actions at each turn, so some of this action may epsent in the action dictionary. For each of desired action types, agent must specify a dictionary of this action type parameters. For example, Move
action requires a Direction
to be specified.
An example action dictionary for a single agent:
from marlben.io import action
env.step(
{
<agent_id>:
{
action.Move: {action.Direction: action.North.index},
action.Attack: {action.Style: action.Range.index, action.Target: 1}
}
}
)
For most of the provided environment we implemented heuristic baselines. Such heuristics baselines may be used to measure an efficiency of your own agents.
For the Boss Fight
and Raid
environments it's highly recommended to use BossFightTankAgent
, BossRaidFighterAgent
and BossRaidHealerAgent
as a scripted baselines. Definition of this scripted agents can be found here, the example of usage can be found here.
For the Gathering
, Exploring
, Spying
and Colors
environments it's highly recommended to use ObscuredAndExclusiveGatheringAgent
, which is defined here. Example of it's usage can be found here.
You also can use basic NMMO scripted agents. However, this agents may provide you with less solid baseline solutions for the environments listed above because of theirs general purpose policies.