GitHub - PBarde/moma-ppo: Rough (uncleaned) code for the paper "A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem". Aimed at implementation transparency and reproducibility.

Disclaimer

This repository is not meant to be a clean, off-the-shelf, codebase. Rather we share the (rough) code we used for the paper A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem in which we present the MOMA-PPO algorithm. The aim is transparency and reproducability, the code is provided as is and will not be supported/extended/cleaned by us. The goal is to allow others to have a look at our implementations and to possibly reuse part of it in their own codebase. If you still want to try and install the code and associated libraries follow the instructions under the installation section.

Get the datasets

Download the reacher datasets from https://huggingface.co/datasets/pbarde1/moma-ppo/tree/main
The other datasets will be automatically downloaded from D4RL servers.

Get pretrained world-models

You can download the world-models we trained and used in the paper at https://huggingface.co/pbarde1/moma-ppo/tree/main

Organise the directory structure

offline_marl/
├── alfred_omarl
│   ├── alfred
│   └── alfred.egg-info
├── moma-ppo
│   ├── dzsc
│   └── offline_marl
└── scratch
    ├── ma_d4rl_generated_datasets
    └── world_models

Update pointers to correct paths

In dzsc/ma_d4rl/generated_datasets/dataset_path.py

ROOT = "path_to_scratch/ma_d4rl_generated_datasets"

In offline_marl/load_worldmodel.py l.39

code_root = "path_to_scratch"

Run experiments

Go under moma-ppo/offline_marl/offline_marl to run experiments.
train_worldmodel.py to train world models.
train_agents.py to train IQL, MAIQL, TD3+BC, CQL, OMAR, for instance for MAIQL on Two-Agent Reacher leader-only task:

python train_agents.py --alg_name ma-iql --task_name reacher-expert-mix-v0_2x1first_0

train_rl_model_based.py to train MOMA-PPO:

python train_rl_model_based.py --task_name reacher-expert-mix-v0_2x1first_0

train_rl.py to train purely RL algos like MA-PPO (to generate expert datasets for instance).
train_model_based_agent.py like train_agent.py but uses the world-models to generate additionnal data and trains the agents on a mixture of true and synthetic data.

Licence

https://creativecommons.org/licenses/by-nc/4.0/

Installation

As you are about to find out that the installation is a real headache (due to dependencies on mujoco, d4rl, dm_control, etc.)

If you still want to try and go through it you can find the outputs from the following commands:

pip freeze > pip_freeze.txt
conda list -e > conda_list.txt
conda env export --no-builds | grep -v "prefix" > environment.yml

you can try and see if you are lucky (I doubt it) with:

conda env create -n omarl -f environment.yml
conda activate omarl

Manual installation

Below are the steps I followed to recreate a conda env that was able to run the code.

Set up initial environment

conda create -y -n omarl python=3.8.12 pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia -c conda-forge

conda activate omarl

conda install numpy==1.22.1
conda install gym==0.21.0

install Mujoco
- Download the MuJoCo version 2.1 binaries for Linux or OSX.
- Extract the downloaded mujoco210 directory into ~/.mujoco/mujoco210. If you want to specify a nonstandard location for the package, use the env variable MUJOCO_PY_MUJOCO_PATH.
- add export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:path_to_mujoco/mujoco210/bin to .bashrc
- ```
conda install -c conda-forge glew
conda install -c conda-forge mesalib
conda install -c menpo glfw3
pip install pip==24.0
pip install mujoco-py==2.1.2.14
```
install D4RL
- git clone git@github.com:rail-berkeley/d4rl.git
- cd d4rl
- edit setup.py file: from dm_control >= 1.0.3 to dm_control @ git+https://github.com/deepmind/dm_control@4f1a9944bf74066b1ffe982632f20e6c687d45f1
- edit setup.py file: from mjrl @ git+git://github.com/aravindr93/mjrl@master#egg=mjrl to mjrl @ git+https://github.com/aravindr93/mjrl@master#egg=mjrl
```
pip install -e .
pip install dm_control
pip install -e .
pip install "Cython<3"
export CPATH=$CONDA_PREFIX/include
pip install patchelf
```
Then building mujoco_py by importing it in python should go through

install wandb

pip install wandb
wandb login

You will have to modify the wandb.init line accordingly for your project and entity, look for

wandb.init(id=self.config.wandb_run_id, project='mila_omarl', reinit=True, resume="allow", entity='paul-b-barde', config=self.config)

install stuff so wandb can record videos

pip install moviepy imageio
install alfred_omarl (an open souce library to monitor experiments that we modified) by following its README

we recommend installing it in editable: pip install -e .
install submitit pip install submitit
Misc
```
pip install nop
pip install readchar
```

Note

Note that for me after this install env.render(‘rgb_array’) is not working (probably a problem with mujoco installation) so set --render_gif to False when training.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
alfred_omarl		alfred_omarl
moma-ppo		moma-ppo
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda_list.txt		conda_list.txt
environment.yml		environment.yml
pip_freeze.txt		pip_freeze.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disclaimer

Get the datasets

Get pretrained world-models

Organise the directory structure

Update pointers to correct paths

Run experiments

Licence

Installation

Manual installation

Note

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Disclaimer

Get the datasets

Get pretrained world-models

Organise the directory structure

Update pointers to correct paths

Run experiments

Licence

Installation

Manual installation

Note

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages