This is the implementation of the ICLR 2023 paper "Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data".
Set up StarCraft II and SMAC:
bash install_sc2.sh
This will download SC2.4.10 into the 3rdparty folder and copy the maps necessary to run over. You may also need to persist the environment variable SC2PATH
(e.g., append this command to .bashrc
):
export SC2PATH=[Your SC2 folder like /abc/xyz/3rdparty/StarCraftII]
Install Python environment with conda:
conda create -n odis python=3.10 -y
conda activate odis
pip install -r requirements.txt
We extend the original SMAC package by adding additional maps for multi-task evaluation. Here are a simple script to make some modifications in smac
and copy additional maps to StarCraft II installation. Please make sure that you have set SC2PATH
correctly.
git clone https://github.com/oxwhirl/smac.git
pip install -e smac/
bash install_smac_patch.sh
You can execute the following command to run ODIS with a toy task config, which will perform training on a small batch of data:
python src/main.py --mto --config=odis --env-config=sc2_offline --task-config=toy --seed=1
The --task-config
flag can be followed with any existing config name in the src/config/tasks/
directory, and any other config named xx
can be passed by --xx=value
.
As the dataset is large, we only contain the a toy task config of 3m
medium data in the dataset
folder from the default code base. Therefore, we provide the data link to the full dataset by this Google Drive URL and you can substitute the original data with the full dataset. After putting the full dataset in dataset
folder, you can run experiments in our pre-defined task sets like
python src/main.py --mto --config=odis --env-config=sc2_offline --task-config=marine-hard-expert --seed=1
All results will be stored in the results
folder. You can see the console output, config, and tensorboard logging in the cooresponding directory.
Alternatively, you can use the following code to run baselines including BC-t, BC-r, UPDeT-m, and UPDeT-l.
python src/main.py --baseline_run --config=updet-m --env-config=sc2_offline --task-config=toy --seed=1
# config=[updet-m/updet-l/bc-t/bc-r]
We also add dataset for the cooperative navigation tasks. Similarly, you can run the following code to run ODIS:
python src/main.py --mto --config=odis --env-config=cn_offline --task-config=cn-expert --entity_embed_dim=64 --t_max=30000 --seed=1
And also, use the following code to run baselines:
python src/main.py --baseline_run --config=bc-t --env-config=cn_offline --task-config=cn-expert --entity_embed_dim=64 --t_max=30000 --seed=1
# config=[updet-m/updet-l/bc-t/bc-r]
We provide our scripts for the ODIS dataset style data collection. You can run the following code to collect data
python src/main.py --data_collect --config=qmix --env-config=sc2_collect --offline_data_quality=expert --num_episodes_collected=2000 --map_name=5m_vs_6m --save_replay_buffer=False
To collect medium data, you should specify a stop_winrate
where the policy will start to collect data after reaching the test winrate.
python src/main.py --data_collect --config=qmix --env-config=sc2_collect --offline_data_quality=medium --num_episodes_collected=2000 --map_name=5m_vs_6m --save_replay_buffer=False stop_winrate=0.5
Code licensed under the Apache License v2.0.