The Code for the Diffusion Soft Policy Iteration with Complete Division (Dspic) paper submission at ICML2026. (accepted as a regular paper)
Towards Complete Multi-Agent Coordination Policy Learning via Denoising Maximum Entropy Optimization
Learning Curves are available in paper_plots
Ensure your Python version >= 3.11, then you can install our repository by:
pip install -r requirement.txtTo install SMAC, please follow the official instructions in here. To install SMACv2, please follow the official instructions in here. To install LBF, please follow the official instructions in here. To install MaMuJoCo, please follow the instructions on https://github.com/openai/mujoco-py, https://www.roboti.us/, and https://github.com/deepmind/mujoco to download the right version of mujoco you need (mujoco210 is suggested).
Then, mkdir ~/.mujoco and move the .tar.gz or .zip to ~/.mujoco, and extract it by unzip zipname. Finally add the path to ~/.bashrc with
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/<user>/.mujoco/<folder-name>/binAfter installation is finished, the conda environment can be activated, and the code can be run using
python examples/train.pyYou can modify the corresponding algorithm and environment parameters in src/configs, and our paper also provides information on the parameters we use.
You can freely choose the algorithm to run (currently only dspic is supported), the testing environment, and the experiment name, simply by running with
python examples/train.py --algo dspic --env smac/smacv2/mamujoco/lbf --exp_name test1Portions of the project are adapted from other repositories:
- https://github.com/PKU-MARL/HARL is licensed under MIT,
- https://github.com/ALRhub/DIME is licensed under MIT.
If you have any questions, you can ask them on GitHub or send an email to ghli04@smail.nju.edu.cn. (Sending emails is recommended🤗.)