- Known dependencies: Python (3.5.4), OpenAI gym (0.10.5), tensorflow (1.14.0), numpy (1.18.2)
-
--scenario
: defines which environment in the MPE is to be used (default:"cn"
) -
--max-episode-len
maximum length of each episode for the environment (default:25
) -
--num-episodes
total number of training episodes (default:60000
) -
--num-adversaries
: number of adversaries in the environment (default:0
)
-
--lr
: learning rate (default:1e-2
) -
--gamma
: discount factor (default:0.95
) -
--batch-size
: batch size (default:800
) -
--num-units
: number of units in the MLP (default:128
)
-
--prior-buffer-size
: prior network training buffer size -
--prior-num-iter
: prior network training iterations -
--prior-training-rate
: prior network training rate -
--prior-training-percentile
: control threshold for KL value to get labels
-
--exp-name
: name of the experiment, used as the file name to save all results (default:None
) -
--save-dir
: directory where intermediate training results and model will be saved (default:"/tmp/policy/"
) -
--save-rate
: model is saved every time this number of episodes has been completed (default:1000
) -
--load-dir
: directory where training state and model are loaded from (default:""
) -
--plots-dir
: directory where training curves are saved (default:"./learning_curves/"
) -
--restore_all
: whether to restore existing I2C network
I2C be learned end-to-end or in a two-phase manner. This code is implemented for end-to-end manner which could take more training time compared with the latter manner
For Cooperative Navigation,
python3 train.py --scenario 'cn' --prior-training-percentile 60 --lr 1e-2
For Predator Prey,
python3 train.py --scenario 'pp' --prior-training-percentile 40 --lr 1e-3
If you are using the codes, please cite our paper.
@inproceedings{ding2020learning,
title={Learning Individually Inferred Communication for Multi-Agent Cooperation},
author={Ding, Ziluo and Huang, Tiejun and Lu, Zongqing},
booktitle={NeurIPS},
year={2020}
}
This code is developed based on the source code of MADDPG by Ryan Lowe