SAC-QMIX

Algorithm that applies SAC to QMIX for Multi-Agent Reinforcement Learning. Watch the demo here.

Requirements

SMAC

pytorch (GPU support recommanded while training)

tensorboard

StarCraft II

For the installation of SMAC and StarCraft II, refer to the repository of SMAC.

Train

Train a model with the following command:

python main.py

Configurations and parameters of the training are specified in config.json. Models will be saved at ./models

Test

Test a trained model with the following command:

python test_model.py

Configurations and parameters of the testing are specified in test_config.json. Match the run_name items in config.json and test_config.json.

Theory & Algorithm

Architecture

Computation Flow

Note that a_i is equivalent to \mu_i and s_i is equivalent to o_i in the architecture schema above.

Train Objective: policies that maximum

Q-values computed by networks:

Individual state-value functions:

Total state-values (alpha is the entropy temperature):

Q-values expressed with Bellman Function:

Critic networks update: minimum

Actor networks update: maximum

Entropy temperatures update: minimum

Result

Note that data of other algorithm are from SMAC paper. Therefore methods of evaluations are kept the same as SMAC paper did (StarCraftII version: SC2.4.6.2.69232).

Test Win Rate % of SAC-QMIX and other algorithms

(Mean of 5 independent runs)

Scenario	IQL	VDN	QMIX	SAC-QMIX
2s_vs_1sc	100	100	100	100
2s3z	75	97	99	100
3s5z	10	84	97	97
1c3s5z	21	91	97	100
10m_vs_11m	34	97	97	100
2c_vs_64zg	7	21	58	56
bane_vs_bane	99	94	85	100
5m_vs_6m	49	70	70	90
3s_vs_5z	45	91	87	100
3s5z_vs_3s6z	0	2	2	85
6h_vs_8z	0	0	3	82
27m_vs_30m	0	0	49	100
MMM2	0	1	69	95
corridor	0	0	1	0

Learning curves of SAC-QMIX and other algorithms

(Mean of 5 independent runs)

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
SAC		SAC
figures		figures
formulas		formulas
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SAC-QMIX.svg		SAC-QMIX.svg
config.json		config.json
cuprof.py		cuprof.py
main.py		main.py
test_config.json		test_config.json
test_model.py		test_model.py
writter_util.py		writter_util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAC-QMIX

Requirements

Train

Test

Theory & Algorithm

Architecture

Computation Flow

Result

Test Win Rate % of SAC-QMIX and other algorithms

Learning curves of SAC-QMIX and other algorithms

About

Releases

Packages

Languages

License

FlickerNiko/SAC-QMIX

Folders and files

Latest commit

History

Repository files navigation

SAC-QMIX

Requirements

Train

Test

Theory & Algorithm

Architecture

Computation Flow

Result

Test Win Rate % of SAC-QMIX and other algorithms

Learning curves of SAC-QMIX and other algorithms

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages