Info & References
With this the 3 easiest mini-games can be "solved" quickly.
|Map||Episodes||Avg score||Max score||Deepmind avg||Deepmind max|
**CollectMineralShards and DefeatRoaches performance was still improving slightly
- Avg and max are from the last n_envs*100 episodes.
- For all maps used the parameters seen in the repo except n_envs=32 (48 in DefeatRoaches).
- Episodes is the total number of playing-episodes over all environments.
Deepmind scores are shown for comparison. They are the FullyConv ones reported in the release paper.
How to run
Install the requirements (Baselines etc) below, clone the repo and do
python run_sc2_a2c.py --map_name MoveToBeacon --n_envs 32
This won't save any files. Some results are printed to stdout.
- Python 3 (will NOT work with python 2)
- Open AI's baselines (tested with 0.1.4) (Can also skip the installation and dump the baselines folder inside this repo, most of the dependencies in baselines are not really if use only a2c)
- pysc2 (tested with v1.2)
- Tensorflow (tested with 1.3.0)
- Other standard python packages like numpy etc.
Here we use only the screen-player-relative observation from the original observation space. Action space is limited only to one action: Select army followed by Attack Move (same for the author when he plays sc2).
With this slice from observation/action space we can make agent to learn the 3 mini-games mentioned above. However for anything more complicated it's not enough.
The action/obs-space limitation makes the problem very much easier, faster and less general/interesting. Because of this and the differences in the network and hyperparamteres the scores are not directly comparable with the release-paper.
The achieved scores here are considerably lower than the Deepmind results which suggests that the limited action space is not enough to achieve optimal performance (e.g micro against roaches or using two marines separately in shards).