Training requires an Nvidia GPU with CUDA support. It's fine to develop on a machine without a GPU, but training will likely be so slow as to be impractical.
- Install Java JDK (Ubuntu 20.04's default JDK is 11, which is fine) and swig
- Install poetry.
- Install git-lfs and
git lfs pull
to download the jar files. - Install rl_algo_impls with
poetry install -E microrts
. - Run
poetry run wandb login
and follow the instructions to add the Weights & Biases API key to the environment. - Train using
./scripts/microrts.sh -a {[ppo], appo, dppo} -e {NAME FROM ENVIRONMENT YAML}
.
For example, my setup for using APPO to train Microrts-small-net-bw8a-40m-ent5-lr3c-05wb2lwr-vf50-nga-a10
on a Lambda Cloud A10 instance is:
git clone https://github.com/sgoodfriend/rl-algo-impls.git
cd rl-algo-impls
# git checkout BRANCH_NAME if running on non-main branch
./scripts/setup.sh --microrts # Handles steps 1-5
./scripts/microrts.sh -a appo -e Microrts-small-net-bw8a-40m-ent5-lr3c-05wb2lwr-vf50-nga-a10
Technical details in technical-description.md
- Java (tested 11+) and Python 3.8+ must be installed
- Download the RAISocketAI archive. For the CoG2023 MicroRTS competition this can be downloaded from https://github.com/sgoodfriend/rl-algo-impls/releases/download/v0.0.38-bugfix/RAISocketAI-0.0.38-bugfix.zip
- Unzip the archive:
unzip -j RAISocketAI-0.0.38-bugfix.zip
- Upgrade and install Python depdendencies:
python -m pip install --upgrade pip
python -m pip install setuptools==65.5.0 wheel==0.38.4
python -m pip install --upgrade torch
- Install the
.whl
file:
python -m pip install --upgrade rl_algo_impls-0.0.38-py3-none-any.whl
The above steps makes rai_microrts
callable within the terminal. RAISocketAI.java
uses this to start a Python child process, which is used to compute actions.
To see this demonstrated in Google Colab running a small command-line tournament, open colab_microrts_demo.ipynb in Google Colab (ideally High-RAM Runtime shape, no Hardware accelator).
RAISocketAI regularly beats prior competition winners and baselines on 7 of 8 competition public maps. The exception is the largest map (64x64). Each cell below represents the average result of RAISocketAI against the opponent AI for 100 matches (50 each as player 1 and player 2). A win is +1, loss is -1, and draw is 0. Same number of wins and losses would average to a score of 0. A score of 0.9 corresponds to winning 95% of games (assuming no draws).
map | POWorkerRush | POLightRush | CoacAI | Mayari | Map Total |
---|---|---|---|---|---|
basesWorkers8x8A | 0.91 | 1 | 0.98 | 1 | 0.97 |
FourBasesWorkers8x8 | 1 | 1 | 1 | 0.97 | 0.99 |
NoWhereToRun9x8 | 1 | 1 | 0.93 | 0.97 | 0.98 |
basesWorkers16x16A | 1 | 1 | 0.78 | 0.97 | 0.94 |
TwoBasesBarracks16x16 | 1 | 0.78 | 0.98 | 1 | 0.94 |
DoubleGame24x24 | 1 | 1 | 0.85 | 1 | 0.96 |
BWDistantResources32x32 | 1 | 0.84 | 0.82 | 0.97 | 0.91 |
(4)BloodBath.scmB | 0.96 | -1 | -1 | -1 | -0.51 |
AI Total | 0.98 | 0.7 | 0.67 | 0.74 | 0.77 |
Mayari was the 2021 COG winner (prior competition), and CoacAI was the 2020 COG winner. POWorkerRush and POLightRush are baseline AIs. POWorkerRush, POLightRush, and CoacAI use the default AStarPathFinding.
The round-robin tournamnet was run on a 2018 Mac Mini with Intel i7-8700B CPU (6-core, 3.2 GHz) with PyTorch limited to 6 threads. The avearge execution time per turn varied by map-size with the shortest being NoWhereToRun9x8 (9 milliseconds) and longest BWDistantResources32x32 and BloodBath (22 ms). The tournament enforces 100 ms per turn. RAISocketAI exceeded the 100 ms limit and needed to skip its turn on less than 0.001% of turns but did lose by timeout in 5 matches (4 BloodBath [1% of BloodBath matches], 1 BWDistantResources [0.25%])
In the videos below, RAISocketAI is the blue player (units outlined in blue, opponent units outlined in red). RAISocketAI wins in all the videos, except (4)BloodBath.scmB. RAISocketAI is the left or up player, except in (4)BloodBath.scmB (down-right).
map | Video |
---|---|
basesWorkers8x8A | basesWorkers8x8A-RAISocketAI-Mayari.mp4 |
FourBasesWorkers8x8 | FourBasesWorkers8x8-RAISocketAI-Mayari.mp4 |
NoWhereToRun9x8 | NoWhereToRun9x8-RAISocketAI-Mayari.mp4 |
basesWorkers16x16A | basesWorkers16x16A-RAISocketAI-Mayari.mp4 |
TwoBasesBarracks16x16 | TwoBasesBarracks16x16-RAISocketAI-Mayari.mp4 |
DoubleGame24x24 | DoubleGame24x24-RAISocketAI-Mayari.mp4 |
BWDistantResources32x32 | BWDistantResources32x32-RAISocketAI-Mayari.mp4 |
(4)BloodBath.scmB | BloodBath-RAISocketAI-Mayari.mp4 |
The bugfix build only has a change in the Java-side RAISocketAI where the thread pool is
allowed to empty to 0 threads. This fixes an issue where scripts would not terminate
because of threads awaiting tasks in RAISocketAI. The bug does not affect gameplay at all.
The fix only changes the program end behavior. The Python .whl
file is exactly the same
between v0.0.38 and v0.0.38-bugfix.
If you want to continue using
v0.0.38
and want to fix the script termination behavior, add System.exit(0);
at the end of the
main
function.
RAISocketAIBestModels.java
is a subclass of RAISocketAI
, which always uses the best
model for the given map. This still respects the timeBudget
passed into it, so if the
model takes over the timeBudget
it'll instead return an empty PlayerAction.
RAISocketAI
has 3 "general" models for 3 different sets of map sizes:
- Maps whose longest dimension is 16
- Maps whose longest dimension is 17-32
- Maps whose longest dimension is 33-64
Additionally, there are models finetuned on specific maps:
- maps/NoWhereToRun9x8.xml
- maps/DoubleGame24x24.xml
- maps/BWDistantResources32x32.xml (2 models: one larger, the other faster)
During the preGameAnalysis
step, RAISocketAI
computes the next action to both warm
up data structures and to estimate how long each model is likely to take.
It will pick the largest model that will likely take less than 75% of the timeBudget
,
which on slower machines will be the smaller, faster model. RAISocketAIBestModels
will
always pick the larger model.
If you want to run the agent with the best models, either make sure the machine is fast
enough to generally complete turns in under 50 milliseconds (Apple M1 Max, Intel Core
i7-8700B, Intel Xeon 8358 are all examples that are easily fast enough) OR increase the
timeBudget sufficiently high. RAISocketAIBestModels
shouldn't be necessary, but can be
used to ensure it's always used.