Skip to content

Artificial Intelligence

Santi Ontañón Villar edited this page Feb 19, 2018 · 19 revisions

Introduction

This page describes the different AIs that come built-in into microRTS. All of these AIs (except for RandomAI and RandomBiasedAI) asume fully-observable settings. microRTS does not come yet with any AI that can handle partial observability.

microRTS comes built-in with several families of AI strategies: hard-coded strategies, minimax alpha-beta search strategies, Monte Carlo search strategies, and Monte Carlo Tree Search strategies. Also, an additional "AI" (that gui/MouseController.java) is provided that basically lets a human play that game.

Let us describe each one of them in detail.

Hard-Coded Strategies

  • RandomAI: executes moves at random.

  • RandomBiasedAI: executes moves at random, but with a strong bias towards attacking, harvesting, and returning (5 times more probability).

  • WorkerRush: The WorkerRush AI uses the abstraction layer implemented in the package "ai.abstraction" to implement a very simple rush strategy that consists of: training workers continuously, make one of them gather resources, and send all the other ones to attack immediately to the closest enemy unit.

  • LightRush: The LightRush AI also uses the abstraction layer implemented in the package "ai.abstraction", and implements a rush strategy that consists of: training one worker and make it gather resources. Once there are enough resources for building barracks, build a barracks. From that point on, train Light units constantly, and sent them to attack immediately to the nearest enemy unit.

  • HeavyRush: The HeavyRush AI also uses the abstraction layer implemented in the package "ai.abstraction", and implements a rush strategy that consists of: training one worker and make it gather resources. Once there are enough resources for building barracks, build a barracks. From that point on, train Heavy units constantly, and sent them to attack immediately to the nearest enemy unit.

  • RangedRush: The RangedRush AI also uses the abstraction layer implemented in the package "ai.abstraction", and implements a rush strategy that consists of: training one worker and make it gather resources. Once there are enough resources for building barracks, build a barracks. From that point on, train Ranged units constantly, and sent them to attack immediately to the nearest enemy unit.

There are version of these strategies called "POWorkerRush", "POLightRush", etc. These are specifically designed for working in partially observable settings. They are not particularly good at it, but they are useful as baselines.

Portfolio Search

An approach that has been shown to perform very well in practice for playing RTS games is to have a collection of predefined scripts or strategies, and then use search to decide which of them to apply in each situation. microRTS contains the following two versions of this approach:

  • PortfolioAI: This AI receives as input parameters a set of hard-coded strategies. At run time, it tries them all, and determines which is the best one to use.

  • PGSAI (Portfolio Greedy Search): This AI receives as input parameters a set of Scripts that can be used to control single units in the game. It performs greedy search to determine which is the script that each of the units should be using. This is an extended version of Churchill and Buro's algorithm (see " Portfolio greedy search and simulation for large-scale combat in starcraft" Churchill and Buro paper)); the extension is because their algorithm was only for combat scenarios where no new units can be created; the version in microRTS considers this later case, and can be applied to the full game.

The main difference between the above is that PortfolioAI works at the player (selects a single script for the bot) level while PGSAI works at the unit level (selects the best script for each individual unit).

Minimax Alpha-Beta Search Strategies

This package implements the strategies reported in this paper ( http://arxiv.org/abs/1208.1940 ), and some additional ones. Most are variants of the RTMM (Real-Time Minimax) algorithm. The RTMM algorithm is designed to deal with durative actions in real-time games. Although it can handle simultaneous action games, it does not have any particular strategy to assess the proper minimax value of a situation where two players can execute a move simultaneously, and thus, it will over or underestimate the value of some of those situations.

  • RTMinimax is the basic algorithm, which will simply open a tree up to the desired depth. Depth of the tree in RTMM is defined by time, not by number of moves as in the standard minimax algorithm for turn-based games (more details in http://arxiv.org/abs/1208.1940 ).

  • IDRTMinimax is an iterative-deepening version of RTMinimax that exploits the available time (specified in the constructor) to search in a tree as deep as possible.

  • IDRTMinimaxRandomized is a small modification over IDContinuingRTMinimax implementing the idea of randomized alpha-beta (see "Heuristic search applied to abstract combat games" by Kovarsky and Buro, 2005) in order to better assess the value of situations where both players can execute moves simultaneously.

  • ABCD/IDABCD the ABCD algorithm presented in "Fast Heuristic Search for RTS Game Combat Scenarios", Churchill and Buro (paper). It is very similar to RTMM, but uses tree alteration to minimize the over or underestimation that results from simultaneous action games, and also uses a play-out policy in the leaves of the tree to have a better look ahead. .

Monte Carlo Search Strategies

This package contains a collection of strategies that are based on the basic Monte Carlo search principle. For each possible move that can be executed, these strategies play a number of random games, and with those, they estimate the expected reward of each move. The main difference between these basic Monte Carlo strategies and the strategies in the following packages, is that the later create a tree, whereas the basic Monte Carlo search strategies do not.

There are 2 basic strategies:

  • MonteCarlo: this strategy simply considers every possible move, runs random game simulations from each of them uniformly. Each random game is player up to a certain time-limit. At the end of each of the random games, a heuristic function is used to estimate the value of the resulting game position. The main issue of this strategy is that the number of available moves in an RTS game is exponential in the number of units in the board, and thus, unless the game board is very simple, it is not possible to run even one random game per each of the available moves. Additionally, an optional parameter "MAX_ACTIONS" can be provided. This addresses a weakness of Monte Carlo search in domains with a combinatorial number of actions. It only considers a subset of all the possible actions that can be executed in a given game state (of size MAX_ACTIONS). This strategy has the shortcoming of not considering all the available moves, but it has the strength of at least sampling each of the ones that it considers a significant number of times.

  • LSI: this is a Monte Carlo search with Linear Side Information, contributed by Alexander Shleyfman, Antonin Komenda and Carmel Domshlak. The theory behind this AI can be found in "On Combinatorial Actions and CMABs with Linear Side Information", Shleyfman and Komenda and Domshlak paper

Moreover, notice that all the MCTS strategies described below (including UCT) can be turned into plain Monte Carlo strategies by just setting the MAX_DEPTH of their trees to 1.

UCT-based Strategies

Once one has a real-time variant of Minimax, implementing a real-time variant of UCT is fairly simple. This package contains three versions of real-time UCT (with their respective "Continuing" versions).

  • UCT: a standard real-time version of UCT based on the ideas of RTMM. This algorithms has 2 main problems: 1) the exponential number of moves in RTS games (same as for MonteCarlo), and 2) it overestimates the value of positions where both players can execute moves simultaneously (same as for RTMinimax).

  • DownsamplingUCT: only a predefined number of actions are considered at each node in the search tree (selected at random from the set of all possible actions).

  • UCTUnitActions: this version opens a search tree that is based on unit-actions rather than on player-actions. This avoids the exponential number of actions to some extent, although not completely.

Monte Carlo Tree Search Strategies (Other than UCT)

A collection of MCTS AIs that use different bandit strategies are provided:

  • MLPSMCTS: This is a MCTS implementation using a version of the MLPS (Match Learning with Polynomial Storage) sampling strategy (see "Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial Multi-Armed Bandit Formulation", Gai, Krishnamachari and Jain paper).

Finally, we also provide a MCTS algorithm that combines the naive-sampling idea of NaiveMonteCarlo with MCTS, resulting on a new Monte-Carlo Tree Search algorithm that I call "NaiveMCTS". I am still working on an in-depth description of this strategy, that will be posted here soon.

Hierarchical Task-Network Planning

microRTS also includes a version of the AHTN (Adversarial HTN) planning algorithm, together with 5 different HTN domain definitions (in the "ahtn" folder):

  • AHTN: an implementation of the AHTN algorithm that integrates minimax alpha-beta search with HTN planning. For a description of the algorithm, see: "Adversarial Hierarchical-Task Network Planning for Complex Real-Time Games", Ontanon and Buro (to appear in IJCAI 2015).

The current implementation of AHTN is a bit inefficient, since it constantly clones large parts of the HTN plan structure. This will be fixed in the near future.

Continuing/PseudoContinuing AIs

By default, all the AIs in microRTS (Except those starting with the keyword "Continuing") only do any computation in those game frames where they have any idle unit to which to assign an action to. However, since microRTS is deterministic, it might be the case that at a given game frame, we know that one of our units will become idle in N frames, and that during those N frames, no unit of the opponent will become idle. If that is the case, then we should start using all of those N frames to decide which is the next best move, instead of just waiting until it is time to move. Some of the AIs built into microRTS can do that. In particular, all the AIs that extend the "InterruptibleAIWithComputationBudget" class can spread their computation across multiple game frames. If you create an "ContinuingAI" and pass, as a parameter any AI that extends InterruptibleAIWithComputationBudget, then the AI will exploit this fact, to maximize the amount of computation time available.

Clone this wiki locally