# Adversarial Search and Games

From: [Artificial Intelligence: A Modern Approach](http://aima.cs.berkeley.edu/) by Stuart Russell and Peter Norvig.

Note:

- Chapter 5 in US edition.
- Chapter 6 in International (Global) edition.

## Introduction


- **Characteristics**: Environments where agents compete against each other.
- **Zero-Sum Games**: Often modeled as zero-sum games, where one agent's gain is another's loss.
- **Multi-Agent Systems**: Involves multiple agents with conflicting goals.
- **Dynamic Interaction**: Agents' actions directly affect each other.
- **Uncertainty and Strategy**: Requires anticipation of opponents' actions.

### Adversarial Search

- **Definition**: A search in a competitive environment where the outcome depends on actions of multiple agents (opponents).
- **Goal**: To find optimal strategies that lead to success in spite of adversaries' actions.
- **Minimax Algorithm**: A common approach in adversarial search to minimize the possible loss in a worst-case scenario.
- **Game Trees**: Used to represent possible moves by players.
- **Evaluation Functions**: Estimate the desirability of game positions when it's impractical to search until the end of the game.

### Examples of Games

- **Chess**: A classic two-player strategy board game with complete information.
- **Checkers (Draughts)**: Similar to chess but with simpler rules, played on an 8x8 board.
- **Go**: An ancient board game known for its complexity and rich strategy, played on a 19x19 grid.
- **Poker**: A card game characterized by incomplete information and uncertainty.
- **Tic-Tac-Toe**: A simple two-player game often used as an introductory example of game theory and AI strategy.

These topics represent key aspects of how AI can be applied in competitive environments, showcasing the complexity and the need for advanced strategies in game playing.

## Physical Games

Physical games, as opposed to traditional tabletop games like chess or checkers, involve a complex interplay of physical skills, real-time decision making, and often team dynamics. 

Until recently physical games were considered too complex for AI, but recent advances in robotics and reinforcement learning have enabled AI agents to compete in physical games like soccer and basketball.

Here are some examples of physical games and an explanation of their complexity compared to tabletop games:

### Soccer (Football)

- **Dynamic Environment**: The game is played in a large, open field with continuous play, requiring constant adaptation.
- **Physical Skills**: Players need to excel in coordination, endurance, and agility.
- **Team Strategy**: Involves complex team dynamics and coordination for defense, midfield control, and attacking.
- **Real-Time Decisions**: Players make split-second decisions under physical pressure.
- **Unpredictable Play**: The ball's movement can be highly unpredictable, influenced by weather, field conditions, and players' actions.

### Basketball

- **Rapid Pace**: The game is fast-paced with frequent changes in possession and continuous movement.
- **Physical Interaction**: Close physical interaction between players, including blocking and dodging.
- **Spatial Awareness**: High level of spatial awareness required to navigate the court and anticipate opponents' and teammates' movements.
- **Skill Diversity**: Players need a diverse set of skills like shooting, dribbling, passing, and defending.

### Tennis

- **Individual Strategy**: Singles tennis requires a high level of personal strategy, adapting to opponents' playing style.
- **Physical Precision**: Precision in serving and returning, along with stamina, plays a crucial role.
- **Psychological Aspect**: Players often have to overcome psychological challenges and maintain focus throughout the match.

### American Football

- **Complex Playbooks**: Teams have extensive playbooks, requiring players to memorize and execute complex plays.
- **Physical Contact**: High level of physical contact and tackling, demanding robust physical conditioning.
- **Positional Specialization**: Each player has a specialized role that contributes differently to the team's strategy.

### Volleyball

- **Team Coordination**: Requires excellent team coordination, especially in setting up and executing attacks.
- **Reflexes and Agility**: Players must have quick reflexes to respond to fast-moving plays and spikes.
- **Spatial and Tactical Awareness**: Positioning and movement on the court are critical for both offense and defense.

### Comparing to Tabletop Games

- **Physicality**: Physical games require athletic skills, whereas tabletop games focus on mental strategy.
- **Real-Time Dynamics**: Physical games often happen in real-time, demanding immediate responses, unlike turn-based tabletop games.
- **Environmental Factors**: Factors like weather, field conditions, and physical fatigue significantly influence physical games.
- **Team Interaction**: Physical games, especially team sports, involve complex interpersonal dynamics and coordination.
- **Unpredictability**: The higher level of unpredictability in physical games arises from real-time dynamics and physical interactions.

## Real Time Strategy Games (RTS)

Real-Time Strategy (RTS) games are a genre of video games particularly suited for AI research due to their complexity, real-time decision-making requirements, and dynamic environments. Here are some examples of RTS games and an explanation of their suitability for AI research:

### Examples of RTS Games

- **StarCraft and StarCraft II**
   - **Complexity**: Features multiple factions with unique units and strategies.
   - **Economy and Resource Management**: Players must efficiently gather resources and manage economies.
   - **Fog of War**: Limited visibility of the map increases uncertainty and strategic depth.
- **Age of Empires Series**
   - **Historical Context**: Players build and manage civilizations through different historical eras.
   - **Diverse Strategies**: Offers a variety of military, economic, and technological strategies.
   - **Large Scale**: Involves managing large armies and settlements.
- **Command & Conquer Series**
   - **Asymmetrical Factions**: Different factions have distinct units and technologies.
   - **Base Building and Defense**: Emphasizes base construction and defense along with offensive strategies.
   - **Real-Time Tactics**: Requires quick tactical decisions in combat situations.
- **Warcraft III**
   - **Hero Units**: Introduces powerful hero units that can turn the tide of battles.
   - **Micro and Macro Management**: Requires both individual unit control (micro) and overall strategy and resource management (macro).
- **Company of Heroes**
   - **Realistic Tactics and Cover System**: Focuses on realistic military tactics and the use of cover.
   - **Resource Points Control**: Involves capturing specific points on the map to gain resources.

### Suitability for AI Research

- **Complex Decision-Making**: RTS games require complex, multi-layered decision-making, making them ideal for studying and developing sophisticated AI algorithms.
- **Real-Time Analysis**: The real-time nature of these games challenges AI to analyze and react to situations swiftly and efficiently.
- **Resource Management**: AI can be tested on its ability to optimally allocate and use resources, a common problem in many real-world scenarios.
- **Strategy Formulation**: AI must formulate short-term and long-term strategies based on incomplete and dynamically changing information.
- **Learning and Adaptation**: These games provide a rich environment for AI to learn and adapt to different strategies and opponents.
- **Handling Uncertainty**: With elements like fog of war and unpredictable opponents, RTS games are excellent for developing AI that can handle uncertainty and incomplete information.
- **Multi-Objective Optimization**: Balancing various objectives (e.g., economy, defense, attack) in an RTS game mirrors real-world scenarios where multiple objectives must be managed concurrently.

The complexity, pace, and strategic depth of RTS games make them an excellent platform for advancing AI research, particularly in areas like decision-making, learning, and adaptation to complex and dynamic environments.

## 5.1 - Game Theory

- **Three Stances Towards Multi-Agent Environments**
   - **Economy as a Game**: Treating the entire economy as a game where each agent is a player.
   - **Agents as Part of the Environment**: Viewing other agents as part of the environment in which a single agent operates.
   - **Modeling Adversarial Agents Explicitly**: Focusing on explicitly modeling the actions and strategies of adversarial agents. This is the main focus of this chapter.
- **Modeling the Adversarial Agents Explicitly**
   - **Strategic Planning**: Developing strategies based on the anticipated actions of opponents.
   - **Predictive Modeling**: Anticipating the moves of the adversary to optimize one's own strategy.
- **Concepts in Adversarial Search and Game Theory**
   - **Pruning**: Techniques like alpha-beta pruning are used to efficiently search the game tree by eliminating branches that cannot possibly influence the final decision.
   - **Evaluation Function**: A heuristic used to estimate the desirability of a game position when it is impractical to search until the end of the game.
   - **Imperfect Information**: Games like poker, where players do not have complete information about the game state, requiring strategies that can handle uncertainty and probabilistic outcomes.
- **Emphasis on Explicit Agent Modeling**
   - **Deep Analysis of Opponents' Strategies**: Understanding and countering specific strategies used by opponents.
   - **Adaptive Tactics**: Adjusting one’s own strategy in response to the observed behavior of adversaries.
   - **Complex Decision Making**: Making decisions based on a mixture of strategic planning, predictive modeling, and real-time analysis of the game state.

This subchapter emphasizes the importance of understanding and explicitly modeling adversarial agents in games, focusing on strategic decision-making, the use of evaluation functions, and dealing with imperfect information. These concepts are crucial in developing AI that can effectively navigate and succeed in multi-agent, competitive environments.

### 5.1.1 - Two-Player Zero-Sum Games

- **Definition and Characteristics**
   - **Perfect Information**: These games are characterized by perfect information, meaning all players are fully aware of all game states.
   - **Zero-Sum Nature**: In a zero-sum game, one player's gain is exactly balanced by the other player's loss.
- **Basic Concepts: Move and Position**
   - **Move**: An action taken by a player that transitions the game from one state to another.
   - **Position**: The current state of the game, determined by the sequence of moves made up to that point.
- **Game Structure**
   - **Initial State**: The starting state of the game before any moves are made.
   - **Transition Model**: Rules that determine how a player’s move changes the state of the game.
   - **Terminal Test**: A test to determine when the game has ended (reaching a terminal state).
   - **Terminal State**: A state where the game ends, with a win, loss, or draw.
- **Representation of Game States**
   - **State Space Graph**: Represents all possible states of the game and how one can transition from one state to another.
   - **Search Tree**: A tree structure that represents the game from a single player's perspective, branching out from the current state based on potential moves.
   - **Game Tree**: Expands on the search tree to include all possible moves by both players from the initial state to all possible terminal states.
- **Differences Between State Space Graph, Search Tree, and Game Tree**
   - **State Space Graph vs. Search Tree**: The state space graph is a more abstract representation showing all possible states and transitions, while the search tree focuses on the paths available from a current state.
   - **Search Tree vs. Game Tree**: The search tree is from one player’s perspective, whereas the game tree includes all possible moves for all players, making it a complete representation of the game's possibilities.

## 5.2 - Optimal Decisions in Games

- **Concept of MAX and MIN in Games**
   - **MAX**: Represents the player who is trying to maximize the score/outcome of the game.
   - **MIN**: Represents the opposing player who is trying to minimize the score/outcome, or conversely, to maximize their own score in a zero-sum context.
- **Minimax Search**
   - **Definition**: A decision rule used for minimizing the possible loss for a worst-case scenario, assuming an opponent who makes the best moves against you.
   - **Application**: Minimax search is used to determine the best move for a player by considering all possible responses of the opponent.
- **The Concept of 'Ply'**
   - **Definition**: A ply in game theory is a single turn taken by one of the players.
   - **Relevance**: The depth of analysis in a game is often measured in plies (e.g., looking three plies ahead means considering your move, your opponent's move, and your subsequent move).
- **Minimax Value**
   - **Description**: The minimax value of a node in a game tree is the best score that can be guaranteed for the player at that node, assuming optimal play by both players.
   - **Computation**: It is computed by recursively applying the minimax decision process through the game tree.
- **Minimax Decision**
   - **Definition**: The minimax decision is the decision at the root node that leads to the best possible outcome (maximized for MAX and minimized for MIN).
   - **Determination**: This decision is determined by evaluating the minimax values of the root's child nodes.

This subchapter lays the groundwork for understanding how optimal decisions are made in games, especially two-player zero-sum games with perfect information. The concepts of MAX and MIN, the use of the minimax search algorithm, and the understanding of game depth in terms of plies are crucial for developing strategies in such games. The minimax value and decision concepts are fundamental in determining the best possible outcomes in a game scenario.

### 5.2.1 The Minimax Search Algorithm

- **Purpose of Minimax Algorithm**
   - Designed for two-player zero-sum games.
   - It computes the best move by considering all possible moves of the opponent.
- **Working Principle**
   - The algorithm searches through the game tree by expanding nodes.
   - It assigns values to terminal states (win, lose, draw).
   - These values are propagated back up the tree to determine the best move.
- **Evaluation of Moves**
   - The algorithm evaluates moves by assuming that the opponent plays optimally.
   - For the maximizing player (MAX), it selects the move with the highest value.
   - For the minimizing player (MIN), it selects the move with the lowest value.
- **Depth of the Tree**
   - The depth of analysis (number of plies) can be adjusted.
   - Deeper trees provide more accurate decisions but require more computation.
- **Pseudo-code for the Minimax Algorithm**

```
function MINIMAX(node, depth, isMaximizingPlayer):
    if depth == 0 or node is a terminal node:
        return the value of the node

    if isMaximizingPlayer:
        bestValue = -∞
        for each child of node:
            val = MINIMAX(child, depth - 1, false)
            bestValue = max(bestValue, val)
        return bestValue

    else:
        bestValue = +∞
        for each child of node:
            val = MINIMAX(child, depth - 1, true)
            bestValue = min(bestValue, val)
        return bestValue

```
- **Implementation Considerations**
   - The algorithm is typically implemented recursively.
   - It requires a well-defined evaluation function for non-terminal states when the depth limit is reached.
- **Optimization Techniques**
   - Techniques like alpha-beta pruning can be used to reduce the number of nodes evaluated.

### 5.2.2 Optimal Decisions in Multiplayer Games

- **Vector of Values for Each Node**
   - In multiplayer games, each node in the game tree represents a vector of values rather than a single value.
   - This vector accounts for the payoffs or utilities for each player, reflecting the multi-dimensional nature of the game's outcomes.
- **Complexity in Multiplayer Games**
   - Multiplayer games are more complex than two-player games due to the increased number of possible interactions and outcomes.
   - The strategy involves not only competing against multiple players but also potentially forming alliances and collaborations.
- **Concepts of Alliances and Collaboration**
   - **Alliances**: Players may form temporary alliances to achieve mutual goals, although these alliances can shift or dissolve as the game progresses.
   - **Collaboration**: Players might collaborate for mutual benefit, influencing the game dynamics and strategic decisions.
   - **Impact on Strategy**: The possibility of alliances and collaborations adds layers of strategic depth, as players must consider not only individual strategies but also group dynamics and potential betrayals.
- **Decision-Making in Multiplayer Context**
   - Decision-making becomes more complex, as each move affects multiple players differently.
   - Strategies must account for the actions and potential responses of all players, not just a single adversary.
- **Evaluation of Game States**
   - Evaluating game states involves assessing the implications for all players, not just assessing a win/lose outcome for a single player.
   - The utility of a state from a player's perspective depends on the utilities for all players, reflecting the interconnected nature of decisions in multiplayer scenarios.

This section highlights the intricacies of decision-making in multiplayer games, emphasizing the need to consider a broader range of outcomes and interactions compared to two-player games. The concept of a vector of values for each node and the dynamics of alliances and collaboration significantly impact the strategies and evaluation methods in multiplayer game settings.

### 5.2.3 Alpha–Beta Pruning

- **Key Idea of Alpha–Beta Pruning**
   - **Efficiency Improvement**: Alpha-beta pruning is a technique used to improve the efficiency of the minimax algorithm. It reduces the number of nodes that are evaluated in the search tree, without affecting the final decision.
   - **Pruning Redundant Nodes**: The method involves skipping the evaluation of certain branches of the tree that cannot possibly influence the final decision. This is done by maintaining two values, alpha and beta, which represent the minimum score that the maximizing player is assured of (alpha) and the maximum score that the minimizing player is assured of (beta).
- **Alpha and Beta Values**
   - **Alpha (α)**: The best value that the maximizing player can guarantee at that level or above.
   - **Beta (β)**: The best value that the minimizing player can guarantee at that level or above.
- **Pruning Condition**
   - When the alpha value of a node is greater than or equal to the beta value of its ancestor, further exploration of that node is unnecessary (i.e., it can be pruned).
- **Effectiveness**
   - **Depth of Search**: Alpha-beta pruning allows for deeper searches in the game tree within the same time constraints, as it significantly cuts down the number of nodes to be evaluated.
   - **Optimal Decision Retention**: Despite pruning parts of the tree, the algorithm still returns the same decision as it would have without pruning.
- **Application**
   - Commonly used in various games, especially in chess and other complex strategy games, to enhance the performance of AI players.

Alpha-beta pruning is a critical optimization technique in game playing AI. By intelligently eliminating unnecessary parts of the search tree, it enables more efficient computation without compromising the accuracy of the decision-making process. This method is particularly valuable in games with large search trees, where exhaustive exploration is computationally impractical.

### 5.2.4 Move Ordering

- **Effect on Alpha-Beta Pruning Effectiveness**
   - Move ordering significantly affects the efficiency of alpha-beta pruning. Good move ordering can greatly increase the number of prunings, thereby reducing the search space and increasing the efficiency of the search.
- **Iterative Deepening**
   - This is a strategy where the search is conducted repeatedly, increasing the depth limit with each iteration.
   - It ensures that the best moves from shallower searches are tried first in deeper searches, potentially improving move ordering in subsequent iterations.
- **Killer Move Heuristic**
   - A "killer move" is a move that has caused a cutoff in another branch of the search tree at the same depth.
   - These moves are given priority in the search order as they are likely to be strong moves.
- **Transpositions and Transposition Table**
   - A transposition occurs when two different sequences of moves lead to the same game position.
   - A transposition table is used to store the evaluations of these positions, so the game doesn't need to re-evaluate them, improving efficiency.
- **Claude Shannon’s Type A and Type B Strategies**
   - **Type A Strategy**: Focuses on examining as many positions as possible and uses a relatively simple evaluation function. This is more brute-force in nature.
   - **Type B Strategy**: Emphasizes the use of more sophisticated evaluation functions and examines fewer positions. This approach relies more on the strategic depth of the evaluation function.
- **Importance of Move Ordering**
   - Effective move ordering can lead to significant improvements in search efficiency, allowing deeper searches within the same computational constraints.
   - It is an essential aspect of optimizing search strategies in game-playing AI, especially in conjunction with alpha-beta pruning.

Move ordering in the context of alpha-beta pruning is a crucial element in game-playing AI. By prioritizing certain moves, especially those that have proven effective in similar situations (like killer moves) or avoiding redundant evaluations (as with transposition tables), the AI can search more efficiently and effectively. The balance between breadth (Type A) and depth (Type B) strategies as described by Shannon also plays a key role in determining the overall approach to game-playing AI.


## 5.3 Heuristic Alpha–Beta Tree Search

- **Evaluation Function**
   - **Purpose**: Since it's impractical to search the entire game tree in most games, the evaluation function is used to estimate the desirability of a game position.
   - **Implementation**: It assigns a numerical value to non-terminal states, helping the algorithm decide which moves are the most promising.
   - **Design**: A good evaluation function captures important aspects of the game position (like material count in chess, control of the center, pawn structure, etc.).
- **Cutoff Test**
   - **Definition**: The cutoff test determines when to apply the evaluation function instead of searching deeper. It acts as a stopping condition for the search.
   - **Criteria**: Typically based on the depth of the search (depth limit) or other game-specific considerations (like time constraints or a specific state of the game board).
- **Heuristic Alpha-Beta Pruning**
   - **Combination with Heuristics**: The alpha-beta pruning technique is often combined with heuristics to enhance its effectiveness.
   - **Role of Heuristics**: Heuristics guide the search process, influencing which parts of the tree are pruned and the order in which nodes are explored.
   - **Improved Efficiency**: By applying heuristics, the algorithm can prune more branches and reduce the search space, leading to faster decision-making.
- **Balancing Accuracy and Efficiency**
   - **Depth vs. Breadth**: There is a trade-off between the depth of the search and the breadth. Heuristics help in balancing this by focusing the search on the most promising areas.
   - **Complexity of the Evaluation Function**: More complex evaluation functions may provide more accurate assessments but are computationally expensive.
- **Practical Considerations**
   - **Time Constraints**: In real-world applications, time constraints often dictate the depth of the search, making efficient heuristics crucial.
   - **Dynamic Adjustments**: The algorithm may dynamically adjust the depth of the search or the use of heuristics based on the state of the game.

In this section, the focus is on the practical application of alpha-beta pruning in real game scenarios, where it's not feasible to search the entire game tree. The evaluation function and the cutoff test are central to this approach, providing a balance between exploring enough of the game tree to make a good decision and doing so within a reasonable amount of time and computational resources. The use of heuristics in this context is key to enhancing the performance and effectiveness of game-playing AI.

### 5.3.1 Evaluation Functions

- **Features of the State**
   - **Definition**: Features of a game state are specific attributes that can be quantitatively assessed to evaluate the state.
   - **Examples**: In chess, these could include material balance, control of the center, pawn structure, king safety, etc.
- **Material Value in Chess**
   - **Concept**: Assigns numerical values to pieces based on their relative strength and importance.
   - **Example**: Pawns might be valued at 1, knights and bishops at 3, rooks at 5, and the queen at 9. The king is invaluable as losing it means losing the game.
- **Expected Value**
   - **Definition**: Represents the average outcome, considering all possible future sequences of moves, each weighted by its probability.
   - **Usage**: Useful in games with elements of chance, like backgammon or card games.
- **Weighted Linear Function**
   - **Function**: Combines various features of a state into a single numerical value.
   - **Form**: It's typically a weighted sum of feature values, where each feature is assigned a weight reflecting its relative importance.
- **Quiescence**
   - **Definition**: Refers to a state of the game where there are no immediate drastic changes expected (like captures or checks in chess).
   - **Importance**: Avoids evaluating a position that appears calm but is actually unstable.
- **Quiescence Search**
   - **Purpose**: To avoid the horizon effect, it continues to search beyond the basic depth limit in positions that are not quiescent.
   - **Implementation**: Typically looks for moves that could result in significant material changes, like captures or major threats.
- **Horizon Effect**
   - **Definition**: A limitation where the algorithm cannot see beyond a certain depth, potentially missing critical developments that occur just beyond this 'horizon'.
   - **Problem**: Can lead to poor evaluations if significant changes are just outside the search depth.
- **Singular Extensions Strategy**
   - **Purpose**: To mitigate the horizon effect.
   - **Method**: Extends the depth of search selectively for moves that seem to be particularly consequential, rather than uniformly increasing the search depth.

### 5.3.2 Cutoff Tests

The next step is to modify ALPHA-BETA-SEARCH so that it will call the heuristic EVAL
function when it is appropriate to cut off the search.

- **Definition**
   - **Purpose**: The cutoff test determines when to apply the evaluation function instead of searching deeper. It acts as a stopping condition for the search.
   - **Criteria**: Typically based on the depth of the search (depth limit) or other game-specific considerations (like time constraints or a specific state of the game board).



### 5.3.3 Forward Pruning

- **Concept of Forward Pruning**
   - **Definition**: Forward pruning refers to the practice of pruning moves from the search tree early on (forward in the search), based on the assumption that they appear to be poor choices.
   - **Risk**: This approach carries the risk of missing potentially good moves that only reveal their value deeper in the search tree.
- **Beam Search**
   - **Explanation**: Beam search is a type of forward pruning. It limits the number of moves (branches) explored at each level of the tree, focusing only on a predetermined number of best moves (the "beam").
   - **Application**: Often used in games where the branching factor is very high, making it impractical to explore all possible moves.
- **PROBCUT Algorithm (Buro 1995)**
   - **Overview**: PROBCUT is an algorithm that applies probabilistic pruning based on the evaluation of moves.
   - **Functionality**: It uses statistical models to predict the likely outcome of a move, pruning those moves that fall below a certain probability threshold of being beneficial.
- **Late Move Reduction**
   - **Concept**: This technique reduces the depth of search for moves considered later in the move generation process, based on the idea that early moves are generally better.
   - **Implementation**: Later moves are searched less deeply on the assumption that if they were strong moves, they would have been considered earlier.

[Stockfish](https://stockfishchess.org/) is a free and open-source chess engine, developed by Tord Romstad, Marco Costalba, and Joona Kiiski. It is consistently ranked among the top chess engines in the world, competing with the likes of Komodo and Houdini. It is also the engine behind the popular chess website [Lichess](https://lichess.org/) among others.

![Stockfish](https://stockfishchess.org/images/logo/icon_512x512@2x.png)

- **STOCKFISH Chess Program**
   - **Hybrid Approach**: STOCKFISH uses a hybrid approach to move pruning, combining several techniques including forward pruning.
   - **Strength**: STOCKFISH is known for its exceptional strength, partly attributed to its sophisticated and efficient pruning strategies.
   - **Adaptation**: The program dynamically adjusts its search strategies based on the specifics of the position and the depth of the search.

In summary, forward pruning in game-playing AI involves making early decisions to discard certain moves based on their perceived lack of promise. While this can greatly enhance efficiency by reducing the search space, it carries the inherent risk of overlooking potentially good moves. Techniques like beam search, PROBCUT, and late move reductions are implemented to mitigate this risk. Programs like STOCKFISH demonstrate the effectiveness of these techniques, particularly when they are part of a dynamic and multifaceted search strategy.