# Adversarial Search and Games

From: [Artificial Intelligence: A Modern Approach](http://aima.cs.berkeley.edu/) by Stuart Russell and Peter Norvig.

Note:

- Chapter 5 in US edition.
- Chapter 6 in International (Global) edition.

<img src="https://github.com/ValRCS/RBS_PBM773_Introduction_to_AI/blob/main/img/ch05_adversarial_search_and_games/DALL%C2%B7E%202024-01-22%2011.06.52%20-%20A%20revised%20futuristic%20illustration%20depicting%20a%20tic%20tac%20toe%20game%20between%20two%20robots.%20The%20environment%20is%20a%20high-tech,%20sci-fi%20setting%20with%20sleek%20metallic%20.png?raw=true" width="400">

## Introduction


- **Characteristics**: Environments where agents compete against each other.
- **Zero-Sum Games**: Often modeled as zero-sum games, where one agent's gain is another's loss.
- **Multi-Agent Systems**: Involves multiple agents with conflicting goals.
- **Dynamic Interaction**: Agents' actions directly affect each other.
- **Uncertainty and Strategy**: Requires anticipation of opponents' actions.

### Adversarial Search

- **Definition**: A search in a competitive environment where the outcome depends on actions of multiple agents (opponents).
- **Goal**: To find optimal strategies that lead to success in spite of adversaries' actions.
- **Minimax Algorithm**: A common approach in adversarial search to minimize the possible loss in a worst-case scenario.
- **Game Trees**: Used to represent possible moves by players.
- **Evaluation Functions**: Estimate the desirability of game positions when it's impractical to search until the end of the game.

## Examples of Games

- **Chess**: A classic two-player strategy board game with complete information.
- **Checkers (Draughts)**: Similar to chess but with simpler rules, played on an 8x8 board.
- **Go**: An ancient board game known for its complexity and rich strategy, played on a 19x19 grid.
- **Poker**: A card game characterized by incomplete information and uncertainty.
- **Tic-Tac-Toe**: A simple two-player game often used as an introductory example of game theory and AI strategy.

These topics represent key aspects of how AI can be applied in competitive environments, showcasing the complexity and the need for advanced strategies in game playing.

### Physical Games

Physical games, as opposed to traditional tabletop games like chess or checkers, involve a complex interplay of physical skills, real-time decision making, and often team dynamics. 

Until recently physical games were considered too complex for AI, but recent advances in robotics and reinforcement learning have enabled AI agents to compete in physical games like soccer and basketball.

Here are some examples of physical games and an explanation of their complexity compared to tabletop games:

### Soccer (Football)

- **Dynamic Environment**: The game is played in a large, open field with continuous play, requiring constant adaptation.
- **Physical Skills**: Players need to excel in coordination, endurance, and agility.
- **Team Strategy**: Involves complex team dynamics and coordination for defense, midfield control, and attacking.
- **Real-Time Decisions**: Players make split-second decisions under physical pressure.
- **Unpredictable Play**: The ball's movement can be highly unpredictable, influenced by weather, field conditions, and players' actions.

### Basketball

- **Rapid Pace**: The game is fast-paced with frequent changes in possession and continuous movement.
- **Physical Interaction**: Close physical interaction between players, including blocking and dodging.
- **Spatial Awareness**: High level of spatial awareness required to navigate the court and anticipate opponents' and teammates' movements.
- **Skill Diversity**: Players need a diverse set of skills like shooting, dribbling, passing, and defending.

### Tennis

- **Individual Strategy**: Singles tennis requires a high level of personal strategy, adapting to opponents' playing style.
- **Physical Precision**: Precision in serving and returning, along with stamina, plays a crucial role.
- **Psychological Aspect**: Players often have to overcome psychological challenges and maintain focus throughout the match.

### American Football

- **Complex Playbooks**: Teams have extensive playbooks, requiring players to memorize and execute complex plays.
- **Physical Contact**: High level of physical contact and tackling, demanding robust physical conditioning.
- **Positional Specialization**: Each player has a specialized role that contributes differently to the team's strategy.

### Volleyball

- **Team Coordination**: Requires excellent team coordination, especially in setting up and executing attacks.
- **Reflexes and Agility**: Players must have quick reflexes to respond to fast-moving plays and spikes.
- **Spatial and Tactical Awareness**: Positioning and movement on the court are critical for both offense and defense.

### Comparing to Tabletop Games

- **Physicality**: Physical games require athletic skills, whereas tabletop games focus on mental strategy.
- **Real-Time Dynamics**: Physical games often happen in real-time, demanding immediate responses, unlike turn-based tabletop games.
- **Environmental Factors**: Factors like weather, field conditions, and physical fatigue significantly influence physical games.
- **Team Interaction**: Physical games, especially team sports, involve complex interpersonal dynamics and coordination.
- **Unpredictability**: The higher level of unpredictability in physical games arises from real-time dynamics and physical interactions.

## Real Time Strategy Games (RTS)

Real-Time Strategy (RTS) games are a genre of video games particularly suited for AI research due to their complexity, real-time decision-making requirements, and dynamic environments. Here are some examples of RTS games and an explanation of their suitability for AI research:

### Examples of RTS Games

- **StarCraft and StarCraft II**
   - **Complexity**: Features multiple factions with unique units and strategies.
   - **Economy and Resource Management**: Players must efficiently gather resources and manage economies.
   - **Fog of War**: Limited visibility of the map increases uncertainty and strategic depth.
- **Age of Empires Series**
   - **Historical Context**: Players build and manage civilizations through different historical eras.
   - **Diverse Strategies**: Offers a variety of military, economic, and technological strategies.
   - **Large Scale**: Involves managing large armies and settlements.
- **Command & Conquer Series**
   - **Asymmetrical Factions**: Different factions have distinct units and technologies.
   - **Base Building and Defense**: Emphasizes base construction and defense along with offensive strategies.
   - **Real-Time Tactics**: Requires quick tactical decisions in combat situations.
- **Warcraft III**
   - **Hero Units**: Introduces powerful hero units that can turn the tide of battles.
   - **Micro and Macro Management**: Requires both individual unit control (micro) and overall strategy and resource management (macro).
- **Company of Heroes**
   - **Realistic Tactics and Cover System**: Focuses on realistic military tactics and the use of cover.
   - **Resource Points Control**: Involves capturing specific points on the map to gain resources.

### Suitability for AI Research

- **Complex Decision-Making**: RTS games require complex, multi-layered decision-making, making them ideal for studying and developing sophisticated AI algorithms.
- **Real-Time Analysis**: The real-time nature of these games challenges AI to analyze and react to situations swiftly and efficiently.
- **Resource Management**: AI can be tested on its ability to optimally allocate and use resources, a common problem in many real-world scenarios.
- **Strategy Formulation**: AI must formulate short-term and long-term strategies based on incomplete and dynamically changing information.
- **Learning and Adaptation**: These games provide a rich environment for AI to learn and adapt to different strategies and opponents.
- **Handling Uncertainty**: With elements like fog of war and unpredictable opponents, RTS games are excellent for developing AI that can handle uncertainty and incomplete information.
- **Multi-Objective Optimization**: Balancing various objectives (e.g., economy, defense, attack) in an RTS game mirrors real-world scenarios where multiple objectives must be managed concurrently.

The complexity, pace, and strategic depth of RTS games make them an excellent platform for advancing AI research, particularly in areas like decision-making, learning, and adaptation to complex and dynamic environments.

## 5.1 - Game Theory

- **Three Stances Towards Multi-Agent Environments**
   - **Economy as a Game**: Treating the entire economy as a game where each agent is a player.
   - **Agents as Part of the Environment**: Viewing other agents as part of the environment in which a single agent operates.
   - **Modeling Adversarial Agents Explicitly**: Focusing on explicitly modeling the actions and strategies of adversarial agents. This is the main focus of this chapter.
- **Modeling the Adversarial Agents Explicitly**
   - **Strategic Planning**: Developing strategies based on the anticipated actions of opponents.
   - **Predictive Modeling**: Anticipating the moves of the adversary to optimize one's own strategy.
- **Concepts in Adversarial Search and Game Theory**
   - **Pruning**: Techniques like alpha-beta pruning are used to efficiently search the game tree by eliminating branches that cannot possibly influence the final decision.
   - **Evaluation Function**: A heuristic used to estimate the desirability of a game position when it is impractical to search until the end of the game.
   - **Imperfect Information**: Games like poker, where players do not have complete information about the game state, requiring strategies that can handle uncertainty and probabilistic outcomes.
- **Emphasis on Explicit Agent Modeling**
   - **Deep Analysis of Opponents' Strategies**: Understanding and countering specific strategies used by opponents.
   - **Adaptive Tactics**: Adjusting one’s own strategy in response to the observed behavior of adversaries.
   - **Complex Decision Making**: Making decisions based on a mixture of strategic planning, predictive modeling, and real-time analysis of the game state.

This subchapter emphasizes the importance of understanding and explicitly modeling adversarial agents in games, focusing on strategic decision-making, the use of evaluation functions, and dealing with imperfect information. These concepts are crucial in developing AI that can effectively navigate and succeed in multi-agent, competitive environments.

### 5.1.1 - Two-Player Zero-Sum Games

- **Definition and Characteristics**
   - **Perfect Information**: These games are characterized by perfect information, meaning all players are fully aware of all game states.
   - **Zero-Sum Nature**: In a zero-sum game, one player's gain is exactly balanced by the other player's loss.
- **Basic Concepts: Move and Position**
   - **Move**: An action taken by a player that transitions the game from one state to another.
   - **Position**: The current state of the game, determined by the sequence of moves made up to that point.
- **Game Structure**
   - **Initial State**: The starting state of the game before any moves are made.
   - **Transition Model**: Rules that determine how a player’s move changes the state of the game.
   - **Terminal Test**: A test to determine when the game has ended (reaching a terminal state).
   - **Terminal State**: A state where the game ends, with a win, loss, or draw.
- **Representation of Game States**
   - **State Space Graph**: Represents all possible states of the game and how one can transition from one state to another.
   - **Search Tree**: A tree structure that represents the game from a single player's perspective, branching out from the current state based on potential moves.
   - **Game Tree**: Expands on the search tree to include all possible moves by both players from the initial state to all possible terminal states.
- **Differences Between State Space Graph, Search Tree, and Game Tree**
   - **State Space Graph vs. Search Tree**: The state space graph is a more abstract representation showing all possible states and transitions, while the search tree focuses on the paths available from a current state.
   - **Search Tree vs. Game Tree**: The search tree is from one player’s perspective, whereas the game tree includes all possible moves for all players, making it a complete representation of the game's possibilities.

## 5.2 - Optimal Decisions in Games

- **Concept of MAX and MIN in Games**
   - **MAX**: Represents the player who is trying to maximize the score/outcome of the game.
   - **MIN**: Represents the opposing player who is trying to minimize the score/outcome, or conversely, to maximize their own score in a zero-sum context.
- **Minimax Search**
   - **Definition**: A decision rule used for minimizing the possible loss for a worst-case scenario, assuming an opponent who makes the best moves against you.
   - **Application**: Minimax search is used to determine the best move for a player by considering all possible responses of the opponent.
- **The Concept of 'Ply'**
   - **Definition**: A ply in game theory is a single turn taken by one of the players.
   - **Relevance**: The depth of analysis in a game is often measured in plies (e.g., looking three plies ahead means considering your move, your opponent's move, and your subsequent move).
- **Minimax Value**
   - **Description**: The minimax value of a node in a game tree is the best score that can be guaranteed for the player at that node, assuming optimal play by both players.
   - **Computation**: It is computed by recursively applying the minimax decision process through the game tree.
- **Minimax Decision**
   - **Definition**: The minimax decision is the decision at the root node that leads to the best possible outcome (maximized for MAX and minimized for MIN).
   - **Determination**: This decision is determined by evaluating the minimax values of the root's child nodes.

This subchapter lays the groundwork for understanding how optimal decisions are made in games, especially two-player zero-sum games with perfect information. The concepts of MAX and MIN, the use of the minimax search algorithm, and the understanding of game depth in terms of plies are crucial for developing strategies in such games. The minimax value and decision concepts are fundamental in determining the best possible outcomes in a game scenario.

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/6f/Minimax.svg/800px-Minimax.svg.png" width="600" alt="minimax">

### 5.2.1 The Minimax Search Algorithm

- **Purpose of Minimax Algorithm**
   - Designed for two-player zero-sum games.
   - It computes the best move by considering all possible moves of the opponent.
- **Working Principle**
   - The algorithm searches through the game tree by expanding nodes.
   - It assigns values to terminal states (win, lose, draw).
   - These values are propagated back up the tree to determine the best move.
- **Evaluation of Moves**
   - The algorithm evaluates moves by assuming that the opponent plays optimally.
   - For the maximizing player (MAX), it selects the move with the highest value.
   - For the minimizing player (MIN), it selects the move with the lowest value.
- **Depth of the Tree**
   - The depth of analysis (number of plies) can be adjusted.
   - Deeper trees provide more accurate decisions but require more computation.
- **Pseudo-code for the Minimax Algorithm**

```
function MINIMAX(node, depth, isMaximizingPlayer):
    if depth == 0 or node is a terminal node:
        return the value of the node

    if isMaximizingPlayer:
        bestValue = -∞
        for each child of node:
            val = MINIMAX(child, depth - 1, false)
            bestValue = max(bestValue, val)
        return bestValue

    else:
        bestValue = +∞
        for each child of node:
            val = MINIMAX(child, depth - 1, true)
            bestValue = min(bestValue, val)
        return bestValue

```
- **Implementation Considerations**
   - The algorithm is typically implemented recursively.
   - It requires a well-defined evaluation function for non-terminal states when the depth limit is reached.
- **Optimization Techniques**
   - Techniques like alpha-beta pruning can be used to reduce the number of nodes evaluated.

### 5.2.2 Optimal Decisions in Multiplayer Games

- **Vector of Values for Each Node**
   - In multiplayer games, each node in the game tree represents a vector of values rather than a single value.
   - This vector accounts for the payoffs or utilities for each player, reflecting the multi-dimensional nature of the game's outcomes.
- **Complexity in Multiplayer Games**
   - Multiplayer games are more complex than two-player games due to the increased number of possible interactions and outcomes.
   - The strategy involves not only competing against multiple players but also potentially forming alliances and collaborations.
- **Concepts of Alliances and Collaboration**
   - **Alliances**: Players may form temporary alliances to achieve mutual goals, although these alliances can shift or dissolve as the game progresses.
   - **Collaboration**: Players might collaborate for mutual benefit, influencing the game dynamics and strategic decisions.
   - **Impact on Strategy**: The possibility of alliances and collaborations adds layers of strategic depth, as players must consider not only individual strategies but also group dynamics and potential betrayals.
- **Decision-Making in Multiplayer Context**
   - Decision-making becomes more complex, as each move affects multiple players differently.
   - Strategies must account for the actions and potential responses of all players, not just a single adversary.
- **Evaluation of Game States**
   - Evaluating game states involves assessing the implications for all players, not just assessing a win/lose outcome for a single player.
   - The utility of a state from a player's perspective depends on the utilities for all players, reflecting the interconnected nature of decisions in multiplayer scenarios.

This section highlights the intricacies of decision-making in multiplayer games, emphasizing the need to consider a broader range of outcomes and interactions compared to two-player games. The concept of a vector of values for each node and the dynamics of alliances and collaboration significantly impact the strategies and evaluation methods in multiplayer game settings.

### 5.2.3 Alpha–Beta Pruning

- **Key Idea of Alpha–Beta Pruning**
   - **Efficiency Improvement**: Alpha-beta pruning is a technique used to improve the efficiency of the minimax algorithm. It reduces the number of nodes that are evaluated in the search tree, without affecting the final decision.
   - **Pruning Redundant Nodes**: The method involves skipping the evaluation of certain branches of the tree that cannot possibly influence the final decision. This is done by maintaining two values, alpha and beta, which represent the minimum score that the maximizing player is assured of (alpha) and the maximum score that the minimizing player is assured of (beta).
- **Alpha and Beta Values**
   - **Alpha (α)**: The best value that the maximizing player can guarantee at that level or above.
   - **Beta (β)**: The best value that the minimizing player can guarantee at that level or above.
- **Pruning Condition**
   - When the alpha value of a node is greater than or equal to the beta value of its ancestor, further exploration of that node is unnecessary (i.e., it can be pruned).
- **Effectiveness**
   - **Depth of Search**: Alpha-beta pruning allows for deeper searches in the game tree within the same time constraints, as it significantly cuts down the number of nodes to be evaluated.
   - **Optimal Decision Retention**: Despite pruning parts of the tree, the algorithm still returns the same decision as it would have without pruning.
- **Application**
   - Commonly used in various games, especially in chess and other complex strategy games, to enhance the performance of AI players.

Alpha-beta pruning is a critical optimization technique in game playing AI. By intelligently eliminating unnecessary parts of the search tree, it enables more efficient computation without compromising the accuracy of the decision-making process. This method is particularly valuable in games with large search trees, where exhaustive exploration is computationally impractical.

<img src="https://github.com/ValRCS/RBS_PBM773_Introduction_to_AI/blob/main/img/ch05_adversarial_search_and_games/alpha_beta_pruning.jpg?raw=true" width="400">

### 5.2.4 Move Ordering

- **Effect on Alpha-Beta Pruning Effectiveness**
   - Move ordering significantly affects the efficiency of alpha-beta pruning. Good move ordering can greatly increase the number of prunings, thereby reducing the search space and increasing the efficiency of the search.
- **Iterative Deepening**
   - This is a strategy where the search is conducted repeatedly, increasing the depth limit with each iteration.
   - It ensures that the best moves from shallower searches are tried first in deeper searches, potentially improving move ordering in subsequent iterations.

<img src="https://github.com/ValRCS/RBS_PBM773_Introduction_to_AI/blob/main/img/ch05_adversarial_search_and_games/DALL%C2%B7E%202024-01-22%2011.26.16%20-%20An%20illustration%20capturing%20the%20essence%20of%20a%20killer%20move%20in%20chess,%20a%20move%20that%20surprises%20the%20opponent.%20The%20scene%20features%20a%20chessboard%20with%20intricate%20pi.png?raw=true" width="400">

- **Killer Move Heuristic**
   - A "killer move" is a move that has caused a cutoff in another branch of the search tree at the same depth.
   - These moves are given priority in the search order as they are likely to be strong moves.
- **Transpositions and Transposition Table**
   - A transposition occurs when two different sequences of moves lead to the same game position.
   - A transposition table is used to store the evaluations of these positions, so the game doesn't need to re-evaluate them, improving efficiency.
- **Claude Shannon’s Type A and Type B Strategies**
   - **Type A Strategy**: Focuses on examining as many positions as possible and uses a relatively simple evaluation function. This is more brute-force in nature.
   - **Type B Strategy**: Emphasizes the use of more sophisticated evaluation functions and examines fewer positions. This approach relies more on the strategic depth of the evaluation function.
- **Importance of Move Ordering**
   - Effective move ordering can lead to significant improvements in search efficiency, allowing deeper searches within the same computational constraints.
   - It is an essential aspect of optimizing search strategies in game-playing AI, especially in conjunction with alpha-beta pruning.

Move ordering in the context of alpha-beta pruning is a crucial element in game-playing AI. By prioritizing certain moves, especially those that have proven effective in similar situations (like killer moves) or avoiding redundant evaluations (as with transposition tables), the AI can search more efficiently and effectively. The balance between breadth (Type A) and depth (Type B) strategies as described by Shannon also plays a key role in determining the overall approach to game-playing AI.


## 5.3 Heuristic Alpha–Beta Tree Search

- **Evaluation Function**
   - **Purpose**: Since it's impractical to search the entire game tree in most games, the evaluation function is used to estimate the desirability of a game position.
   - **Implementation**: It assigns a numerical value to non-terminal states, helping the algorithm decide which moves are the most promising.
   - **Design**: A good evaluation function captures important aspects of the game position (like material count in chess, control of the center, pawn structure, etc.).
- **Cutoff Test**
   - **Definition**: The cutoff test determines when to apply the evaluation function instead of searching deeper. It acts as a stopping condition for the search.
   - **Criteria**: Typically based on the depth of the search (depth limit) or other game-specific considerations (like time constraints or a specific state of the game board).
- **Heuristic Alpha-Beta Pruning**
   - **Combination with Heuristics**: The alpha-beta pruning technique is often combined with heuristics to enhance its effectiveness.
   - **Role of Heuristics**: Heuristics guide the search process, influencing which parts of the tree are pruned and the order in which nodes are explored.
   - **Improved Efficiency**: By applying heuristics, the algorithm can prune more branches and reduce the search space, leading to faster decision-making.
- **Balancing Accuracy and Efficiency**
   - **Depth vs. Breadth**: There is a trade-off between the depth of the search and the breadth. Heuristics help in balancing this by focusing the search on the most promising areas.
   - **Complexity of the Evaluation Function**: More complex evaluation functions may provide more accurate assessments but are computationally expensive.
- **Practical Considerations**
   - **Time Constraints**: In real-world applications, time constraints often dictate the depth of the search, making efficient heuristics crucial.
   - **Dynamic Adjustments**: The algorithm may dynamically adjust the depth of the search or the use of heuristics based on the state of the game.

In this section, the focus is on the practical application of alpha-beta pruning in real game scenarios, where it's not feasible to search the entire game tree. The evaluation function and the cutoff test are central to this approach, providing a balance between exploring enough of the game tree to make a good decision and doing so within a reasonable amount of time and computational resources. The use of heuristics in this context is key to enhancing the performance and effectiveness of game-playing AI.

### 5.3.1 Evaluation Functions

- **Features of the State**
   - **Definition**: Features of a game state are specific attributes that can be quantitatively assessed to evaluate the state.
   - **Examples**: In chess, these could include material balance, control of the center, pawn structure, king safety, etc.
- **Material Value in Chess**
   - **Concept**: Assigns numerical values to pieces based on their relative strength and importance.
   - **Example**: Pawns might be valued at 1, knights and bishops at 3, rooks at 5, and the queen at 9. The king is invaluable as losing it means losing the game.
- **Expected Value**
   - **Definition**: Represents the average outcome, considering all possible future sequences of moves, each weighted by its probability.
   - **Usage**: Useful in games with elements of chance, like backgammon or card games.
- **Weighted Linear Function**
   - **Function**: Combines various features of a state into a single numerical value.
   - **Form**: It's typically a weighted sum of feature values, where each feature is assigned a weight reflecting its relative importance.
- **Quiescence**
   - **Definition**: Refers to a state of the game where there are no immediate drastic changes expected (like captures or checks in chess).
   - **Importance**: Avoids evaluating a position that appears calm but is actually unstable.
- **Quiescence Search**
   - **Purpose**: To avoid the horizon effect, it continues to search beyond the basic depth limit in positions that are not quiescent.
   - **Implementation**: Typically looks for moves that could result in significant material changes, like captures or major threats.

<img src="https://github.com/ValRCS/RBS_PBM773_Introduction_to_AI/blob/main/img/ch05_adversarial_search_and_games/DALL%C2%B7E%202024-01-22%2011.09.44%20-%20An%20illustration%20depicting%20the%20concept%20of%20an%20event%20horizon%20in%20space.%20The%20scene%20shows%20a%20distant,%20futuristic%20spaceship%20nearing%20a%20massive%20black%20hole.%20The%20.png?raw=true" width="400">

- **Horizon Effect**
   - **Definition**: A limitation where the algorithm cannot see beyond a certain depth, potentially missing critical developments that occur just beyond this 'horizon'.
   - **Problem**: Can lead to poor evaluations if significant changes are just outside the search depth.
- **Singular Extensions Strategy**
   - **Purpose**: To mitigate the horizon effect.
   - **Method**: Extends the depth of search selectively for moves that seem to be particularly consequential, rather than uniformly increasing the search depth.

### 5.3.2 Cutoff Tests

The next step is to modify ALPHA-BETA-SEARCH so that it will call the heuristic EVAL
function when it is appropriate to cut off the search.

- **Definition**
   - **Purpose**: The cutoff test determines when to apply the evaluation function instead of searching deeper. It acts as a stopping condition for the search.
   - **Criteria**: Typically based on the depth of the search (depth limit) or other game-specific considerations (like time constraints or a specific state of the game board).



<img src="https://github.com/ValRCS/RBS_PBM773_Introduction_to_AI/blob/main/img/ch05_adversarial_search_and_games/DALL%C2%B7E%202024-01-22%2011.32.23%20-%20An%20illustration%20depicting%20the%20concept%20of%20forward%20pruning.%20The%20scene%20features%20a%20person%20standing%20in%20a%20garden,%20wearing%20a%20headlamp%20emitting%20a%20focused%20beam.png?raw=true" width="400">

### 5.3.3 Forward Pruning

- **Concept of Forward Pruning**
   - **Definition**: Forward pruning refers to the practice of pruning moves from the search tree early on (forward in the search), based on the assumption that they appear to be poor choices.
   - **Risk**: This approach carries the risk of missing potentially good moves that only reveal their value deeper in the search tree.
- **Beam Search**
   - **Explanation**: Beam search is a type of forward pruning. It limits the number of moves (branches) explored at each level of the tree, focusing only on a predetermined number of best moves (the "beam").
   - **Application**: Often used in games where the branching factor is very high, making it impractical to explore all possible moves.
- **PROBCUT Algorithm (Buro 1995)**
   - **Overview**: PROBCUT is an algorithm that applies probabilistic pruning based on the evaluation of moves.
   - **Functionality**: It uses statistical models to predict the likely outcome of a move, pruning those moves that fall below a certain probability threshold of being beneficial.
- **Late Move Reduction**
   - **Concept**: This technique reduces the depth of search for moves considered later in the move generation process, based on the idea that early moves are generally better.
   - **Implementation**: Later moves are searched less deeply on the assumption that if they were strong moves, they would have been considered earlier.

[Stockfish](https://stockfishchess.org/) is a free and open-source chess engine, developed by Tord Romstad, Marco Costalba, and Joona Kiiski. It is consistently ranked among the top chess engines in the world, competing with the likes of Komodo and Houdini. It is also the engine behind the popular chess website [Lichess](https://lichess.org/) among others.

![Stockfish](https://stockfishchess.org/images/logo/icon_512x512@2x.png)

- **STOCKFISH Chess Program**
   - **Hybrid Approach**: STOCKFISH uses a hybrid approach to move pruning, combining several techniques including forward pruning.
   - **Strength**: STOCKFISH is known for its exceptional strength, partly attributed to its sophisticated and efficient pruning strategies.
   - **Adaptation**: The program dynamically adjusts its search strategies based on the specifics of the position and the depth of the search.

In summary, forward pruning in game-playing AI involves making early decisions to discard certain moves based on their perceived lack of promise. While this can greatly enhance efficiency by reducing the search space, it carries the inherent risk of overlooking potentially good moves. Techniques like beam search, PROBCUT, and late move reductions are implemented to mitigate this risk. Programs like STOCKFISH demonstrate the effectiveness of these techniques, particularly when they are part of a dynamic and multifaceted search strategy.

<img src="https://github.com/ValRCS/RBS_PBM773_Introduction_to_AI/blob/main/img/ch05_adversarial_search_and_games/DALL%C2%B7E%202024-01-22%2011.37.31%20-%20An%20illustration%20representing%20the%20concept%20of%20table%20lookup%20in%20games.%20The%20scene%20features%20an%20android,%20designed%20with%20a%20humanoid%20appearance,%20standing%20or%20sit.png?raw=true" width="400">

### 5.3.4 Search versus Lookup

- **Table Lookup Idea**
   - **Concept**: Table lookup involves storing precomputed evaluations of game positions in a table, which the AI can then reference instead of performing a search.
   - **Application in Chess**: In chess, this might involve storing the evaluations of various endgame positions or common opening sequences.
- **Retrograde Analysis**
   - **Definition**: Retrograde analysis is a method of working backward from known outcomes (like checkmate in chess) to determine the value of preceding positions.
   - **Usage in Chess**: This technique is particularly useful in endgames where the number of pieces on the board is reduced. It allows the program to determine the exact value of positions and the best moves leading to a known outcome.
   - **Database Creation**: Retrograde analysis is used to create extensive databases of endgame positions, which can be looked up during play to make optimal moves.
- **Search versus Lookup Balance**
   - **Search Strategy**: Typically involves dynamically exploring possible moves and counter-moves to evaluate positions.
   - **Lookup Strategy**: Relies on accessing a database of precomputed values for certain positions.
   - **Trade-off**: While lookup can provide instant evaluations, it's limited to known positions. Search, on the other hand, is more flexible but computationally expensive.
- **Integration in AI Systems**
   - **Chess AI**: Advanced chess AI systems often integrate both search and lookup strategies. They might use lookup tables for opening moves and endgames, while relying on search for mid-game play.
   - **Efficiency and Effectiveness**: Combining both strategies allows the AI to be more efficient and effective, using the strengths of each approach where they are most applicable.

In this section, the contrast and complementary nature of search and lookup strategies in game-playing AI are discussed. While search provides a flexible approach to evaluating game positions, lookup offers instant access to the best move in known positions, especially useful in games like chess for opening and endgame scenarios. The integration of both strategies in AI systems allows for more efficient and effective gameplay, leveraging the advantages of each method.

<img src="https://github.com/ValRCS/RBS_PBM773_Introduction_to_AI/blob/main/img/ch05_adversarial_search_and_games/DALL%C2%B7E%202024-01-22%2011.11.15%20-%20An%20illustration%20representing%20the%20concept%20of%20Monte%20Carlo%20Tree%20Search.%20The%20scene%20features%20a%20person,%20portrayed%20as%20a%20thoughtful%20strategist,%20standing%20in%20fr.png?raw=true" width="400">

## 5.4 Monte Carlo Tree Search

- **Simulations, Playout, and Rollout**
   - **Simulations**: MCTS uses random simulations to evaluate game positions, making it powerful in games with a high branching factor.
   - **Playout/Rollout**: These terms refer to playing out a game to the end by selecting moves at random (or using a lightweight strategy) from a certain position.
- **Playout Policy**
   - **Definition**: The playout policy is the method used to select moves during the simulation phase of MCTS.
   - **Variation**: It can range from completely random choices to more sophisticated methods that incorporate some knowledge about the game.
- **Pure Monte Carlo Search**
   - **Concept**: In pure Monte Carlo search, decisions are made based only on the results of numerous playouts from the current state, without building a search tree.
   - **Limitation**: While simple and easy to implement, it lacks the depth and strategic foresight of more advanced methods.
- **Selection Policy: Exploration and Exploitation**
   - **Balance**: MCTS must balance between exploring new moves (exploration) and choosing moves that have previously resulted in good outcomes (exploitation).
   - **Selection Policy**: Determines how the balance is achieved during the tree expansion phase.
- **UCT and UCB1 Formulas**
   - **UCT (Upper Confidence bounds applied to Trees)**: A selection strategy that balances exploration and exploitation based on the UCB1 formula.
   - **UCB1 (Upper Confidence Bound 1)**: A formula used in MCTS to decide which node to explore next, considering both the average value of the node and the number of times it has been visited.
- **Early Playout Termination**
   - **Idea**: Instead of playing out to the end of the game, the playout can be terminated early, and an evaluation function can be used to estimate the outcome.
   - **Benefit**: This approach can save time and may be useful in games where completing a full playout is computationally expensive.
- **Relation to Reinforcement Learning**
   - **Later Discussion**: MCTS relates closely to reinforcement learning concepts, which are discussed later in the book and course.
   - **Learning Aspect**: The method incorporates learning from the outcomes of simulations, similar to how reinforcement learning agents learn from experiences.

In summary, Monte Carlo Tree Search is a powerful method for making decisions in complex game environments. It employs a combination of random simulations and a strategic balance between exploring new moves and exploiting known good moves. The UCT and UCB1 formulas are central to its selection policy, guiding the algorithm in its search and expansion of the game tree. MCTS's approach of learning from simulated experiences parallels concepts in reinforcement learning, making it a relevant topic in the broader context of AI and machine learning.

### Examples of Monte Carlo search strategies

AlphaGo, developed by DeepMind, is a prime example of the successful application of Monte Carlo Tree Search in game-playing AI. AlphaGo's victory over the world champion Lee Sedol in the game of Go marked a significant milestone in AI research, showcasing the power of MCTS in complex games.

[AlphaGo](https://deepmind.com/blog/article/alphago-zero-starting-scratch)

## 5.5 Stochastic Games

- **Classical Examples of Stochastic Games**
   - **Backgammon**: A prime example of a stochastic game, where the outcome is influenced by both the players' decisions and the roll of dice.
   - **Nature of Stochastic Games**: These games combine elements of strategy and chance, making them more unpredictable and complex than deterministic games.
- **Chance Nodes**
   - **Definition**: In the game tree of a stochastic game, chance nodes represent points where a random event (like a dice roll) affects the game.
   - **Function**: These nodes branch out based on all possible outcomes of the random event, each with its own probability.
- **Expected Value**
   - **Concept**: The expected value at a chance node is calculated by considering all possible outcomes of the random event and their respective probabilities.
   - **Calculation**: It involves summing the products of each outcome's value and its probability.
- **Expectedminimax Value**
   - **Extension of Minimax**: The concept of minimax value is extended to stochastic games as expectedminimax value.
   - **Computation**: The expectedminimax value of a chance node is the expected value over all its child nodes, considering the strategic choices of the players and the probabilities of the random events.
   - **Strategy Incorporation**: The expectedminimax algorithm incorporates strategies for both deterministic decisions of the players and the probabilistic outcomes of the chance nodes.

Stochastic games like backgammon pose unique challenges for AI due to the blend of strategic decision-making and the randomness introduced by elements like dice rolls. The concept of chance nodes, along with the computation of expected values and expectedminimax values, are key to modeling and solving these types of games. The algorithms need to account not only for the optimal strategies of the players but also for the probabilistic nature of the game's outcomes.

### 5.5.1 Evaluation Functions for Games of Chance

- **Evaluation Functions in Context of Games of Chance**
   - **Different Focus**: Unlike deterministic games, evaluation functions for games of chance need to account for the probabilistic nature of the game.
   - **Incorporating Uncertainty**: These functions must evaluate positions not just based on the current board state, but also considering the probability of various outcomes due to elements of chance.
- **Probability of Winning**
   - **Primary Metric**: The most important measure in these evaluation functions is the probability of winning from a given game state.
   - **Computation**: This involves calculating the likelihood of a player winning under optimal play, given the current state and the probabilities of various chance events (like dice rolls in backgammon).
   - **Dynamic Assessment**: The probability of winning can change significantly with each move and chance event, requiring the evaluation function to be dynamic and responsive to the state of the game.
- **Factors in Evaluation**
   - **Game-Specific Features**: The function may include various game-specific features that influence the outcome, such as board position, material advantage, or potential future moves.
   - **Weighting Uncertain Outcomes**: The function must appropriately weight different outcomes based on their probabilities.
- **Challenges**
   - **Complexity**: Accurately computing the probability of winning can be complex, especially in games with a large number of possible chance events and outcomes.
   - **Balance Between Precision and Efficiency**: The evaluation function needs to strike a balance between accurately assessing the probability of winning and being computationally efficient enough for practical use in AI systems.


## 5.6 Partially Observable Games

- **Key Idea of Partial Observability**
   - **Definition**: In partially observable games, players do not have complete information about the current state of the game. Certain elements of the game's state are hidden from the players.
   - **Contrast with Perfect Information Games**: This differs from games like chess or checkers, where all information about the game state is visible to all players.
- **Challenges Posed by Partial Observability**
   - **Strategic Uncertainty**: Players must make decisions based on incomplete and potentially misleading information.
   - **Probability and Inference**: Players often need to rely on probabilistic reasoning and inference to make educated guesses about the hidden aspects of the game state.
   - **Dynamic Information**: The information available to a player can change dynamically as the game progresses, either through their own actions or through the actions of their opponents.
- **Examples of Partially Observable Games**
   - **Card Games**: Many card games, like poker, are partially observable because players cannot see their opponents' hands.
   - **Strategic Games with Fog of War**: Some strategic video games include a "fog of war" feature, where players cannot see certain parts of the game environment until they explore them.
- **AI Strategies in Partially Observable Games**
   - **Heuristics and Probabilistic Models**: AI systems often use advanced heuristics and probabilistic models to make decisions in these games.
   - **Opponent Modeling**: AI may also attempt to model the strategies and likely moves of opponents, based on the observable actions and outcomes in the game.

Partially observable games present a unique set of challenges, both for human players and AI systems. The uncertainty and dynamic nature of the information available require sophisticated strategies that can adapt to changing conditions and incorporate probabilistic reasoning to deal with incomplete information. These games often necessitate a deeper level of strategic thinking and prediction of opponents' actions.

<img src="https://github.com/ValRCS/RBS_PBM773_Introduction_to_AI/blob/main/img/ch05_adversarial_search_and_games/DALL%C2%B7E%202024-01-21%2023.19.34%20-%20Illustration%20of%20a%20Kriegspiel%20chess%20game,%20depicting%20a%20traditional%20chessboard%20and%20chess%20pieces,%20with%20a%20visual%20representation%20of%20'fog%20of%20war'.%20The%20fog%20pa.png?raw=true" width="400">

### 5.6.1 Kriegspiel: Partially Observable Chess

- **Belief State and State Estimation**
   - **Belief State**: In Kriegspiel, a variant of chess with partial observability, a player's belief state represents all the possible configurations the game board could be in, based on what they know and have inferred.
   - **State Estimation**: Players must continually update their estimation of the game state as new information becomes available, typically through inference and deduction from the known rules and observed outcomes.
- **Change in Strategy for Partially Observable Games**
   - **Adaptation**: Strategies in Kriegspiel differ significantly from standard chess due to the uncertainty about the opponent's pieces.
   - **Proactive and Reactive Play**: Players must balance proactive strategy (planning moves and traps) with reactive play (responding to revealed information).
- **Guaranteed Checkmate**
   - **Concept**: A situation in Kriegspiel where a player can ensure a checkmate regardless of the opponent's unknown moves.
   - **Difficulty**: Achieving a guaranteed checkmate is more challenging due to the uncertainty of the opponent's position and moves.
- **Probabilistic and Accidental Checkmate**
   - **Probabilistic Checkmate**: A scenario where a player attempts a checkmate based on a probabilistic assessment of the opponent's likely positions.
   - **Accidental Checkmate**: This occurs when a player unintentionally checkmates the opponent, having been unaware of the king's actual position.
- **Equilibrium Solution**
   - **Definition**: An equilibrium in Kriegspiel, like in other games, is a state where neither player can unilaterally improve their situation by changing their strategy.
   - **Complexity in Kriegspiel**: Finding an equilibrium solution is more complex due to the partial observability, as it involves accounting for all possible beliefs about the game state.

In summary, Kriegspiel introduces the challenges of partial observability to the structured world of chess, requiring players to maintain and continually update a belief state about the game. This uncertainty demands a significant change in strategy, where players must consider probabilistic outcomes and be prepared for unexpected developments like accidental checkmates. The concept of equilibrium becomes more nuanced in this context, involving a deeper level of strategic planning and anticipation.

### 5.6.2 Card Games

- **Stochastic Partial Observability in Card Games**
   - **Characteristic**: Card games like bridge, whist, hearts, and poker are characterized by stochastic partial observability. This means that some information about the game state (like the cards held by other players) is hidden and the unknown elements are determined randomly (through card dealing).
   - **Impact on Strategy**: This randomness and hidden information fundamentally affect strategic decision-making in these games.
- **The Concept of Bluffing**
   - **Definition**: Bluffing is a strategy particularly relevant in poker, where a player acts in a way that misrepresents their actual hand to deceive opponents.
   - **Psychological Element**: Bluffing introduces a psychological dimension to the game, as it involves predicting and manipulating opponents' beliefs and reactions.
- **Abstraction for Similar Hands**
   - **Strategy**: To manage the vast number of possible hand combinations, many card game AI systems use abstraction techniques to group similar hands and treat them in a similar manner.
   - **Purpose**: This reduces the complexity of the decision-making process, making the computational task more manageable.
- **Poker Program Libratus**
   - **Notable AI**: Libratus is an advanced poker program that has demonstrated significant success against human professionals.
   - **Capabilities**: It employs sophisticated strategies, including bluffing, and is known for its ability to adapt to opponents' strategies.
   - **Technological Achievement**: Libratus is a testament to the advancements in AI, showcasing the ability to handle complex and uncertain environments like those in poker.

In summary, card games represent a unique challenge in the realm of AI due to their stochastic partial observability. Players must make decisions with incomplete information, and the element of chance plays a significant role. Strategies like bluffing add a layer of complexity, requiring AI systems to not only calculate probabilities but also understand and manipulate opponents' beliefs. Abstraction techniques are crucial for reducing the computational complexity, and advanced AI systems like Libratus highlight the progress in this field, successfully navigating the intricate dynamics of these games.

## 5.7 Limitations of Game Search Algorithms

- **Alpha-Beta Search and Heuristic Function Errors**
   - **Vulnerability**: The alpha-beta search algorithm's effectiveness is highly dependent on the accuracy of the heuristic evaluation function. Errors or inaccuracies in this function can lead to suboptimal decisions.
   - **Impact of Heuristic Errors**: Inaccurate evaluations can mislead the search process, causing the algorithm to overlook better moves or prioritize worse ones.
- **Limitation of Alpha-Beta and Monte Carlo Search**
   - **Design Focus**: Both algorithms are fundamentally designed to evaluate the values of legal moves within a game.
   - **Scope Limitation**: This focus limits their applicability in scenarios where broader strategic planning or understanding of game dynamics beyond individual moves is required.
- **Utility of Node Expansion**
   - **Selection of High-Utility Expansions**: A key limitation is determining which nodes to expand during the search. The idea is to prioritize node expansions that are of high utility — likely to lead to a better understanding of the game's outcome.
   - **Efficiency Concerns**: Efficiently selecting high-utility expansions is challenging but crucial for optimizing the search process.
- **Metareasoning**
   - **Definition**: Metareasoning refers to the process of reasoning about the reasoning process itself.
   - **Application**: In game search algorithms, metareasoning involves making decisions about how to allocate computational resources effectively, such as deciding when to deepen the search or when to apply certain heuristics.
- **Reasoning at the Level of Individual Moves**
   - **Limitation**: Both alpha-beta and Monte Carlo algorithms primarily operate at the level of individual moves.
   - **Strategic Depth**: This approach can overlook broader strategic elements of games that span multiple moves or require understanding of complex interactions and long-term consequences.
- **Introduction to Planning**
   - **Future Discussion**: The course will later introduce the concept of planning, which addresses some of these limitations by considering sequences of actions and their outcomes in a more holistic manner.
- **Incorporation of Machine Learning**
   - **Beyond Current Scope**: Strategies that incorporate machine learning, particularly learning from self-play, are not covered in this section but will be explored later.
   - **AlphaZero Example**: A notable example is AlphaZero, which demonstrated remarkable capabilities in chess, Go, and Shogi by learning optimal strategies through self-play, without reliance on predefined heuristics.

In summary, while game search algorithms like alpha-beta and Monte Carlo are powerful, they have limitations, particularly in their reliance on the accuracy of heuristic functions, their focus on individual moves, and their approach to node expansion. Future topics like planning and machine learning strategies, exemplified by systems like AlphaZero, address some of these limitations by offering a more holistic and adaptable approach to game strategy and decision-making.


## Overview of Adversarial Search and Games

- **Fundamentals of Game Theory**
   - Introduces the concept of competitive environments in AI, where multiple agents (players) have conflicting objectives. Focuses on the principles of game theory, particularly in zero-sum games where one player's gain is another's loss.
- **Adversarial Search**
   - Discusses adversarial search as a method to find optimal strategies in a competitive setting. Highlights the importance of anticipating an opponent's moves and formulating responses.
- **Minimax Algorithm**
   - Introduces the minimax algorithm, a foundational method for decision-making in two-player zero-sum games. It details how to choose moves by minimizing the potential maximum loss.
- **Alpha-Beta Pruning**
   - Explores alpha-beta pruning, an optimization technique for the minimax algorithm that significantly reduces the number of nodes evaluated in the search tree without affecting the final decision.
- **Evaluation Functions and Heuristics**
   - Discusses the role of evaluation functions in estimating the desirability of game positions and the use of heuristics to enhance the efficiency of game tree searches.
- **Monte Carlo Tree Search (MCTS)**
   - Introduces MCTS, a modern search strategy that uses random simulations for decision-making in complex games. Emphasizes its balance between exploration and exploitation.
- **Partially Observable and Stochastic Games**
   - Covers games with incomplete information (partially observable) and elements of chance (stochastic), focusing on the unique challenges they pose and the strategies employed, including probabilistic reasoning and bluffing.
- **Advanced Topics in Game Playing AI**
   - Delves into sophisticated concepts like forward pruning, move ordering, and metareasoning, highlighting the complexities in optimizing AI strategies for game playing.
- **Limitations and Future Directions**
   - Acknowledges the limitations of current game search algorithms and points towards future directions, including the integration of machine learning techniques, as exemplified by programs like AlphaZero.
- **Broad Implications in AI**
   - Concludes with a reflection on the broader implications of adversarial search and game theory in AI, underlining its significance in developing intelligent, strategic decision-making capabilities.