# Lab05: Games
Overview of the lab: 

1. Games
2. Minimax 
3. Alpha-beta pruning
4. Additinal materials
----

## 1. Games
* ***Perfect Information (Knowledge of Moves)***: Every player knows every move that has ever been made by all players up to the current point in the game. Think of it as no hidden history;

* ***Complete Information (Knowledge of Rules/Payoffs)***: Every player knows the rules, possible strategies, and payoff structure for every player. Think of it as no hidden types/incentives; everyone knows what everyone else stands to gain or lose. 

|                          | deterministic  | chance           | 
|----                      | ----           |      ----        |
| **perfect information**  | chess, go      | monopoly         |
| **complete information** | batteships     | pocker, scrabble |

### Is it true that if the game has a perfect information it also has a compelete information?

* Perfect Information:	**All past moves.** Every player knows the entire history of actions taken in the game up to the current moment. (No hidden action).	Chess. Both players see every piece and every move made on the board.
* Complete Information	**All game rules/payoffs/types.** Every player knows the incentives (utility functions/payoffs) and strategies available to all other players. (No hidden rules/goals).

NO! 

**The Game:** Imagine a sequential bargaining game where Player 1 makes a proposal, and Player 2 accepts or rejects it.

* Perfect Information? Yes. Player 2 sees Player 1's proposal (the action). If Player 2 rejects, Player 1 sees the rejection. All moves are transparent.

* Complete Information? Maybe Not. Player 1 might not know the exact minimum amount Player 2 is willing to accept (Player 2's payoff/type). Player 2's threshold is private, hidden information.

## 2. Minimax 

Its core principle is to **minimize the maximum possible loss** for the current player, assuming the opponent will always play optimally to maximize their own gain (and thus minimize your score).

The name "Minimax" comes from the recursive alternation between two roles:
1. MAX (Player1): Tries to Maximize the score.
2. MIN (Player2): Tries to Minimize the score.

### Game Tree 
Graph Representation: A game tree is a directed graph where:

* **Nodes** represent a possible state of the game (e.g., a board configuration).
* **Edges** represent a move from one game state to the next.

### Minimax summarised:

1. Minimax is a decision-making tool for two-player turn-taking games
2. Minimax exploress a game tree by anticipating future moves. 
3. Minimax uses an Evaluation Function to assign scores to board states. 
4. Minimax strategically chooses the optimal path trough Recusrsive Exploration and Backtracking.


### Evaluation functions

Tic-Tac Toe:
* Player(X) wins -> score: **+1**
* Player(O) wins -> score: **-1**
* Draw -> score: **0**

![tictactoe_game_tree](images/tictactoe.jpg)

### Backtracking trough the Game Tree

![minimax_tree](images/minimax_tree.jpg)

![minimax_q](images/minimax_q.jpg)

![minimax_filled](images/minimax_filled.jpg)

In [12]:
import numpy as np

def is_terminal(node):
    #implement logic to check if the game is over
    return isinstance(node, int) # isinstance(node, int)

def evaluate(node):
    #if we are evaluating a board for example, this function should calculate a score
    return node

def get_children(node):
    # computing next options from current node
    return node

def minimax(node, maximizingPlayer):
    if is_terminal(node):
        return evaluate(node)

    if maximizingPlayer:
        maxEval = -np.inf
        for child in get_children(node):
            eval = minimax(child, False)
            maxEval = max(maxEval, eval)
        return maxEval
    else:
        minEval = np.inf
        for child in get_children(node):
            eval = minimax(child, True)
            minEval = min(minEval, eval)
        return minEval

game_tree = [
    [
        [-1, 3], 
        [5, 1]
    ],
    [
        [-6, -4],
        [0, 9]
    ]
]

optimal_score = minimax(game_tree, True)
print(f"The Optimal Score for the starting MAX player is: {optimal_score}")

The Optimal Score for the starting MAX player is: 3


### Properties of minimax
 
**Complete?**  **Yes***, if the tree is finite

----

**Optimal?** - Yes, against an optimal opponent
Conditions under which Minimax is optimal:

### Conditions for Minimax Optimality

The Minimax algorithm guarantees the optimal move *for the worst-case scenario* against a rational opponent, provided the following conditions hold true:

* **Two-Player**
   * Only two players are involved.
   * The core $\max/\min$ structure breaks down with more players.
* **Zero-Sum** 
   * The gain of one player is exactly the loss of the other.
   * The $\max$ player's goal (maximize my score) is exactly the inverse of the $\min$ player's goal (minimize my opponent's score).
* **Perfect Information**
   * Both players know the entire state of the game (e.g., Chess, Checkers, Tic-Tac-Toe).
   * the game involves hidden information (e.g., Poker), Minimax is no longer guaranteed to be optimal; other methods like Expected Minimax is needed.
* **Deterministic** 
   * There is no element of chance (no dice, shuffled cards, etc.).
   * If chance is involved, a variant called Expectimax is used, where the Minimax layers are interspersed with 'chance nodes' that calculate expected values.
* **Complete Search**
   * The algorithm must be able to search all the way to a terminal (game-ending) state.
   * In real-world games like Chess, this is impossible. The algorithm stops at a fixed depth (the search horizon) and uses a heuristic (evaluation function) to estimate the value of the non-terminal nodes, making the result only heuristically optimal (an approximation).


*Minimax is typically used in classical games like Chess, Checkers, and Tic-Tac-Toe.*

The moment any of the above conditions are violated, or a practical limitation is introduced, Minimax ceases to be the "optimal" strategy:

* **Against a Suboptimal Opponent:** Minimax will still win, but it might miss an opportunity to win faster or more decisively. If you know your opponent is weak, you could choose a "trap" move that Minimax rejects (because a perfect opponent would avoid the trap), but that an imperfect opponent is likely to fall for. 

* **Limited Search Depth:** As noted above, if the search is cut short by an arbitrary depth limit (which is necessary for complex games), the result is only as good as the heuristic evaluation function used at the leaves.

----

**Time complexity** - $O(b^m)$

----
**Space complexity** - $O(bm)$

## Alpha-Beta pruning

### Do we need to evaluate the whole game tree?

![alpha_beta_1](images/alphabeta1.jpg)

### NO
![alphabeta2](images/alphabeta2.jpg)

![alphabeta3](images/alphabeta3.jpg)

Pruning depends on what order the moves are in.

In [13]:
import numpy as np

def is_terminal(node):
    #implement logic to check if the game is over
    return isinstance(node, int) # isinstance(node, int)

def evaluate(node):
    #if we are evaluating a board for example, this function should calculate a score
    return node

def get_children(node):
    # computing next options from current node
    return node

def minimax(node, alpha, beta, maximizingPlayer):
    if is_terminal(node):
        return evaluate(node)

    if maximizingPlayer:
        maxEval = -np.inf
        for child in get_children(node):
            eval = minimax(child, alpha, beta, False)
            maxEval = max(maxEval, eval)
            alpha = max(alpha, eval)
            print(f"eval: {eval}, minEval: {maxEval}, alpha: {alpha}, beta: {beta}")
            if beta <= alpha:
                break
        return maxEval
    else:
        minEval = np.inf
        for child in get_children(node):
            eval = minimax(child, alpha, beta, True)
            minEval = min(minEval, eval)
            beta = min(beta, eval)
            print(f"eval: {eval}, minEval: {minEval}, alpha: {alpha}, beta: {beta}")
            if beta <= alpha:
                break

        return minEval

game_tree = [
    [
        [-1, 3], 
        [5, 1]
    ],
    [
        [-6, -4],
        [0, 9]
    ]
]

optimal_score = minimax(game_tree, -np.inf, np.inf,  False)
print(f"The Optimal Score for the starting MAX player is: {optimal_score}")

eval: -1, minEval: -1, alpha: -inf, beta: -1
eval: 3, minEval: -1, alpha: -inf, beta: -1
eval: -1, minEval: -1, alpha: -1, beta: inf
eval: 5, minEval: 5, alpha: -1, beta: 5
eval: 1, minEval: 1, alpha: -1, beta: 1
eval: 1, minEval: 1, alpha: 1, beta: inf
eval: 1, minEval: 1, alpha: -inf, beta: 1
eval: -6, minEval: -6, alpha: -inf, beta: -6
eval: -4, minEval: -6, alpha: -inf, beta: -6
eval: -6, minEval: -6, alpha: -6, beta: 1
eval: 0, minEval: 0, alpha: -6, beta: 0
eval: 9, minEval: 0, alpha: -6, beta: 0
eval: 0, minEval: 0, alpha: 0, beta: 1
eval: 0, minEval: 0, alpha: -inf, beta: 0
The Optimal Score for the starting MAX player is: 0


![ab1](images/albt1.jpg)
![ab2](images/albt2.jpg)
![ab3](images/albt3.jpg)
![ab4](images/albt4.jpg)

$\alpha$ - the best value that MAX can garantee at the current level and the levels above (upper levels)

$\beta$ - the best value that MIN can garantee at the current level and the levels above\

**Recuirement for pruning**: $\beta \leq \alpha$ 
 * Min had a better option available earlier in the tree

### Properties of Alpha-Beta Pruning
* Pruning does not affect final result.
* Good move ordering improves effectiveness of pruning

* cutoff test = e.g., depth limit (perhaps add quiescence search)

* evaluation function = estimated desirability of position

## 4. Additinal Materials 
1. Tic tac toe - [here](https://www.neverstopbuilding.com/blog/minimax)