# Week 1, Lab 3: Adversarial Search (Game-Playing AI)

## Welcome to Game AI!

So far, we've been solving problems where we control all the actions. But what about games where we face an **opponent** trying to beat us?

### What You'll Learn

- Game trees and game states
- Minimax algorithm
- Alpha-Beta pruning optimization
- Evaluation functions
- Building a Tic-Tac-Toe AI

### Real-World Applications

- Chess engines (Deep Blue, Stockfish)
- Go AI (AlphaGo)
- Video game opponents
- Poker bots
- Strategy game AI

In [None]:
# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
from typing import List, Tuple, Optional
from copy import deepcopy
import time

plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

## 1. Game Theory Basics

### Key Concepts:

1. **Two players**: Usually called MAX and MIN
   - MAX wants to maximize the score
   - MIN wants to minimize the score

2. **Zero-sum game**: One player's gain is another's loss

3. **Perfect information**: Both players can see everything

4. **Game tree**: All possible moves form a tree structure

### Simple Example: Number Picking Game

- Two players take turns
- Each picks a number from 1-3
- First to reach exactly 10 wins
- Going over 10 loses

In [None]:
def visualize_game_tree_simple():
    """
    Visualize a simple 2-ply game tree.
    
    In game terminology:
    - Ply = one player's turn
    - Depth = number of plies
    """
    fig, ax = plt.subplots(figsize=(14, 8))
    
    # Draw nodes at different levels
    # Level 0: MAX's turn
    ax.plot(5, 4, 'ro', markersize=20)
    ax.text(5, 4, 'MAX', ha='center', va='center', fontweight='bold')
    
    # Level 1: MIN's responses
    min_positions = [(2, 3), (5, 3), (8, 3)]
    for x, y in min_positions:
        ax.plot(x, y, 'bo', markersize=20)
        ax.text(x, y, 'MIN', ha='center', va='center', fontweight='bold', color='white')
        ax.plot([5, x], [4, y], 'k-', linewidth=2)
    
    # Level 2: MAX's responses (terminal states with values)
    outcomes = [
        [(0.5, 2, 3), (1.5, 2, 5), (2.5, 2, -2)],  # From first MIN
        [(4, 2, 1), (5, 2, -1), (6, 2, 4)],         # From second MIN
        [(7, 2, 2), (8, 2, 6), (9, 2, -3)]          # From third MIN
    ]
    
    for parent_idx, outcomes_group in enumerate(outcomes):
        parent_x, parent_y = min_positions[parent_idx]
        for x, y, value in outcomes_group:
            # Color based on value
            color = 'green' if value > 0 else 'red' if value < 0 else 'gray'
            ax.plot(x, y, 'o', color=color, markersize=15)
            ax.text(x, y, str(value), ha='center', va='center', fontweight='bold')
            ax.plot([parent_x, x], [parent_y, y], 'k-', linewidth=1, alpha=0.5)
    
    # Labels
    ax.text(-0.5, 4, 'Depth 0\n(MAX)', fontsize=10, ha='right')
    ax.text(-0.5, 3, 'Depth 1\n(MIN)', fontsize=10, ha='right')
    ax.text(-0.5, 2, 'Depth 2\n(Terminal)', fontsize=10, ha='right')
    
    ax.set_xlim(-1, 10)
    ax.set_ylim(1.5, 4.5)
    ax.axis('off')
    ax.set_title('Simple Game Tree\nMAX maximizes, MIN minimizes', fontsize=14, fontweight='bold')
    plt.tight_layout()
    plt.show()

visualize_game_tree_simple()

print("Game Tree Explanation:")
print("- Red nodes: MAX's turn (wants high scores)")
print("- Blue nodes: MIN's turn (wants low scores)")
print("- Leaf nodes show final game values")
print("- Green = good for MAX, Red = good for MIN")

## 2. Building Tic-Tac-Toe

Let's implement a complete Tic-Tac-Toe game as our testbed.

In [None]:
class TicTacToe:
    """Tic-Tac-Toe game implementation."""
    
    def __init__(self):
        # Board: 0 = empty, 1 = X, -1 = O
        self.board = np.zeros((3, 3), dtype=int)
        self.current_player = 1  # X starts
    
    def get_valid_moves(self) -> List[Tuple[int, int]]:
        """Return list of valid (row, col) moves."""
        return [(i, j) for i in range(3) for j in range(3) if self.board[i, j] == 0]
    
    def make_move(self, row: int, col: int) -> bool:
        """Make a move. Returns True if valid."""
        if self.board[row, col] == 0:
            self.board[row, col] = self.current_player
            self.current_player *= -1  # Switch player
            return True
        return False
    
    def check_winner(self) -> Optional[int]:
        """
        Check if there's a winner.
        Returns: 1 (X wins), -1 (O wins), 0 (draw), None (game continues)
        """
        # Check rows
        for i in range(3):
            if abs(self.board[i, :].sum()) == 3:
                return self.board[i, 0]
        
        # Check columns
        for j in range(3):
            if abs(self.board[:, j].sum()) == 3:
                return self.board[0, j]
        
        # Check diagonals
        if abs(self.board.trace()) == 3:
            return self.board[0, 0]
        if abs(np.fliplr(self.board).trace()) == 3:
            return self.board[0, 2]
        
        # Check for draw
        if len(self.get_valid_moves()) == 0:
            return 0
        
        return None  # Game continues
    
    def is_terminal(self) -> bool:
        """Check if game is over."""
        return self.check_winner() is not None
    
    def copy(self):
        """Create a deep copy of the game state."""
        new_game = TicTacToe()
        new_game.board = self.board.copy()
        new_game.current_player = self.current_player
        return new_game
    
    def display(self):
        """Visualize the board."""
        symbols = {0: ' ', 1: 'X', -1: 'O'}
        print("\n  0   1   2")
        for i in range(3):
            row_str = f"{i} "
            for j in range(3):
                row_str += f" {symbols[self.board[i, j]]} "
                if j < 2:
                    row_str += "│"
            print(row_str)
            if i < 2:
                print("  ───┼───┼───")
        print()

# Test the game
game = TicTacToe()
game.display()
game.make_move(1, 1)  # X in center
game.make_move(0, 0)  # O in corner
game.make_move(0, 2)  # X in corner
game.display()
print(f"Valid moves: {game.get_valid_moves()}")
print(f"Winner: {game.check_winner()}")

## 3. Minimax Algorithm

Minimax is the foundation of game AI. It assumes both players play **optimally**.

### How Minimax Works:

1. **Build the game tree** of all possible moves
2. **At terminal nodes**, assign values (win/loss/draw)
3. **Propagate values up** the tree:
   - MAX nodes: Choose the **maximum** child value
   - MIN nodes: Choose the **minimum** child value
4. **Root node value** is the best outcome for MAX with optimal play

### Pseudocode:

```python
def minimax(state, is_maximizing):
    if state.is_terminal():
        return evaluate(state)
    
    if is_maximizing:
        best = -infinity
        for each move:
            value = minimax(resulting_state, False)
            best = max(best, value)
        return best
    else:
        best = +infinity
        for each move:
            value = minimax(resulting_state, True)
            best = min(best, value)
        return best
```

In [None]:
def minimax(game: TicTacToe, is_maximizing: bool, depth: int = 0) -> Tuple[int, Optional[Tuple[int, int]]]:
    """
    Minimax algorithm for Tic-Tac-Toe.
    
    Args:
        game: Current game state
        is_maximizing: True if MAX's turn (X), False if MIN's turn (O)
        depth: Current depth in tree (for tie-breaking)
    
    Returns:
        (best_value, best_move)
    """
    # Terminal state: return the value
    winner = game.check_winner()
    if winner is not None:
        if winner == 1:  # X (MAX) wins
            return (10 - depth, None)  # Prefer faster wins
        elif winner == -1:  # O (MIN) wins
            return (-10 + depth, None)  # Prefer slower losses
        else:  # Draw
            return (0, None)
    
    valid_moves = game.get_valid_moves()
    
    if is_maximizing:  # MAX's turn (X)
        best_value = -float('inf')
        best_move = None
        
        for move in valid_moves:
            # Try this move
            new_game = game.copy()
            new_game.make_move(move[0], move[1])
            
            # Recursively get the value
            value, _ = minimax(new_game, False, depth + 1)
            
            # Update best
            if value > best_value:
                best_value = value
                best_move = move
        
        return (best_value, best_move)
    
    else:  # MIN's turn (O)
        best_value = float('inf')
        best_move = None
        
        for move in valid_moves:
            new_game = game.copy()
            new_game.make_move(move[0], move[1])
            
            value, _ = minimax(new_game, True, depth + 1)
            
            if value < best_value:
                best_value = value
                best_move = move
        
        return (best_value, best_move)

# Test minimax on an empty board
game = TicTacToe()
print("Finding best move for X (MAX) on empty board...")
start_time = time.time()
value, move = minimax(game, is_maximizing=True)
elapsed = time.time() - start_time

print(f"Best move: {move}")
print(f"Expected value: {value}")
print(f"Time taken: {elapsed:.3f} seconds")
print(f"\nWith perfect play, the game should end in a {'X win' if value > 0 else 'O win' if value < 0 else 'draw'}")

## 4. Let's Play Against Minimax!

Try to beat the AI (spoiler: you can't if it's perfect!).

In [None]:
def play_game_vs_ai(human_first: bool = True, use_alpha_beta: bool = False):
    """
    Play Tic-Tac-Toe against the AI.
    
    Args:
        human_first: If True, human plays X (goes first)
        use_alpha_beta: If True, use alpha-beta pruning (we'll implement this next!)
    """
    game = TicTacToe()
    human_player = 1 if human_first else -1
    ai_player = -human_player
    
    print("\n" + "="*50)
    print("TIC-TAC-TOE vs AI")
    print("="*50)
    print(f"You are {'X (first)' if human_first else 'O (second)'}")
    print("Enter moves as 'row col' (e.g., '1 1' for center)\n")
    
    moves_count = 0
    
    while not game.is_terminal():
        game.display()
        
        if game.current_player == human_player:
            # Human's turn
            valid = False
            while not valid:
                try:
                    move_input = input("Your move (row col): ").strip().split()
                    row, col = int(move_input[0]), int(move_input[1])
                    valid = game.make_move(row, col)
                    if not valid:
                        print("Invalid move! Try again.")
                except (ValueError, IndexError):
                    print("Invalid input! Use format 'row col' (0-2)")
        else:
            # AI's turn
            print("AI is thinking...")
            start = time.time()
            
            is_maximizing = (ai_player == 1)
            _, move = minimax(game, is_maximizing)
            
            elapsed = time.time() - start
            print(f"AI chose {move} (took {elapsed:.3f}s)")
            game.make_move(move[0], move[1])
        
        moves_count += 1
    
    # Game over
    game.display()
    winner = game.check_winner()
    
    if winner == human_player:
        print("🎉 You won! (Wait, that shouldn't be possible with perfect AI...)")
    elif winner == ai_player:
        print("🤖 AI wins! Better luck next time.")
    else:
        print("🤝 It's a draw! Well played.")
    
    print(f"\nGame finished in {moves_count} moves.")

# Uncomment to play (interactive)
# play_game_vs_ai(human_first=True)

## 5. Alpha-Beta Pruning ✂️

Minimax explores **every** possible move. But we can skip many branches that won't affect the final decision!

### The Key Insight:

- **Alpha (α)**: Best value MAX can guarantee so far
- **Beta (β)**: Best value MIN can guarantee so far

If we find a move that's worse than what we can already guarantee, we can **prune** (skip) it!

### Example:
```
MAX is choosing between branches:
- Branch A evaluated to 5
- Branch B: First child is 3, and it's MIN's turn
  → MIN will choose ≤ 3
  → No need to check other children of Branch B!
  → Branch A (5) is better than Branch B (≤3)
```

### Performance:
- Minimax: Explores O(b^d) nodes
- Alpha-Beta: Explores O(b^(d/2)) nodes with good move ordering
- **Same result, much faster!**

In [None]:
# Global counters for comparison
minimax_nodes = 0
alphabeta_nodes = 0

def minimax_with_count(game: TicTacToe, is_maximizing: bool, depth: int = 0) -> Tuple[int, Optional[Tuple[int, int]]]:
    """Minimax with node counting."""
    global minimax_nodes
    minimax_nodes += 1
    
    winner = game.check_winner()
    if winner is not None:
        if winner == 1:
            return (10 - depth, None)
        elif winner == -1:
            return (-10 + depth, None)
        else:
            return (0, None)
    
    valid_moves = game.get_valid_moves()
    best_move = None
    
    if is_maximizing:
        best_value = -float('inf')
        for move in valid_moves:
            new_game = game.copy()
            new_game.make_move(move[0], move[1])
            value, _ = minimax_with_count(new_game, False, depth + 1)
            if value > best_value:
                best_value = value
                best_move = move
        return (best_value, best_move)
    else:
        best_value = float('inf')
        for move in valid_moves:
            new_game = game.copy()
            new_game.make_move(move[0], move[1])
            value, _ = minimax_with_count(new_game, True, depth + 1)
            if value < best_value:
                best_value = value
                best_move = move
        return (best_value, best_move)


def alpha_beta(game: TicTacToe, is_maximizing: bool, 
               alpha: float = -float('inf'), beta: float = float('inf'), 
               depth: int = 0) -> Tuple[int, Optional[Tuple[int, int]]]:
    """
    Alpha-Beta Pruning - optimized minimax.
    
    Args:
        game: Current game state
        is_maximizing: True if MAX's turn
        alpha: Best value for MAX so far
        beta: Best value for MIN so far
        depth: Current depth
    
    Returns:
        (best_value, best_move)
    """
    global alphabeta_nodes
    alphabeta_nodes += 1
    
    # Terminal state
    winner = game.check_winner()
    if winner is not None:
        if winner == 1:
            return (10 - depth, None)
        elif winner == -1:
            return (-10 + depth, None)
        else:
            return (0, None)
    
    valid_moves = game.get_valid_moves()
    best_move = None
    
    if is_maximizing:  # MAX's turn
        best_value = -float('inf')
        
        for move in valid_moves:
            new_game = game.copy()
            new_game.make_move(move[0], move[1])
            
            value, _ = alpha_beta(new_game, False, alpha, beta, depth + 1)
            
            if value > best_value:
                best_value = value
                best_move = move
            
            alpha = max(alpha, best_value)
            
            # Beta cutoff: MIN won't let us get here
            if beta <= alpha:
                break  # Prune remaining branches
        
        return (best_value, best_move)
    
    else:  # MIN's turn
        best_value = float('inf')
        
        for move in valid_moves:
            new_game = game.copy()
            new_game.make_move(move[0], move[1])
            
            value, _ = alpha_beta(new_game, True, alpha, beta, depth + 1)
            
            if value < best_value:
                best_value = value
                best_move = move
            
            beta = min(beta, best_value)
            
            # Alpha cutoff: MAX won't let us get here
            if beta <= alpha:
                break  # Prune remaining branches
        
        return (best_value, best_move)

# Compare performance
game = TicTacToe()
print("Performance Comparison: Minimax vs Alpha-Beta Pruning")
print("="*70)

# Test minimax
minimax_nodes = 0
start = time.time()
value1, move1 = minimax_with_count(game, True)
time1 = time.time() - start

# Test alpha-beta
alphabeta_nodes = 0
start = time.time()
value2, move2 = alpha_beta(game, True)
time2 = time.time() - start

print(f"\nMinimax:")
print(f"  Nodes explored: {minimax_nodes:,}")
print(f"  Time: {time1:.4f}s")
print(f"  Best move: {move1}, Value: {value1}")

print(f"\nAlpha-Beta:")
print(f"  Nodes explored: {alphabeta_nodes:,}")
print(f"  Time: {time2:.4f}s")
print(f"  Best move: {move2}, Value: {value2}")

print(f"\n{'='*70}")
print(f"Improvement:")
print(f"  Nodes pruned: {minimax_nodes - alphabeta_nodes:,} ({(1 - alphabeta_nodes/minimax_nodes)*100:.1f}% reduction)")
print(f"  Speedup: {time1/time2:.2f}x faster")
print(f"  Same result? {value1 == value2 and move1 == move2} ✓")

## 6. Visualizing Pruning

Let's see alpha-beta pruning in action with a simplified game tree.

In [None]:
def visualize_alpha_beta_pruning():
    """
    Visualize which branches get pruned.
    """
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(18, 8))
    
    # Left: Full minimax tree
    ax1.set_title('Minimax (explores everything)', fontsize=14, fontweight='bold')
    ax1.text(5, 4, 'MAX\n?', ha='center', va='center', bbox=dict(boxstyle='round', facecolor='red', alpha=0.5))
    
    # MIN layer
    for x in [2, 5, 8]:
        ax1.plot([5, x], [4, 3], 'k-', linewidth=2)
        ax1.text(x, 3, 'MIN\n?', ha='center', va='center', bbox=dict(boxstyle='round', facecolor='blue', alpha=0.5))
    
    # Terminal nodes
    terminals = [
        [1, 1.5, '3'], [2, 1.5, '5'], [3, 1.5, '2'],  # First MIN
        [4, 1.5, '1'], [5, 1.5, '4'], [6, 1.5, '6'],  # Second MIN
        [7, 1.5, '7'], [8, 1.5, '2'], [9, 1.5, '5'],  # Third MIN
    ]
    
    parents = [2, 2, 2, 5, 5, 5, 8, 8, 8]
    for i, (x, y, val) in enumerate(terminals):
        ax1.plot([parents[i], x], [3, y], 'k-', linewidth=1, alpha=0.5)
        ax1.text(x, y, val, ha='center', va='center', 
                bbox=dict(boxstyle='round', facecolor='white'))
    
    ax1.set_xlim(0, 10)
    ax1.set_ylim(1, 4.5)
    ax1.axis('off')
    ax1.text(5, 0.5, f'Nodes explored: {len(terminals) + 4}', ha='center', fontsize=12)
    
    # Right: Alpha-beta pruned tree
    ax2.set_title('Alpha-Beta (prunes branches)', fontsize=14, fontweight='bold')
    ax2.text(5, 4, 'MAX\n6', ha='center', va='center', bbox=dict(boxstyle='round', facecolor='lightgreen', alpha=0.7))
    
    # MIN layer with values
    min_values = [(2, '2'), (5, '1'), (8, '2')]
    for x, val in min_values:
        ax2.plot([5, x], [4, 3], 'k-', linewidth=2)
        ax2.text(x, 3, f'MIN\n{val}', ha='center', va='center', 
                bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.7))
    
    # Show evaluated and pruned terminals
    evaluated = [
        (1, 1.5, '3', 2, False),
        (2, 1.5, '5', 2, False),
        (3, 1.5, '2', 2, False),
        (4, 1.5, '1', 5, False),
        (5, 1.5, '4', 5, True),  # PRUNED
        (6, 1.5, '6', 5, True),  # PRUNED
        (7, 1.5, '7', 8, True),  # PRUNED
        (8, 1.5, '2', 8, True),  # PRUNED
        (9, 1.5, '5', 8, True),  # PRUNED
    ]
    
    for x, y, val, parent, pruned in evaluated:
        color = 'lightgray' if pruned else 'white'
        alpha_line = 0.2 if pruned else 1.0
        ax2.plot([parent, x], [3, y], 'k-', linewidth=1, alpha=alpha_line)
        
        text = '✂️' if pruned else val
        ax2.text(x, y, text, ha='center', va='center',
                bbox=dict(boxstyle='round', facecolor=color))
    
    ax2.set_xlim(0, 10)
    ax2.set_ylim(1, 4.5)
    ax2.axis('off')
    
    explored = sum(1 for _, _, _, _, p in evaluated if not p) + 4
    ax2.text(5, 0.5, f'Nodes explored: {explored} (saved {len(terminals) + 4 - explored})', 
            ha='center', fontsize=12)
    
    plt.tight_layout()
    plt.show()

visualize_alpha_beta_pruning()

print("\nHow Alpha-Beta Works:")
print("1. MAX explores first MIN node, gets value 2")
print("2. MAX now knows it can get at least 2")
print("3. MAX explores second MIN node")
print("4. Second MIN's first child is 1 (< 2)")
print("5. MIN will choose ≤ 1, so MAX won't pick this branch")
print("6. ✂️ PRUNE remaining children of second MIN!")
print("7. Continue with third branch...")

## 7. Move Ordering Matters!

Alpha-beta prunes more when we check better moves first.

In [None]:
def alpha_beta_with_ordering(game: TicTacToe, is_maximizing: bool,
                            alpha: float = -float('inf'), 
                            beta: float = float('inf'),
                            depth: int = 0) -> Tuple[int, Optional[Tuple[int, int]]]:
    """
    Alpha-Beta with simple move ordering heuristic.
    Tries center and corners first (usually better in Tic-Tac-Toe).
    """
    winner = game.check_winner()
    if winner is not None:
        if winner == 1:
            return (10 - depth, None)
        elif winner == -1:
            return (-10 + depth, None)
        else:
            return (0, None)
    
    valid_moves = game.get_valid_moves()
    
    # Move ordering: prioritize center, then corners, then edges
    def move_priority(move):
        r, c = move
        if (r, c) == (1, 1):  # Center
            return 0
        elif (r + c) % 2 == 0:  # Corners
            return 1
        else:  # Edges
            return 2
    
    valid_moves.sort(key=move_priority)
    
    best_move = None
    
    if is_maximizing:
        best_value = -float('inf')
        for move in valid_moves:
            new_game = game.copy()
            new_game.make_move(move[0], move[1])
            value, _ = alpha_beta_with_ordering(new_game, False, alpha, beta, depth + 1)
            if value > best_value:
                best_value = value
                best_move = move
            alpha = max(alpha, best_value)
            if beta <= alpha:
                break
        return (best_value, best_move)
    else:
        best_value = float('inf')
        for move in valid_moves:
            new_game = game.copy()
            new_game.make_move(move[0], move[1])
            value, _ = alpha_beta_with_ordering(new_game, True, alpha, beta, depth + 1)
            if value < best_value:
                best_value = value
                best_move = move
            beta = min(beta, best_value)
            if beta <= alpha:
                break
        return (best_value, best_move)

# Compare with and without move ordering
game = TicTacToe()

alphabeta_nodes = 0
start = time.time()
value1, move1 = alpha_beta(game, True)
time1 = time.time() - start
nodes1 = alphabeta_nodes

alphabeta_nodes = 0
start = time.time()
value2, move2 = alpha_beta_with_ordering(game, True)
time2 = time.time() - start

print("Move Ordering Impact:")
print("="*60)
print(f"Without ordering: {nodes1:,} nodes in {time1:.4f}s")
print(f"With ordering:    {alphabeta_nodes:,} nodes in {time2:.4f}s")
print(f"\nImprovement: {(1 - alphabeta_nodes/nodes1)*100:.1f}% fewer nodes")
print("\n💡 Good move ordering makes alpha-beta even more efficient!")

## 8. Key Takeaways

### Minimax
- **Purpose**: Find optimal move assuming opponent plays optimally
- **How**: Recursively build game tree, maximize at MAX nodes, minimize at MIN nodes
- **Pros**: Guaranteed optimal (with perfect info)
- **Cons**: Exponential time complexity O(b^d)

### Alpha-Beta Pruning
- **Purpose**: Speed up minimax without changing result
- **How**: Prune branches that can't affect final decision
- **Performance**: O(b^(d/2)) with good move ordering
- **Result**: Same as minimax, just faster!

### Move Ordering
- Check better moves first
- More pruning = faster search
- Heuristics help (center/corners in TTT)

### Real-World Limitations
1. **Too many states**: Chess has ~10^120 positions
2. **Solution**: Depth-limited search + evaluation function
3. **Evaluation function**: Estimate game value without reaching end
   - Example: Material count in chess (queen=9, rook=5, etc.)

## Next Up

In Lab 4, we'll explore:
- Using professional libraries (NetworkX)
- Real-world graph search
- Advanced search techniques
- Performance optimization

## Practice Exercises

1. Implement Connect Four with minimax
2. Add depth-limited search to handle larger games
3. Create evaluation functions for partial game states
4. Compare different move ordering strategies
5. Implement iterative deepening (deeper search when time permits)
6. Build a chess piece value evaluator

Excellent work! You now understand how computers play games at superhuman levels! 🎮♟️