<p style="font-size: 18px;">
  This is the accompanying code for the post titled "Unlocking the Power of Heuristic Functions in Minimax Algorithms: A Step-by-Step Guide to Creating Smarter Game AI with Enhanced Decision-Making"<br>
  You can find it <a href="https://pureai.substack.com/p/unlocking-the-power-of-heuristic-functions">here</a>.<br>
  Published: September 16, 2023<br>
  <a href="https://pureai.substack.com">https://pureai.substack.com</a>
</p>

Welcome to this Jupyter notebook! If you're new to Python or don't have it installed on your system, don't worry; you can still follow along and explore the code.

Here's a quick guide to getting started:

- Using an Online Platform: You can run this notebook in a web browser using platforms like Google Colab or Binder. These services offer free access to Jupyter notebooks and don't require any installation.
- Installing Python Locally: If you'd prefer to run this notebook on your own machine, you'll need to install Python. A popular distribution for scientific computing is Anaconda, which includes Python, Jupyter, and other useful tools.
  - Download Anaconda from [here](https://www.anaconda.com/download).
  - Follow the installation instructions for your operating system.
  - Launch Jupyter Notebook from Anaconda Navigator or by typing jupyter notebook in your command line or terminal.
- Opening the Notebook: Once you have Jupyter running, navigate to the location of this notebook file (.ipynb) and click on it to open.
- Running the Code: You can run each cell in the notebook by selecting it and pressing Shift + Enter. Feel free to modify the code and experiment with it.
- Need More Help?: If you're new to Python or Jupyter notebooks, you might find these resources helpful:
  - [Python.org's Beginner's Guide](https://docs.python.org/3/tutorial/index.html)
  - [Jupyter Notebook Basics](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Notebook%20Basics.html)

Happy coding, and enjoy exploring the fascinating world of Adversarial Search Algorithms!

In [1]:
# A small readme
from platform import python_version
print(f"The version of Python used is: {python_version()}")

import random
from typing import List, Tuple, Union, Any, Dict, Callable, Set, Optional
from numpy import ndarray

The version of Python used is: 3.10.12


## Alpha-Beta Minimax Algorithm and Tic-Tac-Toe: An In-Depth Exploration

### Introduction

Welcome to this Jupyter Notebook where we will be exploring not just the Minimax algorithm but also its optimized version—Alpha-Beta Pruning—in the context of game theory and adversarial search. The objectives of this notebook are:

1. To Understand Alpha-Beta Minimax: We'll dissect both the Minimax algorithm and its more efficient variant, Alpha-Beta Pruning. We'll look at their conceptual foundations, principles, and pseudocode representations.
1. To Apply Alpha-Beta Minimax: We'll implement both algorithms in Python and integrate them into a tic-tac-toe game, allowing you to play against an enhanced AI opponent.

### Why is Alpha-Beta Minimax Important?

The Minimax algorithm, along with its optimized counterpart Alpha-Beta Pruning, is pivotal in the field of artificial intelligence, particularly in game theory. These algorithms are widely used in two-player, turn-based games like chess, tic-tac-toe, and Go. They aid in decision-making by optimizing the worst-case scenario for a player, thus identifying the best possible moves. Alpha-Beta Pruning takes this a step further by reducing the number of nodes evaluated in the search tree, making the decision process faster and more efficient.

These techniques have applications beyond just games; they are also relevant in areas such as economics, politics, and negotiation where adversarial or competitive interactions can be modeled.

### What Will You Gain from This Notebook?
- Conceptual Understanding: You'll grasp the theoretical foundations of both the Minimax and Alpha-Beta Pruning algorithms and understand their significance in game theory and AI.
- Practical Skills: By implementing these algorithms in Python, you'll be able to see the theories in action. We'll build a tic-tac-toe game where you can experience firsthand the effectiveness of these algorithms.
- AI in Gaming: Gain insights into how intelligent behavior in games can be generated through relatively simple but powerful algorithms.

Let's dive in and explore these fascinating topics!

We first start by importing the required Library, which is numpy.

In [2]:
import numpy as np

And define a function that will print our simple board.

In [3]:
# Function to print the Tic-Tac-Toe board
def print_board(board):
    for row in board:
        print(" | ".join(row))

### check_winner

The check_winner function checks if a player with a given sign (usually 'X' or 'O') has won a tic-tac-toe game. The function checks for three winning conditions:

All elements in a row are the same as the provided sign.
All elements in a column are the same as the provided sign.
All elements in a diagonal are the same as the provided sign.

* **board** (np.ndarray): A 2D NumPy array representing the tic-tac-toe board.
* **sign** (str): A string representing the sign of the player ('X' or 'O').

**returns** bool: True if the player with the given sign has won, otherwise False.

In [4]:
# Function to check if someone has won
def check_winner(board: np.ndarray, sign: str) -> bool:
    for row in board:
        if np.all(row == sign):
            return True
    for col in board.T:
        if np.all(col == sign):
            return True
    if np.all(np.diag(board) == sign) or np.all(np.diag(np.fliplr(board)) == sign):
        return True
    return False

### unit test: check_winner

In [5]:
# Initialize a sample board with 'X' winning by row
board1 = np.array([['X', 'X', 'X'],
                   [' ', ' ', ' '],
                   [' ', ' ', ' ']])

# Initialize another sample board with 'O' winning by diagonal
board2 = np.array([['O', ' ', ' '],
                   [' ', 'O', ' '],
                   [' ', ' ', 'O']])

# Initialize another sample board with no winner
board3 = np.array([['X', 'O', ' '],
                   [' ', 'O', 'X'],
                   [' ', ' ', 'O']])

assert check_winner(board1, 'X') == True, "Test Case 1 Failed"
assert check_winner(board2, 'O') == True, "Test Case 2 Failed"
assert check_winner(board3, 'X') == False, "Test Case 3 Failed"
assert check_winner(board3, 'O') == False, "Test Case 4 Failed"

print("All test cases pass!")

All test cases pass!


### evaluate_row_or_col

The evaluate_row_or_col function evaluates a row, column, or diagonal in a Tic-Tac-Toe board. It updates a dictionary with counts of different board situations, like two 'X's and an empty cell (num_x2s), two 'O's and an empty cell (num_o2s), etc. The function is used for evaluating the state of the Tic-Tac-Toe board and determining the best move for a player.

* **array** (np.ndarray): A 1-dimensional NumPy array containing the row, column, or diagonal to evaluate. Expected values in the array are 0 for empty, 1 for 'X', and 2 for 'O'.
* **counts** (dict): A dictionary that will be updated with the counts of specific situations in the Tic-Tac-Toe board. Expected keys are num_x2s, num_x1s, num_o2s, and num_o1s.
* **diagonal** (bool = False): A boolean flag indicating whether the input array represents a diagonal. The weight for counting is doubled if this is True. Default is False.

**returns** None: The function returns None as it updates the counts dictionary in-place.

In [6]:
def evaluate_row_or_col(array: np.ndarray, counts: dict, diagonal: bool = False) -> None:
    num_0s = np.sum(array == 0)
    num_1s = np.sum(array == 1)
    num_2s = np.sum(array == 2)

    weight = 2 if diagonal else 1

    if num_1s == 2 and num_0s == 1:
        counts["num_x2s"] += weight
    elif num_1s == 1 and num_0s == 2:
        counts["num_x1s"] += weight

    if num_2s == 2 and num_0s == 1:
        counts["num_o2s"] += weight
    elif num_2s == 1 and num_0s == 2:
        counts["num_o1s"] += weight

### unit test: evaluate_row_or_col

In [7]:
# Initialize counts dictionary and array
counts = {"num_x2s": 0, "num_x1s": 0, "num_o2s": 0, "num_o1s": 0}
array1 = np.array([1, 1, 0])
array2 = np.array([0, 2, 0])
array3 = np.array([2, 2, 0])

# Test the function
evaluate_row_or_col(array1, counts)
evaluate_row_or_col(array2, counts)
evaluate_row_or_col(array3, counts, diagonal=True)

# Check the results
assert counts["num_x2s"] == 1, f'Expected 1, got {counts["num_x2s"]}'
assert counts["num_o1s"] == 1, f'Expected 1, got {counts["num_o1s"]}'
assert counts["num_o2s"] == 2, f'Expected 2, got {counts["num_o2s"]}'

### evaluate_board_state

This method evaluates the state of a Tic-Tac-Toe board by checking for patterns in rows, columns, and diagonals. The function uses a numerical representation for 'X' and 'O' to analyze the board state and returns a score based on the number of two-in-a-row and one-in-a-row instances for each player ('X' and 'O').

* **board** (np.ndarray): A 2D NumPy array representing the state of the Tic-Tac-Toe board. The array contains 'X', 'O', or an empty string to represent the state of each cell on the board.

**returns** (int): A score that signifies the board state. A higher positive score implies that player 'X' is in a better position, while a negative score implies that player 'O' is in a better position.

In [8]:
def evaluate_board_state(board: np.ndarray) -> int:
    counts = {"num_x2s": 0, "num_x1s": 0, "num_o2s": 0, "num_o1s": 0}
    num_board = np.where(board == 'X', 1, np.where(board == 'O', 2, 0))
    
    diagonals = [np.diag(num_board), np.diag(np.fliplr(num_board))]
    for diag in diagonals:
        evaluate_row_or_col(diag, counts, diagonal=True)
    
    for row in num_board:
        evaluate_row_or_col(row, counts)
        
    for col in num_board.T:
        evaluate_row_or_col(col, counts)

    return 3 * counts["num_x2s"] + counts["num_x1s"] - (3 * counts["num_o2s"] + counts["num_o1s"])

### unit test: evaluate_board_state

In [9]:
# Test cases
board1 = np.array([['X', 'O', 'X'],
                   ['O', 'X', 'O'],
                   ['X', 'O', 'X']])

board2 = np.array([['X', 'O', 'X'],
                   ['O', 'X', 'O'],
                   ['O', 'X', 'O']])

board3 = np.array([['X', ' ', ' '],
                   ['O', 'X', 'O'],
                   ['O', 'O', 'X']])

assert evaluate_board_state(board1) == 0 
assert evaluate_board_state(board2) == 0
assert evaluate_board_state(board3) == 1

### minimax_with_heuristic

The minimax_with_heuristic function is an implementation of the minimax algorithm with alpha-beta pruning for optimizing the search space in a Tic-Tac-Toe game. The function determines the best possible move for the maximizing or minimizing player by searching the game tree and returning a score.

* **board** (ndarray): The game board represented as a NumPy array.
* **depth** (int): The current depth in the game tree.
* **alpha** (float): The best score achievable by any preceding nodes on the path for the maximizing player.
* **beta** (float): The lowest score achievable by any preceding nodes on the path for the minimizing player.
* **maximizing** (bool): A boolean variable indicating whether the current player is maximizing or not.

**returns** (int): an integer score representing the evaluation of the board state. Positive score indicates favor towards 'X', negative towards 'O', and zero for a draw.

In [10]:
def minimax_with_heuristic(board: np.ndarray, depth: int, alpha: float, beta: float, maximizing: bool) -> int:
    if check_winner(board, 'X'):
        return 10 - depth
    if check_winner(board, 'O'):
        return -10 + depth
    if np.all(board != ' '):
        return 0

    heuristic_value = evaluate_board_state(board)
    
    if maximizing:
        max_val = heuristic_value
        for i in range(3):
            for j in range(3):
                if board[i, j] == ' ':
                    board[i, j] = 'X'
                    value = minimax_with_heuristic(board, depth + 1, alpha, beta, False)
                    board[i, j] = ' '
                    max_val = max(max_val, value + heuristic_value)
                    alpha = max(alpha, value)
                    if beta <= alpha:
                        return max_val
        return max_val
    else:
        min_val = heuristic_value
        for i in range(3):
            for j in range(3):
                if board[i, j] == ' ':
                    board[i, j] = 'O'
                    value = minimax_with_heuristic(board, depth + 1, alpha, beta, True)
                    board[i, j] = ' '
                    min_val = min(min_val, value - heuristic_value)
                    beta = min(beta, value)
                    if beta <= alpha:
                        return min_val
        return min_val

### best_move_with_heuristic

The best_move_with_heuristic function is designed to find the best move for an AI player in a Tic-Tac-Toe game using the minimax algorithm. The function takes the current state of the Tic-Tac-Toe board as an input and returns the best move for the AI ('X') to make.

* **board** (np.ndarray): A 2D NumPy array representing the current state of the Tic-Tac-Toe board. The array should be of shape (3, 3) and contain either ' ', 'X', or 'O' to represent empty, AI ('X'), and human ('O') moves, respectively.

**returns** (Union[None, Tuple[int, int]]): a tuple (i, j) indicating the best move for the AI in the form of row and column indices. If there are no valid moves left (i.e., the board is full), it returns None.

In [11]:
def best_move_with_heuristic(board: np.ndarray) -> Optional[Tuple[int, int]]:
    moves = []

    for i in range(3):
        for j in range(3):
            if board[i, j] == ' ':
                board[i, j] = 'O'
                if check_winner(board, 'O'):
                    return (i, j)
                board[i, j] = ' '

    for i in range(3):
        for j in range(3):
            if board[i, j] == ' ':
                board[i, j] = 'X'
                minimax_value = minimax_with_heuristic(board, 0, -np.inf, np.inf, False)
                heuristic_value = evaluate_board_state(board)
                board[i, j] = ' '
                moves.append(((i, j), minimax_value, heuristic_value))

    moves.sort(key=lambda x: (x[1], x[2]), reverse=True)

    for move, _, _ in moves:
        if board[move] == ' ':
            return move

### unit test: best_move_with_heuristic

In [12]:
test_board = np.array([
    ['X', 'O', ' '],
    [' ', 'X', 'O'],
    ['O', ' ', 'X']
])

move = best_move_with_heuristic(test_board)
assert move == (0, 2), "Test failed!"

In [13]:
test_board = np.array([
    ['X', 'O', ' '],
    [' ', 'X', 'O'],
    [' ', ' ', ' ']
])

move = best_move_with_heuristic(test_board)
assert move == (2, 2), "Test failed!"

### main game loop

The following is for a simple turn-based game where a human player competes against an AI opponent. The game runs in a continuous loop until a winner is identified or the game is a draw. The board is represented as a 2D array, and as a zero-based index.

To play, run the notebook and enter your desired location in the form of row and column indices (for example, type: 0 1 to place a 'O' in the first row and second column). The game will then display the updated board and the AI's move.

In [14]:
# Initialize the board
board = np.full((3, 3), ' ')

while True:
    if np.all(board != ' '):
        print("It's a draw!")
        break

    # Human's turn
    row, col = map(int, input("Your move (row col): ").split())
    board[row, col] = 'O'
    print("Your move:")
    print_board(board)
    
    if check_winner(board, 'O'):
        print("You win!")
        break

    # AI's turn
    move = best_move_with_heuristic(board)
    board[move] = 'X'
    print("AI's move:")
    print_board(board)
    
    if check_winner(board, 'X'):
        print("AI wins!")
        break

Your move:
  |   |  
  | O |  
  |   |  
AI's move:
X |   |  
  | O |  
  |   |  
Your move:
X |   | O
  | O |  
  |   |  
AI's move:
X |   | O
  | O |  
X |   |  
Your move:
X |   | O
  | O | O
X |   |  
AI's move:
X |   | O
X | O | O
X |   |  
AI wins!
