# Tictactoe's Minimax AI - Explanation

## Background

Similar to Nim, Tictactoe can be classified as a `combinatorial adversial game` which has a `finite` sequence of moves. 

Unlike a human, an AI is not able to utilize strategies such as forks to beat the game. In order to make the AI invincible, the AI would need to use adversarial search to calculate all possible opponent moves, and pick the best counter-move for that. The best algorithm for an unbeatable Tic-Tac-Toe AI is Minimax.

## Minimax

In Tictactoe, if I am the X player, I will obtain a score of 1 if I win, and thus be called the maximizing player because my goal is to maximize the score. My opponent will obtain a score of -1 if they win, and will be called the minimizing player because their objective is to minimize the score.

In order to play optimally, the maximizing player (myself) would have to consider O's objective - to minimize the score - and pick the move which maximizes the minimum score.
Suppose this is the game board:

![endgame](images/endgame.png)

(Image courtesy of https://www.neverstopbuilding.com/blog/minimax).

When I pick the optimal move - I will gain a score of 1 and be the winning player.

If you look at Tictactoe from the perspective of my opponent. Suppose my opponent is an optimal player: he will pick the moves that minimize my score based on this decision tree:

![o's_move_tree](images/o's_move_tree.png)

If you generalize this algorithm:

The `max_value` algorithm maximizes the minimum score by recursively calling the `min_value` algorithm, which recursively calls the `max_value` algorithm until the endgame is reached (either player wins or the game results in a draw). Both functions will be utilized in the `minimax` function.

In [None]:
def max_value(board):
    
    if terminal(board):
        return utility(board), None
    
    value = -np.inf
    action = None
    for action in actions(board):
        v, a = min_value(result(board, action))
        print(v, a)
        if v > value:
            value = v
            
            if v == 1:
                return value, action
            
    return value, action


def min_value(board):
    
    if terminal(board):
        return utility(board), None
    
    value = np.inf
    action = None
    for action in actions(board):
        v, a = max_value(result(board, action))
        print(v, a)
        if v < value:
            value = v
            
            if v == -1:
                return value, action
            
    return value, action


def minimax(board):
    """
    Returns the optimal action for the current player on the board.
    """
    
    if terminal(board):
        return None
    
    else:
        if player(board) == "X":
            value, action = max_value(board)
            return action
        else:
            value, action = min_value(board)
            return action

Courtesy of neverstopbuilding, there is an interesting quirk to this algorithm: he passed in a board where X would always win:

![fatalism](images/fatalism.png)

He expected that the algorithm would attempt to put up a fight and block the X player's immediate win. However, upon closer analysis, what he found was:

![x_always_wins](images/x_always_wins.png)

The algorithm had searched through the universe of moves and realized that every move in this universe resulted in a win for X. Even a world champion would lose in this case! As a result, the algorithm decided to be fatalistic and surrender quickly.

# Conclusion & Further Steps

The algorithm can be further optimized by alpha-beta pruning, which will be part of the next project.