In [None]:
from IPython.core.display import HTML
with open('../style.css') as f:
    css = f.read()
HTML(css)

# The Minimax Algorithm with Memoization

This notebook implements the [minimax algorithm](https://en.wikipedia.org/wiki/Minimax) with [memoization](https://en.wikipedia.org/wiki/Memoization) 
and thereby implements a program that can play various *deterministic*, *zero-sum*, *turn-taking*, *two-person* games with *perfect information*.  The implementation assumes that an external notebook defines a game and that this notebook provides the following variables and functions:
* `gPlayers` is a list of length two.  The elements of this list are the 
  names of the players.  It is assumed that the first element in this list represents 
  the computer, while the second element is the human player.  The computer
  always starts the game.
* `gStart` is the *start state* of the game.
* `next_states(State, player)` is a function that takes two arguments:`
  - `State` is a state of the game.
  - `player` is the player whose turn it is to make a move.
  The function call `next_states(State, player)` returns the list
  of all states that can be reached by any move of `player`.
* `utility(State)` takes a state and a player as its arguments.
  If `state` is a *terminal state* (i.e. a state where the game is finished), 
  then the function returns the value that this `state` has for `gPlayer[0]`.  Otherwise, the function returns `None`.
* `finished(State)` returns `True` if and only if `state` is a terminal state.
* `get_move(State)` displays the given state and asks the human player for
  her move.
* `final_msg(State)` informs the human player about the result of the game.
* `draw(State, canvas, value)` draws the given state on the given canvas and 
  informs the user about the `value` of this state.  The value is always 
  calculated from the perspective of the first player, which is the computer.
   
---

The function `maxValue(State)` takes one argument:
- `State` is the current state of the game.

The function assumes that it is the first player's turn.  It returns the value that `State` has
if both players play their best game.  This values is an element from the set $\{-1, 0, 1\}$.  
* If the first player can force a win, then `maxValue` returns the value `1`.
* If the first player can at best force a draw, then the return value is `0`.
* If the second `player` can force a win, then the return value is `-1`.

Mathematically, the function `maxValue` is defined recursively:
- $\;\;\texttt{finished}(s) \rightarrow \texttt{maxValue}(s) = \texttt{utility}(s)$
- $\neg \texttt{finished}(s) \rightarrow 
   \texttt{maxValue}(s) = \max\bigl(\bigl\{ \texttt{minValue}(n) \bigm| n \in \texttt{nextStates}(s, \texttt{gPlayers}[0]) \bigr\}\bigr)
  $

In [None]:
def maxValue(State):
    if finished(State):
        return utility(State)
    return max([ minValue(ns) for ns in next_states(State, gPlayers[0]) ])

The function `minValue(State)` takes one argument:
- `State` is the current state of the game.

The function assumes that it is the second player's turn.  It returns the value that `State` has
if both players play their best game.  This values is an element from the set $\{-1, 0, 1\}$.  
* If the first player can force a win, then the return value is `1`.
* If the first player can at best force a draw, then the return value is `0`.
* If the second `player` can force a win, then the return value is `-1`.

Mathematically, the function `minValue` is defined recursively:
- $\texttt{finished}(s) \rightarrow \texttt{minValue}(s) = \texttt{utility}(s)$
- $\neg \texttt{finished}(s) \rightarrow 
   \texttt{minValue}(s) = \min\bigl(\bigl\{ \texttt{maxValue}(n) \bigm| n \in \texttt{nextStates}(s, \texttt{gPlayers}[1]) \bigr\}\bigr)
  $

In [None]:
def minValue(State):  
    if finished(State):
        return utility(State)
    return min([ maxValue(ns) for ns in next_states(State, gPlayers[1]) ])

The function `best_move` takes one argument:
- `State` is the current state of the game.

It is assumed that the first player in the list `Player` is to move. 
The function `best_move` returns a pair of the form $(v, s)$ where $s$ is a state and $v$ is the value of this state.  The state $s$ is a state that is reached from `State` if `player` makes one of her optimal moves.  In order to have some variation in the game, the function randomly chooses any of the optimal moves.

In [None]:
import random
random.seed(1)

In [None]:
def best_move(State):
    NS        = next_states(State, gPlayers[0])
    bestVal   = maxValue(State)
    BestMoves = [s for s in NS if minValue(s) == bestVal]
    BestState = random.choice(BestMoves)
    return bestVal, BestState

The next line is needed because we need the function `IPython.display.clear_output` to clear the output in a cell.

In [None]:
import IPython.display 

The function `play_game` plays a game on the given `canvas`.  The game played is specified indirectly as follows:
- `gStart` is a global variable defining the start state of the game.

   This variable is defined in the notebook that defines the game that is played.
   The same holds for the other functions mentioned below.
- `next_states` is a function such that $\texttt{next_states}(s, p)$ computes the set of all possible states that can be reached from state $s$ if player $p$ is next to move.
- `finished` is a function such that $\texttt{finished}(s)$ is true for a state $s$ if the game is over in state $s$.
- `utility` is a function such that $\texttt{utility}(s)$ returns either `-1`, `0`, or `1` in the *terminal state* $s$.  We have that
  - $\texttt{utility}(s)= -1$ iff the game is lost for the first player in state $s$, 
  - $\texttt{utility}(s)=  0$ iff the game is drawn, and 
  - $\texttt{utility}(s)=  1$ iff the game is won for the first player in state $s$.

In [None]:
def play_game(canvas):
    State = gStart
    while True: 
        val, State = best_move(State);
        draw(State, canvas, f'For me, the game has the value {val}.')
        if finished(State):
            final_msg(State)
            return
        IPython.display.clear_output(wait=True)
        State = get_move(State)
        draw(State, canvas, '')
        if finished(State):
            IPython.display.clear_output(wait=True)
            final_msg(State)
            return

Below, the jupyter *magic command* `%%capture` silently discards the output that is produced by the notebook `Tic-Tac-Toe.ipynb`.

In [None]:
%%capture
%run Tic-Tac-Toe.ipynb

With the game *tic-tac-toe* represented as lists, computing the value of the start state takes about 4 seconds on my Windows PC (Processor: AMD Ryzen Threadripper PRO 3955WX with 16 Cores, 4.1 GHz).

In [None]:
%%time
val = maxValue(gStart)

The start state has the value `0`as neither player can force a win.

In [None]:
val

Let's draw the board and play a game.

In [None]:
canvas = create_canvas()
draw(gStart, canvas, f'Current value of game for "X": {val}')

Now its time to play.  In the input window that will pop up later, enter your move in the format "row,col"  with no space between row and column.

In [None]:
play_game(canvas)

## Using the BitBoard Implementation of TicTacToe

Next, we try how much the bit-board implementation speeds up the game.

In [None]:
%%capture
%run Tic-Tac-Toe-BitBoard.ipynb

On my computer, the bit-board implementation is about twice as fast as the list based implementation.  

In [None]:
%%time
val = maxValue(gStart)

In [None]:
canvas = create_canvas()
draw(gStart, canvas, f'Current value of game for "X": {val}')

In [None]:
play_game(canvas)

## Memoization

In [None]:
gCache = {}

def memoize(f):
    global gCache
    
    def f_memoized(*args):
        if (f, args) in gCache:
            return gCache[(f, args)]
        result = f(*args)
        gCache[(f, args)] = result
        return result
    
    return f_memoized

In [None]:
maxValue = memoize(maxValue)
minValue = memoize(minValue)

* The list based implementation of TicTacToe with *Memoization* takes 84 ms.
* The bit-board based implementation of TicTacToe takes 38 ms.

In [None]:
%%time
val = maxValue(gStart)

Let us check the size of the cache.

In [None]:
len(gCache)