<p style="font-family: Arial; font-size:3.75em;color:purple; font-style:bold"><br>
Tic-Tac-Toe
</p><br>
<strong>3x3 version of the <a href='https://en.wikipedia.org/wiki/Hex_(board_game)'>Hex game</a> solved using the minimax method</strong>

Minimax works:
- on games for 2 players
- when a player's win is another player's loss: 0-sum games
- when there is a complete information on the possible outcomes
- where the goal of the game is to minimize loss (and we assume that the other player is trying to maximize gain)

<p>Minimax can either be the brute force method of calculating every possible outcome, or improved using alpha-beta pruning (which eliminates sqrt(n) where n is the number of combinations for the brute-force method for the negamax solution, or n^0.75 otherwise)</p>
<p>We calculate the probability to win for the first player only.</p>

[@vbipin]('https://github.com/vbipin')
The minimax should contain the following functions:
* calculate the children of the current state (<strong>child_node</strong>): or for more complex games with some heuristics, divide into action(state) and move(state,action)
* evaluate if the current state is terminal: either when reaching maximum depth or in case of win/loss (<strong>calc_value()</strong>)
* utility function (<strong>calc_score()</strong>) that interprets the terminal state in terms of value for the player (typically, 1, -1 or 0). To assign different values to paths based on how fast they reach a terminal state, associate to each node a weight corresponding to the distance to the terminal state (a win becomes +1+depth_number, a losss -1-depth_number)
* the minimax is a recursive function calculating the utility function for all the children of a given state, for each state. Since players alternate to play each state, the utility for a given state is reversed from the point of view of the original player.

### Calculate the win/loss function
<a href='http://ohboyigettodomath.blogspot.com/2015/05/tic-tac-toe-as-magic-square.html'>Magic square trick</a>

In [1]:
import numpy as np
numbered_grid=np.array([[2,7,6],[9,5,1],[4,3,8]])
def calc_value(node,player):
    node_player=np.array([1 if x==player else 0 for x in node.flatten()]).reshape(-1,3)
    score_cols=max(sum(np.multiply(numbered_grid,node_player)))
    score_rows=max(sum(np.multiply(numbered_grid,node_player.transpose())))
    score_diagonal1=max(np.multiply(node_player.diagonal(),np.array([2,5,8])))
    score_diagonal2=max(np.multiply(node_player[:,::-1].diagonal(),np.array([2,5,8])))
    return max(score_cols,score_rows,score_diagonal1,score_diagonal2)

calc_value(np.array([[1,1,-1], [1,-1,0], [0,0,0]]),1)

11

In [2]:
def calc_score(node,player):
    if calc_value(node,player)==15:
        return 1
    elif calc_value(node,-player)==15:
        return -1
    return 0
calc_score(np.array([[1,1,-1], [1,-1,0], [0,0,0]]),-1)

0

### Create a child node

In [45]:
#contains a list of all the possible grids at the next move for a player
def child_node(grid,player):
    child_node=[]
    child_grid=grid.copy()
       
    #find the location of the '0' in the grid
    zeros=np.isin(grid,0)
    nb_children=np.sum(zeros)
    
    #replace each remaining 0 by 'player' one by one and append the resulting grid to the child_node list
    itemindex = np.where(grid==0)
    for child in range(0,nb_children):
        child_grid[itemindex[0][child]][itemindex[1][child]]=player
        child_node.append(child_grid)
        child_grid=grid.copy()
    return child_node
        
child_node(np.array([[1,1,0],[1,-1,0],[-1,0,0]]),-1)

[array([[ 1,  1, -1],
        [ 1, -1,  0],
        [-1,  0,  0]]), array([[ 1,  1,  0],
        [ 1, -1, -1],
        [-1,  0,  0]]), array([[ 1,  1,  0],
        [ 1, -1,  0],
        [-1, -1,  0]]), array([[ 1,  1,  0],
        [ 1, -1,  0],
        [-1,  0, -1]])]

In [46]:
a_wins=np.array([[1,-1,0],[1,1,-1],[1,1,-1]])
print(calc_score(a_wins,1),calc_score(a_wins,-1))
draw=np.array([[1,-1,0],[1,1,-1],[0,1,-1]])
print(calc_score(draw,1),calc_score(draw,-1))
b_wins=np.array([[1,-1,-1],[1,1,-1],[0,1,-1]])
print(calc_score(b_wins,1),calc_score(b_wins,-1))

1 -1
0 0
-1 1


### Minimax

In [77]:
#the game being symmetrical around zero, we can apply the negamax simplification
def minimax(node,player,tree):
    #Calculate the depth: equal to the remaining number of empty spaces in the grid
    depth=np.sum(np.isin(node,0))
    #evaluate if terminal state (only possible after the first player has played at least 3 times) or end of tree
    terminal=depth<6 and calc_score(node,player)!=0
    if depth==0 or terminal:
        value=calc_score(node,player)+depth
        tree[value]=node
        return value
    else:
        if depth%2!=0:
            for child in child_node(node,player):
                value=max(depth,minimax(child,-player,tree))
                tree[value]=child
        elif depth%2==0:
            for child in child_node(node,-player):
                value=min(depth,minimax(child,player,tree))
                tree[value]=child
        return value

In [78]:
def main(state=np.full(((3,3)),0),player=1):
    tree={}
    chance=minimax(state,player,tree)
    depth=np.sum(np.isin(state,0))
    print(state)
    print('Likelihood to win: ',(chance-1)/depth)
    if chance>1:
        print('Strategy to win:')
        print(tree[chance])

In [79]:
last_but_one=np.array([[1,-1,-1],[0,1,0],[1,1,-1]])
print(main(last_but_one,1))

[[ 1 -1 -1]
 [ 0  1  0]
 [ 1  1 -1]]
Likelihood to win:  -0.5
None


In [80]:
last_but_two=np.array([[1,-1,0],[1,1,-1],[0,1,-1]])
print(main(last_but_two,-1))

[[ 1 -1  0]
 [ 1  1 -1]
 [ 0  1 -1]]
Likelihood to win:  -0.5
None


In [81]:
third=np.array([[1,-1,1],[0,0,0],[0,0,0]])
print(main(third,1),-1)

[[ 1 -1  1]
 [ 0  0  0]
 [ 0  0  0]]
Likelihood to win:  0.6666666666666666
Strategy to win:
[[ 1 -1  1]
 [ 0  0  0]
 [ 0  0 -1]]
None -1


In [82]:
origin=np.full(((3,3)),0)
print(main(origin,1))

[[0 0 0]
 [0 0 0]
 [0 0 0]]
Likelihood to win:  0.8888888888888888
Strategy to win:
[[0 0 0]
 [0 0 0]
 [0 0 1]]
None


In [90]:
second=np.array([[0,0,0], [0,0,-1], [1,-1,1]])
main(second,1)

[[ 0  0  0]
 [ 0  0 -1]
 [ 1 -1  1]]
Likelihood to win:  0.8
Strategy to win:
[[ 0  0  0]
 [ 0  1 -1]
 [ 1 -1  1]]
