<p style="font-family: Arial; font-size:3.75em;color:purple; font-style:bold"><br>
Tic-Tac-Toe
</p><br>
<strong>3x3 version of the <a href='https://en.wikipedia.org/wiki/Hex_(board_game)'>Hex game</a> solved using the minimax method</strong>

Minimax works:
- on games for 2 players
- when a player's win is another player's loss: 0-sum games
- when there is a complete information on the possible outcomes
- where the goal of the game is to minimize loss (and we assume that the other player is trying to maximize gain)

<p>Minimax can either be the brute force method of calculating every possible outcome, or improved using alpha-beta pruning (which eliminates sqrt(n) where n is the number of combinations for the brute-force method for the negamax solution, or n^0.75 otherwise)</p>
<p>We calculate the probability to win for the first player only.</p>

[@vbipin]('https://github.com/vbipin')
The minimax should contain the following functions:
* calculate the children of the current state (<strong>child_node</strong>): or for more complex games with some heuristics, divide into action(state) and move(state,action)
* evaluate if the current state is terminal: either when reaching maximum depth or in case of win/loss (<strong>calc_value()</strong>)
* utility function (<strong>calc_score()</strong>) that interprets the terminal state in terms of value for the player (typically, 1, -1 or 0). To assign different values to paths based on how fast they reach a terminal state, associate to each node a weight corresponding to the distance to the terminal state (a win becomes +1+depth_number, a losss -1-depth_number)
* the minimax is a recursive function calculating the utility function for all the children of a given state, for each state. Since players alternate to play each state, the utility for a given state is reversed from the point of view of the original player.

### Calculate the win/loss function
<a href='http://ohboyigettodomath.blogspot.com/2015/05/tic-tac-toe-as-magic-square.html'>Magic square trick</a>

In [4]:
import numpy as np

In [28]:
class State():
    def __init__(self,state):
        self.grid = state
        self.depth = np.sum(np.isin(self.grid,0))
        self.itemindex = np.where(self.grid==0)
        self.player = -1
        if self.depth % 2 == 0:
            self.player = 1
    def calc_value(self):
        numbered_grid = np.array([[2,7,6],[9,5,1],[4,3,8]])
        self.node_player = np.array([1 if x==self.player else 0 for x in self.grid.flatten()]).reshape(-1,3)
        self.score_cols = max(sum(np.multiply(numbered_grid,self.node_player)))
        self.score_rows = max(sum(np.multiply(numbered_grid,self.node_player.transpose())))
        self.score_diagonal1 = max(np.multiply(self.node_player.diagonal(),np.array([2,5,8])))
        self.score_diagonal2 = max(np.multiply(self.node_player[:,::-1].diagonal(),np.array([2,5,8])))
        return max(self.score_cols,self.score_rows,self.score_diagonal1,self.score_diagonal2)
    def calc_score(self):
        #returns 1 for the maxplayer, -1 for the minplayer, 0 if nobody wins at this state
        if self.calc_value() == 15:
            if self.player == 1:
                return 1 + self.depth
            else:
                return -1 - self.depth
        return 0
    def child_node(self):
        #contains a list of all the possible grids at the next move for a player
        child_node = []
        child_grid = self.grid.copy()   
        #replace each remaining 0 by 'player' one by one and append the resulting grid to the child_node list
        for child in range(0,self.depth):
            child_grid[self.itemindex[0][child]][self.itemindex[1][child]]=-self.player
            child_node.append(child_grid)
            child_grid=self.grid.copy()
        return child_node
    def position_children(self):
        return [(self.itemindex[0][child],self.itemindex[1][child]) for child in range(0,self.depth)]
    def terminal(self):
        #evaluate if terminal state (only possible once the first player has played at least 3 times) or end of tree
        return self.depth<6 and self.calc_score()!=0

In [27]:
five=State(np.array([[1,0,-1], [1,-1,0], [0,0,0]]))
four=State(np.array([[1,0,-1], [1,-1,0], [1,0,0]]))
last_but_one=State(np.array([[1,-1,-1],[0,1,0],[1,1,-1]]))
last_but_two=State(np.array([[1,-1,0],[1,1,-1],[0,1,-1]]))
third=State(np.array([[1,-1,1],[0,0,0],[0,0,0]]))
second=State(np.array([[0,0,0], [0,0,-1], [1,-1,1]]))
full=State(np.full(((3,3)),0))
print(five.player,five.calc_value(),five.calc_score(),five.child_node(),five.position_children())
print(four.player,four.calc_value(),four.calc_score(),four.child_node(),four.position_children())

-1 6 0 [array([[ 1,  1, -1],
       [ 1, -1,  0],
       [ 0,  0,  0]]), array([[ 1,  0, -1],
       [ 1, -1,  1],
       [ 0,  0,  0]]), array([[ 1,  0, -1],
       [ 1, -1,  0],
       [ 1,  0,  0]]), array([[ 1,  0, -1],
       [ 1, -1,  0],
       [ 0,  1,  0]]), array([[ 1,  0, -1],
       [ 1, -1,  0],
       [ 0,  0,  1]])] [(0, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
1 15 5 [array([[ 1, -1, -1],
       [ 1, -1,  0],
       [ 1,  0,  0]]), array([[ 1,  0, -1],
       [ 1, -1, -1],
       [ 1,  0,  0]]), array([[ 1,  0, -1],
       [ 1, -1,  0],
       [ 1, -1,  0]]), array([[ 1,  0, -1],
       [ 1, -1,  0],
       [ 1,  0, -1]])] [(0, 1), (1, 2), (2, 1), (2, 2)]


### Minimax

In [18]:
#the game being symmetrical around zero, we can apply the negamax simplification
def minimax(state,tree):
    if state.depth==0 or state.terminal():
        return state.calc_score()
    else:
        if state.depth%2!=0:                
            for child in state.child_node():
                node=State(child)
                value=max(state.depth,minimax(node,tree))
                tree[state.]=value
        elif state.depth%2==0:
            for child in state.child_node():
                node=State(child)
                value=min(state.depth,minimax(node,tree))
                tree[child[1]]=value
        return value

In [19]:
def main(state):
    print(state.grid)
    tree={}
    chance=minimax(state,tree)
    print('Likelihood to win: ',(chance-1)/state.depth)
    if chance>1:
        print('Strategy to win:')
        print(tree)

In [20]:
main(five)

[[ 1  0 -1]
 [ 1 -1  0]
 [ 0  0  0]]
[array([[ 1,  1, -1],
       [ 1, -1,  0],
       [ 0,  0,  0]]), [0, 1]]
[array([[ 1,  1, -1],
       [ 1, -1, -1],
       [ 1,  0,  0]]), [2, 0]]


TypeError: unhashable type: 'list'