<p style="font-family: Arial; font-size:3.75em;color:purple; font-style:bold"><br>
Tic-Tac-Toe
</p><br>
<strong>3x3 version of the <a href='https://en.wikipedia.org/wiki/Hex_(board_game)'>Hex game</a> solved using the minimax method</strong>

Minimax works:
- on games for 2 players
- when a player's win is another player's loss: 0-sum games
- when there is a complete information on the possible outcomes
- where the goal of the game is to minimize loss (and we assume that the other player is trying to maximize gain)

<p>Minimax can either be the brute force method of calculating every possible outcome, or improved using alpha-beta pruning (which eliminates sqrt(n) where n is the number of combinations for the brute-force method for the negamax solution, or n^0.75 otherwise)</p>
<p>We calculate the probability to win for the first player only.</p>

[@vbipin]('https://github.com/vbipin')
The minimax should contain the following functions:
* calculate the children of the current state (<strong>child_node</strong>): or for more complex games with some heuristics, divide into action(state) and move(state,action)
* evaluate if the current state is terminal: either when reaching maximum depth or in case of win/loss (<strong>calc_value()</strong>)
* utility function (<strong>calc_score()</strong>) that interprets the terminal state in terms of value for the player (typically, 1, -1 or 0). To assign different values to paths based on how fast they reach a terminal state, associate to each node a weight corresponding to the distance to the terminal state (a win becomes +1+depth_number, a losss -1-depth_number)
* the minimax is a recursive function calculating the utility function for all the children of a given state, for each state. Since players alternate to play each state, the utility for a given state is reversed from the point of view of the original player.

### Calculate the win/loss function
<a href='http://ohboyigettodomath.blogspot.com/2015/05/tic-tac-toe-as-magic-square.html'>Magic square trick</a>

In [1]:
import numpy as np

In [189]:
class State():
    def __init__(self,state):
        self.grid = state
        self.depth = np.sum(np.isin(self.grid,'.'))
        self.itemindex = np.where(self.grid=='.')
        self.id=''.join([str(i) for i in self.grid.flatten()])
        self.player = 'X'
        self.next_player = 'O'
        if self.depth % 2 == 0:
            self.player = 'O'
            self.next_player = 'X'
    def calc_value(self):
        numbered_grid = np.array([[2,7,6],[9,5,1],[4,3,8]])
        self.node_player = np.array([1 if x==self.player else 0 for x in self.grid.flatten()]).reshape(-1,3)
        self.score_cols = max(sum(np.multiply(numbered_grid,self.node_player)))
        self.score_rows = max(sum(np.multiply(numbered_grid,self.node_player.transpose())))
        self.score_diagonal1 = max(np.multiply(self.node_player.diagonal(),np.array([2,5,8])))
        self.score_diagonal2 = max(np.multiply(self.node_player[:,::-1].diagonal(),np.array([2,5,8])))
        return max(self.score_cols,self.score_rows,self.score_diagonal1,self.score_diagonal2)
    def calc_score(self):
        #returns 1 for the maxplayer, -1 for the minplayer, 0 if nobody wins at this state
        if self.calc_value() == 15:
            if self.player == 'O':
                return self.depth
            else:
                return -self.depth
        return 0
    def child_node(self):
        #contains a list of all the possible grids at the next move for a player
        child_node = []
        child_grid = self.grid.copy()   
        #replace each remaining 0 by 'player' one by one and append the resulting grid to the child_node list
        for child in range(0,self.depth):
            child_grid[self.itemindex[0][child]][self.itemindex[1][child]] = self.next_player
            child_node.append(child_grid)
            child_grid=self.grid.copy()
        return child_node
    def position_children(self):
        return [(self.itemindex[0][child],self.itemindex[1][child]) for child in range(0,self.depth)]
    def terminal(self):
        #evaluate if terminal state (only possible once the first player has played at least 3 times) or end of tree
        return self.depth<6 and self.calc_score()!=0

In [190]:
five=State(np.array([['O','.','X'], ['O','X','.'], ['.','.','.']]))
four=State(np.array([['O','.','X'], ['O','X','.'], ['O','.','.']]))
print(five.player,five.calc_value(),five.calc_score(),five.child_node(),five.position_children())
print(four.player,four.calc_value(),four.calc_score(),four.child_node(),four.position_children())

X 6 0 [array([['O', 'O', 'X'],
       ['O', 'X', '.'],
       ['.', '.', '.']], dtype='<U1'), array([['O', '.', 'X'],
       ['O', 'X', 'O'],
       ['.', '.', '.']], dtype='<U1'), array([['O', '.', 'X'],
       ['O', 'X', '.'],
       ['O', '.', '.']], dtype='<U1'), array([['O', '.', 'X'],
       ['O', 'X', '.'],
       ['.', 'O', '.']], dtype='<U1'), array([['O', '.', 'X'],
       ['O', 'X', '.'],
       ['.', '.', 'O']], dtype='<U1')] [(0, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
O 15 4 [array([['O', 'X', 'X'],
       ['O', 'X', '.'],
       ['O', '.', '.']], dtype='<U1'), array([['O', '.', 'X'],
       ['O', 'X', 'X'],
       ['O', '.', '.']], dtype='<U1'), array([['O', '.', 'X'],
       ['O', 'X', '.'],
       ['O', 'X', '.']], dtype='<U1'), array([['O', '.', 'X'],
       ['O', 'X', '.'],
       ['O', '.', 'X']], dtype='<U1')] [(0, 1), (1, 2), (2, 1), (2, 2)]


### Minimax

In [191]:
#the game being symmetrical around zero, we can apply the negamax simplification
def minimax(state,tree):
    value=0
    if state.depth==0 or state.terminal():
        tree[state.id]=(state.depth,value)
        return state.calc_score()
    else:
        if state.depth%2!=0:                
            for child in state.child_node():
                node=State(child)
                value=max(value,minimax(node,tree))
                tree[node.id]=(node.depth,value)
        elif state.depth%2==0:
            for child in state.child_node():
                node=State(child)
                value=min(value,minimax(node,tree))
                tree[node.id]=(node.depth,value)
    return value

In [195]:
def main(state):
    print(state.grid)
    tree={}
    chance=minimax(state,tree)
    print('Best score: ',chance)
    if chance>1 and state.depth!=chance:
        print('Strategy to win:')
        winning_move=next(key for key, value in tree.items() if value == max(tree.values()))
        print(np.array(list(winning_move)).reshape(-1,3))
        results=[x[1] for x in tree.values()]
        print('Moves resulting in a draw: ',"{:.1%}".format(results.count(0)/len(results)))
        print('Moves resulting in a loss: ',"{:.1%}".format(results.count(-1)/len(results)))
        print('Moves resulting in a victory: ',"{:.1%}".format(sum(i>0 for i in results)/len(results)))
    return tree

In [194]:
main(State(np.array([['.','.','.'],['.','O','.'],['.','.','.']])))

[['.' '.' '.']
 ['.' 'O' '.']
 ['.' '.' '.']]
Best score:  0


{'XOXOOXOXO': (0, 0),
 'XOXOOXOX.': (1, 0),
 'XOXOOXO.X': (1, -1),
 'XOXOOXO..': (2, 0),
 'XOXOOX.O.': (2, 2),
 'XOXOOXXOO': (0, 0),
 'XOXOOXX.O': (1, 0),
 'XOXOOX.XO': (1, 0),
 'XOXOOX..O': (2, 0),
 'XOXOOX...': (3, 0),
 'XOXOOOX..': (2, 2),
 'XOXOO.XO.': (2, 2),
 'XOXOOOXXO': (0, 0),
 'XOXOO.XXO': (1, 0),
 'XOXOO.X.O': (2, 0),
 'XOXOO.X..': (3, 0),
 'XOXOOO.X.': (2, 2),
 'XOXOOOOXX': (0, 0),
 'XOXOO.OXX': (1, 0),
 'XOXOO.OX.': (2, 0),
 'XOXOO..XO': (2, 0),
 'XOXOO..X.': (3, 0),
 'XOXOOO..X': (2, 2),
 'XOXOO.O.X': (2, 0),
 'XOXOO..OX': (2, 2),
 'XOXOO...X': (3, 0),
 'XOXOO....': (4, 0),
 'XOXXOOOXO': (0, 0),
 'XOXXOOOX.': (1, 0),
 'XOXXOOOOX': (0, 0),
 'XOXXOOO.X': (1, 0),
 'XOXXOOO..': (2, 0),
 'XOXXOO.O.': (2, 2),
 'XOXXOOX.O': (1, -1),
 'XOXXOO.XO': (1, 0),
 'XOXXOO..O': (2, 0),
 'XOXXOO...': (3, 0),
 'XOX.OOXO.': (2, 2),
 'XOX.OOXXO': (1, 0),
 'XOX.OOX.O': (2, 0),
 'XOX.OOX..': (3, 0),
 'XOX.OOOXX': (1, 0),
 'XOX.OOOX.': (2, 0),
 'XOX.OO.XO': (2, 0),
 'XOX.OO.X.': (3, 0),
 'XOX.OO

In [196]:
main(five)

[['O' '.' 'X']
 ['O' 'X' '.']
 ['.' '.' '.']]
Best score:  4
Strategy to win:
[['O' '.' 'X']
 ['O' 'X' '.']
 ['O' '.' '.']]
Moves resulting in a draw:  72.7%
Moves resulting in a loss:  5.2%
Moves resulting in a victory:  22.1%


{'OOXOXXO..': (2, 2),
 'OOXOXXXOO': (0, 0),
 'OOXOXXXO.': (1, 0),
 'OOXOXX.OX': (1, -1),
 'OOXOXX.O.': (2, 0),
 'OOXOXXX.O': (1, 0),
 'OOXOXXOXO': (0, 0),
 'OOXOXX.XO': (1, 0),
 'OOXOXX..O': (2, 0),
 'OOXOXX...': (3, 0),
 'OOXOXOXXO': (0, 0),
 'OOXOXOXX.': (1, 0),
 'OOXOXOXOX': (0, 0),
 'OOXOXOX.X': (1, 0),
 'OOXOXOX..': (2, 0),
 'OOXOX.XOX': (1, -1),
 'OOXOX.XO.': (2, 0),
 'OOXOX.XXO': (1, 0),
 'OOXOX.X.O': (2, 0),
 'OOXOX.X..': (3, 0),
 'OOXOXOOXX': (0, 0),
 'OOXOXO.XX': (1, 0),
 'OOXOXO.X.': (2, 0),
 'OOXOX.OX.': (2, 2),
 'OOXOX..XO': (2, 0),
 'OOXOX..X.': (3, 0),
 'OOXOXO..X': (2, 0),
 'OOXOX.O.X': (2, 2),
 'OOXOX..OX': (2, 0),
 'OOXOX...X': (3, 0),
 'OOXOX....': (4, 0),
 'OXXOXOO..': (2, 2),
 'OXXOXOXOO': (0, 0),
 'OXXOXOXO.': (1, 0),
 'OXXOXOOOX': (0, 0),
 'OXXOXO.OX': (1, 0),
 'OXXOXO.O.': (2, 0),
 'OXXOXOX.O': (1, 0),
 'OXXOXO.XO': (1, -1),
 'OXXOXO..O': (2, 0),
 'OXXOXO...': (3, 0),
 'O.XOXOXOX': (1, 0),
 'O.XOXOXO.': (2, 0),
 'O.XOXOXXO': (1, -1),
 'O.XOXOX.O': (2, 0),
 'O.XO