<p style="font-family: Arial; font-size:3.75em;color:purple; font-style:bold"><br>
Tic-Tac-Toe
</p><br>
<strong>3x3 version of the <a href='https://en.wikipedia.org/wiki/Hex_(board_game)'>Hex game</a> solved using the minimax method</strong>

Minimax works:
- on games for 2 players
- when a player's win is another player's loss: 0-sum games
- when there is a complete information on the possible outcomes
- where the goal of the game is to minimize loss (and we assume that the other player is trying to maximize gain)

<p>Minimax can either be the brute force method of calculating every possible outcome, or improved using alpha-beta pruning (which eliminates sqrt(n) where n is the number of combinations for the brute-force method for the negamax solution, or n^0.75 otherwise)</p>
<p>We calculate the probability to win for the first player only.</p>

[@vbipin]('https://github.com/vbipin')
The minimax should contain the following functions:
* calculate the children of the current state (<strong>child_node</strong>): or for more complex games with some heuristics, divide into action(state) and move(state,action)
* evaluate if the current state is terminal: either when reaching maximum depth or in case of win/loss (<strong>calc_value()</strong>)
* utility function (<strong>calc_score()</strong>) that interprets the terminal state in terms of value for the player (typically, 1, -1 or 0). To assign different values to paths based on how fast they reach a terminal state, associate to each node a weight corresponding to the distance to the terminal state (a win becomes +1+depth_number, a losss -1-depth_number)
* the minimax is a recursive function calculating the utility function for all the children of a given state, for each state. Since players alternate to play each state, the utility for a given state is reversed from the point of view of the original player.

### Calculate the win/loss function
<a href='http://ohboyigettodomath.blogspot.com/2015/05/tic-tac-toe-as-magic-square.html'>Magic square trick</a>

In [1]:
import numpy as np

In [2]:
class State():
    
    def __init__(self,state):
        self.grid = state
        self.depth = np.sum(np.isin(self.grid,'.'))
        self.itemindex = np.where(np.asanyarray(self.grid) == '.')[0]
        self.id = ''.join([str(i) for i in self.grid])
        self.player = 'X'
        self.next_player = 'O'
        if self.depth % 2 == 0:
            self.player = 'O'
            self.next_player = 'X'
            
    def calc_value(self): 
        numbered_grid = [2, 7, 6, 9, 5, 1, 4, 3, 8]
        self.nb_player = [numbered_grid[k] if x == self.player else 0 for k,x in enumerate(self.grid)]
        self.score_cols = max([sum(self.nb_player[x:x+3]) for x in [0,3,6]])
        self.score_rows=max([sum(self.nb_player[x::3]) for x in range(3)])
        self.score_diagonal1 = self.nb_player[0] + self.nb_player[4] + self.nb_player[8]
        self.score_diagonal2 = self.nb_player[2] + self.nb_player[4] + self.nb_player[6]
        return max(self.score_cols,self.score_rows,self.score_diagonal1,self.score_diagonal2)
    
    def calc_score(self):
        #returns 1 for the maxplayer, -1 for the minplayer, 0 if nobody wins at this state
        if self.calc_value() == 15:
            if self.player == 'O':
                return self.depth + 1
            else:
                return -self.depth - 1
        return 0
    
    def child_node(self):
        #contains a list of all the possible grids at the next move for a player
        child_node = []
        child_grid = self.grid.copy()   
        #replace each remaining 0 by 'player' one by one and append the resulting grid to the child_node list
        for child in self.itemindex:
            child_grid[child] = self.next_player
            child_node.append(State(child_grid))
            child_grid = self.grid.copy()
        return child_node
    
    def terminal(self):
        #evaluate if terminal state (only possible once the first player has played at least 3 times) or a player wins
        return self.depth<6 and self.calc_score()!=0

In [69]:
five = State(['O','.','X','O','X','.', '.','.','.'])
four = State(['O','.','X','O','X','.','O','.','.'])
print(five.player,five.calc_value(),five.calc_score())
print(four.player,four.calc_value(),four.calc_score())
four.child_node()

X 11 0
O 15 5


[<__main__.State at 0x1a936d7d898>,
 <__main__.State at 0x1a936d7dbe0>,
 <__main__.State at 0x1a936d7df28>,
 <__main__.State at 0x1a936d7d550>]

### Minimax

In [4]:
def minimax(state, tree, scores):
    value = 0
    if state.depth == 0 or state.terminal():
        tree[state.id]=(state.depth,value)
        return state.calc_score()
    else:
        list_scores = []
        if state.player == 'X':                
            for child in state.child_node():
                value = max(value,minimax(child, tree, scores))
                tree[child.id] = (child.depth, value)
                list_scores.append(value)
        else:
            for child in state.child_node():
                value = min(value,minimax(child, tree, scores))
                tree[child.id] = (child.depth, value)
                list_scores.append(value)
        scores[state.id] = list_scores
        return value

In [82]:
def main(state):
    print(np.array(state.grid).reshape(-1, 3))
    tree = {}
    scores = {}
    chance = minimax(state, tree, scores)
    print('Best score (if both players maximize their chance): ', chance)
    print('Next move:')
    for branch in scores.values():
        if len(branch)==state.depth:
            for k,v in enumerate(branch):
                if v==max(branch):
                    next_move=state.grid.copy()
                    next_move[state.itemindex[k]]=state.next_player
                    print(np.array(next_move).reshape(-1, 3))

    results = [x[1] for x in tree.values()]
    
    print('Moves resulting in a draw: ',"{:.1%}".format(results.count(0) / len(results)))
    print('Moves resulting in a loss: ',"{:.1%}".format(sum(i < 0 for i in results) / len(results)))
    print('Moves resulting in a victory: ',"{:.1%}".format(sum(i > 0 for i in results) / len(results)))
    
    return scores

In [83]:
main(State(list('X..O.OOXX')))

[['X' '.' '.']
 ['O' '.' 'O']
 ['O' 'X' 'X']]
Best score (if both players maximize their chance):  3
Next move:
[['X' '.' '.']
 ['O' 'O' 'O']
 ['O' 'X' 'X']]
Moves resulting in a draw:  44.4%
Moves resulting in a loss:  22.2%
Moves resulting in a victory:  33.3%


{'XOXO.OOXX': [1],
 'XO.O.OOXX': [0, -2],
 'XXOO.OOXX': [1],
 'X.OO.OOXX': [0, -2],
 'X..O.OOXX': [0, 0, 3]}

In [74]:
#centerstart = main(State(list('....O....')))
main(State(list('....O....')))

[['.' '.' '.']
 ['.' 'O' '.']
 ['.' '.' '.']]
Best score (if both players maximize their chance):  0
Next move:
[['X' '.' '.']
 ['.' 'O' '.']
 ['.' '.' '.']]
[['.' 'X' '.']
 ['.' 'O' '.']
 ['.' '.' '.']]
[['.' '.' 'X']
 ['.' 'O' '.']
 ['.' '.' '.']]
[['.' '.' '.']
 ['X' 'O' '.']
 ['.' '.' '.']]
[['.' '.' '.']
 ['.' 'O' 'X']
 ['.' '.' '.']]
[['.' '.' '.']
 ['.' 'O' '.']
 ['X' '.' '.']]
[['.' '.' '.']
 ['.' 'O' '.']
 ['.' 'X' '.']]
[['.' '.' '.']
 ['.' 'O' '.']
 ['.' '.' 'X']]
Moves resulting in a draw:  64.0%
Moves resulting in a loss:  5.7%
Moves resulting in a victory:  30.3%


In [596]:
cornerstart = main(State(list('O........')))

[['O' '.' '.']
 ['.' '.' '.']
 ['.' '.' '.']]
Best score (if both players maximize their chance):  0
Next move:
5
Moves resulting in a draw:  72.1%
Moves resulting in a loss:  7.9%
Moves resulting in a victory:  20.0%


In [571]:
middlestart = main(State(list('.O.......')))

[['.' 'O' '.']
 ['.' '.' '.']
 ['.' '.' '.']]
Best score (if both players maximize their chance):  0
Next move:
5
Moves resulting in a draw:  75.7%
Moves resulting in a loss:  9.8%
Moves resulting in a victory:  14.4%


In [84]:
main(five)

[['O' '.' 'X']
 ['O' 'X' '.']
 ['.' '.' '.']]
Best score (if both players maximize their chance):  5
Next move:
[['O' '.' 'X']
 ['O' 'X' '.']
 ['O' '.' '.']]
[['O' '.' 'X']
 ['O' 'X' '.']
 ['.' 'O' '.']]
[['O' '.' 'X']
 ['O' 'X' '.']
 ['.' '.' 'O']]
Moves resulting in a draw:  31.3%
Moves resulting in a loss:  37.3%
Moves resulting in a victory:  31.3%


{'OOXOXX.O.': [-2, -2],
 'OOXOXX.XO': [1],
 'OOXOXX..O': [-2, -2],
 'OOXOXX...': [3, 3, 3],
 'OOXOXO.XX': [1],
 'OOXOXO.X.': [-2, -2],
 'OOXOX..XO': [0, -2],
 'OOXOX..X.': [0, 3, 3],
 'OOXOXO..X': [-2, -2],
 'OOXOX..OX': [-2, -2],
 'OOXOX...X': [0, 3, 3],
 'OOXOX....': [0, -4, -4, -4],
 'OXXOXO.OX': [1],
 'OXXOXO.O.': [-2, -2],
 'OXXOXO..O': [-2, -2],
 'OXXOXO...': [3, 3, 3],
 'O.XOXO.XO': [-2, -2],
 'O.XOXO.X.': [0, 3, 3],
 'O.XOXO.OX': [0, -2],
 'O.XOXO..X': [0, 3, 3],
 'O.XOXO...': [0, -4, -4, -4],
 'OXXOXX.OO': [1],
 'OXXOX..OO': [0, -2],
 'OXXOX..O.': [0, 3, 3],
 'O.XOXX.OO': [0, -2],
 'O.XOXX.O.': [0, 3, 3],
 'O.XOX..OX': [0, 0, 3],
 'O.XOX..O.': [0, 0, -4, -4],
 'OXXOX...O': [0, 3, 3],
 'O.XOXX..O': [0, 3, 3],
 'O.XOX..XO': [0, 0, 3],
 'O.XOX...O': [0, 0, -4, -4],
 'O.XOX....': [0, 0, 5, 5, 5]}

In [85]:
main(State(list('OX...O...')))

[['O' 'X' '.']
 ['.' '.' 'O']
 ['.' '.' '.']]
Best score (if both players maximize their chance):  0
Next move:
[['O' 'X' 'X']
 ['.' '.' 'O']
 ['.' '.' '.']]
[['O' 'X' '.']
 ['X' '.' 'O']
 ['.' '.' '.']]
[['O' 'X' '.']
 ['.' 'X' 'O']
 ['.' '.' '.']]
[['O' 'X' '.']
 ['.' '.' 'O']
 ['X' '.' '.']]
[['O' 'X' '.']
 ['.' '.' 'O']
 ['.' 'X' '.']]
[['O' 'X' '.']
 ['.' '.' 'O']
 ['.' '.' 'X']]
Moves resulting in a draw:  71.8%
Moves resulting in a loss:  8.8%
Moves resulting in a victory:  19.3%


{'OXXOXO.OX': [1],
 'OXXOXO.O.': [-2, -2],
 'OXXOXO..O': [-2, -2],
 'OXXOXO...': [3, 3, 3],
 'OXXO.OXOX': [1],
 'OXXO.OXO.': [-2, -2],
 'OXXO.OXXO': [1],
 'OXXO.OX.O': [-2, -2],
 'OXXO.OX..': [3, 3, 3],
 'OXXO.O.XO': [-2, -2],
 'OXXO.O.X.': [3, 3, 3],
 'OXXO.O.OX': [0, 0],
 'OXXO.O..X': [3, 3, 3],
 'OXXO.O...': [0, 0, 0, 0],
 'OXXXOOOX.': [1],
 'OXXXOOO.X': [0],
 'OXXXOOO..': [0, 0],
 'OXXXOOXO.': [1],
 'OXXXOO.OX': [0],
 'OXXXOO.O.': [0, 0],
 'OXXXOO...': [0, 0, 3],
 'OXX.OOXOX': [1],
 'OXX.OOXO.': [0, 0],
 'OXX.OOX..': [3, 3, 3],
 'OXX.OOOXX': [1],
 'OXX.OOOX.': [0, 0],
 'OXX.OO.X.': [3, 3, 3],
 'OXX.OOO.X': [0, 0],
 'OXX.OO.OX': [0, 0],
 'OXX.OO..X': [3, 3, 3],
 'OXX.OO...': [0, 0, 0, 0],
 'OXXXXOOO.': [1],
 'OXXX.OOOX': [0],
 'OXXX.OOO.': [0, 0],
 'OXXXXOO.O': [1],
 'OXXX.OOXO': [1],
 'OXXX.OO.O': [0, 0],
 'OXXX.OO..': [0, 0, 0],
 'OXX.XOOOX': [1],
 'OXX.XOOO.': [0, 0],
 'OXX.XOO.O': [0, -2],
 'OXX.XOO..': [3, 3, 3],
 'OXX..OOXO': [0, -2],
 'OXX..OOX.': [3, 3, 3],
 'OXX..OOOX': [0,