# Connect 4

---

Author: S. Menary [sbmenary@gmail.com]

Date  : 2023-01-03, last edit 2023-01-11

Brief : Develop a simple Connect 4 game environment and implement a bot using Monte Carlo Tree Search (MCTS)

---

### Summary

- Connect 4 is a two-player, fully-observable, zero-sum game. 
- The game states may be represented as a tree sturcture, We can therefore implement a bot using tree-search algorithms. We choose Connect 4 because it is simple, and therefore provides a launch-pad for more complex games such as checkers or chess.
- Initially we implement vanilla MCTS with no machine learning. We expect this to be limited by (i) the stochastic rollout of the tree and (ii) the simplicity of the simulation policy.
- To introduce ML, we would perform alternate steps of MCTS evaluation and simulation policy improvement. In this way, the simulated games will _hopefully_ begin to approach "good play", and the final MCTS values will reflect the behaviour of good players.
- MCTS configuration:
    + Tree-traversal policy is:
        1. From the current node, uniformly-randomly select a non-expanded child if one is available
        2. Otherwise select child with highest UCB-1 score, traverse to this node and repeat
    + Resulting node is expanded by adding all possible children and selecting one by performing a uniformly-random action
    + Simulation policy is to select a uniformly-random action
- The UCB-1 score is designed to optimally balance exploration/exploitation for static multi-arm bandits. Strictly speaking, we are applying this in a non-stationary environment because the reward-distribution for each action changes according to the evolution of the down-stream tree. This makes UCB-1 theoretically sub-optimal. However, it is often used nonetheless.
- When playing an actual move (i.e. inference time), greedily select the action with the max average score from its MCTS visits (do not use UCB-1 since we are no longer exploring).

Observations:
- Strength of decision-making depends on how many iterations of MCTS we perform:
    1. When tree is shallow, we effectively assume that future play is random, which means we will choose options with the greatest number of permutations of winning. We therefore may neglect to defend against an imminent loss, favouring a different move with many win permutations (bad behaviour).
    2. When tree is deep and UCB1 score converges towards true means, at least for the best moves, then we effectively assume that future play is optimal. As play-count goes to infinity, our scores become unbiased.
    3. For finite but sufficient run-time, we assume optimal play, but using mean scores that are biased by the fact that our early simulations used random play instead of optimal play.
- This explains why even random simulation MCTS is pretty good - we end up doing most of our simulations with pretty effective play, at least for the next few moves where our tree is sufficiently grown.


## Imports

In [1]:
###
###  Required imports
###  - all imports should be placed here
###


##  Python core libs
import sys, time
from enum import IntEnum
from abc  import ABC, abstractmethod
from __future__ import annotations

##  PyPI libs
import numpy as np

##  Local packages
from connect4.enums     import BinaryPlayer, DebugLevel, GameResult
from connect4.GameBoard import GameBoard


In [2]:
###
###  Print version for reproducibility
###

print(f"Python version is {sys.version}")
print(f"Numpy  version is {np.__version__}")

Python version is 3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:25:29) [Clang 14.0.6 ]
Numpy  version is 1.23.2


##  MCTS

Implement Node class to handle the tree search.

In [3]:
class BaseNode(ABC) :
    
    def __init__(self, game_board:GameBoard, parent:Node=None, params:list=[], shallow_copy_board:bool=False, 
                 label=None) :
        """
        Class BaseNode
        
        - Used as part of MCTS algorithm. 
        - Stores total score and number of visits
        - Stores a list of children and a reference to the parent node
        - Provides methods for node selection, expansion, simulation, backpropagation
        - Abstract base class, requires derived class to implement expansion and simulation policies
        
        Inputs:
        
            > game_board, GameBoard
              state of the game at this node
              
            > parent, None, default=None
              reference to the parent node, only equals None if this is a root node
              
            > params, list, default=[]
              hyper-parameters to be used in models
              
            > shallow_copy_board, bool, default=False
              whether to only create a shallow copy of the game board - caution: improves memory efficiency 
              but may lead to undefined behaviour if either one of the referenced objects is updated
              
            > label, str, default=None
              label for the node, used when generating summary strings
        """
                
        self.game_board  = game_board.deep_copy()
        self.actions     = game_board.get_available_actions()
        self.player      = game_board.to_play
        self.is_terminal = True if len(self.actions) == 0 else False
        self.children    = [None for a_idx in range(len(self.actions))]
        self.parent      = parent
        self.total_score = 0
        self.num_visits  = 0
        self.params      = params
        self.label       = label
        
        
    def __str__(self) -> str :
        """
        Return a string representation of the current node.
        """
        
        ##  Figure out parent / children info
        is_root              = False if self.parent else True
        num_children         = len(self.children)
        num_visited_children = len([c for c in self.children if c])
        
        ##  Begin str with node label if one provided
        ret  = f"[{self.label}] " if self.label else ""
        
        ##  Add some node information
        ret += f"N={self.num_visits}, T={self.total_score}, is_root={is_root}, is_leaf={self.is_terminal}"
        ret += f", num_children={num_children}, num_visited_children={num_visited_children}"
        
        ##  Return str
        return ret
        
        
    def get_best_action(self) -> int :
        """
        Return the optimal action based on the currently stored values.
        """
        
        ##  If this is a terminal node then no actions available
        if self.is_terminal : 
            return None
        
        ##  Find the index of the best child node, and return the corresponding action
        ##  - if no actions evaluated then argmax will return first action by default
        child_scores = [c.get_action_value() if c else -np.inf for c in self.children]
        best_a_idx   = np.argmax(child_scores)
        return self.actions[best_a_idx]
        
        
    def get_action_value(self) -> float :
        """
        Return node score.
        """
        
        ##  If node has not been visited then return -inf
        if self.num_visits == 0 :
            return -np.inf
        
        ##  Otherwise return mean reward per visit
        return self.total_score / self.num_visits
        
        
    @abstractmethod
    def get_expansion_score(self) -> float :
        """
        Returns the score used to expand nodes.
        """
        raise NotImplementedError()
        
        
    @abstractmethod
    def get_simulated_action(self, game_board:GameBoard) -> int :
        """
        Returns the action chosen by the simulation policy in the game state provided.
        """
        raise NotImplementedError()
        
        
    def select_and_expand(self, recurse:bool=False, debug_lvl:DebugLevel=DebugLevel.MUTE) -> Node :
        """
        Select from node children according to tree traversal policy. If next state is None then create a 
        new child and return this.
        
        Inputs:
        
            > recurse, bool, default=False
              whether to recursively iterate through tree until a new leaf node is found.
              
            > debug_lvl, DebugLevel, default=MUTE
              level at which to print debug statements to help understand algorithm behaviour.
        """
        
        ##  If leaf node then nothing to expand
        if self.is_terminal :
            debug_lvl.message(DebugLevel.MEDIUM, f"Leaf node found")
            return self
                
        ##  Uniformly randomly expand from un-visited children
        unvisited_children = [c_idx for c_idx, c in enumerate(self.children) if not c]
        if len(unvisited_children) > 0 :
            a_idx = np.random.choice(unvisited_children)
            new_game_board = self.game_board.deep_copy()
            node_label = f"{self.game_board.to_play}:{self.actions[a_idx]}"
            debug_lvl.message(DebugLevel.MEDIUM, f"Select unvisited action {node_label}")
            new_game_board.apply_action(self.actions[a_idx])
            self.children[a_idx] = self.__class__(new_game_board, parent=self, params=self.params, 
                                                  shallow_copy_board=True, label=node_label)
            return self.children[a_idx]
        
        ##  Otherwise best child is that with highest UCB score
        a_idx = np.argmax([c.get_expansion_score() for c in self.children])
        best_child = self.children[a_idx]
        debug_lvl.message(DebugLevel.MEDIUM, f"Select known action {self.game_board.to_play}:{self.actions[a_idx]}")
        
        ##  If recurse then also select_and_expand from the child node
        if recurse :
            debug_lvl.message(DebugLevel.MEDIUM, "... iterating to next level ...")
            return best_child.select_and_expand(recurse=recurse, debug_lvl=debug_lvl)
        
        ##  Otherwise return selected child
        return best_child
    
    
    def simulate(self, max_turns:int=-1, debug_lvl:DebugLevel=DebugLevel.MUTE) -> GameResult :
        """
        Simulate a game starting from this node.
        Assumes that both players act according to a uniform-random policy.
        
        Inputs:
        
            > max_turns, int, default=-1
              if positive then determines how many moves to play before declaring a drawn game
              
            > debug_lvl, DebugLevel, default=MUTE
              level at which to print debug statements to help understand algorithm behaviour.
              
        Returns:
        
            > float
              the score of the simulation, defined as +1 for a win, -1 for a loss, 0 for a draw
        """
        
        ##  Check if game has already been won
        ##  - if so then return score
        ##  - score is -1 if target player has lost, +1 if they've won, and 0 for a draw
        result = self.game_board.get_result()
        if result :
            debug_lvl.message(DebugLevel.MEDIUM, f"Leaf node found with result {result.name}")
            return result
                
        ##  Create copy of game board to play simulation
        simulated_game = self.game_board.deep_copy()
        
        ##  Keep playing moves until one of terminating conditions is reached:
        ##  1. game is won by a player
        ##  2. no further moves are possible, game is considered a draw
        ##  3. maximum move limit is reached, game is considered a draw
        turn_idx, is_terminal, result = 0, False, GameResult.NONE
        trajectory = []
        while not is_terminal :
            turn_idx += 1
            action = self.get_simulated_action(simulated_game)
            trajectory.append(f"{simulated_game.to_play}:{action}")
            simulated_game.apply_action(action)
            result = simulated_game.get_result()
            if result :
                is_terminal = True
                  
        ##  Debug trajectory
        debug_lvl.message(DebugLevel.MEDIUM, f"Simulation ended with result {result.name}")
        debug_lvl.message(DebugLevel.HIGH  , f"Simulated trajectory was: {' '.join(trajectory)}")
                                
        ##  Return score
        return result
    
    
    def simulate_and_backprop(self, max_turns:int=-1, 
                              debug_lvl:DebugLevel=DebugLevel.MUTE) -> None :
        """
        Simulate a game starting from this node. Backpropagate the resulting score up the whole tree.
        
        Inputs:
        
            > max_turns, int, default=-1
              if positive then determines how many moves to play before declaring a drawn game
              
            > debug_lvl, DebugLevel, default=MUTE
              level at which to print debug statements to help understand algorithm behaviour.
        """
        
        ##  Simulated game and obtain instance of GameResult
        result = self.simulate(max_turns=max_turns, debug_lvl=debug_lvl)
        
        ##  Update this node and backprop up the tree
        self.update_and_backprop(result, debug_lvl=debug_lvl)
        
        
    def tree_summary(self, indent_level:int=0) :
        """
        Return a multi-line str summarising every node in the tree.
        """
        
        ##  Summarise this node
        ret = ("     "*indent_level +
               f"> [{indent_level}{f': {self.label}' if self.label else ''}] N={self.num_visits}, T={self.total_score}, " +
               f"E={self.get_expansion_score():.3f}, Q={self.get_action_value():.3f}")
        
        ##  Recursively add the summary of each child node, iterating the indent level to reflect tree depth
        for a, c in zip(self.actions, self.children) :
            if c :
                ret += f"\n{c.tree_summary(indent_level+1)}"
            else :
                ret += "\n" + "     "*(indent_level+1) + "> None"
                
        ##  Return
        return ret
        
        
    def update(self, result:GameResult, debug_lvl:DebugLevel=DebugLevel.MUTE) -> None :
        """
        Update the score and visit counts for this node.
        """
        
        ##  Resolve score for this node given the game result
        ##  - score is from the viewpoint of the parent, since this is the one deciding whether to come here!
        ##  - if no parent exists then this is a ROOT node, and we assign a score of 0. by default
        if self.parent :
            score = result.get_game_score_for_player(self.parent.player)
        else :
            score = 0.
        debug_lvl.message(DebugLevel.MEDIUM, 
              f"Node {self.label} with parent={self.parent.player.name if self.parent else 'NONE'}, N={self.num_visits}, T={self.total_score:.2f} receiving score {score:.2f} for game ending in result {result.name}")
        
        ##  Update total score and number of visits for this node
        self.total_score += score
        self.num_visits  += 1
        
        
    def update_and_backprop(self, result:GameResult, 
                            debug_lvl:DebugLevel=DebugLevel.MUTE) -> None :
        """
        Update the score and visit counts for this node and backprop to all parents.
        """
        
        ##  Update this node
        self.update(result, debug_lvl=debug_lvl)
        
        ##  Recursively update all parent nodes
        if self.parent :
            self.parent.update_and_backprop(result, debug_lvl=debug_lvl)
        

In [4]:
class Node(BaseNode) :
    
    def __init__(self, game_board:GameBoard, parent:Node=None, params:list=[2.], shallow_copy_board:bool=False, 
                 label=None) :
        """
        Class Node
        
        - Used as part of MCTS algorithm. 
        - Stores total score and number of visits
        - Stores a list of children and a reference to the parent node
        - Provides methods for node selection, expansion, simulation, backpropagation
        - Derived from BaseNode, implements vanilla UCB1 expansion and uniform-random simulation
        
        Inputs:
        
            > game_board, GameBoard
              state of the game at this node
              
            > parent, None, default=None
              reference to the parent node, only equals None if this is a root node
              
            > params, list, default=[2.]
              hyper-parameter controlling strength of exploration in UCB search
              
            > shallow_copy_board, bool, default=False
              whether to only create a shallow copy of the game board - caution: improves memory efficiency 
              but may lead to undefined behaviour if either one of the referenced objects is updated
              
            > label, str, default=None
              label for the node, used when generating summary strings
        """
        
        ##  Call BaseNode initialiser with params=[UCB_c]
        super().__init__(game_board, parent, params, shallow_copy_board, label)
        
        
    def get_expansion_score(self) -> float :
        """
        Returns the UCB score of this node
        """
        
        ##  Retreive value of UCB exploration-strength hyper-param
        UCB_c = self.params[0]
                
        ##  If node is un-visited then the UCB score is infinite
        if UCB_c != 0 and self.num_visits == 0 :
            return np.inf
        
        ##  If node has no parent then no UCB score exists
        if not self.parent :
            return np.nan
        
        ##  Calculate mean score from past games
        mean_score = self.total_score / self.num_visits
        
        ##  Otherwise calculate UCB score
        return mean_score + UCB_c * np.sqrt(np.log(self.parent.num_visits) / self.num_visits)
    
    
    def get_simulated_action(self, game_board:GameBoard) -> int :
        """
        Returns a uniformly random action from those available
        """
        return np.random.choice(game_board.get_unfilled_columns())
        

In [5]:
###
###  Methods for MCTS
###  - Implement methods which interact with the Node class to perform a number of MCTS iterations
###


def one_step_MCTS(root_node, max_turns=-1, debug_lvl=DebugLevel.MUTE) :
    """
    Perform a single MCTS iteration on the root_node provided.
    """
    
    ##  Select and expand from the root node
    leaf_node = root_node.select_and_expand(recurse=True, debug_lvl=debug_lvl)
    
    ##  Simulate and backprop from the selected child
    leaf_node.simulate_and_backprop(max_turns=max_turns, debug_lvl=debug_lvl)
    
    ##  Print updated tree if debug level is HIGH
    debug_lvl.message(DebugLevel.HIGH, f"Updated tree is:\n{root_node.tree_summary()}")
    
    
def multi_step_MCTS(root_node, num_steps, max_turns=-1, debug_lvl=DebugLevel.MUTE) :
    """
    Perform a many MCTS iterations on the root_node provided.
    """
    
    ##  Call one_step_MCTS a number of times equal to num_steps
    for idx in range(num_steps) :
        debug_lvl.message(DebugLevel.MEDIUM, f"Running MCTS step {idx}")
        one_step_MCTS(root_node, max_turns=max_turns, debug_lvl=debug_lvl)
        debug_lvl.message(DebugLevel.MEDIUM, f"")
        
        
def timed_MCTS(root_node, duration, max_turns=-1, debug_lvl=DebugLevel.MUTE) :
    """
    Perform a MCTS iterations on the root_node until duration (in seconds) has elapsed.
    After this time, MCTS will finish its current iteration, so total execution time is > duration.
    """
    
    ##  Keep calling one_step_MCTS until required duration has elapsed
    start_time   = time.time()
    current_time = start_time
    num_itr = 0
    while current_time - start_time < duration :
        one_step_MCTS(root_node, max_turns=max_turns, debug_lvl=debug_lvl)
        current_time = time.time()
        num_itr += 1
    return num_itr


def get_bot_action(game_board, duration=1, max_turns=-1, debug_lvl=DebugLevel.MUTE) :
    """
    Create a root_node from the current game state, and perform a timed MCTS to choose a move.
    """
    
    ##  Create root node from current game board
    root_node = Node(game_board)
    
    ##  Call timed_MCTS to update tree values 
    num_itr       = timed_MCTS(root_node, duration=duration, max_turns=max_turns, debug_lvl=debug_lvl)
    chosen_action = root_node.get_best_action()
    
    ##  Print debug info
    debug_lvl.message(DebugLevel.HIGH, 
          root_node.tree_summary())
    debug_lvl.message(DebugLevel.LOW, 
          "Action values are:  " + " ".join([f"{x.get_action_value():.2f}".ljust(6) if x else "N/A   " for x in root_node.children]))
    debug_lvl.message(DebugLevel.LOW, 
          "Visit counts are:   " + " ".join([f"{x.num_visits}".ljust(6) if x else "N/A   " for x in root_node.children]))
    debug_lvl.message(DebugLevel.LOW, 
          f"Selecting action {chosen_action}")
        
    ##  Return best action from tree evaluation, and the number of MCTS iterations executed
    return chosen_action, root_node, num_itr
    
    
def take_move(game_board, my_action, duration=1, max_turns=-1, debug_lvl=DebugLevel.MUTE) :
    """
    Apply a human move.
    Print the game board.
    Use MCTS to find a responding bot move.
    Apply the bot move.
    Print the game board.
    """
    
    ##  Apply the human move.
    print(f"Human takes move {my_action}")
    game_board.apply_action(my_action)
    print(game_board)
    print()
    
    ##  If game has ended then return
    if game_board.get_result() :
        return
    
    ##  Use timed MCTS to obtain a bot action
    bot_action, _, num_itr = get_bot_action(game_board, 
                                            duration=duration, 
                                            debug_lvl=debug_lvl)
    
    ##  Apply the bot move
    print(f"Bot takes move {bot_action} ({num_itr} iterations)")
    game_board.apply_action(bot_action)
    print(game_board)


##  Test MCTS

In [6]:
###
###  Setup a small game
###  - 4x4 grid
###  - line of 3 needed to win
###

##  Create game board
game_board = GameBoard(4, 4, 3)

##  Show initial game board
print(game_board)


+---+---+---+---+
| . | . | . | . |
| . | . | . | . |
| . | . | . | . |
| . | . | . | . |
+---+---+---+---+
| 0 | 1 | 2 | 3 |
+---+---+---+---+
Game result is: NONE


In [7]:
###
###  Play a few initial moves
###  - transitions into a ciritical state where O player needs to be careful not to 
###    blunder a win for X
###

##  Play moves
game_board.apply_action(1)
game_board.apply_action(2)
game_board.apply_action(1)

##  Show updated game state
print(game_board)


+---+---+---+---+
| . | . | . | . |
| . | . | . | . |
| . | X | . | . |
| . | X | 0 | . |
+---+---+---+---+
| 0 | 1 | 2 | 3 |
+---+---+---+---+
Game result is: NONE


In [8]:
###
###  Perform a few MCTS steps
###  - transitions into a ciritical state where O player needs to be careful not to 
###    blunder a win for X
###

##  Create a root node at the current game state
root_node = Node(game_board, label="ROOT")

##  Print the initial value tree (should be a ROOT node with no children)
print("Initial tree:")
print(root_node.tree_summary())
print()

##  Perform several MCTS steps with a HIGH debug level
multi_step_MCTS(root_node, num_steps=10, max_turns=-1, debug_lvl=DebugLevel.HIGH)

##  Print the updated value tree 
print("Updated tree:")
print(root_node.tree_summary())
print()


Initial tree:
> [0: ROOT] N=0, T=0, E=inf, Q=-inf
     > None
     > None
     > None
     > None

Running MCTS step 0
Select unvisited action -1:2
Simulation ended with result X
Simulated trajectory was: 1:1
Node -1:2 with parent=O, N=0, T=0.00 receiving score -1.00 for game ending in result X
Node ROOT with parent=NONE, N=0, T=0.00 receiving score 0.00 for game ending in result X
Updated tree is:
> [0: ROOT] N=1, T=0.0, E=nan, Q=0.000
     > None
     > None
     > [1: -1:2] N=1, T=-1.0, E=-1.000, Q=-1.000
          > None
          > None
          > None
          > None
     > None

Running MCTS step 1
Select unvisited action -1:0
Simulation ended with result X
Simulated trajectory was: 1:2 -1:3 1:1
Node -1:0 with parent=O, N=0, T=0.00 receiving score -1.00 for game ending in result X
Node ROOT with parent=NONE, N=1, T=0.00 receiving score 0.00 for game ending in result X
Updated tree is:
> [0: ROOT] N=2, T=0.0, E=nan, Q=0.000
     > [1: -1:0] N=1, T=-1.0, E=0.665, Q=-1.000
      

In [9]:
###
###  Use MCTS to play a move
###

##  Use MCTS to search for an optimal action
bot_action, _, num_itr = get_bot_action(game_board, 
                                        duration=1, 
                                        debug_lvl=DebugLevel.LOW)
print(f"Bot chooses action {bot_action} after {num_itr} MCTS iterations")

##  Play bot move
game_board.apply_action(bot_action)

##  Show updated game state
print(game_board)


Action values are:  -0.80  -0.61  -0.79  -0.80 
Visit counts are:   143    466    152    143   
Selecting action 1
Bot chooses action 1 after 904 MCTS iterations
+---+---+---+---+
| . | . | . | . |
| . | 0 | . | . |
| . | X | . | . |
| . | X | 0 | . |
+---+---+---+---+
| 0 | 1 | 2 | 3 |
+---+---+---+---+
Game result is: NONE


## Connect 4

Play a game of connect 4 against our bot!

Just add new calls to `take_move(game_board, column_index, duration)` to play a move in column `column_index`. Turning up the `duration` parameter will improve the bot by allowing it to search for longer.

In [10]:
##  Create a new game

game_board = GameBoard()
print(game_board)


+---+---+---+---+---+---+---+
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 |
+---+---+---+---+---+---+---+
Game result is: NONE


In [11]:
##  Play a move in column index 3

take_move(game_board, 3, duration=5, max_turns=30)


Human takes move 3
+---+---+---+---+---+---+---+
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | X | . | . | . |
+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 |
+---+---+---+---+---+---+---+
Game result is: NONE

Bot takes move 2 (1005 iterations)
+---+---+---+---+---+---+---+
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | 0 | X | . | . | . |
+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 |
+---+---+---+---+---+---+---+
Game result is: NONE


---

... and so on, we keep calling `take_move` until the game is complete!

---

In [12]:
##  Play a move in column index 3

take_move(game_board, 2, duration=5, max_turns=30, debug_lvl=DebugLevel.LOW)


Human takes move 2
+---+---+---+---+---+---+---+
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | X | . | . | . | . |
| . | . | 0 | X | . | . | . |
+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 |
+---+---+---+---+---+---+---+
Game result is: NONE

Action values are:  -0.45  -0.58  -0.25  -0.18  -0.31  -0.25  -0.27 
Visit counts are:   76     52     165    242    122    165    148   
Selecting action 3
Bot takes move 3 (970 iterations)
+---+---+---+---+---+---+---+
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | . | . | . | . | . |
| . | . | X | 0 | . | . | . |
| . | . | 0 | X | . | . | . |
+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 |
+---+---+---+---+---+---+---+
Game result is: NONE
