# Adversarial Search: Playing Connect 4


## Instructions

Total Points: Undegraduates 10, graduate students 11

Complete this notebook and submit it. The notebook needs to be a complete project report with your implementation, documentation including a short discussion of how your implementation works and your design choices, and experimental results (e.g., tables and charts with simulation results) with a short discussion of what they mean. Use the provided notebook cells and insert additional code and markdown cells as needed.

## Introduction

You will implement different versions of agents that play Connect 4:

> "Connect 4 is a two-player connection board game, in which the players choose a color and then take turns dropping colored discs into a seven-column, six-row vertically suspended grid. The pieces fall straight down, occupying the lowest available space within the column. The objective of the game is to be the first to form a horizontal, vertical, or diagonal line of four of one's own discs." (see [Connect Four on Wikipedia](https://en.wikipedia.org/wiki/Connect_Four))

## Task 1: Defining the Search Problem [1 point]

Define the components of the search problem:

* Initial state
* Actions
* Transition model
* Goal state


* Initial state
 * Empty board and it is marked as someones turn.
* Actions
 * Either player can play in either column, as long as there isn't 6 pieces in that column.
* Transition model
 * Given a state and an action, the next state is the same board with one more piece on the lowest availble space of the chosen action.
* Goal state
 * Goal States for both players is all states with 4 or more in a diagonal .

How big is the search space?

In [18]:
# Your code/ answer goes here.

__Note:__ The search space for a $6 \times 7$ board is large. You can experiment with smaller boards (the smallest is $4 times \4$) and/or changing the winning rule to connect 3 instead of 4.

## Task 2: Game Environment and Random Agent [2 point]

Use a numpy character array as the board.

Implement the board and helper functions for:

* The transition model (result).
* The utility function.
* Check for terminal states.
* A check for available actions.
* A function to visualize the board.

Make sure that all these functions work with boards of different sizes.

Implement an agent that plays randomly and let two random agents play against each other 1000 times. How often does each player win? Is the result expected? 

In [28]:
import numpy as np
import math
def empty_board(shape=(6, 7)):
    return np.full(shape=shape, fill_value=' ')

def show_board(board):
    for row in board:
        line = '|'
        for char in row:
            line += char +  '|'
        print(line)
    print('_' * (2*len(board[0]) + 1))
show_board(empty_board())

| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
_______________


Instead of colors for the players use 'x' and 'o' to represent the players. Make sure that your agent functions all have the from: `agent_type(board, player = 'x')`, where board is the current board position and player is the player whose next move it is and who the agent should play.

In [20]:
import random
def to_move(board, player, move):
    new_board = board.copy()
    for row in reversed(new_board):
        if row[move] == ' ':
            row[move] = player
            return new_board
    return board
#helper for utility functions
def diagonal_rows(arr):
    rows = []
    num_rows = len(arr)
    num_cols = len(arr[0])
    arr = np.asarray(arr)
    arr2 = arr.copy()
    arr2 = np.fliplr(arr2)
    for i in range((-1)*max(num_cols,num_rows),max(num_cols,num_rows)):
        possible_arr = np.diagonal(arr,i,axis1=0,axis2=1)
        if len(possible_arr) != 0:
            rows.append(possible_arr)
        possible_arr = np.diagonal(arr2,i,axis1=0,axis2=1)
        if len(possible_arr) != 0:
            rows.append(possible_arr)
    return rows
def switch_player(player):
    if player == 'x':
        return 'o'
    else:
        return 'x'

def utility(board, player):
    opponent = switch_player(player)
    #check if there is a horizonatal winner
    for row in board:
        for x in range(0, len(row) - 3):
            if row[x] == player and row[x + 1] == player and row[x + 2] == player and row[x + 3] == player:
                return 1
            if row[x] == opponent and row[x + 1] == opponent and row[x + 2] == opponent and row[x + 3] == opponent:
                return -1
    #check if there is a vertical winner
    vertical_board = board.copy()
    vertical_board = np.rot90(vertical_board)
    for row in vertical_board:
        for x in range(0, len(row) - 3):
            if row[x] == player and row[x + 1] == player and row[x + 2] == player and row[x + 3] == player:
                return 1
            if row[x] == opponent and row[x + 1] == opponent and row[x + 2] == opponent and row[x + 3] == opponent:
                return -1
    rows = len(board)
    columns = len(board[0])
    #check diagonal
    for row in diagonal_rows(board):
        for x in range(0, len(row) - 3):
            if row[x] == player and row[x + 1] == player and row[x + 2] == player and row[x + 3] == player:
                return 1
            if row[x] == opponent and row[x + 1] == opponent and row[x + 2] == opponent and row[x + 3] == opponent:
                return -1
    if len(actions(board))==0:
        return 0
    return None

def actions(board):
    options = []
    for i in range(0,len(board[0])):
        if board[0][i] == ' ':
            options.append(i)
    return options



In [21]:
import random
class random_agent:
    def __init__(self,character):
        self.character = character
    
    def act(self, board):
        moves = actions(board)
        move_int = random.choice(moves)
        return to_move(board, self.character,move_int)

In [22]:

arr = [[1,2,3,4],
       [4,5,6,7],
       [9,10,11,12],
       [13,14,15,16]]
diag = diagonal_rows(arr)
print(diag)


[array([13]), array([16]), array([ 9, 14]), array([12, 15]), array([ 4, 10, 15]), array([ 7, 11, 14]), array([ 1,  5, 11, 16]), array([ 4,  6, 10, 13]), array([ 2,  6, 12]), array([3, 5, 9]), array([3, 7]), array([2, 4]), array([4]), array([1])]


In [23]:
#Testing some of these functions


board = empty_board()
show_board(board)
board = to_move(board,'x',0)
show_board(board)
print(utility(board,'x'))
board = to_move(board,'x',0)
board = to_move(board,'x',0)
board2 = to_move(board,'x',0)
show_board(board2)
print(utility(board2,'x'))


board = to_move(board,'o',1)
board = to_move(board,'o',2)
board = to_move(board,'o',3)
board3 = to_move(board,'o',4)
show_board(board3)
print(utility(board3,'x'))

board = to_move(board,'x',1)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'o',3)
board = to_move(board,'o',3)
board4 = to_move(board,'x',3)
show_board(board4)
print(utility(board4,'x'))

| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
_______________
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
|x| | | | | | |
_______________
None
| | | | | | | |
| | | | | | | |
|x| | | | | | |
|x| | | | | | |
|x| | | | | | |
|x| | | | | | |
_______________
1
| | | | | | | |
| | | | | | | |
| | | | | | | |
|x| | | | | | |
|x| | | | | | |
|x|o|o|o|o| | |
_______________
-1
| | | | | | | |
| | | | | | | |
| | | |x| | | |
|x| |x|o| | | |
|x|x|o|o| | | |
|x|o|o|o| | | |
_______________
1


In [24]:
def run_randoms(n,shape=(6,7),print_boards=False):
    x_wins = 0
    o_wins = 0
    draws = 0
    for i in range(0,n):
        board = empty_board(shape=shape)
        agent1 = random_agent('x')
        agent2 = random_agent('o')
        winner = ' '
        while winner == ' ':
            board = agent1.act(board)
            if utility(board, agent1.character) == 1:
                winner = agent1.character
                x_wins = x_wins + 1
            if winner == ' ':
                board = agent2.act(board)
            if utility(board, agent2.character) == 1:
                winner = agent2.character
                o_wins = o_wins + 1
            if len(actions(board)) == 0:
                winner = 'd'
                draws = draws + 1
        if print_boards:
            print(board)
    return (x_wins, o_wins, draws)

In [25]:
run_randoms(10000)

(5526, 4449, 29)


## Task 3: Minimax Search with Alpha-Beta Pruning [4 points]

### Implement the search starting from a given board and specifying the player.

In [26]:
# global variables
DEBUG = 1 # 1 ... count nodes, 2 ... debug each node
COUNT = 0

def alpha_beta_search(board, player = 'x'):
    """start the search."""
    global DEBUG, COUNT
    COUNT = 0
    
    value, move = max_value_ab(board, player, -math.inf, +math.inf)
    
    if DEBUG >= 1: print(f"Number of nodes searched: {COUNT}") 
    
    return value, move

def max_value_ab(state, player, alpha, beta):
    """player's best move."""
    global DEBUG, COUNT
    COUNT += 1
       
    # return utility of state is a terminal state
    v = utility(state, player)
    if DEBUG >= 2: print("max: " + str(state) + str([alpha, beta, v]) ) 
    if v is not None: return v, None
        
    v, move = -math.inf, None

    # check all possible actions in the state, update alpha and return move with the largest value
    for a in actions(state):
        v2, a2 = min_value_ab(to_move(state, player, a), player, alpha, beta)
        if v2 > v:
            v, move = v2, a
            alpha = max(alpha, v)
        if v >= beta: return v, move
    
    return v, move

def min_value_ab(state, player, alpha, beta):
    """opponent's best response."""
    global DEBUG, COUNT
    COUNT += 1
    
    # return utility of state is a terminal state
    v = utility(state, player)
    if DEBUG >= 2: print("min: " + str(state) + str([alpha, beta, v]) ) 
    if v is not None: return v, None
    
    v, move = +math.inf, None

    # check all possible actions in the state, update beta and return move with the smallest value
    for a in actions(state):
        v2, a2 = max_value_ab(to_move(state, switch_player(player), a), player, alpha, beta)
        if v2 < v:
            v, move = v2, a
            beta = min(beta, v)
        if v <= alpha: return v, move
    
    return v, move

Experiment with some manually created boards (at least 5) to check if the agent spots wining opportunities.

In [29]:
%%time
board = empty_board((5,5))
board[4][0] = 'x'
board[4][1] = 'x'
board[4][2] = 'x'
board[3][0] = 'o'
board[3][1] = 'o'
board[3][2] = 'o'
board[2][0] = 'x'
board[2][1] = 'x'
board[2][2] = 'x'
board[1][0] = 'o'
board[1][1] = 'o'
board[1][2] = 'o'
board[0][0] = 'o'
board[0][1] = 'o'
board[0][2] = 'o'
board[4][4] = 'o'
board[3][4] = 'o'
board[2][4] = 'o'
board[1][4] = 'x'
move = alpha_beta_search(board, player = 'x')
print(move)
move = alpha_beta_search(board, player = 'o')
print(move)
show_board(board)

Number of nodes searched: 6
(1, 3)
Number of nodes searched: 10
(1, 3)
|o|o|o| | |
|o|o|o| |x|
|x|x|x| |o|
|o|o|o| |o|
|x|x|x| |o|
___________
Wall time: 2 ms


In [30]:
%%time
board = empty_board((5,5))
board[4][0] = 'x'
board[4][1] = 'x'
board[4][2] = 'o'
board[3][0] = 'o'
board[3][1] = 'o'
board[3][2] = 'x'
board[2][0] = 'x'
board[2][1] = 'x'
board[2][2] = 'o'
board[4][4] = 'o'
board[3][4] = 'o'
board[2][4] = 'o'
move = alpha_beta_search(board, player = 'x')
print(move)

move = alpha_beta_search(board, player = 'o')
print(move)
show_board(board)

Number of nodes searched: 171286
(0, 4)
Number of nodes searched: 24203
(1, 0)
| | | | | |
| | | | | |
|x|x|o| |o|
|o|o|x| |o|
|x|x|o| |o|
___________
Wall time: 23.8 s


Something looks off about 'o's move.  We are going to investigate this path

In [64]:
%%time
board = empty_board((5,5))
board = empty_board((5,5))
board[4][0] = 'x'
board[4][1] = 'x'
board[4][2] = 'o'
board[3][0] = 'o'
board[3][1] = 'o'
board[3][2] = 'x'
board[2][0] = 'x'
board[2][1] = 'x'
board[2][2] = 'o'
board[4][4] = 'o'
board[3][4] = 'o'
board[2][4] = 'o'
board = to_move(board, 'o',0)
show_board(board)

| | | | | |
|o| | | | |
|x|x|o| |o|
|o|o|x| |o|
|x|x|o| |o|
___________
Wall time: 0 ns


In [65]:
move = alpha_beta_search(board, player = 'x')
print(move)

Number of nodes searched: 23716
(-1, 0)


In [66]:
board = to_move(board, 'x',0)
show_board(board)
move = alpha_beta_search(board, player = 'o')
print(move)

|x| | | | |
|o| | | | |
|x|x|o| |o|
|o|o|x| |o|
|x|x|o| |o|
___________
Number of nodes searched: 1613
(1, 2)


In [67]:
board = to_move(board, 'o',1)
show_board(board)
move = alpha_beta_search(board, player = 'x')
print(move)

|x| | | | |
|o|o| | | |
|x|x|o| |o|
|o|o|x| |o|
|x|x|o| |o|
___________
Number of nodes searched: 1299
(0, 4)


In [68]:
board = to_move(board, 'x',0)
show_board(board)
move = alpha_beta_search(board, player = 'o')
print(move)

|x| | | | |
|o|o| | | |
|x|x|o| |o|
|o|o|x| |o|
|x|x|o| |o|
___________
Number of nodes searched: 85
(1, 1)


In [69]:
board = to_move(board, 'o',1)
show_board(board)
move = alpha_beta_search(board, player = 'x')
print(move)

|x|o| | | |
|o|o| | | |
|x|x|o| |o|
|o|o|x| |o|
|x|x|o| |o|
___________
Number of nodes searched: 64
(-1, 2)


In [70]:
board = to_move(board, 'x',2)
show_board(board)
move = alpha_beta_search(board, player = 'o')
print(move)

|x|o| | | |
|o|o|x| | |
|x|x|o| |o|
|o|o|x| |o|
|x|x|o| |o|
___________
Number of nodes searched: 29
(1, 2)


In [71]:
board = to_move(board, 'o',2)
show_board(board)
move = alpha_beta_search(board, player = 'x')
print(move)

|x|o|o| | |
|o|o|x| | |
|x|x|o| |o|
|o|o|x| |o|
|x|x|o| |o|
___________
Number of nodes searched: 23
(-1, 3)


In [72]:
board = to_move(board, 'x',3)
show_board(board)
move = alpha_beta_search(board, player = 'o')
print(move)

|x|o|o| | |
|o|o|x| | |
|x|x|o| |o|
|o|o|x| |o|
|x|x|o|x|o|
___________
Number of nodes searched: 3
(1, 3)


In [74]:
board = to_move(board, 'o',3)
show_board(board)

|x|o|o| | |
|o|o|x| | |
|x|x|o| |o|
|o|o|x|o|o|
|x|x|o|x|o|
___________


It still won in the end.  We are going to do some other tests.  The problem we are experiencing with MiniMax is wit is a depth first search, and does not find the fastest solution, but a solution.  

In [47]:
%%time
board = empty_board((5,5))
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'x',0)
board = to_move(board,'o',1)
board = to_move(board,'x',1)
board = to_move(board,'o',3)
board = to_move(board,'x',3)
board = to_move(board,'o',4)
board = to_move(board,'x',4)
board = to_move(board,'o',3)
move = alpha_beta_search(board, player = 'x')
print(move)
show_board(board)

Number of nodes searched: 3122
(1, 3)
| | |x| | |
| | |o| | |
| | |x|o| |
| |x|o|x|x|
|x|o|x|o|o|
___________
Wall time: 390 ms


How long does it take to make a move? Start with a smaller board with 4 columns and make the board larger by adding columns.

In [51]:
%%time
board = empty_board((4,5))
show_board(board)
move = alpha_beta_search(board, player = 'x')
print(move)

| | | | | |
| | | | | |
| | | | | |
| | | | | |
___________


KeyboardInterrupt: 

### Move ordering

Describe and implement a simple move ordering strategy. How does this strategy influence the time it takes to 
make a move?

In [95]:
def actions_ordering1(board, player='x'):
    options = []
    win_options = []
    for i in range(0,len(board[0])):
        if board[0][i] == ' ':
            potential_board = to_move(board,player,i)
            
            if utility(potential_board, player) != None:
                win_options.append(i)
            else:
                options.append(i)
    print('win',win_options)
    print('other', options)
    return win_options + options

def alpha_beta_search_ordering1(board, player = 'x'):
    """start the search."""
    global DEBUG, COUNT
    COUNT = 0
    
    value, move = max_value_ab(board, player, -math.inf, +math.inf)
    
    if DEBUG >= 1: print(f"Number of nodes searched: {COUNT}") 
    
    return value, move

def max_value_ab_ordering1(state, player, alpha, beta):
    """player's best move."""
    global DEBUG, COUNT
    COUNT += 1
       
    # return utility of state is a terminal state
    v = utility(state, player)
    if DEBUG >= 2: print("max: " + str(state) + str([alpha, beta, v]) ) 
    if v is not None: return v, None
        
    v, move = -math.inf, None

    # check all possible actions in the state, update alpha and return move with the largest value
    for a in actions_ordering1(state, player):
        v2, a2 = min_value_ab(to_move(state, player, a), player, alpha, beta)
        if v2 > v:
            v, move = v2, a
            alpha = max(alpha, v)
        if v >= beta: return v, move
    
    return v, move

def min_value_ab_ordering1(state, player, alpha, beta):
    """opponent's best response."""
    global DEBUG, COUNT
    COUNT += 1
    
    # return utility of state is a terminal state
    v = utility(state, player)
    if DEBUG >= 2: print("min: " + str(state) + str([alpha, beta, v]) ) 
    if v is not None: return v, None
    
    v, move = +math.inf, None

    # check all possible actions in the state, update beta and return move with the smallest value
    for a in actions_ordering1(state,switch_player(player)):
        v2, a2 = max_value_ab(to_move(state, switch_player(player), a), player, alpha, beta)
        if v2 < v:
            v, move = v2, a
            beta = min(beta, v)
        if v <= alpha: return v, move
    
    return v, move

In [55]:
board = empty_board((5,5))
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'x',0)
board = to_move(board,'o',1)
board = to_move(board,'x',1)
board = to_move(board,'o',3)
board = to_move(board,'x',3)
board = to_move(board,'o',4)
board = to_move(board,'x',4)
board = to_move(board,'o',3)
show_board(board)
move = alpha_beta_search_ordering1(board, 'x')
print(move)

| | |x| | |
| | |o| | |
| | |x|o| |
| |x|o|x|x|
|x|o|x|o|o|
___________
Number of nodes searched: 3122
(1, 3)


In [69]:
%%time
board = empty_board((5,5))
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'o',0)
show_board(board)
move = alpha_beta_search_ordering1(board, player = 'x')
print(move)

move = alpha_beta_search(board, player = 'x')
print(move)

| | |x| | |
| | |o| | |
| | |x| | |
| | |o| | |
|o| |x| | |
___________
Number of nodes searched: 1127129
(1, 1)
Number of nodes searched: 1127129
(1, 1)
Wall time: 4min 29s


This ordering does not seem to reduce the nodes we explore.  We will try another.

In [96]:
def actions_ordering2(board):
    options = []
    win_options = []
    for i in range(0,len(board[0])):
        if board[0][i] == ' ':
            options.append(i)
    order = []
    #get the middle element
    print(options)
    if len(options) % 2 == 1:
        order.append(options[len(options)//2])
        for i in range(1,(len(options)//2)+1):
            order.append(len(options)//2 + i)
            order.append(len(options)//2 - i)
    else:
        for i in range(1,(len(options)//2)+1):
            order.append(len(options)//2 + (i-1))
            order.append(len(options)//2 - i)
    options = [options[i] for i in order]
    return options

def alpha_beta_search_ordering2(board, player = 'x'):
    """start the search."""
    global DEBUG, COUNT
    COUNT = 0
    
    value, move = max_value_ab(board, player, -math.inf, +math.inf)
    
    if DEBUG >= 1: print(f"Number of nodes searched: {COUNT}") 
    
    return value, move

def max_value_ab_ordering2(state, player, alpha, beta):
    """player's best move."""
    global DEBUG, COUNT
    COUNT += 1
       
    # return utility of state is a terminal state
    v = utility(state, player)
    if DEBUG >= 2: print("max: " + str(state) + str([alpha, beta, v]) ) 
    if v is not None: return v, None
        
    v, move = -math.inf, None

    # check all possible actions in the state, update alpha and return move with the largest value
    for a in actions_ordering2(state):
        v2, a2 = min_value_ab(to_move(state, player, a), player, alpha, beta)
        if v2 > v:
            v, move = v2, a
            alpha = max(alpha, v)
        if v >= beta: return v, move
    
    return v, move

def min_value_ab_ordering2(state, player, alpha, beta):
    """opponent's best response."""
    global DEBUG, COUNT
    COUNT += 1
    
    # return utility of state is a terminal state
    v = utility(state, player)
    if DEBUG >= 2: print("min: " + str(state) + str([alpha, beta, v]) ) 
    if v is not None: return v, None
    
    v, move = +math.inf, None

    # check all possible actions in the state, update beta and return move with the smallest value
    for a in actions_ordering2(state):
        v2, a2 = max_value_ab(to_move(state, switch_player(player), a), player, alpha, beta)
        if v2 < v:
            v, move = v2, a
            beta = min(beta, v)
        if v <= alpha: return v, move
    
    return v, move

In [97]:
%%time
board = empty_board((5,5))
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'o',0)
show_board(board)
move = alpha_beta_search_ordering2(board, player = 'x')
print(move)


move = alpha_beta_search_ordering1(board, player = 'x')
print(move)

move = alpha_beta_search(board, player = 'x')
print(move)

| | |x| | |
| | |o| | |
| | |x| | |
| | |o| | |
|o| |x| | |
___________
Number of nodes searched: 1127129
(1, 1)
Number of nodes searched: 1127129
(1, 1)
Number of nodes searched: 1127129
(1, 1)
Wall time: 6min 44s


### Playtime

Let the Minimax Search agent play a random agent on a small board. Analyze wins, losses and draws.

In [104]:
class minimax_agent:
    def __init__(self,character):
        self.character = character
    
    def act(self, board):
        win, move_int = minimax_search(board, self.character)
        return to_move(board, self.character,move_int)
board = empty_board((5,5))
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'o',2)
board = to_move(board,'x',2)
board = to_move(board,'x',0)
board = to_move(board,'o',1)
board = to_move(board,'x',1)
board = to_move(board,'o',3)
board = to_move(board,'x',3)
board = to_move(board,'o',4)
board = to_move(board,'x',4)
board = to_move(board,'o',3)
board = to_move(board,'x',4)
agent2 = minimax_agent('x')
board = agent2.act(board)
show_board(board)

Number of nodes searched: 25994
| | |x| | |
| | |o|x| |
| | |x|o|x|
| |x|o|x|x|
|x|o|x|o|o|
___________


In [106]:
%%time
def run_random_vs_alpha(n,shape=(6,7),print_boards=False):
    x_wins = 0
    o_wins = 0
    draws = 0
    for i in range(0,n):
        board = empty_board(shape=shape)
        agent1 = random_agent('x')
        agent2 = minimax_agent('o')
        winner = ' '
        while winner == ' ':
            board = agent1.act(board)
            if utility(board, agent1.character) == 1:
                winner = agent1.character
                x_wins = x_wins + 1
            if winner == ' ':
                board = agent2.act(board)
            if utility(board, agent2.character) == 1:
                winner = agent2.character
                o_wins = o_wins + 1
            if len(actions(board)) == 0:
                winner = 'd'
                draws = draws + 1
        if print_boards:
            print(board)
    return (x_wins, o_wins, draws)
print(run_random_vs_alpha(1,(4,4)))

Number of nodes searched: 32988322
Number of nodes searched: 2566477
Number of nodes searched: 104029
Number of nodes searched: 3299
Number of nodes searched: 410
Number of nodes searched: 31
Number of nodes searched: 4
Number of nodes searched: 2
(0, 0, 1)
Wall time: 53min 17s


## Task 4: Heuristic Alpha-Beta Tree Search [3 points] 

### Heuristic evaluation function

Define and implement a heuristic evaluation function.

In [10]:
# Your code/ answer goes here.

### Cutting off search 

Modify your Minimax Search with Alpha-Beta Pruning to cut off search at a specified depth and use the heuristic evaluation function. Experiment with different cutoff values.

In [11]:
# Your code/ answer goes here.

Experiment with the same manually created boards as above to check if the agent spots wining opportunities.

In [12]:
# Your code/ answer goes here.

How long does it take to make a move? Start with a smaller board with 4 columns and make the board larger by adding columns.

In [13]:
# Your code/ answer goes here.

### Forward Pruning

Add forward pruning to the cutoff search where you do not consider moves that have a low evaluation value after a shallow search 
(way smaller than the cuttoff value).

In [14]:
# Your code/ answer goes here.

How long does it take to make a move? Start with a smaller board with 4 columns and make the board larger by adding columns.

In [15]:
# Your code/ answer goes here.

### Playtime

Let two heuristic search agents (different cutoff depth, different heuristic evaluation function or different forward pruning) compete against each other on a reasonably sized board. Since there is no randomness, you only need to let them play once.

In [16]:
# Your code/ answer goes here.

## Challenge task [+ 1 bonus point]

Find another student and let your best agent play against the other student's best player. We will set up a class tournament on Canvas. This tournament will continue after the submission deadline.

## Graduate student advanced task: Pure Monte Carlo Search and Best First Move [1 point]

__Undergraduate students:__ This is a bonus task you can attempt if you like [+1 Bonus point].

### Pure Monte Carlos Search

Implement Pure Monte Carlo Search and investigate how this search performs on the test boards that you have used above. 

In [17]:
# Your code/ answer goes here.

### Best First Move

How would you determine what the best first move is? You can use Pure Monte Carlo Search or any algorithms 
that you have implemented above.

In [18]:
# Your code/ answer goes here.

### Utility Function 1

Player Score.
* Any streak of 1 = 1 point
* Any streak of 2 = 2 point
* Any streak of 3 = 6 point

The function returns target players score minus the opponents score

In [100]:
import sys

def unitility_function_1(board, player, move_int):
    #check if there is a horizonatal winner
    board = move(board=board,player=player,move=move_int)
    num_ones_total = 0
    num_twos_total = 0
    num_threes_total = 0
    num_fours_total = 0
    for row in board:
        num_ones, num_twos, num_threes,num_fours = score_array(row, player)
        num_ones_total += num_ones
        num_twos_total += num_twos
        num_threes_total += num_threes
        num_fours_total += num_fours
        
    #check if there is a vertical winner
    vertical_board = board.copy()
    vertical_board = np.rot90(vertical_board)
    for row in vertical_board:
        num_ones, num_twos, num_threes,num_fours = score_array(row, player)
        num_twos_total += num_twos
        num_threes_total += num_threes
        num_fours_total += num_fours
    for i in diagonal_rows(board):
        num_ones, num_twos, num_threes,num_fours = score_array(i, player)
        num_twos_total += num_twos
        num_threes_total += num_threes
        num_fours_total += num_fours
    print(num_ones_total,num_twos_total,num_threes_total)
    if num_fours_total > 0:
        return 2**32
    return num_ones_total + num_twos_total*2 + num_threes_total*6

def score_array(array, player):
    arr = array.copy()
    np.insert(arr,obj=0,values=['*'])
    np.insert(arr,obj=len(arr),values=['*'])
    num_ones = 0
    num_twos = 0
    num_threes = 0
    num_fours = 0
    current_run = 0
    for c in arr:
        if c == player:
            current_run += 1
        else: 
            if current_run == 1:
                num_ones += 1
            elif current_run == 2:
                num_twos += 1
            elif current_run == 3:
                num_threes +=1
            elif current_run >= 4:
                num_fours = 1
            current_run = 0
    return (num_ones,num_twos,num_threes,num_fours)
