# Adversarial Search: Playing Connect 4


## Instructions

Total Points: Undegraduates 10, graduate students 11

Complete this notebook and submit it. The notebook needs to be a complete project report with your implementation, documentation including a short discussion of how your implementation works and your design choices, and experimental results (e.g., tables and charts with simulation results) with a short discussion of what they mean. Use the provided notebook cells and insert additional code and markdown cells as needed.

## Introduction

You will implement different versions of agents that play Connect 4:

> "Connect 4 is a two-player connection board game, in which the players choose a color and then take turns dropping colored discs into a seven-column, six-row vertically suspended grid. The pieces fall straight down, occupying the lowest available space within the column. The objective of the game is to be the first to form a horizontal, vertical, or diagonal line of four of one's own discs." (see [Connect Four on Wikipedia](https://en.wikipedia.org/wiki/Connect_Four))

## Task 1: Defining the Search Problem [1 point]

Define the components of the search problem:

* Initial state
* Actions
* Transition model
* Goal state


* Initial state
 * Empty board and it is marked as someones turn.
* Actions
 * Either player can play in either column, as long as there isn't 6 pieces in that column.
* Transition model
 * Given a state and an action, the next state is the same board with one more piece on the lowest availble space of the chosen action.
* Goal state
 * Goal States for both players is all states with 4 or more in a diagonal .

How big is the search space?

In [40]:
# Your code/ answer goes here.

__Note:__ The search space for a $6 \times 7$ board is large. You can experiment with smaller boards (the smallest is $4 times \4$) and/or changing the winning rule to connect 3 instead of 4.

## Task 2: Game Environment and Random Agent [2 point]

Use a numpy character array as the board.

In [41]:
import numpy as np

def empty_board(shape=(6, 7)):
    return np.full(shape=shape, fill_value=' ')

print(empty_board())

[[' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']]


Instead of colors for the players use 'x' and 'o' to represent the players. Make sure that your agent functions all have the from: `agent_type(board, player = 'x')`, where board is the current board position and player is the player whose next move it is and who the agent should play.

Implement the board and helper functions for:

* The transition model (result).
* The utility function.
* Check for terminal states.
* A check for available actions.
* A function to visualize the board.

Make sure that all these functions work with boards of different sizes.

Implement an agent that plays randomly and let two random agents play against each other 1000 times. How often does each player win? Is the result expected? 

In [42]:
def move(board, player, move):
    new_board = board.copy()
    for row in reversed(new_board):
        if row[move] == ' ':
            row[move] = player
            return new_board
    return 0

def if_winner(board, player):
    #check if there is a horizonatal winner
    for row in board:
        for x in range(0, len(row) - 3):
            if row[x] == player and row[x + 1] == player and row[x + 2] == player and row[x + 3] == player:
                return True
    #check if there is a vertical winner
    vertical_board = board.copy()
    vertical_board = np.rot90(vertical_board)
    for row in vertical_board:
        for x in range(0, len(row) - 3):
            if row[x] == player and row[x + 1] == player and row[x + 2] == player and row[x + 3] == player:
                return True
    rows = len(board)
    columns = len(board[0])
    #check right diagonal
    for x in range(0,rows - 3):
        for y in range(0,rows - 3):
            if (board[x][y] == player and board[x + 1][y + 1] == player and 
                board[x + 2][y + 2] == player and board[x + 3][y + 3] == player):
                return True
    
    #check left diagonal
    for x in range(3,rows):
        for y in range(0,rows - 3):
            if (board[x][y] == player and board[x - 1][y + 1] == player and 
                board[x - 2][y + 2] == player and board[x - 3][y + 3] == player):
                return True
    return False

def available_move(board):
    options = []
    for i in range(0,len(board[0])):
        if board[0][i] == ' ':
            options.append(i)
    return options



### Utility Function 1

Player Score.
* Any streak of 1 = 1 point
* Any streak of 2 = 2 point
* Any streak of 3 = 6 point

The function returns target players score minus the opponents score

In [100]:
import sys

def unitility_function_1(board, player, move_int):
    #check if there is a horizonatal winner
    board = move(board=board,player=player,move=move_int)
    num_ones_total = 0
    num_twos_total = 0
    num_threes_total = 0
    num_fours_total = 0
    for row in board:
        num_ones, num_twos, num_threes,num_fours = score_array(row, player)
        num_ones_total += num_ones
        num_twos_total += num_twos
        num_threes_total += num_threes
        num_fours_total += num_fours
        
    #check if there is a vertical winner
    vertical_board = board.copy()
    vertical_board = np.rot90(vertical_board)
    for row in vertical_board:
        num_ones, num_twos, num_threes,num_fours = score_array(row, player)
        num_twos_total += num_twos
        num_threes_total += num_threes
        num_fours_total += num_fours
    for i in diagonal_rows(board):
        num_ones, num_twos, num_threes,num_fours = score_array(i, player)
        num_twos_total += num_twos
        num_threes_total += num_threes
        num_fours_total += num_fours
    print(num_ones_total,num_twos_total,num_threes_total)
    if num_fours_total > 0:
        return 2**32
    return num_ones_total + num_twos_total*2 + num_threes_total*6
def diagonal_rows(arr):
    rows = []
    num_rows = len(arr)
    num_cols = len(arr[0])
    arr = np.asarray(arr)
    arr2 = arr.copy()
    arr2 = np.rot90(arr2)
    arr2 = np.rot90(arr2)
    for i in range((-1)*num_cols*num_rows, num_cols,num_rows):
        possible_arr = np.diagonal(arr,i,axis1=0,axis2=1)
        if len(possible_arr) != 0:
            rows.append(possible_arr)
        possible_arr = np.diagonal(arr2,i,axis1=0,axis2=1)
        if len(possible_arr) != 0:
            rows.append(possible_arr)
    return rows
def score_array(array, player):
    arr = array.copy()
    np.insert(arr,obj=0,values=['*'])
    np.insert(arr,obj=len(arr),values=['*'])
    num_ones = 0
    num_twos = 0
    num_threes = 0
    num_fours = 0
    current_run = 0
    for c in arr:
        if c == player:
            current_run += 1
        else: 
            if current_run == 1:
                num_ones += 1
            elif current_run == 2:
                num_twos += 1
            elif current_run == 3:
                num_threes +=1
            elif current_run >= 4:
                num_fours = 1
            current_run = 0
    return (num_ones,num_twos,num_threes,num_fours)


In [101]:
import random
class random_agent:
    def __init__(self,character):
        self.character = character
    
    def act(self, board):
        moves = available_move(board)
        move_int = random.choice(moves)
        return move(board, self.character,move_int)

In [107]:
#Test the utility function 
arr = ['x','x','o',' ',' ','x']
score_array(arr, 'x')
board = empty_board()
agent1 = random_agent('x')
board = np.rot90(board)
for i in range(10):
    board = agent1.act(board)
    
print(board)
print(unitility_function_1(board,'x',0))

[[' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ']
 ['x' ' ' ' ' ' ' ' ' ' ']
 ['x' ' ' 'x' 'x' ' ' ' ']
 ['x' ' ' 'x' 'x' ' ' ' ']
 ['x' ' ' 'x' 'x' ' ' ' ']]
5 3 0
11


In [105]:
arr = [[1,2,3,4],[5,6,8,9],[10,11,12,13]]
arr = np.asarray(arr)
arr2 = arr.copy()
arr2 = np.rot90(arr2)
arr2 = np.rot90(arr2)

print(arr)
print(np.diagonal(arr,-1111,axis1=0,axis2=1))
print(np.diagonal(arr2,0,axis1=1,axis2=0))

[[ 1  2  3  4]
 [ 5  6  8  9]
 [10 11 12 13]]
[]
[13  8  2]


In [246]:
def run_randoms(n,shape=(6,7),print_boards=False):
    x_wins = 0
    o_wins = 0
    draws = 0
    for i in range(0,n):
        board = empty_board(shape=shape)
        agent1 = random_agent('x')
        agent2 = random_agent('o')
        winner = ' '
        while winner == ' ':
            board = agent1.act(board)
            if if_winner(board, agent1.character):
                winner = agent1.character
                x_wins = x_wins + 1
            if winner == ' ':
                board = agent2.act(board)
            if if_winner(board, agent2.character):
                winner = agent2.character
                o_wins = o_wins + 1
            if len(available_move(board)) == 0:
                winner = 'd'
                draws = draws + 1
        if print_boards:
            print(board)
    return (x_wins, o_wins, draws)

In [250]:
run_randoms(10000)

(5620, 4344, 51)


## Task 3: Minimax Search with Alpha-Beta Pruning [4 points]

### Implement the search starting from a given board and specifying the player.

In [29]:
# Your code/ answer goes here.

Experiment with some manually created boards (at least 5) to check if the agent spots wining opportunities.

In [6]:
# Your code/ answer goes here.

How long does it take to make a move? Start with a smaller board with 4 columns and make the board larger by adding columns.

In [7]:
# your code/ answer goes here.

### Move ordering

Describe and implement a simple move ordering strategy. How does this strategy influence the time it takes to 
make a move?

In [8]:
# Your code/ answer goes here.

### Playtime

Let the Minimax Search agent play a random agent on a small board. Analyze wins, losses and draws.

In [9]:
# Your code/ answer goes here.

## Task 4: Heuristic Alpha-Beta Tree Search [3 points] 

### Heuristic evaluation function

Define and implement a heuristic evaluation function.

In [10]:
# Your code/ answer goes here.

### Cutting off search 

Modify your Minimax Search with Alpha-Beta Pruning to cut off search at a specified depth and use the heuristic evaluation function. Experiment with different cutoff values.

In [11]:
# Your code/ answer goes here.

Experiment with the same manually created boards as above to check if the agent spots wining opportunities.

In [12]:
# Your code/ answer goes here.

How long does it take to make a move? Start with a smaller board with 4 columns and make the board larger by adding columns.

In [13]:
# Your code/ answer goes here.

### Forward Pruning

Add forward pruning to the cutoff search where you do not consider moves that have a low evaluation value after a shallow search 
(way smaller than the cuttoff value).

In [14]:
# Your code/ answer goes here.

How long does it take to make a move? Start with a smaller board with 4 columns and make the board larger by adding columns.

In [15]:
# Your code/ answer goes here.

### Playtime

Let two heuristic search agents (different cutoff depth, different heuristic evaluation function or different forward pruning) compete against each other on a reasonably sized board. Since there is no randomness, you only need to let them play once.

In [16]:
# Your code/ answer goes here.

## Challenge task [+ 1 bonus point]

Find another student and let your best agent play against the other student's best player. We will set up a class tournament on Canvas. This tournament will continue after the submission deadline.

## Graduate student advanced task: Pure Monte Carlo Search and Best First Move [1 point]

__Undergraduate students:__ This is a bonus task you can attempt if you like [+1 Bonus point].

### Pure Monte Carlos Search

Implement Pure Monte Carlo Search and investigate how this search performs on the test boards that you have used above. 

In [17]:
# Your code/ answer goes here.

### Best First Move

How would you determine what the best first move is? You can use Pure Monte Carlo Search or any algorithms 
that you have implemented above.

In [18]:
# Your code/ answer goes here.