# Adversarial Search: Playing Connect 4


## Instructions

Total Points: Undegraduates 10, graduate students 11

Complete this notebook and submit it. The notebook needs to be a complete project report with your implementation, documentation including a short discussion of how your implementation works and your design choices, and experimental results (e.g., tables and charts with simulation results) with a short discussion of what they mean. Use the provided notebook cells and insert additional code and markdown cells as needed.

## Introduction

You will implement different versions of agents that play Connect 4:

> "Connect 4 is a two-player connection board game, in which the players choose a color and then take turns dropping colored discs into a seven-column, six-row vertically suspended grid. The pieces fall straight down, occupying the lowest available space within the column. The objective of the game is to be the first to form a horizontal, vertical, or diagonal line of four of one's own discs." (see [Connect Four on Wikipedia](https://en.wikipedia.org/wiki/Connect_Four))

## Task 1: Defining the Search Problem [1 point]

Define the components of the search problem:

* Initial state: an empty board.
* Actions: unfilled columns, {0-6} at game start.
* Transition model: When a player drops a piece, it falls straight down to the lowest available space in the column.
* Goal state: Four of the player's pieces are lined up vertically, horizontally, or diagonally.

Each of the 42 squares can have three values (empty, x, or o). However,
* pieces cannot float above empty squares
* the # of x's is always within one of the # of o's
* both players can't have 4-in-a-row at once, etc.

So, the search space has significantly less than this number of states:

In [1]:
print(3**42)
print('{:.1e}'.format(3**42))

109418989131512359209
1.1e+20


## Task 2: Game Environment and Random Agent [2 point]

Implement an agent that plays randomly and let two random agents play against each other 1000 times. How often does each player win? Is the result expected?

In [28]:
import sys
import numpy as np
# np.random.seed(0)
DEBUG = 2
HEIGHT = 6
WIDTH = 7

def empty_board(shape=(HEIGHT, WIDTH)):
    return np.full(shape=shape, fill_value=' ')

def check_win(board):
    """check the board and return one of x, o, d (draw), or n (for next move)"""
    
    # TODO: check for four-in-a-row (rows, columns, diagonals)...
    # invent solution or look for libraries

    # check for draw
    if(np.sum(board == ' ') < 1):
        return 'd'
    
    return 'n'

if(DEBUG>1):
    board = empty_board()
    for i in range(4): result(board, 'x', 0)
    print(board)
    print('Win? ' + check_win(board))
    print()
    board = empty_board()
    print(board)
    print('Win? ' + check_win(board))
    print()

def get_actions(board):
    """Returns non-full columns as a vector of indices"""
    top_row = board[0, 0:7]
    actions = np.where(top_row == ' ')[0].tolist()
    if(DEBUG): print(actions)
    return actions

def result(state, player, action):
    """Add move to the board"""
    for i in range(HEIGHT):
        if(state[HEIGHT-i-1][action] == ' '):
            state[HEIGHT-i-1][action] = player
            return state
    sys.exit('error -> column full')

def is_terminal(state, player = 'x', draw_is_win = True):
    """returns win or None (loss) for terminal states and False
       for non-terminal states. Not called in random_player()"""
    if player == 'x': other = 'o'
    else: other = 'x'
    
    goal = check_win(state)        
    if goal == player: return 'win' 
    if goal == 'd': 
        if draw_is_win: return 'draw' 
        else: return None 
    if goal == other: return None  # loss is failure
    return False # continue

if(DEBUG>1):
    print(is_terminal(np.full(shape=[HEIGHT, WIDTH], fill_value='x')))
    print(is_terminal(np.full(shape=[HEIGHT, WIDTH], fill_value='o')))
    print(is_terminal(empty_board()))

def random_player(board, player = None):
    """Simple player that chooses a random unfilled column.
       Argument 'player' is unused."""
    action = np.random.choice(get_actions(board), size=None, replace=False)
    if(DEBUG): print(action)
    return action

def switch_player(player, x, o):
    if player == 'x':
        return 'o', o
    else:
        return 'x', x

def play(x, o, N = 100):
    results = {'x': 0, 'o': 0, 'd': 0}
    for i in range(N):
        board = empty_board()
        player, fun = 'x', x
        
        while True:
            a = fun(board, player)
            board = result(board, player, a)
            
            win = check_win(board)
            if win != 'n':
                results[win] += 1
                break
            
            player, fun = switch_player(player, x, o)   
    return results
    
if(DEBUG<2): # TODO: When satisfied with code, change N to 1000.
    %timeit -n 1 -r 1 display(play(random_player, random_player, N = 50))

[[' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']
 ['x' ' ' ' ' ' ' ' ' ' ' ' ']
 ['x' ' ' ' ' ' ' ' ' ' ' ' ']
 ['x' ' ' ' ' ' ' ' ' ' ' ' ']
 ['x' ' ' ' ' ' ' ' ' ' ' ' ']]
Win? n

[[' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ' ' ' ' ']]
Win? n

draw
draw
False


* Player 'x' won ??? times and player 'o' won ??? times.
* I (did or did not) expect this result, because ???.

## Task 3: Minimax Search with Alpha-Beta Pruning [4 points]

### Implement the search starting from a given board and specifying the player.

In [3]:
# Your code/ answer goes here.

Experiment with some manually created boards (at least 5) to check if the agent spots wining opportunities.

In [4]:
# Your code/ answer goes here.

How long does it take to make a move? Start with a smaller board with 4 columns and make the board larger by adding columns.

In [5]:
# Your code/ answer goes here.

### Move ordering

Describe and implement a simple move ordering strategy. How does this strategy influence the time it takes to 
make a move?

In [6]:
# Your code/ answer goes here.

### Playtime

Let the Minimax Search agent play a random agent on a small board. Analyze wins, losses and draws.

In [7]:
# Your code/ answer goes here.

## Task 4: Heuristic Alpha-Beta Tree Search [3 points] 

### Heuristic evaluation function

Define and implement a heuristic evaluation function.

In [8]:
# Your code/ answer goes here.

### Cutting off search 

Modify your Minimax Search with Alpha-Beta Pruning to cut off search at a specified depth and use the heuristic evaluation function. Experiment with different cutoff values.

In [9]:
# Your code/ answer goes here.

Experiment with the same manually created boards as above to check if the agent spots wining opportunities.

In [10]:
# Your code/ answer goes here.

How long does it take to make a move? Start with a smaller board with 4 columns and make the board larger by adding columns.

In [11]:
# Your code/ answer goes here.

### Forward Pruning

Add forward pruning to the cutoff search where you do not consider moves that have a low evaluation value after a shallow search 
(way smaller than the cuttoff value).

In [12]:
# Your code/ answer goes here.

How long does it take to make a move? Start with a smaller board with 4 columns and make the board larger by adding columns.

In [13]:
# Your code/ answer goes here.

### Playtime

Let two heuristic search agents (different cutoff depth, different heuristic evaluation function or different forward pruning) compete against each other on a reasonably sized board. Since there is no randomness, you only need to let them play once.

In [14]:
# Your code/ answer goes here.

## Challenge task [+ 1 bonus point]

Find another student and let your best agent play against the other student's best player. We will set up a class tournament on Canvas. This tournament will continue after the submission deadline.

## Graduate student advanced task: Pure Monte Carlo Search and Best First Move [1 point]

__Undergraduate students:__ This is a bonus task you can attempt if you like [+1 Bonus point].

### Pure Monte Carlos Search

Implement Pure Monte Carlo Search and investigate how this search performs on the test boards that you have used above. 

In [15]:
# Your code/ answer goes here.

### Best First Move

How would you determine what the best first move is? You can use Pure Monte Carlo Search or any algorithms 
that you have implemented above.

In [16]:
# Your code/ answer goes here.