# Monte Carlo Tree Search

In this lab, we'll be using the game connect four, as a vehicle for learning Monte Carlo Tree Search.
We'll also introduce concepts, such as state, that'll stay throughout the course.
Expect to lose to the algorithm at the end of the lab.

## Setup
This section you won't need to edit, but it is worth skimming through—this is we declare the objects you'll be interacting with througout the lab

In [25]:
# imports
import random
from copy import deepcopy # world -> world model
from typing import List, Tuple

In [26]:
# world and world model
class State:
    def __init__(self, cols=7, rows=6, win_req=4):
        self.board = [['.'] * cols for _ in range(rows)]
        self.heights = [0] * cols
        self.num_moves = 0
        self.win_req = win_req
        self.winner = None
        
    def get_actions(self) -> List[int]:
        return [i for i in range(len(self.board[0])) if self.heights[i] < len(self.board)]
    
    def __str__(self):
        header  = " ".join([str(i) for i in range(len(self.board[0]))])
        line    = "".join(["-" for _ in range(len(header))])
        board   = [[e for e in row] for row in self.board]
        board   = '\n'.join([' '.join(row) for row in board])
        return  '\n' + header + '\n' + line + '\n' + board + '\n'
    
    def detect_winner(self):
        pass
        
    def get_action(self, action):
        pass
    
    def __repr__(self):
        return self.__str__()

In [27]:
# parrent class for mcts, minmax, human, and any other idea for an agent you have
class Agent:
    def __init__(self, name: str):
        self.name: str = name
    
    def give_action(self, state: State):
        return random.choice(state.get_actions())
    
    def utility(self, state: State):
        pass

In [28]:
# connecting states and agents
class Game:
    def __init__(self, agents: Tuple[Agent]):
        self.agents = agents
        self.state = State()

    def play(self):
        while True:
            for agent in agents:
                if not self.state.winner:
                    agent.give_action(self.state)
                    print(self.state)
                    break
            break

## Exercise 0: Run game
put the state, agent and game class together so that a game is run

In [16]:
agents = (Agent('O'), Agent('X'))
game = Game(agents)
game.play()


0 1 2 3 4 5 6
-------------
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .



## Exercise 1: Human Agent
Make a child class of `Agent` called `Human`, with the `give_action` method overwritten to take input from you. *hint*: use `int(input())`

In [37]:
# class Human(Agent):
#    def __init__(self):
#        super(Human, self).__init__()

## Exercise 2: Gekko
Make a child class of `Agent` called `Gekko`, with the `give_action` that is as short sighted as you can possibly make it. Here you'll need to edit both `give_action`, `utility`, and perhaps some helper functions.

In [41]:
# class Gekko(Agent):
#    def __init__(self):
#        super(Gekko, self).__init__()

## Exercise 3: MinMax
Make a MinMax agent, using the the minmax heuristic. Have it play against another copy of itself.
Make a version of MinMax tat uses Alpha Beta pruning. Sort the action space so as to make this as useful as possible.

In [49]:
# class MinMax(Agent):
#    def __init__(self):
#        super(MinMax, self).__init__()

## Exercise 4: & Alpha Beta pruning
Same but for Monte Carlo Tree Search. See if you can beat it.

In [50]:
# class MCTS(Agent):
#    def __init__(self):
#        super(MCTS, self).__init__()

## Exercise 5: MCTS
Same but for Monte Carlo Tree Search. See if you can beat it.

In [51]:
# class MCTS(Agent):
#    def __init__(self):
#        super(MCTS, self).__init__()

## Exercise 6 (optional): Dynamic Programming
Then use dynamic programming to make your AI more efficient.