# Assignment 1 - Swap Isolation

## Setup

Clone this repository:
`git clone https://github.gatech.edu/omscs6601/assignment_1.git`

The submission scripts depend on the presence of 4 python packages - `requests`, `future`, and `nelson`. Install them using the command below:

`pip install -r requirements.txt`

Read [setup.md](./setup.md) for more information on how to effectively manage your git repository and troubleshooting information.

## Overview

This assignment will cover some of the concepts discussed in the Adversarial Search lectures. You will be implementing game playing agents for a variant of the game Isolation.

We are also implementing this through Jupyter Notebook, so you all may find it useful to spend some time getting familiar with this software. During the first week of classes, there was an assignment called "Assignment 0" that spends some time going through Python and Jupyter. If you are unfamiliar with either Python or Jupyter, please take some time to go through that assignment! The first several cells contain information about the game and assignment, followed by the places where we would like you to implement code. At the bottom of the notebook, you can find more information about the extra credit portion.

In addition, pay close attention to the text interspersed throughout the notebook. We try to give you an idea of what functions may be useful and what test cases we would expect you to pass. Also remember that we only grade your **last** submission, so be careful if your agent is not being consistent in its win-rate.

In case we haven't got this into your heads enough: **start early!!** It is entirely possible, and probably likely, that a large part of your next 2 weeks will be devoted to this assignment, but we hope that you learn a lot and that you really enjoy the assignment! Good luck!!

### The Game

The rules of Swap Isolation are simple. There are two players, each with his own game piece, and a 7-by-7 grid of squares. At the beginning of the game, the first player places his piece on any square. The second player follows suit, and places his piece on any one of the available squares. From that point on, the players alternate turns moving their piece like a queen in chess (any number of open squares vertically, horizontally, or diagonally). When the piece is moved, the square that was previously occupied is blocked, and can not be used for the remainder of the game.

A queen can not move through blocked squares. She can, however, move into a position occupied by another queen, in which case the other queen now occupies the position that was held by this queen. This is called a 'swap'. You cannot swap back with the other queen if a swap involving both queen's positions already happened the previous turn. To make this clearer, let's use an example. If Q1 performed a swap with Q2 during Turn 1, where Q1 was in spot A and Q2 in spot B, then on Turn 2, Q2 **cannot** do a swap to take back spot B.

The first player who is unable to move their queen loses.

### The Assignment

Your task is to create an AI that can play and win a game of Swap Isolation. Your AI will be tested against several pre-baked AIs as well as your peers’ AI systems. You will implement your AI in Python 3, using our provided code as a starting point.

In this repository, we provide:

- A class for representing the game state (isolation.py)
- A function for printing the game board (game_as_text() in isolation.py)
- A function for generating legal game states
- A file for running unit tests (player_submission_tests.py)
- A random AI (baseline test) (test_players.py)

Your goal is to implement the following parts of the AI in the class CustomPlayer:

1. Evaluation functions (`OpenMoveEvalFn()` and `CustomEvalFn()`)
2. The minimax algorithm (`minimax()`)
3. Alpha-beta pruning (`alphabeta()`)

Your agent will have a limited amount of time to act each turn (1 second). We will call these functions directly so **don’t modify** the <u>function names</u> or the <u>parameters</u>. In addition to checking time each turn, you will be penalized if your agent takes more than a few minutes at construction time (for example, if you attempt to load the entire set of possible board states from memory). We have divided the tests into three(mentioned in details in next grading section below).  In total, your submission will be allowed to run for a maximum of <u>30 minutes</u> before being interrupted for the first section. This is increased to <u>120 minutes</u> for the second and third section.

### Grading

A friendly reminder: please ensure that your submission is in `player_submission.ipynb`. The scripts described in the following section automatically send that file to the servers for processing.

When you have finished implementing your code in `player_submission.ipynb`, run `notebook2script.py`. This script takes your notebook and produces a file called `submission.py`, which is what the submit scripts will be looking for when sending your code to Gradescope. It only exports the cells that have `#export` at the top of them to a Python script so keep this in mind when producing `submission.py`.

To submit your code and have it evaluated for a grade for first section, use `python submit.py`, for evaluation of second section use `python submit_a.py` and for third section use `python submit_b.py`.  Ensure that you have created the required AI.txt to enter the tournament.

You can run `python submit.py` once every 30 minutes, and the other two scripts once every 120 minutes. `submit.py` tests OpenEvalFn and against our RandomPlayer, `submit_a.py` tests against our minimax and alphabeta agents, and `submit_b.py` runs the remaining test cases. You can see all the cases in the rubric below.

Submission policy: Grades will be based on the last submission made per section. (We are running our largest class to date, so we reserve the right to modify these rules depending upon the load on the servers).

The grade you receive for the assignment will be determined as follows:

| Points    | Condition                                |
| --------- | ---------------------------------------- |
| 5 points | You write an evaluation function, OpenMoveEval, which returns the number of moves that the AI minus the number of moves opponent can make, and your evaluation function performs correctly on some sample boards we provide. |
| 30 points | Your AI defeats a random player >= 90% of the time. |
| 20 points | Your AI defeats an agent with OpenMoveEval function that uses minimax to level 2  >= 65% of the times. |
| 20 points | Your AI defeats an agent with OpenMoveEval function that uses alphabeta to level 4  >= 65% of the times. |
| 20 points | Your AI defeats an agent with OpenMoveEval function that uses iterative deepening and alpha-beta pruning >= 65% of the time. |
| 5 points | Your AI defeats an agent with Noah's secret evaluation function that uses iterative deepening and alpha-beta pruning and optimizes various aspects of the game player >= 85% of the time  |

### Jupyter Tips

Hopefully, Assignment 0 got you pretty comfortable with Jupyter or at the very least addressed the major things that you may run into during this project. That said, Jupyter can take some getting used to, so here is a compilation of some things to watch out for specifically when it comes to Jupyter in a sort-of FAQs-like style

**1. My Jupyter notebook does not seem to be starting up or my kernel is not starting correctly.**<br />
Ans: This probably has to do with activating virtual environments. If you followed the setup instructions exactly, then you should activate your conda environment using `conda activate <environment_name>` from the Anaconda Prompt and start Jupyter Notebook from there.

**2. I was running cell xxx when I opened up my notebook again and something or the other seems to have broken.**<br />
Ans: This is one thing that is very different between IDEs like PyCharm and Jupyter Notebook. In Jupyter, every time you open a notebook, you should run all the cells that a cell depends on before running that cell. This goes for cells that are out of order too (if cell 5 depends on values set in cell 4 and 6, you need to run 4 and 6 before 5). Using the "Run All" command and its variants (found in the "Cell" dropdown menu above) should help you when you're in a situation like this.

**3. The value of a variable in one of my cells is not what I expected it to be? What could have happened?** <br />
Ans: You may have run a cell that modifies that variable too many times. Look at the "counter" example in assignment 0. First, try running `counter = 0` and then `counter += 1`. This way, when you print counter, you get counter = 1, right? Now try running `counter += 1` again, and now when you try to print the variable, you see a value of 2. This is similar to the issue from Question 2. The order in which you run the cells does affect the entire program, so be careful.

## The Code

This file is your main submission that will be graded against. Do not add any classes or functions to this file that are not part of the classes that we want.

In addition, pay close attention to the text interspersed throughout the notebook. We try to give you an idea of what functions may be useful and what test cases we would expect you to pass. Also remember that we only grade your **last** submission, so be careful if your agent is not being consistent in its win-rate.

In case we haven't stressed this enough: **start early!!** It is entirely possible, and probably likely, that a large part of your next 2 weeks will be devoted to this assignment, but we hope that it is worth and you really enjoy the assignment! Good luck!

In [None]:
#export
# Discussed at the whiteboard level with: 
# References:
# If you have discussed this assignment at a whiteboard level with anyone or you have consulted external resources
#(not provided by the instuction staff) you should cite such people and resources. Please do so in this cell!

## Importing External Modules

In [None]:
#!/usr/bin/env python
%load_ext autoreload
%autoreload 2
import player_submission_tests as tests
from test_players import HumanPlayer, RandomPlayer
from board_vis import BoardGrid

In [None]:
#export
from isolation import Board, game_as_text
import random
from random import randint
import time
import platform
if platform.system() != 'Windows':
    import resource

## Visualization
Here, we give you a way of visualizing the board history after having played a game of swap isolation. We basically have you run a game that pits you against the random player and then you can look at the move history of this game and see how the game works for yourself.

In [None]:
game = Board(RandomPlayer(), HumanPlayer(), 7, 7)
winner, move_history, termination = game.play_isolation(time_limit=100000000, print_moves=False)

# Catch game timeouts
timed_out = None
if "timed out" in termination:
    timed_out = True
else:
    timed_out = False
    
bg = BoardGrid(game, move_history, timed_out, show_legal_moves=True)
bg.show_board()

# Your code goes below this cell. Hope you have a blast!

## OpenMoveEvalFn
- This is where you write your evaluation function to evaluate the state of the board.
- Test cases the below code is expected to pass: The first 5 points (as specified in the README).
- Useful functions: Python's built-in len() method could be of use here.
- Hints: This needs to work for when either your agent or the opponent is making a move. In other words, perspective is important.

In [None]:
#export
class OpenMoveEvalFn:

    def score(self, game, my_turn=True):
        """Score the current game state

        Evaluation function that outputs a score equal to how many
        moves are open for AI player on the board minus how many moves
        are open for Opponent's player on the board.
        Note:
            1. Be very careful while doing opponent's moves. You might end up
               reducing your own moves.
            3. If you think of better evaluation function, do it in CustomEvalFn below.

            Args
                param1 (Board): The board and game state.
                param2 (bool): True if maximizing player is active.

            Returns:
                float: The current state's score. MyMoves-OppMoves.

            """

        # TODO: finish this function!
        raise NotImplementedError

########## DON'T WRITE ANY CODE OUTSIDE THE FUNCTION! ################
##### CODE BELOW IS USED FOR RUNNING LOCAL TEST DON'T MODIFY IT ######
tests.correctOpenEvalFn(OpenMoveEvalFn)
################ END OF LOCAL TEST CODE SECTION ######################

## About the local test above
If you want to edit the test (which you most definitely can), then edit the source code back in player_submission_tests.py. We want to keep things consistent with this notebook.

## CustomPlayer
- This is where you are given some methods to use later on in the minimax() and alphabeta() methods. You may wish to modify these, and you are free to do so with some restrictions:
    - You need to make sure that if you add any additional parameters, they are also given default values. Do not expect the server-side (i.e. Gradescope) to assign values to those parameters.

In [None]:
#export
class CustomPlayer:
    # TODO: finish this class!
    """Player that chooses a move using your evaluation function
    and a minimax algorithm with alpha-beta pruning.
    You must finish and test this player to make sure it properly
    uses minimax and alpha-beta to return a good move."""

    def __init__(self, search_depth=3, eval_fn=OpenMoveEvalFn()):
        """Initializes your player.
        
        if you find yourself with a superior eval function, update the default
        value of `eval_fn` to `CustomEvalFn()`
        
        Args:
            search_depth (int): The depth to which your agent will search
            eval_fn (function): Utility function used by your agent
        """
        self.eval_fn = eval_fn
        self.search_depth = search_depth
    
    def move(self, game, legal_moves, time_left):

        """Called to determine one move by your agent
        
            Note:
                1. Do NOT change the name of this 'move' function. We are going to call
                the this function directly.
                2. Change the name of minimax function to alphabeta function when
                required. Here we are talking about 'minimax' function call,
                NOT 'move' function name.
                Args:
                game (Board): The board and game state.
                legal_moves (list): List of legal moves
                time_left (function): Used to determine time left before timeout
                
            Returns:
                tuple: best_move
            """

        best_move, utility = minimax(self, game, time_left, depth=self.search_depth)
        return best_move

    def utility(self, game, my_turn):
        """Can be updated if desired. Not compulsory."""
        return self.eval_fn.score(game, maximizing_player)

## And now for the meat of the assignment

## Minimax
- This is where you will implement the minimax algorithm. The final output of your minimax should come from this method and this is the only method that Gradescope will call when testing minimax.
- Test cases the below code is expected to pass: The first 30 points (as specified in the README).
- Useful functions: The useful methods will probably all come from isolation.py. A couple of particularly interesting ones could be forecast_move() and your score() method from OpenMoveEvalFn. Remember the double question mark trick from Assignment 0 if you feel you are flipping between files too much!
- Hints: Remember that perspective hint from before? If you did not realize what it meant and made a slight error there, you may catch it here.

In [None]:
#export
def minimax(player, game, time_left, depth, my_turn=True):
    """Implementation of the minimax algorithm.
    
    Args:
        game (Board): A board and game state.
        time_left (function): Used to determine time left before timeout
        depth: Used to track how deep you are in the search tree
        maximizing_player (bool): True if maximizing player is active.
        
    Returns:
        (tuple, int): best_move, val
    """
    # TODO: finish this function!
    raise NotImplementedError
    return best_move, best_val

########## DON'T WRITE ANY CODE OUTSIDE THE FUNCTION! ################
##### CODE BELOW IS USED FOR RUNNING LOCAL TEST DON'T MODIFY IT ######
tests.beatRandom(CustomPlayer)
tests.minimaxTest(CustomPlayer, minimax)
################ END OF LOCAL TEST CODE SECTION ######################

## About the local test above
If you want to edit the test (which you most definitely can), then edit the source code back in player_submission_tests.py. We want to keep things consistent with this notebook.

Notice that we have not really adjusted the beatRandom() method to test a Random player against your agent. How would you adjust that test to see if your custom player can beat a random player?

## AlphaBeta
- This is where you will implement the alphabeta algorithm. The final output of your alphabeta should come from this method and this is the only method that Gradescope will call when testing alphabeta.
- Test cases the below code is expected to pass: Minimax level 2 >= 65% of the time
- Useful functions: The useful methods will probably all come from isolation.py. A couple of particularly interesting ones could be forecast_move() and your score() method from OpenMoveEvalFn. Remember the double question mark trick from Assignment 0 if you feel you are flipping between files too much!
- Hints: Remember a key point of alphabeta: your final answer will always be the same as if you implemented minimax, but you will get that answer a lot faster.

In [None]:
#export
def alphabeta(player, game, time_left, depth, alpha=float("-inf"), beta=float("inf"), my_turn=True):
    """Implementation of the alphabeta algorithm.
    
    Args:
        game (Board): A board and game state.
        time_left (function): Used to determine time left before timeout
        depth: Used to track how deep you are in the search tree
        alpha (float): Alpha value for pruning
        beta (float): Beta value for pruning
        maximizing_player (bool): True if maximizing player is active.
        
    Returns:
        (tuple, int): best_move, val
    """
    # TODO: finish this function!
    raise NotImplementedError
    return best_move, val

## About the lack of a local test above
Notice that we do not have any code here. We want you to learn to write your own test cases, so feel free to get creative! You can always create the test in player_submission_tests.py and then run it over here in a manner identical to how local tests have been run so far.

## That does not cover all 100 points though!
- You're right, and that's on purpose. Each of the bullets below try to walk you through how you may want to think about beating the remaining agents.
    - First up is the alphabeta agent. Vanilla alphabeta (that is, alphabeta with no optimization) may not do so well against this agent. However, any agent that searches deeper with the same algorithm probably has a bigger advantage. You may learn about a method that allows your algorithm to search in such a way that you can find the maximum search depth without running out of time. This will probably come up in class or you can read through the book to find out what you are looking for.
    - Next is the agent with iterative deepening (yes, I know that I just gave away the answer). This one is a little harder to think about, given that you may have used all the tools that you may think of to try a make a "better" agent. But you may have just implemented the evaluation function that was discussed in class. Maybe we can do better - like checking for winning moves and prioritizing those! Or if you are feeling really creative, you can always try editing the "CustomEvalFn" at the bottom of this notebook and come up with an awesome idea of your own.
    - Now to that agent with the evaluation function. I have nothing to say. Unfortunately, the TAs cannot say anything about what it will take to beat this agent and that is so that we can see what you all can come up with. Use everything in your toolbox and within the class rules to defeat it. And as always: good luck and have fun!
    
- Remember that you may want to edit the methods in the cell with the CustomPlayer class to try and implement some of the above. You are certainly free too as long as you don't change the function signatures (other than that little caveat that was mentioned in the cell above that code).

## CustomEvalFn
- Go crazy with how you want to design your own evaluation function. The typical rules about how you can and cannot edit the code we have given (namely, the function signature rules) apply here.

In [None]:
#export
class CustomEvalFn:
    def __init__(self):
        pass

    def score(self, game, maximizing_player_turn=True):
        """Score the current game state.
        
        Custom evaluation function that acts however you think it should. This
        is not required but highly encouraged if you want to build the best
        AI possible.
        
        Args:
            game (Board): The board and game state.
            maximizing_player_turn (bool): True if maximizing player is active.
            
        Returns:
            float: The current state's score, based on your own heuristic.
        """

        # TODO: finish this function!
        raise NotImplementedError

### Botfight (Extra Credit)

In addition to the basic assignment, you will have the option to compete against your peers for the glory of being the Fall 2018 AI-Game-Playing champ. We’ll set up a system to pit your AI against others, and we’ll be handing out extra credit for the top players. May the odds be ever in your favor.

If you wish to compete in the tournament, simply include a plaintext file with a description of your agent, titled ‘AI.txt’, while submitting for the third section of tests (submit_b) and your CustomPlayer instance will be enlisted.

If you compete in the AI tournament and your agent finishes in the top 10, you will receive bonus points for this assignment **(bonus points are added to the grades of each assignment. Not to final score. )**:

- Best Overall:  12 bonus points added to the assignment score.
- Second Best: 10 bonus points.
- Third Best: 7 bonus points.
- Fourth to Tenth Best: 5 bonus points.