## Connect Three 
You previously saw an example of minimax search being used to write an AI that plays perfect noughts and crosses. In this ungraded assignment you have an opportunity to apply this to another game, connect three (a modified version of [connect four](https://en.wikipedia.org/wiki/Connect_Four)).

Players alternate playing pieces of their colour into any non-full column of a grid. The pieces will fall down as far as possible. The objective is to get three of the same pieces in a line: horizontal, vertical, or diagonal. In this version the board will be 5 columns wide and 3 rows high. An example is shown below showing a win for the red player.

<img src="images/connect3.png" width=200 />

### Objective
You should write or adapt the minimax algorithm from the noughts and crosses example to work for connect three. The state space is much larger due to the size of the board, so the basic brute force version of the algorithm is unlikely to be computationally efficient enough to actually play against. Hence you will be required to implement some optimisations, for example:
* Alpha-beta pruning – see the noughts and crosses example.
* Cache results for board states – multiple paths can lead to the same state so you shouldn't have to solve these twice. You can use a dictionary to cache results efficiently.
* Optimise for symmetry – the board is symmetric around the centre column, so you can cut down the number of necessary states by assuming that optimal play will look the same but adjusted for reflection when you are creating the cache.
* Table lookups – like caching results but saving permanently, you can precompute a certain number of end-game board positions and store the results (e.g. calculate in one cell and then use in another).
* Depth limit – a hard cap on the recursion limit will hugely improve performance at the cost of optimality. If the recursion limit is reached you can simply return a default value like 0, or...
* Evaluation functions – rather than returning a default value when you hit the recursion limit, you can use a fast heuristic function to evaluate the strength of a board position for the given player. What you choose is up to you! Be creative!

Your objective is to write an AI which is efficient and a strong or even optimal player. Reflect on your results once you are finished and share them in the forum. You can even try upping the game parameters to see if your code will work on the full connect four board! Note that this assignment is **ungraded**, do not feel the need to implement all of the optimisations listed above.

### Code details

We provide a `Connect` class that you can use to simulate connect three games. The following cells in this section will walk you through the basic usage of this class by playing a couple of games.

We import the `connect` module and create a Connect-Three environment called `env`. The constructor method has one argument called `verbose`. If `verbose=True`, the `Connect` object will regularly print the progress of the game.

You will need to write all of the supporting code, including the code to allow your AI to play the game against a human (you can use the noughts and crosses example for help). The `Connect` class is designed for a broad range of AI approaches, not just minimax.

The `Connect` object uses the strings `'o'` and `'x'` instead of different disk colours in order to distinguish between the two players. We can specify who should start the game using the `starting_player` argument.

In [2]:
import connect
env = connect.Connect(starting_player='x', verbose=True)

Game has been reset.
[[' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ']]



We can interact with the environment using the `act()` method. This method takes an `action` (an integer) as input and computes the response of the environment. An action is defined as the column index that a disk is dropped into. The `act()` method returns the `reward` for player `'o'` and a boolean, indicating whether the game is over (`True`) or not (`False`). 

In [3]:
reward, game_over = env.act(action=2)
print("reward =", reward)
print("game_over =", game_over)

[[' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ']
 [' ' ' ' 'x' ' ' ' ']]
reward = 0
game_over = False


Because we set `verbose=True` when we created our environment, the grid is printed each time we call the `act()` method.

As expected, the `reward` is 0 and no one has won the game yet (`game_over` is `False`). Let us drop another disk into the same column.

In [4]:
reward, game_over = env.act(action=2)

[[' ' ' ' ' ' ' ' ' ']
 [' ' ' ' 'o' ' ' ' ']
 [' ' ' ' 'x' ' ' ' ']]


We see that the `Connect` environment automatically switches the active player.

The `grid` is stored as a two-dimensional `numpy` array in the `Connect` class and you can easily access it by calling...

In [5]:
current_grid = env.grid
print(current_grid)

[[' ' ' ' 'x' ' ' ' ']
 [' ' ' ' 'o' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ']]


Note that the grid now appears to be "upside down" because `numpy` arrays are printed from "top to bottom".
We can also print it the way it is printed by the Connect class by calling...

In [6]:
print(current_grid[::-1])

[[' ' ' ' ' ' ' ' ' ']
 [' ' ' ' 'o' ' ' ' ']
 [' ' ' ' 'x' ' ' ' ']]


Let's make another move.

In [7]:
reward, game_over = env.act(action=2)

[[' ' ' ' 'x' ' ' ' ']
 [' ' ' ' 'o' ' ' ' ']
 [' ' ' ' 'x' ' ' ' ']]


Let us try to put another disk in the same column with `act(action=2)`. The environment will throw an error because that column is already filled.

In [8]:
# This cell should throw an IndexError!
env.act(action=2)

IndexError: index 3 is out of bounds for axis 0 with size 3

The attribute `.available_actions` of the `Connect` class contains a `numpy` array of all not yet filled columns. This variable should help you to avoid errors like the one we have just encountered.

In [9]:
print(env.available_actions)

[0 1 3 4]


Note that column index '2' is missing because this column is already filled.

Let's keep on playing until some player wins...

In [10]:
reward, game_over = env.act(action=3)
print("reward =", reward, "game_over =", game_over) 
reward, game_over = env.act(action=1)
print("reward =", reward, "game_over =", game_over)
reward, game_over = env.act(action=3)
print("reward =", reward, "game_over =", game_over)
reward, game_over = env.act(action=1)
print("reward =", reward, "game_over =", game_over)
reward, game_over = env.act(action=3)
print("reward =", reward, "game_over =", game_over)

[[' ' ' ' 'x' ' ' ' ']
 [' ' ' ' 'o' ' ' ' ']
 [' ' ' ' 'x' 'o' ' ']]
reward = 0 game_over = False
[[' ' ' ' 'x' ' ' ' ']
 [' ' ' ' 'o' ' ' ' ']
 [' ' 'x' 'x' 'o' ' ']]
reward = 0 game_over = False
[[' ' ' ' 'x' ' ' ' ']
 [' ' ' ' 'o' 'o' ' ']
 [' ' 'x' 'x' 'o' ' ']]
reward = 0 game_over = False
[[' ' ' ' 'x' ' ' ' ']
 [' ' 'x' 'o' 'o' ' ']
 [' ' 'x' 'x' 'o' ' ']]
reward = 0 game_over = False
[[' ' ' ' 'x' 'o' ' ']
 [' ' 'x' 'o' 'o' ' ']
 [' ' 'x' 'x' 'o' ' ']]
Player ' o ' has won the game!
reward = 1 game_over = True


#### Note that the `reward` returned by the `act()` method is the reward for player `'o'`.

Finally, you can reset the game using the `reset()` method. This method cleans the grid and makes sure that the it is the `starting_player`'s turn as defined earlier.

In [11]:
env.reset()
reward, game_over = env.act(1)

Game has been reset.
[[' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ']]
[[' ' ' ' ' ' ' ' ' ']
 [' ' ' ' ' ' ' ' ' ']
 [' ' 'x' ' ' ' ' ' ']]


### Your Solution Here
As a reminder, feel free to modify existing or add new methods to the `Connect` class.