# Misagh Mohaghegh - 810199484

Artificial Intelligence CA3: *Game*  
In this assignment, game of sim (pencil game) is played by two agents.  
One of them uses the alpha-beta minimax algorithm to play optimally, while the other agent plays randomly.

## Players

The given code sample has been improved.  
The various imports are written and some typedefs are made.  
A line is defined by its 2 vertex numbers.

In [101]:
import math
import random
import turtle
import time
import copy
from enum import Enum


Color = str | tuple[float, float, float]
Point = tuple[float, float]
Line = tuple[float, float]

A `Player` enum is defined which has the invert operator `[~]` overloaded to return the other player.  
The `color` method returns the corresponding `Color` of the player.

In [102]:
class Player(Enum):
    NIL = 0
    RED = 1
    BLU = 2

    def color(self) -> Color:
        if self == Player.RED:
            return 'red'
        if self == Player.BLU:
            return 'blue'
        return 'black'

    def __invert__(self):
        if self == Player.RED:
            return Player.BLU
        if self == Player.BLU:
            return Player.RED
        return Player.NIL

## GUI

The `Gui` class has methods for drawing the pencil game.  

In [103]:
class Gui:
    def __init__(self, width: int, height: int, title: str = ''):
        turtle.setup(width, height)
        turtle.title(title)
        turtle.setworldcoordinates(-1.5, -1.5, 1.5, 1.5)
        turtle.tracer(0, 0)
        turtle.hideturtle()

    def draw_dot(self, dot: Point, color: Color):
        turtle.up()
        turtle.goto(dot)
        turtle.color(color)
        turtle.dot(15)

    def draw_line(self, p1: Point, p2: Point, color: Color, pensize: int):
        turtle.up()
        turtle.pensize(pensize)
        turtle.goto(p1)
        turtle.down()
        turtle.color(color)
        turtle.goto(p2)

    def update(self):
        turtle.update()

    def clear(self):
        turtle.clear()

## The Main Game Class

### Constructor

Next, the `Sim` class is defined which will play the game.  
The class has many fields which are described below:

- `minimax_depth`: The maximum depth the minimax tree will be searched.
- `prune`: Boolean to do alpha-beta pruning or not.
- `vertices`: The number of vertices in the game.
---
- `turn`: The player who has to play currently.
- `reds`: All of the lines that red has drawn.
- `blues`: All of the lines that blue has drawn.
- `available_moves`: All of the lines that are yet to be drawn.
---
- `gui`: Boolean to show the game or not.
- `gui_dots`: List of the vertices' points in the gui.
- `frame_delay`: How much to wait between each line drawing.

In [104]:
class Sim:
    def __init__(self, minimax_depth: int, prune: bool, vertices: int, gui: bool, frame_delay: float = 1):
        self.minimax_depth: int = minimax_depth
        self.prune: bool = prune
        self.vertices: int = vertices

        self.turn: Player = Player.RED
        self.reds: list[Line] = []
        self.blues: list[Line] = []
        self.available_moves: list[Line] = []

        self.gui = Gui(600, 600, 'Game of SIM - 810199484') if gui else None
        self.gui_dots: list[Point] = []
        self.frame_delay: float = frame_delay

### GUI Methods

The methods of the `Sim` class will now be explained:

- `generate_dots`: This will generate a list of points that the `Gui` can draw.
- `draw_line`: Draws a line from a dot to another.
- `draw`: This will draw the dots saved in `gui_dots` (created by `generate_dots`) and draw the lines done by either player with their respective color. It will then sleep for `frame_delay` seconds.

In [105]:
def generate_dots(self) -> list[Point]:
    dots: list[Point] = []
    for angle in range(0, 360, 360 // self.vertices):
        radians = math.radians(angle)
        dots.append((math.cos(radians), math.sin(radians)))
    return dots


def draw_line(self, line: Line, color: Color, pensize: int = 3):
    angle = 360 // self.vertices
    radX, radY = (math.radians(line[0] * angle), math.radians(line[1] * angle))
    self.gui.draw_line((math.cos(radX), math.sin(radX)),
                       (math.cos(radY), math.sin(radY)),
                       color, pensize)


def draw(self):
    if self.gui is None:
        return

    for dot in self.gui_dots:
        self.gui.draw_dot(dot, 'dark gray')

    for red in self.reds:
        self.draw_line(red, 'red')
    for blue in self.blues:
        self.draw_line(blue, 'blue')

    self.gui.update()
    time.sleep(self.frame_delay)


Sim.generate_dots = generate_dots
Sim.draw_line = draw_line
Sim.draw = draw

### Turns and Triangle Condition Methods

- `random_move`: Chooses a random line to draw.
- `swap_turn`: Used to swap the player turns.
- `check_triangle`: Checks if a triangle is formed within `lines` and returns the 3 lines that make it.

In [106]:
def random_move(self) -> Line:
    return random.choice(self.available_moves)

def swap_turn(self):
    self.turn = ~self.turn

def check_triangle(self, lines: list[Line]) -> tuple[Line, Line, Line] | None:
    for i in range(len(lines) - 2):
        for j in range(i + 1, len(lines) - 1):
            for k in range(j + 1, len(lines)):
                unique_dots = {*lines[i], *lines[j], *lines[k]}
                if len(unique_dots) == 3:
                    return (lines[i], lines[j], lines[k])
    return None

Sim.random_move = random_move
Sim.swap_turn = swap_turn
Sim.check_triangle = check_triangle

### Game Methods

- `initialize`:  
  Initializes the game. The `reds` and `blues` line lists are cleared and `available_moves` are reset.  
  A random player is chosen as the first player and the dots are drawn on the GUI.

- `play`:  
  Calls initialize and while there is an `available_move`, the players take turns with red playing `minimax` and blue playing `random_move`  
  Whichever move they choose is added to their lists and removed from the available list.  
  The GUI is updated and a check for gameover happens.  
  Returns the winner of the game after it is over.

- `gameover`:  
  Checks if the game is over (one of the players have drawn a triangle and lost) and highlights the triangle on the GUI.

In [107]:
def initialize(self):
    self.available_moves.clear()
    self.reds.clear()
    self.blues.clear()

    for i in range(0, self.vertices):
        for j in range(i, self.vertices):
            if i != j:
                self.available_moves.append((i, j))

    self.turn = random.choice((Player.RED, Player.BLU))

    self.gui_dots = self.generate_dots()
    if self.gui is not None:
        self.gui.clear()

    self.draw()


def play(self) -> Player:
    self.initialize()
    while self.available_moves:
        if self.turn == Player.RED:
            selection = self.minimax(self.minimax_depth, self.turn)[0]
            self.reds.append(selection)
        else:
            selection = self.random_move()
            self.blues.append(selection)

        self.available_moves.remove(selection)
        self.swap_turn()
        self.draw()

        winner = self.gameover()
        if winner != Player.NIL:
            return winner
    return Player.NIL


def gameover(self) -> Player:
    def draw_lines(lines: list[Line], color: Color, pensize: int):
        if self.gui is None:
            return None
        for line in lines:
            self.draw_line(line, color, pensize)
        self.gui.update()
        time.sleep(self.frame_delay)

    def check_triangle_creator() -> tuple[Player, tuple[Line, Line, Line]]:
        if len(self.reds) < 3 and len(self.blues) < 3:
            return Player.NIL, None
        three_lines = self.check_triangle(self.reds)
        if three_lines:
            return Player.RED, three_lines
        three_lines = self.check_triangle(self.blues)
        if three_lines:
            return Player.BLU, three_lines
        return Player.NIL, None

    loser, lines = check_triangle_creator()
    if not lines:
        return Player.NIL
    draw_lines(lines, loser.color(), 7)
    return ~loser


Sim.initialize = initialize
Sim.play = play
Sim.gameover = gameover

### Minimax Method

This is the main part of the alpha-beta minimax algorithm.

`minimax` is a recursive function that runs the algorithm.  
It takes 4 parameters:

- `depth`: The current depth (which decreases every time we go deeper into the tree)
- `turn`: Specifies which player's turn it is.
- `alpha`: The maximum of the ancestor's branches along the path to the current node so far.
- `beta`: The minimum of the ancestor's branches along the path to the current node so far.

If the `prune` flag is false, alpha and beta do not do anything.

At each recursion, it is checked whether a triangle is made and + (for red) or - (for blue) infinity is returned because someone has lost the game.  
If we have reached the depth limit of the tree and the game in that path is not over yet, we use the `evaluate` function to score the current state.  
Next, based on the current turn, all available moves are iterated and for each of them, they are added to the players move lists and removed from the available moves, and minimax is recursively called again.  
Minimax returns a tuple of the done move, its score, and the depth the move was chosen from.  
The maximum score and move are saved here, and if 2 moves have the same score, the one that occurred in a lower depth take priority. This is so that if the agent is going to lose, it loses later and not sooner because the other agent is playing randomly and red still has a chance of winning.  

In [108]:
def minimax(self, depth: int, turn: Player, alpha: float = -math.inf, beta: float = math.inf) -> tuple[Line, int, int]:
    if turn == Player.BLU:
        if self.check_triangle(self.reds):
            return None, -math.inf, depth
    if turn == Player.RED:
        if self.check_triangle(self.blues):
            return None, +math.inf, depth

    if depth <= 0:
        return None, self.evaluate(turn), depth

    optimal_move = None
    score_depth = -1

    if turn == Player.RED:
        score_max = -math.inf
        for move in copy.deepcopy(self.available_moves):
            self.reds.append(move)
            self.available_moves.remove(move)

            _, score, ret_depth = self.minimax(depth - 1, ~turn, alpha, beta)
            ret_depth = self.minimax_depth - ret_depth

            self.reds.remove(move)
            self.available_moves.append(move)

            if score == math.inf:
                return move, score, ret_depth

            if score > score_max:
                optimal_move = move
                score_max = score
                score_depth = ret_depth
                alpha = max(alpha, score_max)
                if self.prune and score_max >= beta:
                    break
            elif score == score_max:
                if ret_depth > score_depth:
                    optimal_move = move
                    score_depth = ret_depth

        return optimal_move, score_max, score_depth

    if turn == Player.BLU:
        score_min = math.inf
        for move in copy.deepcopy(self.available_moves):
            self.blues.append(move)
            self.available_moves.remove(move)

            _, score, ret_depth = self.minimax(depth - 1, ~turn, alpha, beta)
            ret_depth = self.minimax_depth - ret_depth

            self.blues.remove(move)
            self.available_moves.append(move)

            if score == -math.inf:
                return move, score, ret_depth

            if score < score_min:
                optimal_move = move
                score_min = score
                score_depth = ret_depth
                beta = min(beta, score_min)
                if self.prune and score_min <= alpha:
                    break
            elif score == score_min:
                if ret_depth > score_depth:
                    optimal_move = move
                    score_depth = ret_depth

        return optimal_move, score_min, score_depth

Sim.minimax = minimax

### Evaluation Method

The `evaluate` function scores the current state.  
The chosen evaluation method iterates the available moves and if that makes a triangle for us, the score is lowered and otherwise it is upped just a bit.  
While adding score if the other player loses with a move increases our precision, it comes with a lot of time cost and the result was not that much different and is therefore commented out.

In [109]:
def evaluate(self, turn: Player) -> int:
    score_multiplier = 1 if turn == Player.RED else -1
    turn_lines = self.reds if turn == Player.RED else self.blues
    other_lines = self.blues if turn == Player.RED else self.reds
    score = 0
    for x in self.available_moves:
        if self.check_triangle(turn_lines + [x]):
            score -= 10
        # elif self.check_triangle(other_lines + [x]):
        #     score += 10
        else:
            score += 1
    return score * score_multiplier

Sim.evaluate = evaluate

### Example Run and Timing

In [113]:
DEPTH = 3
GUI = False
PRUNE = True
LOOPS = 100

game = Sim(
    minimax_depth=DEPTH,
    prune=PRUNE,
    vertices=6,
    gui=GUI,
    frame_delay=0.5
)
results = {Player.RED: 0, Player.BLU: 0, Player.NIL: 0}

time_start = time.time()
for _ in range(LOOPS):
    winner = game.play()
    results[winner] += 1
time_end = time.time()

print(f'Time taken: {time_end - time_start:.3f} seconds')
print(f'Average time per game: {(time_end - time_start) / LOOPS:.3f} seconds')
print(f'Total red wins: {results[Player.RED]}')
print(f'Total blue wins: {results[Player.BLU]}')

Time taken: 1.228 seconds
Average time per game: 0.012 seconds
Total red wins: 96
Total blue wins: 4


## Questions

1. *What make a good heuristic? Why was the current heuristic chosen?*  
  A good heuristic for the evaluation function must give a score that closely relates to the real outcome of the state.  
  The heuristic must take into account most of the state properties and correctly predict the results with a high probability.  
  The chosen heuristic relies on the gameover condition which is forming a triangle.  
  In an arbitrary state, there are some lines that are yet to be drawn and are still available.  
  For each of them, if we can form a triangle with it, thats a negative score for us and if the enemy can make a triangle, that's a plus for us. If the move does not form anything, we just add 1 to the score.  
  This means that between some states, the one that will leave us with a much more open space and choices without losing (and potentially making the opponent lose) will take priority.

2. *The effects of the minimax depth:*  
  Clearly, the more the depth, the more states that will be actually explored which will bring us lower in the search tree and closer to the results, thus making the evaluation more precise and our chances of winning higher.  
  This will of course, increase the time the algorithm takes.

3. *The order of exploring children when pruning:*  
  The order of searching without pruning does not change anything for all states are searched whatever the result of a branch may be.  
  But when pruning, the ordering changes the number of visited states, though the outcome will not change or differ from that of no pruning.  
  The visited states change because in an order, nothing may be pruned but in another order the pruning may happen within the initial nodes.  
  Lower visited states correlate directly to less time consumption.  
  In this assignment implementation, no specific ordering is chosen.

The results of running the algorithm are as follow:  
Each section is ran 3 times, each time playing the game 100 times.

- Depth: 1, Prune: False

| Run | Time Taken | Average Per Game | Red | Blue |
| :-: | :--------: | :--------------: | :-: | :--: |
| 1   | 0.055      | 0.001            | 90  | 10   |
| 2   | 0.058      | 0.001            | 96  | 4    |
| 3   | 0.058      | 0.001            | 88  | 12   |

- Depth: 3, Prune: False

| Run | Time Taken | Average Per Game | Red | Blue |
| :-: | :--------: | :--------------: | :-: | :--: |
| 1   | 5.649      | 0.056            | 97  | 3    |
| 2   | 5.427      | 0.054            | 97  | 3    |
| 3   | 5.921      | 0.059            | 85  | 15   |

- Depth: 5, Prune: False

| Run | Time Taken | Average Per Game | Red | Blue |
| :-: | :--------: | :--------------: | :-: | :--: |
| 1   | 663.173    | 6.632            | 99  | 1    |
| 2   | 657.223    | 6.572            | 94  | 6    |
| 3   | 671.824    | 6.718            | 98  | 2    |

- Depth: 1, Prune: True

| Run | Time Taken | Average Per Game | Red | Blue |
| :-: | :--------: | :--------------: | :-: | :--: |
| 1   | 0.067      | 0.001            | 89  | 11   |
| 2   | 0.069      | 0.001            | 93  | 7    |
| 3   | 0.068      | 0.001            | 90  | 10   |

- Depth: 3, Prune: True

| Run | Time Taken | Average Per Game | Red | Blue |
| :-: | :--------: | :--------------: | :-: | :--: |
| 1   | 1.390      | 0.014            | 95  | 5    |
| 2   | 1.355      | 0.014            | 96  | 4    |
| 3   | 1.361      | 0.014            | 97  | 3    |

- Depth: 5, Prune: True

| Run | Time Taken | Average Per Game | Red | Blue |
| :-: | :--------: | :--------------: | :-: | :--: |
| 1   | 21.212     | 0.212            | 100 | 0    |
| 2   | 20.212     | 0.202            | 99  | 1    |
| 3   | 20.663     | 0.207            | 99  | 1    |

- Depth: 7, Prune: True

| Run | Time Taken | Average Per Game | Red | Blue |
| :-: | :--------: | :--------------: | :-: | :--: |
| 1   | 264.456    | 2.645            | 97  | 3    |
| 2   | 260.128    | 2.601            | 100 | 0    |
| 3   | 258.523    | 2.585            | 100 | 0    |
