# Dots & Boxes - Decision Theory Project

Within this project, we are going to implement the dots & boxes game by using different techniques learned during this course. The idea is to create the dots & boxes environment and play with a smart agent and a random agent. The agents will simulate the game. Our goal is to make the smart agent more efficient than the smart agent.

## 1. Definition of game
Dots and boxes is a game originally played with pen and paper. The aim of the game is to get more boxes in your possession then your opponent. You start the game with an empty grid. This grid consists of a square (x,y where x and y have the same length) with horizontal dots evenly divided and vertical dots beneath those horizontal dots. So, each dot it in a right angle with every other dot. When You connect four dots you can form a square or in this game called a block. You and your opponent take turns to join up two adjacent dots with a line. If any player forms a box they get a point and they also get to make another move. The player with the most boxes will win. This is not a game of chance; strategies can help a player to win.

## 2. Definition of the environment
### 2.1. States
This game consists of a state space with a grid of 3 by 3. The environment can be in one of these states $S = (S_0, S_1, ..., S_n)$, where n has a maximum of 13 which is the end state, because you can only set 12 lines in-between dots this is not including the beginning state without any lines. After every turn from a player the game will be in a new state. This means that every addition of a line will result in a new state. So, our state consists of the state space with 1 to multiple dots connected by a line. Our class DotsAndBoxes creates this environment. This environment consists of:
- The initialisation with:
    - n as the size of the grid
    - hor_links as defining the horizontal dot connections (default with all false)
    - ver_links as defining the vertical dot connections (default all false)
    - owners to define the owners of the boxes (default all empty)
    - alphabets to create the x-labels on the grid
    - numbers to create the y-labels on the grid
    - __state as all the possible connection coordinates
    - score1 as score for player 1
    - score2 as score for player 2
    - player1 as name for player 1
    - player2 as name for player 2
    - player as the player of the current turn (default is player1)
    - gameOver to indicate if the game has finished
    - rewards 
    - total_reword
- The printer which prints the current state space
- start_game which without any input starts the game up and initializes the printer
- is_game_over which checks if the game is done

In [46]:
import random

class DotsAndBoxes:
    def __init__(self, n):
        self.n = n
        self.hor_links = [False] * (n * (n + 1))  # Defining horizontal link connections(now all False)
        self.ver_links = [False] * (n * (n + 1))  # Defining vertical link connections(now all False)
        self.owners = [' '] * (n ** 2)  # defining the owners of created boxes(now blank)
        self.alphabets = list('abcdefghijklmnopqrstuvwxyz')[0:(n + 1)]
        self.numbers = list('0123456789')[0:(n + 1)]
        self.dots = []  # List for points ID
        for num in self.numbers:
            for i in self.alphabets:
                self.dots.append(i + num)
        self.__state = self.dots
        self.score1 = 0
        self.score2 = 0
        self.player1 = 'P'
        self.player2 = 'C'
        self.prev_text = ""
        self.player = self.player1
        self.gameOver = False
        self.rewards = []
        self.total_reward = 0
        self.actions= [["a0","a1"],["a1","a2"],["b0","b1"],["b1","b2"],["c0","c1"],["c1","c2"],["a0","b0"],["b0","c0"],["a1","b1"],["b1","c1"],["a2","b2"],["b2","c2"]]

    # part of the following printer function : Helps in same line printing
    def part_print(self, new_text, end=""):
        self.prev_text = self.prev_text + new_text
        if end == "\n":
            print(self.prev_text)
            self.prev_text = ""
        else:
            self.prev_text = self.prev_text + end

    # Prints the dots and links and scores in a user friendly manner
    def printer(self, hor_links, ver_links, owners):
        new_hor_links = []
        for i in hor_links:
            if i:
                new_hor_links.append('___')
            else:
                new_hor_links.append('   ')
        new_ver_links = []
        for i in ver_links:
            if i:
                new_ver_links.append('|   ')
            else:
                new_ver_links.append('    ')
        char = '+'
        hor_index = 0
        ver_index = 0
        owner_index = 0
        row_index = 0
        print('-' * (((self.n + 1) * 4) + 8) + '\n')
        print("    a   b   c   d   e   f   g   h   i   j   "[0:((self.n + 1) * 4) + 1] + '\n')
        while True:
            print(" " + str(row_index) + ' ', end=' ')
            for i in range(self.n):
                self.part_print(char, "")
                self.part_print(new_hor_links[hor_index], "")
                hor_index += 1
            self.part_print(char, "\n")
            row_index += 1
            if (hor_index) == len(new_hor_links):
                break
            print("   ", end=' ')
            for i in range(self.n + 1):
                self.part_print(new_ver_links[ver_index], "")
                ver_index += 1
            self.part_print("", "\n")
            ver_index -= (self.n + 1)
            print("   ", end=' ')
            for i in range(self.n):
                if ver_links[ver_index]:
                    self.part_print("| " + owners[owner_index] + " ", "")
                else:
                    self.part_print("  " + owners[owner_index] + " ", "")
                owner_index += 1
                ver_index += 1
            if ver_links[ver_index]:
                self.part_print("|", "\n")
            else:
                self.part_print(" ", "\n")
            ver_index += 1
        print('\n\n' + '-' * (((self.n + 1) * 4) + 8))
        print("\nscore of player one (", self.player1, ") : " + str(self.score1))
        print("score of player two (", self.player2, ") : " + str(self.score2))
    
    def show_game(self):
        self.printer(self.hor_links, self.ver_links, self.owners)  # prints the boxes

    def play_game(self, point1, point2): #play game instead of start
        dont_change = False
        if point1 != "" and point2 != "":
            self.actions.remove([point1,point2])
            pos1 = self.dots.index(point1)
            pos2 = self.dots.index(point2)
            box_id = self.create_link(pos1, pos2, self.hor_links, self.ver_links)
            for corner in box_id:
                dont_change = self.change_owner(corner, self.owners, self.player)

            
            # self.movements(dont_change, box_id)

            if dont_change:  # if true the current player will continue the game
                if self.player == 'P':
                    self.score1 += 1
                else:
                    self.score2 += 1
            else:
                if self.player2 == 'C':  # checks if computer will play
                    self.comp_play(self.hor_links, self.ver_links)
                else:
                    self.change_player()  # changes the player
            
            self.printer(self.hor_links, self.ver_links, self.owners)  # prints the boxes
        else:
            self.printer(self.hor_links, self.ver_links, self.owners)  # prints the boxes
            print("Select the points please!")
        
        if self.is_game_over():
            # Actions after game is over
            print("\nGame over!!")
            self.get_rewards()
            if self.score1 < self.score2:
                print("\nPlayer 2" + "(" + self.player2 + ")" + " has won the match with " + str(self.score2) + " points")
            elif self.score1 > self.score2:
                print("\nPlayer 1" + "(" + self.player1 + ")" + " has won the match with " + str(self.score1) + " points")
            else:
                print("\nThe game is draw!")
    
    def is_game_over(self):
        if ' ' not in self.owners:
            self.gameOver = True
        return self.gameOver
        
    def is_linked(self, pos1, pos2, hor_links, ver_links):
        if pos1 > pos2:
            pos1, pos2 = pos2, pos1
        if (pos1 + 1) % (self.n + 1) == 0 and pos2 % (self.n + 1) == 0:
            return False
        if pos2 - pos1 == self.n + 1:
            return ver_links[pos1]
        elif pos2 - pos1 == 1:
            return hor_links[pos1 - ((pos1 + 1) // (self.n + 1))]
        else:
            return False

    # Checks if the given four points are joined correctly so that a box is formed
    def is_box_completed(self, pos1, pos2, pos3, pos4, hor_links, ver_links):
        all = [pos1, pos2, pos3, pos4]
        all.sort()
        for i in all:
            if i < 0 or i > (((self.n + 1) ** 2) - 1):
                return False
        if (self.is_linked(all[0], all[1], hor_links, ver_links) and self.is_linked(all[2], all[3], hor_links,
                                                                                    ver_links)) and (
                self.is_linked(all[0], all[2], hor_links, ver_links) and self.is_linked(all[1], all[3], hor_links,
                                                                                        ver_links)):
            return True
        else:
            return False

    # checks if the given points are joined and returns a list of topmost left points of the box created .
    # if no box is formed, returns [].
    # raises error if the points cannot be joined !
    def create_link(self, pos1, pos2, hor_links, ver_links):
        e = Exception("Error")
        if self.is_linked(pos1, pos2, hor_links, ver_links):
            raise RuntimeError("already present")
        if pos1 > pos2:
            pos1, pos2 = pos2, pos1
        if (pos1 + 1) % (self.n + 1) == 0 and pos2 % (self.n + 1) == 0:
            raise e
        if pos2 - pos1 == self.n + 1:
            ver_links[pos1] = True
            box_id = []
            check = self.is_box_completed(pos1, pos2, pos1 - 1, pos2 - 1, hor_links, ver_links)
            if check:
                box_id.append(pos1 - 1)
            check = self.is_box_completed(pos1, pos2, pos1 + 1, pos2 + 1, hor_links, ver_links)
            if check:
                box_id.append(pos1)
            return box_id
        elif pos2 - pos1 == 1:
            hor_links[pos1 - ((pos1 + 1) // (self.n + 1))] = True
            box_id = []
            check = self.is_box_completed(pos1, pos2, pos1 - (self.n + 1), pos2 - (self.n + 1), hor_links, ver_links)
            if check:
                box_id.append(pos1 - (self.n + 1))
            check = self.is_box_completed(pos1, pos2, pos1 + (self.n + 1), pos2 + (self.n + 1), hor_links, ver_links)
            if check:
                box_id.append(pos1)
            return box_id
        else:
            raise e

    # removes a link from the given points by making the joining index False in the hor_links or ver_links
    # does nothing if the link is absent
    def remove_link(self, pos1, pos2, hor_links, ver_links):
        e = Exception("Error")
        if pos1 > pos2:
            pos1, pos2 = pos2, pos1
        if (pos1 + 1) % (self.n + 1) == 0 and pos2 % (self.n + 1) == 0:
            raise e
        if (pos2 - pos1) == self.n + 1:
            ver_links[pos1] = False
        elif (pos2 - pos1) == 1:
            hor_links[pos1 - ((pos1 + 1) // (self.n + 1))] = False
        else:
            raise e

    # receives the corner(left topmost point of the box) value and changes its ownership to player name
    def change_owner(self, corner, owners, player):
        if corner != []:
            owners[corner - ((corner + 1) // (self.n + 1))] = player
            return True
        else:
            return False

    # reverses the current player
    def change_player(self):
        if self.player == self.player1:
            self.player = self.player2
        else:
            self.player = self.player1

    # joins every links and checks if a box is created, if not : Deletes the link , else: Keeps it
    def comp_complete_box(self, virtual_hor_links, virtual_ver_links):
        link_joined = []
        box_count = 0
        for i in range((self.n + 1) ** 2):
            try:
                flag = self.create_link(i, i + 1, virtual_hor_links, virtual_ver_links)
                if flag == []:
                    self.remove_link(i, i + 1, virtual_hor_links, virtual_ver_links)
                else:
                    link_joined.append((i, i + 1))
                box_count += len(flag)
            except:
                pass
            try:
                flag = self.create_link(i, i + self.n + 1, virtual_hor_links, virtual_ver_links)
                if flag == []:
                    self.remove_link(i, i + self.n + 1, virtual_hor_links, virtual_ver_links)
                else:
                    link_joined.append((i, i + self.n + 1))
                box_count += len(flag)
            except:
                pass
        return link_joined, box_count, virtual_hor_links, virtual_ver_links
    # calls the comp_complete_box untill the is a slightest chance of gaining a box
    def comp_try_box(self, hor_links, ver_links):
        virtual_hor_links = list(hor_links)
        virtual_ver_links = list(ver_links)
        link_joined = []
        box_count = 0
        while True:
            prev_length = box_count
            new_links, count, virtual_hor_links, virtual_ver_links = self.comp_complete_box(virtual_hor_links,
                                                                                            virtual_ver_links)
            link_joined = link_joined + new_links
            box_count += count
            if box_count == prev_length:
                break
        return link_joined, box_count, virtual_hor_links, virtual_ver_links
    # final Turns generator! comes into play when all chances of gaining a box is gone! joins all not joined lines 
    # one by one and counts the possibility of gaining a box by the opponent, then remove the joining the least box 
    # gaining possibility is selected takes a random chance from the least possibilities and appends to the 
    # link_joined list ,hence generates the final turn chances 
    def get_comp_turns(self, link_joined, virtual_hor_links, virtual_ver_links):
        if (False not in virtual_hor_links) and (False not in virtual_ver_links):
            least_gainable_box_count = 0
        else:
            least_gainable_box_count = (self.n + 1) ** 2
        link_available = []
        count = 0
        for link in virtual_hor_links:
            if link == False:
                virtual_hor_links[count] = True
                new_link, new_count, H, V = self.comp_try_box(virtual_hor_links, virtual_ver_links)
                for link in new_link:
                    self.remove_link(link[0], link[1], virtual_hor_links, virtual_ver_links)
                if new_count < least_gainable_box_count:
                    least_gainable_box_count = new_count
                    link_available = []
                    link_available.append(((count // self.n) + count, (count // self.n) + (count + 1)))
                elif new_count == least_gainable_box_count:
                    link_available.append(((count // self.n) + count, (count // self.n) + (count + 1)))
                virtual_hor_links[count] = False
            count += 1
        count = 0
        for link in virtual_ver_links:
            if link == False:
                virtual_ver_links[count] = True
                new_link, new_count, H, V = self.comp_try_box(virtual_hor_links, virtual_ver_links)
                for link in new_link:
                    self.remove_link(link[0], link[1], virtual_hor_links, virtual_ver_links)
                if new_count < least_gainable_box_count:
                    least_gainable_box_count = new_count
                    link_available = [(count, count + self.n + 1)]
                elif new_count == least_gainable_box_count:
                    link_available.append((count, count + self.n + 1))
                virtual_ver_links[count] = False
            count += 1

        if len(link_joined) >= 3 and least_gainable_box_count >= 2:  # a special winning trick is special cases only!
            del link_joined[-2]
            return link_joined
        else:
            if link_available != []:
                link_joined.append(random.choice(link_available))  # general case
            return link_joined

    # calls the comp_try_box and get_comp_turns one by one and joins the links(returned from get_comp_turns) and 
    # changes the ownership! 
    def comp_play(self, hor_links, ver_links):
        box_link_list, box_count, new_hor_links, new_ver_links = self.comp_try_box(hor_links, ver_links)
        turn_list = self.get_comp_turns(box_link_list, new_hor_links, new_ver_links)
        for turn in turn_list:
            box_id = self.create_link(turn[0], turn[1], hor_links, ver_links)
            flag = False
            for corner in box_id:
                flag = self.change_owner(corner, self.owners, "C")
                if flag:
                    self.score2 += 1
            self.actions.remove([self.dots[turn[0]], self.dots[turn[1]]])
            print("\nComputer play: line created between", self.dots[turn[0]], "and", self.dots[turn[1]], '\n')
            # if flag == False:
            #      break
            # else:
            #     self.printer(hor_links, ver_links, self.owners)        

    def get_rewards(self):
        if self.gameOver:
            if self.score1 < self.score2:
                self.rewards.append(-10)
            elif self.score1 > self.score2:
                self.rewards.append(10)
            else:
                self.rewards.append(0)
                
            r = self.score2 * -1
            self.rewards.append(self.score1)   
            self.rewards.append(r)          
                    
    def total_rewards_calculator(self):
        self.total_reward = sum(self.rewards)
        return self.total_reward

### 2.2. Actions
One action in this game will be making a connection between two dots. Each action from one player is followed by an action from the other player unless the first player has managed to create a box. The actions in this game consist of two coordinates where in between the connection should be formed. These include:

In [25]:
actions= [["a0","a1"],["a1","a2"],["b0","b1"],["b1","b2"],["c0","c1"],["c1","c2"],["a0","b0"],["b0","c0"],["a1","b1"],["b1","c1"],["a2","b2"],["b2","c2"]]
len(actions)

12

These actions include all the possible horizontal and vertical actions. These actions can be activated by dots and boxes play game, where you input the two coordinates and returns the new state with your new action and the algorithms new action. All the actions by the algorithm for the computer play are the functions with comp in front of the name. These include:
- comp_try_box 
- get_comp_turns
- comp_play

### 2.3. Transitions
The transition of $(s'|s,a)$ where s' is the new state, s is the previous state and a represents one of the actions. $s'$ will be decided by the previous state in combination with the action, because it will depend on the new actions what the new stat looks like. The probability $P$ of these transitions will depend on the strategy of the algorithm.

### 2.4. Rewards
The reword of a specific transition, also $R(s,a,s')$ will depend on how successful the algorithm was. If the algorithm is closer to creating a box, or did create a box, then the reword should be higher then actions that do the opposite.

### 2.5. Policy
Not yet decided. 

In [47]:
game = DotsAndBoxes(2)
game.show_game()

--------------------

    a   b   c

 0  +   +   +
                
             
 1  +   +   +
                
             
 2  +   +   +


--------------------

score of player one ( P ) : 0
score of player two ( C ) : 0


In [48]:
print(game.actions)
game.play_game("c0", "c1")
print(game.actions)

[['a0', 'a1'], ['a1', 'a2'], ['b0', 'b1'], ['b1', 'b2'], ['c0', 'c1'], ['c1', 'c2'], ['a0', 'b0'], ['b0', 'c0'], ['a1', 'b1'], ['b1', 'c1'], ['a2', 'b2'], ['b2', 'c2']]

Computer play: line created between b1 and c1 

--------------------

    a   b   c

 0  +   +   +
            |   
            |
 1  +   +___+
                
             
 2  +   +   +


--------------------

score of player one ( P ) : 0
score of player two ( C ) : 0
[['a0', 'a1'], ['a1', 'a2'], ['b0', 'b1'], ['b1', 'b2'], ['c1', 'c2'], ['a0', 'b0'], ['b0', 'c0'], ['a1', 'b1'], ['a2', 'b2'], ['b2', 'c2']]


In [29]:
game.play_game("b0", "b1")


Computer play: line created between a1 and b1 

--------------------

    a   b   c

 0  +   +   +
        |   |   
        |   |
 1  +___+   +
                
             
 2  +___+   +


--------------------

score of player one ( P ) : 0
score of player two ( C ) : 0


In [30]:
game.play_game("b2", "c2")


Computer play: line created between c1 and c2 

--------------------

    a   b   c

 0  +   +   +
        |   |   
        |   |
 1  +___+   +
            |   
            |
 2  +___+___+


--------------------

score of player one ( P ) : 0
score of player two ( C ) : 0


In [31]:
game.play_game("a1", "a2")


Computer play: line created between b1 and b2 


Computer play: line created between b1 and c1 


Computer play: line created between b0 and c0 


Computer play: line created between a0 and a1 

--------------------

    a   b   c

 0  +   +___+
    |   |   |   
    |   | C |
 1  +___+___+
    |   |   |   
    | C | C |
 2  +___+___+


--------------------

score of player one ( P ) : 0
score of player two ( C ) : 3


In [633]:
for i  in game.rewards:
    print(f"reward -> {i}")

print(f"The total rewards are {sum(game.rewards)}")
print(f"The total rewards are {game.total_rewards_calculator()}")

reward -> -10
reward -> 0
reward -> -4
The total rewards are -14
The total rewards are -14
