# **AI Assignment: Connect 4 with MCTS and ID3**

### Assignment Done by:
- David Ventura Mendes de Sá (UP202303580)
- Samuel José Sousa Ventura da Silva (UP202305647)

## 0. Contents
1. Introduction

2. Connect Four  
    **2.1.** Libraries  
    **2.2.** Game Implementation  
    **2.3.** Bitboard vs Matrix  

3. Algorithms    
    **3.1.** Monte Carlo Tree Search (MCTS)   
    **3.2.** Decision Trees (ID3)     
        **3.2.1.** Dataset Generation  

      

4. Algorithms Implementation  
    **4.1.** Libraries   


4. UI Game

6. Results
7. Conclusion

8. References

   

## **1. Introduction** ##

## **2. Connect Four** ##

Connect Four is a two-player game where players take turns dropping discs into a 7x6 grid, aiming to be the first to connect four of their own discs in a row-horizontally, vertically, or diagonally. If the board fills up without a winner, the game is a draw.

Escrever alguma coisa sobre as libraries

### **2.2. Bitboard Implementation** ###

# Bitboard Class
- `player1 and player2` are 48-bit integers that represent each player's placed pieces.
- `height` is a 1x7 array, where height[col] returns the ammount of pieces in said column.
- `current_player` (1 or 2) dictates which player makes the next move, this will come in handy when generating datasets for ID3.

**Explaining the Board Encoding:**
- The Connect Four board is 7 columns by 6 rows (7x6 = 42 cells).
- The height array keeps track of how many pieces are in each column, making it easy to check what moves are available.
- Each integer's bit corresponds to a cell following this pattern: `bit_position = col * 7 + row`:

<div style="text-align: center">
<table border="1" cellpadding="8" cellspacing="0" style="margin: 0 auto">
  <tr><td>05</td><td>12</td><td>19</td><td>26</td><td>33</td><td>40</td><td>47</td></tr>
  <tr><td>04</td><td>11</td><td>18</td><td>25</td><td>32</td><td>39</td><td>46</td></tr>
  <tr><td>03</td><td>10</td><td>17</td><td>24</td><td>31</td><td>38</td><td>45</td></tr>
  <tr><td>02</td><td>09</td><td>16</td><td>23</td><td>30</td><td>37</td><td>44</td></tr>
  <tr><td>01</td><td>08</td><td>15</td><td>22</td><td>29</td><td>36</td><td>43</td></tr>
  <tr><td>00</td><td>07</td><td>14</td><td>21</td><td>28</td><td>35</td><td>42</td></tr>
</table>
</div>
<p style="text-align: center">(bits 6, 13, 20, 27, 34, 41 and 48 are ignored to simplify bitwise operations.</p>

Since MCTS will need to constantly simulate games, switching from a matrix to Bitboard increases performance by a wide margin by enabling fast bitwise operations for move generation, win detection, board evaluation and cheap state copying. This is exactly what our use-case requires, since we'll be using a large number of iterations for each move calculated. 

#### **Methods:** ####

- `make_move(col)`
    - **Purpose:** Places a piece for the current player in the specified column.
    - **How it works:**
        - Checks if the column is full.
        - Updates the bitboard for the current player using a bitwise OR operation.
        - Increments the column height.
        - Switches the turn to the other player.

- `check_player_win(player)`
    - **Purpose:** Checks if the player has achieved four in a row (win condition).
    - **How it works:**
        - Uses bitwise operations to efficiently check for four consecutive pieces in all directions (vertical, horizontal, and both diagonals).
        - For example, a horizontal win is detected by checking if there are three consecutive bits to the right of a piece using bit shifts and AND operations.
    
- `get_legal_moves()`
    - **Purpose:** Returns a list of columns where a move is possible (i.e., not full).
    - **How it works:**
        - Checks the height array for columns with less than 6 pieces.

- `is_over()`
    - **Purpose:** Determines if the game has ended, either by a win or a draw.
    - **How it works:**
        - Calls check_player_win for both players and checks if all columns are full.

**We also provide `matrix()` and `__str__` for easier debugging.** 

In [None]:
class Bitboard:
    def __init__(self):
        self.player1 = 0
        self.player2 = 0
        self.height = [0] * 7
        self.current_player = 1

    def make_move(self, col):
        
        if col == -1: return

        if self.height[col] >= 6:
            return False

        # Get position
        row = self.height[col]
        bit_position = col * 7 + row

        # Update bitboard
        if self.current_player == 1:
            self.player1 |= (1 << bit_position)
        else:
            self.player2 |= (1 << bit_position)

        # Update heightmap
        self.height[col] += 1

        # Switch to other player1
        self.current_player = 3 - self.current_player
        return True

    def check_player_win(self, player):
        # Diagonal \
        if player == 1:
            board = self.player1
        else:
            board = self.player2

        y = board & (board >> 6)
        if (y & (y >> 2 * 6)):
            return True
        
        # Horizontal
        y = board & (board >> 7)
        if (y & (y >> 2 * 7)):
            return True

        # Diagonal /
        y = board & (board >> 8)
        if (y & (y >> 2 * 8)):
            return True

        # Vertical
        y = board & (board >> 1)
        if (y & (y >> 2)):      
            return True
        return False

    def get_legal_moves(self):
        return [col for col in range(7) if self.height[col] < 6]
    
    def is_over(self):
        return self.check_player_win(1) or self.check_player_win(2) or all(h == 6 for h in self.height)

    def copy(self): # returns deep copy of self
        new_bitboard = Bitboard()
        new_bitboard.player1 = self.player1
        new_bitboard.player2 = self.player2
        new_bitboard.height = self.height.copy()
        new_bitboard.current_player = self.current_player
        return new_bitboard

    def matrix(self):

        matrix = [[0] * 7 for _ in range(6)]

        for bit_position in range(48):
            row = bit_position // 7  
            col = bit_position % 7

            # Check if the bit is set in player1's bitboard
            if self.player1 & (1 << bit_position):
                matrix[col][row] = 1
            # Check if the bit is set in player2's bitboard
            elif self.player2 & (1 << bit_position):
                matrix[col][row] = 2

        return matrix

    def __str__(self):
        # Print the matrix in a readable format
        matrix = self.matrix()
        resul = ""
        for row in matrix:
            for cell in row:
                if cell == 0:
                    resul += "- "
                elif cell == 1:
                    resul += "X "
                elif cell == 2:
                    resul += "O "
            resul += "\n"
        return resul


### **2.3 Bitboard vs Matrix** ###

In [3]:
##exemplo de codigo que faça o connect4 com matriz ou array 

## **3. Algorithms Implementation** ##

### **3.1 Monte Carlo Tree Search (MCTS)** ###

MCTS is a heuristic search algorithm that combines random sampling with tree search to make optimal decisions in complex environments. It's particularly effective for games like Connect Four with large branching factors. The algorithm operates in four phases:  
- **Selection:** Traverse the tree using Upper Confidence Bound (UCB) to balance exploration/exploitation.

- **Expansion:** Add a new child node for an unexplored move.

- **Simulation:** Perform random playouts from new nodes to a terminal state.

- **Backpropagation:** Update node statistics with simulation results.

The UCB formula balances known good moves with unexplored possibilities:

$$
UCB = \frac{U}{N} + C*\sqrt{\frac{ln{(Parent_N)}}{N}}
$$

#### **3.1.1. Libraries** ####

In [4]:
from math import sqrt, log
import random

- We import __sqrt__ and __log__ from the math module for mathematical calculations used in the UCB formula, and __random__ for selecting random moves during the search process.

#### **3.1.2. Class Node** ####

In [5]:
class Node:
    __slots__ = ['parent', 'move', 'children', 'wins', 'visits']
        
    def __init__(self, parent, move):
        self.parent = parent  # Node
        self.move = move  # move that led to this state
        self.children = {}  # Nodes
        self.wins = 0
        self.visits = 0

    def ucb_score(self, exploration_weight=5):
        if self.visits == 0:
            return float('inf')

        return (self.wins / self.visits) + exploration_weight * sqrt(log(self.parent.visits) / self.visits)

    def expand(self, bitboard):
        children = {Node(self, move) for move in bitboard.get_legal_moves()}
        self.children = children
        return random.choice(list(children))


The Node class represents a single state in the search tree.

- Memory-efficient with `__slots__`

- Each node tracks its parent, the move that led to this state, its children, and statistics (wins and visits).

- The `ucb_score` method computes the Upper Confidence Bound score for balancing exploration and exploitation, driving to an intelligent node selection.

- The `expand` method generates all possible child nodes from the current state and returns a randomly selected child for simulation.

#### **3.1.3. Class MCTS** ####

In [6]:
class MCTS:

    def __init__(self, iterations):
        self.iterations = iterations

    def select(self, root, state):
        node = root
        while node.children: 
            node = max(node.children, key=lambda c: c.ucb_score())
            state.make_move(node.move)
        return node, state


    def simulate(self, state):
        moves = state.get_legal_moves()
        while moves:
            move = random.choice(moves)
            state.make_move(move)
            if state.is_over():
                break
            moves = state.get_legal_moves()
        if state.check_player_win(1): return 1
        if state.check_player_win(2): return 2
        return 0
        

    def backpropagate(self, winner, node, state):

        reward = 0 if state.current_player == winner else 1

        while node is not None:
            node.visits += 1
            if winner == 0:
                reward = 0
            else:
                node.wins += reward
                reward = 1 - reward
            node = node.parent


    def search(self, bitboard):
        root = Node(None, None)
        root.expand(bitboard);

        for _ in range(self.iterations):

            state = bitboard.copy()

            leaf, state = self.select(root, state)
            
            # only simulate if its not terminal state
            if not state.is_over():
                leaf = leaf.expand(state)
                state.make_move(leaf.move)
            
            winner = self.simulate(state.copy())
            
            self.backpropagate(winner, leaf, state)

        # stats for the display
        arr = [0] * 14
        for child in root.children:
            arr[child.move] = child.visits
            arr[7+child.move] = child.wins
    
        # return the child with MOST VISITS, we don't use winrate here
        return max(root.children, key=lambda c: c.visits).move, arr


The Principal methods of the class `MCTS` are:

- `__init__(self, iterations)` : The constructor only takes a single parameter, the number of iterations the algorithm will run and determines the depth search.


- `select(self, root, state)` : This method implements the __selection__ phase of MCTS
    - Starts at the root node and descends through the tree  
    - At each level, selects the child with the highest UCB score, from the class `Node` 
    - Updates the game state as it descends  
    - Returns the selected leaf node and its corresponding state


- `simulate(self, state)` : This method performs the __simulation__ phase of MCTS
    - Executes a random play from the current state
    - Continues making random moves until the game ends
    - Returns the result: 1 if player 1 wins, 2 if player 2 wins, 0 for a draw


- `backpropagate(self, winner, node, state)` : This method implements the __backpropagation__ phase of MCTS  
    - Updates statistics (visits and wins) on all nodes in the path back to the root
    - Alternates the reward (0/1) to handle zero-sum games
    - If the result was a draw (winner=0), no wins are added


- `search(self, bitboard)` : This is the __main__ method that manage the entire MCTS process
    - Creates a root node and expands it
    - For each iteration:  
        - Copies the current game state
        - Selects a leaf node using UCB
        - If the game isn't over, expands the node and makes a move  
        - Simulates the game to completion  
        - Propagates the results back up the tree
    - Collects statistics for visualization  
    - Returns the move with the most visits (considered the best) and the statistics


The MCTS algorithm is powerful because it doesn't require domain-specific knowledge beyond the game rules, and naturally balances exploration of new moves with exploitation of moves known to be good.





### **3.2 Decision Trees (ID3)** ###

#### **3.2.1. Dataset Generation** ####

Falar um bocado do porque de termos gerado desta maneira

In [7]:
###codigo do dataset

#### **3.2.2. ID3 Implementation** ####

Qualuer cena

In [8]:
## codigo do Id3

## **4. Algorithms Implementation** ##

### **4.1. Libraries** ###


In [None]:
import game
import mcts
from pygame import gfxdraw
import pygame
from os import environ
import time
environ['PYGAME_HIDE_SUPPORT_PROMPT'] = '1'

### **4.2. Nao sei** ###


## **5. User Interface Game** ##


### **5.1. Human vs Human** ###


### **5.2. Human vs MCTS** ###


### **5.3. Human vs ID3** ###


### **5.4. MCTS vs ID3** ###


## **6. Results** ##


## **7. Conclusion** ##
