# Programming Assignment: Domino AI with Monte Carlo Simulation

**University of Puerto Rico – Mayagüez**  
Department of Electrical and Computer Engineering  
ICOM5015 Inteligencia Artificial  
Dr. J. Fernando Vega Riveros  
23 de abril de 2025  
 
- Miguel A. Maldonado Maldonado  
- Alejandro J. Rodríguez Burgos  


## Abstract

This assigment implements and tests an adversarial‐search agent for the classic game of Domino, using **Monte Carlo simulation**.  
We generate the full domino set, deal hands to two players (human vs. AI), and use randomized rollouts from each candidate move to estimate win probabilities.  
This approach requires minimal game‐modeling, scales linearly with the number of simulations, and naturally handles hidden‐tile uncertainty without explicit belief‐state tracking.  


## Introduction

Domino is a two‐player tile‐matching game in which each player seeks to exhaust their hand by matching pips on either end of a growing chain.  
Adversarial‐search techniques like minimax or POMDP solvers can become complex when faced with hidden information and large branching factors.  
Monte Carlo simulation sidesteps these challenges by performing many **random rollouts** from each possible move and selecting the move with the best average outcome.  


## Information & Methods

We selected Monte Carlo simulation for it's ease of implementation as we found DNNs to be much more complicated and required training and licenses. DNNs might be a bit overkill while Monte Carlo might be able to find new and different outcomes to which we can't train our Deep Neural Network for. We found Monte Carlo simulation to  to be a practical approach that directly tackles the stochastic and partially observable nature of Dominoes without requiring hard to design heuristic functions or DNNs.

1. **Domino & Player classes**  
   - `Domino(p1, p2)`: represents one tile, supports flipping and value access.  
   - `Player(name)`: holds a hand, can play and check for legal moves.

2. **MonteCarloAgent (subclass of Player)**  
   - `legal_moves(board, hand)`: enumerate indices of playable tiles.  
   - `simulate_playout(board, hands, current_idx)`: deep‐copy state, then loop random legal moves until someone wins or the game blocks; returns +1/–1/0.  
   - `choose_move(board, hands, current_idx)`: for each legal move, apply it, run N rollouts, average the results, and pick the highest‐scoring move.

3. **GameEnvironment**  
   - Builds and shuffles the 28‐tile set, deals 7 tiles each.  
   - Manages the turn loop, calling `choose_move` for the AI or prompting human input.  
   - Detects win or blocked‐game, and computes pip‐sum tiebreakers.

4. **Tunable parameters**  
   - `num_rollouts` controls the trade‐off between decision quality and compute time.


In [1]:
# import random
import copy
import sys  # Import sys for sys.stdout.flush() if needed for interactive input


class Domino:
    def __init__(self, p1, p2):
        # Ensure pips1 is always the smaller number for canonical representation
        # Although not strictly necessary for this game logic, it can be good practice
        self.pips1 = min(p1, p2)
        self.pips2 = max(p1, p2)
        # Keep track of the orientation on the board
        self.is_flipped = False

    def __str__(self):
        # Display based on current orientation if played
        p1, p2 = self.get_values()
        return f"[{p1}|{p2}]"

    def __repr__(self):
        # Use the canonical constructor values for representation
        return f"Domino({min(self.pips1, self.pips2)}, {max(self.pips1, self.pips2)})"

    def is_double(self):
        return self.pips1 == self.pips2

    def get_values(self):
        # Return values based on current orientation
        return (self.pips2, self.pips1) if self.is_flipped else (self.pips1, self.pips2)

    def flip(self):
        # Toggle the flipped state
        self.is_flipped = not self.is_flipped

    def can_connect(self, value):
        """Checks if either pip matches the given value."""
        p1, p2 = self.get_values_canonical()  # Check canonical values for connection potential
        return p1 == value or p2 == value

    def get_values_canonical(self):
        """Returns the pips regardless of flipped state, useful for some checks."""
        return (self.pips1, self.pips2)

    def orient(self, value_to_match, side):
        """Orients the domino correctly for placing it."""
        # Side 'left' means the value_to_match should be on the left (pips1 after orientation)
        # Side 'right' means the value_to_match should be on the right (pips2 after orientation)
        p1_orig, p2_orig = self.pips1, self.pips2  # Use original values

        if side == 'left':
            if p1_orig == value_to_match:
                self.is_flipped = True  # Flip so p2_orig is now first
            elif p2_orig == value_to_match:
                self.is_flipped = False  # Don't flip, p1_orig is now first
            else:
                raise ValueError("Domino cannot connect to this value.")
        elif side == 'right':
            if p1_orig == value_to_match:
                self.is_flipped = False  # Don't flip, p2_orig is second
            elif p2_orig == value_to_match:
                self.is_flipped = True  # Flip, so p1_orig is second
            else:
                raise ValueError("Domino cannot connect to this value.")
        else:
            raise ValueError("Invalid side specified for orientation.")


class Player:
    def __init__(self, name):
        self.name = name
        self.hand = []

    def draw_domino(self, domino):
        self.hand.append(domino)

    def get_playable_moves(self, board, left_end, right_end):
        """Returns a list of playable moves as (domino_index, side_to_play)."""
        playable = []
        if not board:  # First move
            for i, dom in enumerate(self.hand):
                playable.append((i, 'right'))  # Can play any domino initially
            return playable

        for i, dom in enumerate(self.hand):
            p1, p2 = dom.get_values_canonical()  # Check canonical values
            can_play_left = left_end is not None and (p1 == left_end or p2 == left_end)
            can_play_right = right_end is not None and (p1 == right_end or p2 == right_end)

            if can_play_left:
                playable.append((i, 'left'))
            # Avoid adding duplicate entries if a domino fits both ends
            if can_play_right and (
                    not can_play_left or p1 != p2):  # (p1 != p2 ensures doubles fitting both ends aren't added twice unless they are the only option)
                playable.append((i, 'right'))
            elif can_play_right and can_play_left and p1 == p2 and not any(
                    idx == i and side == 'right' for idx, side in playable):
                # special case for double fitting both identical ends
                playable.append((i, 'right'))

        return playable

    def play_domino(self, domino_index, board, side, left_end, right_end):
        """
        Plays the domino at the specified index onto the specified side.
        Returns True if successful, False otherwise.
        Assumes the move is potentially valid (checked by get_playable_moves).
        """
        if domino_index < 0 or domino_index >= len(self.hand):
            print("Error: Invalid domino index.")
            return False

        domino = self.hand[domino_index]
        p1_orig, p2_orig = domino.pips1, domino.pips2  # Use original values for matching check

        if not board:  # First move of the game
            # No specific end value, just place it. Let's keep default orientation.
            self.hand.pop(domino_index)
            board.append(domino)
            return True

        if side == 'left':
            if left_end is None or not (p1_orig == left_end or p2_orig == left_end):
                # This check should ideally be redundant if get_playable_moves was used
                print(f"Error: Domino {domino} cannot connect to left end value {left_end}")
                return False
            domino.orient(left_end, 'left')
            self.hand.pop(domino_index)
            board.insert(0, domino)
            return True
        elif side == 'right':
            if right_end is None or not (p1_orig == right_end or p2_orig == right_end):
                # This check should ideally be redundant
                print(f"Error: Domino {domino} cannot connect to right end value {right_end}")
                return False
            domino.orient(right_end, 'right')
            self.hand.pop(domino_index)
            board.append(domino)
            return True
        else:
            print("Error: Invalid side specified.")
            return False

    def has_playable(self, board, left_end, right_end):
        """Checks if the player has any domino that can be played on either end."""
        if not board:
            return len(self.hand) > 0  # Can always play if hand is not empty and board is empty
        for dom in self.hand:
            p1, p2 = dom.get_values_canonical()
            if left_end is not None and (p1 == left_end or p2 == left_end):
                return True
            if right_end is not None and (p1 == right_end or p2 == right_end):
                return True
        return False


class MonteCarloAgent(Player):
    def __init__(self, name, num_rollouts=200):
        super().__init__(name)
        self.num_rollouts = num_rollouts

    def legal_moves(self, board, hand, left_end, right_end):
        """Gets legal moves identical to Player.get_playable_moves for consistency."""
        # Note: Overriding Player.get_playable_moves is also an option
        playable = []
        if not board:
            return [(i, 'right') for i in range(len(hand))]

        for i, dom in enumerate(hand):
            p1, p2 = dom.get_values_canonical()
            can_play_left = left_end is not None and (p1 == left_end or p2 == left_end)
            can_play_right = right_end is not None and (p1 == right_end or p2 == right_end)

            if can_play_left:
                playable.append((i, 'left'))
            # Avoid adding duplicate entries if a domino fits both ends
            if can_play_right and (not can_play_left or p1 != p2):
                playable.append((i, 'right'))
            elif can_play_right and can_play_left and p1 == p2 and not any(
                    idx == i and side == 'right' for idx, side in playable):
                playable.append((i, 'right'))

        return playable

    def simulate_playout(self, board_sim, hands_sim, current_idx):
        """Simulates a random playout from the current state."""
        # Create deep copies to avoid modifying original state during simulation
        board_sim = [copy.deepcopy(d) for d in board_sim]
        hands_sim = [[copy.deepcopy(d) for d in h] for h in hands_sim]

        num_players = len(hands_sim)
        agent_idx = (current_idx - 1 + num_players) % num_players  # The agent who made the move leading here
        turn_player_idx = current_idx
        passes = 0  # Track consecutive passes to detect blocked game

        while True:
            # --- Check for Winner (empty hand) ---
            if not hands_sim[turn_player_idx]:
                # Last player emptied their hand, they win
                # Need to determine who that was relative to the original agent making the choice
                return 1 if turn_player_idx == agent_idx else -1  # Simplified win/loss

            # --- Determine board ends for the current simulation state ---
            sim_left_end, sim_right_end = None, None
            if board_sim:
                sim_left_end = board_sim[0].get_values()[0]
                sim_right_end = board_sim[-1].get_values()[1]
            elif not board_sim and not any(hands_sim):  # Check edge case: board empty, all hands empty (unlikely)
                return 0  # Draw

            # --- Find legal moves for the current simulation player ---
            current_hand = hands_sim[turn_player_idx]
            sim_moves = []
            if not board_sim:  # First move of sim
                sim_moves = [(i, dom, 'right') for i, dom in enumerate(current_hand)]
            else:
                for i, dom in enumerate(current_hand):
                    p1, p2 = dom.get_values_canonical()
                    if sim_left_end is not None and (p1 == sim_left_end or p2 == sim_left_end):
                        sim_moves.append((i, dom, 'left'))
                    # Avoid duplicates if fits both, unless double
                    if sim_right_end is not None and (p1 == sim_right_end or p2 == sim_right_end):
                        is_double = (p1 == p2)
                        already_added_left = any(m[0] == i and m[2] == 'left' for m in sim_moves)
                        if not already_added_left or is_double:
                            sim_moves.append((i, dom, 'right'))

            # --- Play a random move or pass ---
            if sim_moves:
                passes = 0  # Reset pass counter
                move_idx, chosen_dom_copy, side_to_play = random.choice(sim_moves)

                # Get the actual domino from the hand to modify/play
                chosen_dom = current_hand[move_idx]

                # Orient and play the chosen domino
                if side_to_play == 'left':
                    chosen_dom.orient(sim_left_end, 'left')
                    board_sim.insert(0, chosen_dom)
                else:  # 'right' or first move
                    end_val = sim_right_end if board_sim else None  # Use None for first move orientation
                    if end_val is not None:
                        chosen_dom.orient(end_val, 'right')
                    # else: default orientation is fine for first move
                    board_sim.append(chosen_dom)

                # Remove the domino from the player's hand *after* potentially using its index
                hands_sim[turn_player_idx].pop(move_idx)

                # --- Check if this player just won ---
                if not hands_sim[turn_player_idx]:
                    return 1 if turn_player_idx == agent_idx else -1

            else:
                # Player has no moves, pass
                passes += 1
                # --- Check for Blocked Game ---
                if passes >= num_players:
                    # All players passed consecutively
                    hand_sums = [sum(sum(d.get_values_canonical()) for d in hand) for hand in hands_sim]
                    min_sum = min(hand_sums)
                    winners = [i for i, s in enumerate(hand_sums) if s == min_sum]

                    if len(winners) == 1:
                        # Single winner based on lowest score
                        return 1 if winners[0] == agent_idx else -1
                    else:
                        # Tie based on score
                        return 0  # Draw or neutral outcome for tied block

            # --- Advance to next player ---
            turn_player_idx = (turn_player_idx + 1) % num_players

    def choose_move(self, board, hands, current_player_index):
        """Chooses the best move using Monte Carlo Tree Search rollouts."""
        left_end, right_end = GameEnvironment.get_board_ends_static(board)  # Use static method
        my_hand = hands[current_player_index]

        possible_moves = self.legal_moves(board, my_hand, left_end, right_end)

        if not possible_moves:
            return None  # No legal moves

        best_move, best_score = None, float('-inf')

        # Evaluate each possible move
        for move in possible_moves:
            move_idx, side = move
            total_score = 0

            # Perform rollouts for the current candidate move
            for _ in range(self.num_rollouts):
                # Create deep copies for the simulation start state
                board_copy = [copy.deepcopy(d) for d in board]
                hands_copy = [[copy.deepcopy(d) for d in h] for h in hands]

                # Simulate making the move
                agent_hand_copy = hands_copy[current_player_index]
                domino_to_play = agent_hand_copy[move_idx]  # Get the specific domino copy

                current_left, current_right = GameEnvironment.get_board_ends_static(board_copy)

                if not board_copy:  # First move case
                    board_copy.append(domino_to_play)
                elif side == 'left':
                    domino_to_play.orient(current_left, 'left')
                    board_copy.insert(0, domino_to_play)
                else:  # side == 'right'
                    domino_to_play.orient(current_right, 'right')
                    board_copy.append(domino_to_play)

                # Remove the domino from the copied hand *after* placing it
                agent_hand_copy.pop(move_idx)

                # Start the random playout simulation from the *next* player's turn
                next_player_idx = (current_player_index + 1) % len(hands_copy)
                outcome = self.simulate_playout(board_copy, hands_copy, next_player_idx)
                total_score += outcome

            # Calculate average score for this move
            average_score = total_score / self.num_rollouts

            # Update best move if this one is better
            if average_score > best_score:
                best_score = average_score
                best_move = move

        return best_move


class GameEnvironment:
    def __init__(self, players):
        self.players = players
        self.dominoes = self._create_dominoes()
        self.board = []  # List representing the chain of dominoes
        self.current_player_index = 0

    def _create_dominoes(self):
        doms = [Domino(i, j) for i in range(7) for j in range(i, 7)]
        random.shuffle(doms)
        return doms

    def deal_dominoes(self, num=7):
        for p in self.players:
            p.hand = []  # Ensure hands are empty before dealing
            for _ in range(num):
                if self.dominoes:
                    p.draw_domino(self.dominoes.pop())

    def get_current_player(self):
        return self.players[self.current_player_index]

    def display_board(self):
        if not self.board:
            print("Board: (empty)")
        else:
            print("Board:", " ".join(str(d) for d in self.board))
            left_end, right_end = self.get_board_ends()
            print(f"       Ends: Left={left_end}, Right={right_end}")

    def display_hands(self, show_all=False):
        print("--- Hands ---")
        for i, p in enumerate(self.players):
            if isinstance(p, Player) and not isinstance(p,
                                                        MonteCarloAgent) or show_all:  # Show human hand or all hands if requested
                hand_str = " ".join(f"[{idx}:{dom}]" for idx, dom in enumerate(p.hand)) if p.hand else "(empty)"
                print(f"{p.name}'s hand: {hand_str}")
            else:  # Hide AI hand during normal play
                print(f"{p.name}'s hand: {len(p.hand)} dominoes")
        print("-------------")

    @staticmethod
    def get_board_ends_static(board):
        """Static version to be used by agent without needing GameEnvironment instance."""
        if not board:
            return None, None
        elif len(board) == 1:
            # For a single domino, both ends are its pips in current orientation
            p1, p2 = board[0].get_values()
            return p1, p2
        else:
            # Left end is the first pip of the first domino
            # Right end is the second pip of the last domino
            left_end = board[0].get_values()[0]
            right_end = board[-1].get_values()[1]
            return left_end, right_end

    def get_board_ends(self):
        """Instance method calling the static version."""
        return self.get_board_ends_static(self.board)

    def play_turn(self):
        player = self.get_current_player()
        print(f"\n--- {player.name}'s Turn ---")
        self.display_board()
        self.display_hands()  # Show hands (hiding AI)

        left_end, right_end = self.get_board_ends()
        playable_moves = player.get_playable_moves(self.board, left_end, right_end)

        if not playable_moves:
            print(f"{player.name} has no playable moves. Skipping turn.")
            # We don't track passes here, rely on is_game_over check later
        elif isinstance(player, MonteCarloAgent):
            print(f"{player.name} (AI) is thinking...")
            # The agent needs the state of all hands for its simulation
            all_hands = [p.hand for p in self.players]
            chosen_move = player.choose_move(self.board, all_hands, self.current_player_index)

            if chosen_move:
                move_idx, side = chosen_move
                domino_played = player.hand[move_idx]  # Get reference before popping
                print(f"{player.name} (AI) chooses to play {domino_played} on the {side} end.")
                # Play the domino using the player's method
                success = player.play_domino(move_idx, self.board, side, left_end, right_end)
                if not success:
                    print(f"!!! AI Error: Failed to play chosen move {chosen_move}. This shouldn't happen.")
                    # Consider adding error handling or fallback
            else:
                # This case should ideally be caught by playable_moves check earlier
                print(f"{player.name} (AI) couldn't choose a move (should have skipped).")

        else:  # Human player
            print("Your playable moves:")
            if not self.board:
                print("  Board is empty, play any domino.")
                for idx, side in playable_moves:
                    print(f"  Index {idx}: {player.hand[idx]}")
            else:
                for idx, side in playable_moves:
                    print(
                        f"  Index {idx} ({player.hand[idx]}) on '{side}' end (matching {left_end if side == 'left' else right_end})")

            while True:
                action = input("Enter index and side (e.g., '3 left' or '3 l') or 's' to skip: ").lower().strip()
                if action == 's':
                    # Only allow skipping if truly no moves are possible (already checked)
                    # This input is more for potentially passing if allowed by rules, but here we force a play if possible.
                    print(f"{player.name} chose to skip (should only happen if no moves).")
                    # Re-verify just in case:
                    if playable_moves:
                        print("You have playable moves, you cannot skip.")
                        continue
                    else:  # Truly no moves
                        break  # Exit the loop, turn ends

                parts = action.split()
                if len(parts) != 2:
                    print("Invalid input format. Use 'index side' (e.g., '2 r' or '2 right').")
                    continue

                try:
                    index = int(parts[0])
                    side_input = parts[1]

                    if side_input in ['l', 'left']:
                        side = 'left'
                    elif side_input in ['r', 'right']:
                        side = 'right'
                    elif not self.board and side_input in ['r', 'right', 'l', 'left']:  # Allow either for first move
                        side = 'right'  # Default to right for first move consistency
                    else:
                        print("Invalid side. Use 'left'/'l' or 'right'/'r'.")
                        continue

                    # Validate if the chosen (index, side) is in the list of playable moves
                    if (index, side) not in playable_moves:
                        # Check if the domino exists but can only be played on the other side
                        other_side = 'right' if side == 'left' else 'left'
                        if any(m[0] == index for m in playable_moves):
                            print(f"Domino {index} is playable, but not on the '{side}' end. Try the other end?")
                        else:
                            print(f"Invalid move choice: Domino index {index} is not playable or doesn't exist.")
                        continue

                    # Attempt to play the domino
                    domino_to_play = player.hand[index]  # For logging
                    success = player.play_domino(index, self.board, side, left_end, right_end)
                    if success:
                        print(f"{player.name} played {domino_to_play} on the {side} end.")
                        break  # Valid move made, exit loop
                    else:
                        # play_domino prints internal errors, but add a general message
                        print("Move failed. Please try again.")  # Should be rare now

                except ValueError:
                    print("Invalid index. Please enter a number.")
                    continue
                except IndexError:
                    print("Invalid index. That domino number doesn't exist in your hand.")
                    continue

        # Advance to the next player *after* the current player has potentially finished their move
        self.current_player_index = (self.current_player_index + 1) % len(self.players)

    def is_game_over(self):
        """Checks if the game has ended either by a player emptying their hand or a blocked board."""
        # Check for empty hand
        for p in self.players:
            if not p.hand:
                return True, f"{p.name} emptied their hand and wins!"

        # Check for blocked game (no player has any playable moves)
        left_end, right_end = self.get_board_ends()

        # Handle the very start of the game - not blocked if anyone has dominoes
        if not self.board and any(p.hand for p in self.players):
            return False, None

        can_anyone_play = False
        for p in self.players:
            if p.has_playable(self.board, left_end, right_end):
                can_anyone_play = True
                break  # Found someone who can play, game is not blocked

        if not can_anyone_play:
            # Only declare blocked if the board isn't empty OR if board is empty AND no one has cards (unlikely scenario handled above)
            if self.board:
                return True, "Game blocked! No player can make a move."
            # If board is empty AND no one can play, means hands must be empty -> already caught by first check.
            # Or if board is empty and hands are not -> first player can always play -> not blocked.

        return False, None  # Game is not over

    def calculate_score(self, player):
        """Calculates the sum of pips in a player's remaining hand."""
        # Use canonical values to ensure consistent scoring regardless of orientation
        return sum(sum(d.get_values_canonical()) for d in player.hand)

    def determine_winner_blocked(self):
        """Determines the winner(s) in a blocked game based on lowest pip sum."""
        scores = {p.name: self.calculate_score(p) for p in self.players}
        if not scores: return [], 0  # Should not happen in a real game
        min_score = min(scores.values())
        winners = [name for name, score in scores.items() if score == min_score]
        return winners, min_score


# --- Main Game Loop ---
if __name__ == "__main__":
    human = Player("Human")
    # Increase rollouts for potentially better AI play with more complex state
    ai = MonteCarloAgent("AI", num_rollouts=500)
    players = [human, ai]
    random.shuffle(players)  # Randomize who starts

    game = GameEnvironment(players)
    game.deal_dominoes(num=7)  # Standard 7 dominoes for 2 players

    turn_count = 1
    while True:
        print(f"\n========== Turn {turn_count} ==========")
        game.play_turn()

        # Check game over condition
        is_over, message = game.is_game_over()
        if is_over:
            print("\n========== Game Over ==========")
            print(message)
            game.display_board()
            game.display_hands(show_all=True)  # Show all hands at the end

            if "blocked" in message.lower():
                winners, score = game.determine_winner_blocked()
                if len(winners) == 1:
                    print(f"Winner (lowest score): {winners[0]} with {score} points.")
                else:
                    print(f"Tie between: {', '.join(winners)} with {score} points each.")
            # Else, the winner was declared in the message
            break  # Exit the loop

        turn_count += 1
        # Add a small delay or prompt to continue between turns if desired
        # input("Press Enter to continue to next turn...")

    print("\nThanks for playing!")

NameError: name 'random' is not defined

## Conclusion

We have successfully implemented a Monte Carlo–based adversarial agent for Domino. By performing randomized rollouts for each candidate move, the AI handles hidden information without explicit probability models, and its strength scales with the number of simulations. This clean separation of game logic and search policy offers a robust, tunable baseline for further enhancements.


## Future Work

- **Parallelize** rollouts to leverage multi‐core CPUs or GPUs.  
- **Incorporate heuristics** into playouts 
- Extend to **multiplayer** or different Domino rule variants.  
- Compare empirically against **Minimax with α–β pruning** or **deep‐learning** policies.


## Bibliography

1. S. Russell and P. Norvig, *Artificial Intelligence: A Modern Approach*, 4th ed.  
2. A. Browne et al., “A Survey of Monte Carlo Tree Search Methods,” *IEEE Trans. Games*, 2012.
