# Halving Game with Minimax Algorithm - Student Assignment

In this assignment, you will implement the minimax algorithm for a simple game called the "Halving Game." We will provide the game mechanics and basic structure, but you will need to implement the core minimax algorithm.

## 1. Introduction to the Halving Game

The Halving Game is a two-player mathematical game with the following rules:

- The game starts with a positive integer N
- Players take turns making a move
- On each turn, a player can either:
  - Subtract 1 from the current number (`-` action)
  - Divide the current number by 2, rounding down to an integer (`/` action)
- The player who reduces the number to 0 **loses** the game

For example, if N=6:
- Player 1 could subtract 1 (resulting in 5) or divide by 2 (resulting in 3)
- If Player 1 chooses to subtract 1 (5), Player 2 could subtract 1 (4) or divide by 2 (2)
- And so on until someone is forced to reduce the number to 0 and lose

## 2. Game Implementation

Let's first understand the implementation of the Halving Game:

In [2]:
class HalvingGame(object):
    """
    A class representing the Halving Game.
    
    The game starts with a number N.
    Players take turns either subtracting 1 or dividing by 2 (integer division).
    The player who reduces the number to 0 loses.
    """
    def __init__(self, N):
        """
        Initialize the game with starting number N.
        
        Args:
            N: The starting number (positive integer)
        """
        self.N = N

    def startState(self):
        """
        Returns the initial state of the game.
        
        State is represented as a tuple: (player, number)
        - player: +1 for Player 1, -1 for Player 2
        - number: the current number in the game
        
        Returns:
            tuple: (player, number) initial state
        """
        return (+1, self.N)  # Player 1 starts with number N

    def actions(self, state):
        """
        Returns a list of valid actions from the current state.
        
        In the Halving Game, there are always two possible actions:
        - '-': Subtract 1 from the current number
        - '/': Divide the current number by 2 (integer division)
        
        Args:
            state: Current game state (player, number)
            
        Returns:
            list: Available actions ['-', '/']
        """
        player, number = state
        return ['-', '/']  # These actions are always available

    def succ(self, state, action):
        """
        Returns the successor state after taking the specified action.
        
        Args:
            state: Current game state (player, number)
            action: Either '-' (subtract 1) or '/' (divide by 2)
            
        Returns:
            tuple: New state (next player, updated number)
        """
        player, number = state
        if action == '-':
            return (-player, number - 1)  # Subtract 1 and switch players
        elif action == '/':
            return (-player, number // 2)  # Integer division by 2 and switch players
        assert False  # Should never reach here

    def isEnd(self, state):
        """
        Checks if the game has ended (number is 0).
        
        Args:
            state: Current game state (player, number)
            
        Returns:
            bool: True if the game has ended, False otherwise
        """
        player, number = state
        return number == 0  # Game ends when the number becomes 0

    def utility(self, state):
        """
        Returns the utility value of a terminal state.
        
        In this game, the player who faces the number 0 loses.
        The utility function returns:
        - +infinity if Player 1 wins (Player 2 is at number 0)
        - -infinity if Player 2 wins (Player 1 is at number 0)
        
        Args:
            state: Current game state (player, number)
            
        Returns:
            float: Utility value for the terminal state
        """
        player, number = state
        assert self.isEnd(state), "Utility can only be calculated for terminal states"
        # The player in 'state' is the one who would move next if the game wasn't over
        # Since that player is facing number 0, they have lost
        # So the utility is positive infinity if Player 2 lost (player is +1)
        # And negative infinity if Player 1 lost (player is -1)
        return player * float('inf')

    def player(self, state):
        """
        Returns the player whose turn it is in the current state.
        
        Args:
            state: Current game state (player, number)
            
        Returns:
            int: +1 for Player 1, -1 for Player 2
        """
        player, number = state
        return player  # Extract player from the state


## 3. Understanding the State Representation

In the Halving Game, the state is represented as a tuple of:

- `player`: Whose turn it is (+1 for Player 1, -1 for Player 2)
- `number`: The current number in the game

For example:
- Initial state with N=15: `(+1, 15)` (Player 1's turn, number is 15)
- After Player 1 subtracts 1: `(-1, 14)` (Player 2's turn, number is 14)
- After Player 2 divides by 2: `(+1, 7)` (Player 1's turn, number is 7)

## 4. Player Policies

Now let's look at the different policies (strategies) that players can use:


In [3]:
def simplePolicy(game, state):
    """
    A simple policy that always chooses to subtract 1.
    
    Args:
        game: The game object
        state: Current game state (player, number)
        
    Returns:
        str: Always returns '-' action
    """
    action = '-'
    print('simplePolicy: state {} => action {}'.format(state, action))
    return action

def humanPolicy(game, state):
    """
    Allows a human player to make a move by entering '-' or '/'.
    Adapted for reliable Jupyter Notebook input.
    """
    player_num = "1" if state[0] == 1 else "2"
    current_num = state[1]
    
    # Display current state in a clear format
    print(f"Player {player_num}'s turn (current number: {current_num})")
    
    # Use Python's built-in input function, which works in Jupyter
    action = input("Enter your move ('-' to subtract 1, '/' to divide by 2): ").strip()
    
    # Validate the input
    while action not in game.actions(state):
        print("Invalid move. Please enter '-' or '/'.")
        action = input("Enter your move ('-' to subtract 1, '/' to divide by 2): ").strip()
    
    # Confirm the action
    print(f"Player {player_num} chose: {action}")
    
    return action


## 5. Minimax Algorithm - Your Task

Now for the main part of the assignment! You need to implement the minimax algorithm for the Halving Game.

The framework for the minimax policy is provided below, but the `recurse` function is incomplete. Your task is to implement this function following the minimax algorithm principles.

In [4]:
def minimaxPolicy(game, state):
    """
    Uses the minimax algorithm to select the optimal move.
    
    Args:
        game: The game object
        state: Current game state (player, number)
        
    Returns:
        str: The optimal action determined by minimax
    """
    def recurse(state):
        """
        Recursive helper function that implements the minimax algorithm.
        
        Args:
            state: Current game state (player, number)
            
        Returns:
            tuple: (utility, action) where utility is the best achievable utility
                  from this state and action is the move that achieves it
        """
        # YOUR CODE HERE
        # 
        # Implement the minimax algorithm to determine the optimal action
        # 
        # The function should:
        # 1. Check if the state is a terminal state (game is over)
        # 2. If it is, return the utility value and None for the action
        # 3. If not, explore all possible actions and their resulting states
        # 4. Calculate the utility of each action recursively
        # 5. Return the best action and its utility based on which player is moving
        #    - For Player 1 (+1), return the action with the maximum utility
        #    - For Player 2 (-1), return the action with the minimum utility
        #
        # Hint: The algorithm will need to:
        #   - Use game.isEnd(state) to check if the game is over
        #   - Use game.utility(state) to get the utility of a terminal state
        #   - Use game.actions(state) to get valid actions
        #   - Use game.succ(state, action) to get successor states
        #   - Use game.player(state) to determine which player is moving
        #
        # The function should return a tuple: (utility, action)
        # where utility is the best achievable utility from this state
        # and action is the move that achieves it
        if game.isEnd(state):
            return game.utility(state), None
        
        player = game.player(state)
        best_action = None
        best_utility = float('-inf') if player == 1 else float('inf')
        for action in game.actions(state):
            next_state = game.succ(state, action)
            utility, _ = recurse(next_state)
        
        if player == 1:
            if utility > best_utility:
                best_utility = utility
                best_action = action
        if player == -1:
            if utility < best_utility:
                best_utility = utility
                best_action = action
        else:
            raise ValueError()
        return (best_utility, best_action)
        
    # Get the optimal action by running the minimax algorithm
    utility, action = recurse(state)
    print('minimaxPolicy: state {} => action {} with utility {}'.format(state, action, utility))
    return action


## 6. Game Execution

Finally, the code that executes the game:


In [5]:
from IPython.display import clear_output

# Create a game with starting number N=15
game = HalvingGame(N=15)

# Define players: Player 1 (+1) and Player 2 (-1)
policies = {
    +1: humanPolicy,  # Player 1 is human
    -1: humanPolicy,  # Player 2 is human (for 2-player mode)
    # Alternate option: -1: minimaxPolicy,  # Player 2 is AI using minimax
}

print("Welcome to the Halving Game!")
print("Starting number: 15")
print("Each turn, you can either subtract 1 ('-') or divide by 2 ('/')")
print("The player who reduces the number to 0 loses")
print("This is a 2-player game where both players are human")

# Game history tracking
game_history = []

# Game loop
state = game.startState()
turn_counter = 1

while not game.isEnd(state):
    # Display current game state clearly
    player = game.player(state)
    player_name = f"Player {1 if player == 1 else 2}"
    current_number = state[1]
    
    print(f"\n--- Turn {turn_counter} ---")
    print(f"Current number: {current_number}")
    print(f"{player_name}'s turn")
    
    # Ask policy to make a move
    policy = policies[player]
    action = policy(game, state)
    
    # Record move in history
    game_history.append((player_name, current_number, action))
    
    # Advance state
    state = game.succ(state, action)
    turn_counter += 1
    
    # Optional: clear previous output (uncomment if desired)
    # clear_output(wait=True)

# Game over - display result
player, number = state
final_utility = game.utility(state)

print("\n=== GAME OVER ===")

# Display game history
print("\nGame History:")
for i, (p_name, num, act) in enumerate(game_history):
    print(f"Turn {i+1}: {p_name} faced number {num} and chose '{act}'")

# Display winner
if final_utility > 0:
    print('\nPlayer 1 wins! Player 2 faced 0 and lost.')
elif final_utility < 0:
    print('\nPlayer 2 wins! Player 1 faced 0 and lost.')

print(f'Final utility of game is {game.utility(state)}')

Welcome to the Halving Game!
Starting number: 15
Each turn, you can either subtract 1 ('-') or divide by 2 ('/')
The player who reduces the number to 0 loses
This is a 2-player game where both players are human

--- Turn 1 ---
Current number: 15
Player 1's turn
Player 1's turn (current number: 15)
Player 1 chose: /

--- Turn 2 ---
Current number: 7
Player 2's turn
Player 2's turn (current number: 7)
Player 2 chose: /

--- Turn 3 ---
Current number: 3
Player 1's turn
Player 1's turn (current number: 3)
Player 1 chose: -

--- Turn 4 ---
Current number: 2
Player 2's turn
Player 2's turn (current number: 2)
Player 2 chose: -

--- Turn 5 ---
Current number: 1
Player 1's turn
Player 1's turn (current number: 1)
Player 1 chose: -

=== GAME OVER ===

Game History:
Turn 1: Player 1 faced number 15 and chose '/'
Turn 2: Player 2 faced number 7 and chose '/'
Turn 3: Player 1 faced number 3 and chose '-'
Turn 4: Player 2 faced number 2 and chose '-'
Turn 5: Player 1 faced number 1 and chose '-'

P

## 7. Assignment Task

Your task is to implement the `recurse` function in the `minimaxPolicy` to create a perfect AI player for the Halving Game. The minimax algorithm should:

1. Explore all possible game states
2. Evaluate terminal states using the utility function
3. For non-terminal states, recursively determine the best action:
   - For Player 1 (maximizing player), choose the action with the highest utility
   - For Player 2 (minimizing player), choose the action with the lowest utility

## 8. Example Minimax Trace

To help you understand how minimax works in the Halving Game, here's a trace for a small example with N=3:

1. Initial state: (+1, 3) - Player 1's turn, number is 3
   * Player 1 has two options:
     * Subtract 1: leads to (-1, 2)
     * Divide by 2: leads to (-1, 1)

2. For (-1, 2) - Player 2's turn, number is 2
   * Player 2 has two options:
     * Subtract 1: leads to (+1, 1)
     * Divide by 2: leads to (+1, 1) (same as above, since 2//2 = 1)

3. For (+1, 1) - Player 1's turn, number is 1
   * Player 1 has two options:
     * Subtract 1: leads to (-1, 0) - Player 2 wins (utility = -∞)
     * Divide by 2: leads to (-1, 0) - Player 2 wins (utility = -∞)

4. For (-1, 1) - Player 2's turn, number is 1
   * Player 2 has two options:
     * Subtract 1: leads to (+1, 0) - Player 1 wins (utility = +∞)
     * Divide by 2: leads to (+1, 0) - Player 1 wins (utility = +∞)

5. Working backwards:
   * From (-1, 1), Player 2 will choose an action leading to (+1, 0), resulting in utility +∞
   * From (+1, 1), Player 1 will choose an action leading to (-1, 0), resulting in utility -∞
   * From (-1, 2), Player 2 will choose an action leading to (+1, 1), resulting in utility -∞
   * From (+1, 3), Player 1 will choose an action leading to (-1, 1), resulting in utility +∞

6. Therefore, from the initial state (+1, 3), Player 1's optimal move is to divide by 2 (leading to state (-1, 1)), which gives utility +∞, meaning Player 1 can force a win.

## 9. Testing Your Implementation

After implementing the minimax algorithm, test it by playing against it as Player 1. The AI should play perfectly, which means:

- For certain starting numbers (like N=15), the AI will always win with optimal play
- For other starting numbers, you can win with optimal play

Try to determine for which values of N the first player can force a win, and for which values the second player can force a win.

## 10. Extension Challenges

Once you've completed the basic minimax implementation, consider these extensions:

1. Add time tracking to measure how long the algorithm takes to make decisions
2. Implement alpha-beta pruning to improve efficiency
3. Modify the game to allow additional actions (like multiplying by 3)
4. Create a visualization of the game tree for small values of N
5. Implement a depth-limited version of minimax with a heuristic evaluation function

Good luck!


In [None]:
import time
import networkx as nx
import matplotlib.pyplot as plt

class betterHG(HalvingGame):
    """
    An improved HalvingGame with:
      1. Time tracking for decision-making
      2. Alpha-beta pruning
      3. Optional multiply-by-3 action
      4. Game tree visualization
      5. Depth-limited minimax with heuristic evaluation
    """
    def __init__(self, N, allow_multiply=False):
        super().__init__(N)
        self.allow_multiply = allow_multiply
        self.decision_times = []  # store elapsed times for each decision

    def actions(self, state):
        # Get base actions add multiply-by-3 add-2
        base_actions = super().actions(state)
        if self.allow_multiply:
            base_actions = base_actions + ['*', '+']
        return base_actions

    def result(self, state, action):
        # Apply multiply action or defer to base implementation
        player, number = state
        if action == '*':
            new_number = number * 3
            return (-player, new_number)
        elif action == '+':
            new_number = number + 2
            return (-player, new_number)
        return super().result(state, action)

    def evaluate(self, state):
        """
        Heuristic evaluation: smaller numbers are better for current player.
        Utility approximated as negative of the number times the player.
        """
        player, number = state
        return -number * player

    def minimax(self, state, depth_limit=None):
        """
        Depth-limited minimax with alpha-beta pruning and time tracking.

        Args:
            state: current game state
            depth_limit: maximum search depth (None for unlimited)

        Returns:
            best_action: action chosen
        """
        start_time = time.time()

        def recurse(state, alpha, beta, depth):
            player, number = state
            # Terminal test
            if number == 0:
                return self.utility(state), None
            # Depth limit test
            if depth_limit is not None and depth >= depth_limit:
                return self.evaluate(state), None

            best_action = None
            if player == 1:
                value = -float('inf')
                for action in self.actions(state):
                    util, _ = recurse(self.result(state, action), alpha, beta, depth + 1)
                    if util > value:
                        value, best_action = util, action
                    alpha = max(alpha, value)
                    if beta <= alpha:
                        break  # beta cutoff
                return value, best_action
            else:
                value = float('inf')
                for action in self.actions(state):
                    util, _ = recurse(self.result(state, action), alpha, beta, depth + 1)
                    if util < value:
                        value, best_action = util, action
                    beta = min(beta, value)
                    if beta <= alpha:
                        break  # alpha cutoff
                return value, best_action

        utility, best_action = recurse(state, -float('inf'), float('inf'), 0)
        elapsed = time.time() - start_time
        self.decision_times.append(elapsed)
        print(f"Minimax decision time: {elapsed:.6f} seconds, utility: {utility}")
        return best_action

    def visualize(self, state, max_depth=3):
        """
        Build and plot the game tree up to max_depth from the given state.
        """
        G = nx.DiGraph()

        def build(node_state, depth):
            label = f"{node_state}"
            G.add_node(label)
            if depth >= max_depth or node_state[1] == 0:
                return
            for action in self.actions(node_state):
                child = self.result(node_state, action)
                child_label = f"{child}"
                G.add_edge(label, child_label, action=action)
                build(child, depth + 1)

        build(state, 0)
        pos = nx.spring_layout(G)
        plt.figure(figsize=(8, 6))
        nx.draw(G, pos, with_labels=True, node_size=1000, font_size=8)
        edge_labels = {(u, v): d['action'] for u, v, d in G.edges(data=True)}
        nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_size=6)
        plt.title("Game Tree Visualization")
        plt.show()


In [7]:
from IPython.display import clear_output

# Create a game with starting number N=15
game = betterHG(N=15)

# Define players: Player 1 (+1) and Player 2 (-1)
policies = {
    +1: humanPolicy,  # Player 1 is human
    -1: humanPolicy,  # Player 2 is human (for 2-player mode)
    # Alternate option: -1: minimaxPolicy,  # Player 2 is AI using minimax
}

print("Welcome to the Halving Game!")
print("Starting number: 15")
print("Each turn, you can either subtract 1 ('-'), divide by 2 ('/'), sum by 2 ('+') or multiply by 2 ('*')")
print("The player who reduces the number to 0 loses")
print("This is a 2-player game where both players are human")

# Game history tracking
game_history = []

# Game loop
state = game.startState()
turn_counter = 1

while not game.isEnd(state):
    # Display current game state clearly
    player = game.player(state)
    player_name = f"Player {1 if player == 1 else 2}"
    current_number = state[1]
    
    print(f"\n--- Turn {turn_counter} ---")
    print(f"Current number: {current_number}")
    print(f"{player_name}'s turn")
    
    # Ask policy to make a move
    policy = policies[player]
    action = policy(game, state)
    
    # Record move in history
    game_history.append((player_name, current_number, action))
    
    # Advance state
    state = game.succ(state, action)
    turn_counter += 1
    
    # Optional: clear previous output (uncomment if desired)
    # clear_output(wait=True)

# Game over - display result
player, number = state
final_utility = game.utility(state)

print("\n=== GAME OVER ===")

# Display game history
print("\nGame History:")
for i, (p_name, num, act) in enumerate(game_history):
    print(f"Turn {i+1}: {p_name} faced number {num} and chose '{act}'")

# Display winner
if final_utility > 0:
    print('\nPlayer 1 wins! Player 2 faced 0 and lost.')
elif final_utility < 0:
    print('\nPlayer 2 wins! Player 1 faced 0 and lost.')

print(f'Final utility of game is {game.utility(state)}')

Welcome to the Halving Game!
Starting number: 15
Each turn, you can either subtract 1 ('-'), divide by 2 ('/'), sum by 2 ('+') or multiply by 2 ('*')
The player who reduces the number to 0 loses
This is a 2-player game where both players are human

--- Turn 1 ---
Current number: 15
Player 1's turn
Player 1's turn (current number: 15)
Player 1 chose: -

--- Turn 2 ---
Current number: 14
Player 2's turn
Player 2's turn (current number: 14)
Player 2 chose: -

--- Turn 3 ---
Current number: 13
Player 1's turn
Player 1's turn (current number: 13)
Player 1 chose: -

--- Turn 4 ---
Current number: 12
Player 2's turn
Player 2's turn (current number: 12)
Player 2 chose: -

--- Turn 5 ---
Current number: 11
Player 1's turn
Player 1's turn (current number: 11)
Player 1 chose: -

--- Turn 6 ---
Current number: 10
Player 2's turn
Player 2's turn (current number: 10)
Player 2 chose: -

--- Turn 7 ---
Current number: 9
Player 1's turn
Player 1's turn (current number: 9)
Player 1 chose: -

--- Turn 8 