<div style="text-align: right" align="right"><i>Peter Norvig, December 2025</i></div>

# Advent of Code 2025: The AI LLM Edition

*This notebook shows some solutions by Gemini, Claude, and ChatGPT, three AI Large Language Models (LLMs),  for the 2025 [**Advent of Code**](https://adventofcode.com/) (AoC) programming puzzles. In order to understand each puzzle, you'll have to look at the problem descriptions at [**Advent of Code**](https://adventofcode.com/2025) for each [**Day**](https://adventofcode.com/2025/day/1), and you can also look at [**my solutions**](https://github.com/norvig/pytudes/blob/main/ipynb/Advent-2025.ipynb), which I did before asking the LLMs for theirs.*

*All the code in this notebook is written by an LLM (except for the one line where I call the LLM's code for each puzzle). My comments (like this one) are in italics, and my prompts given to the LLMs are in **bold italics**. Sometimes I quote the LLM's responses; those are in* regular roman font.

*Each day I'll choose a model and give it the prompt* "***Write code to solve the following problem:***" *followed by the full text of the AoC Part 1 problem description. Then I'll prompt again with* "***There is a change to the specification:***" *followed by the AoC Part 2 description.*

# Conclusions

*Now that the 12 days are finished, here are my conclusions:*

- *Overall, the LLMs did **very well**, producing code that gives the correct answer to every puzzle.*
- *I'm beginning to think I should use an LLM as an assistant for all my coding, not just as an experiment like this.*
- *This is a huge improvement over just one year ago, when LLMs could not perform anywhere near this level.*
- *The three LLMS seemed to be roughly equal in quality.*
- *I neglected to track the time it took them to produce the code, but it was a lot faster than me–maybe 20 times faster.*
- *The LLMs knew the things you would want an experienced software engineer to know:*
    - *How to see through the story about elves and christmas trees, etc. and get to the real programming issues*
    - *Standard Python syntax, builtin types, and basic modules (e.g. `collections`, `functools`, `typing`, `numpy`)*
    - *using the `re` module and/or `str.split` to parse input, even when it is in a somewhat tricky format*
    - *modular arithmetic*
    - *memoization / caching / dynamic programming*
    - *O*(*n*) *reasoning (such as knowing that enumerating up to √n is better than up to n)*
    - *stacks and queues*
    - *sorting and merging*
    - *when to use sets versus lists*
    - *handling a 2D grid of points with 4 or 8 directional neighbors*
    - *accumulating sums in a defaultdict or Counter*
    - *advanced esoteric data structures such as Union-Find and dancing links*
    - *computational geometry algorithms including scan lines, flood fill, and ray-casting*
    - *recognizing an integer linear programming problem and knowing how to call a package*
    - *depth-first search, meet-in-the-middle search, and recognizing search properties such as commutativity of actions*
    - *data classes*
    - *sometimes type annotations (on about 1/3 of the solutions)*
    - *sometimes good doc strings and comments (but not always, and sometimes too many comments).*
- *Problems 9.2 and 12.1 had a "trick" that allowed for a simpler, faster solution. For these I gave ChatGPT my input file, so it would have a chance of finding the trick. It didn't, but perhaps it was trying to solve the general problem over all possible inputs, whereas I understood that if AoC persents a trick input to me, they will present the same trick input to everyone.*
- *Much of the code (from all 3 LLMs) could be improved stylistically.  In many cases the code was one long function that has the parsing of input, the intermediate processing, and the return of the final result all intertwined, clearly violating Robert Maartin's maxim of "Functions should do one thing."*
- *The run time were reasonably fast, all under half a second. However, this was about 3 times slower than my code. (I'm not counting 12.1, which had a run time of about 2 minutes, but missed the "trick" that no search is actually required. It would be unfair to compare it to my code, which ran much faster, but would fail if a search was required.)*
- *The LLMs' code is about five times more verbose than mine. There are a few reasons:*
    - *I benefited from a few key utility functions to do things like "return a tuple of all the integers in a text string." For most problems I could parse the input in 1 or 2 lines of code, while the LLM would take 20.*
    - *The LLMs were being extra robust in doing error checking, while I recognized that within the confines of an AoC puzzle the input will always follow the prescribed format exactly.*
    - *I use a functional style; the LLMs were imperative. So I did `sum(int(max_joltage(bank)) for bank in banks)` whereas ChatGPT did:*

          total = 0
          for line in input_text.strip().splitlines():
              line = line.strip()
              if not line:
                  continue
              total += max_joltage_for_bank(line)
          return total


***Note:*** *For brevity, I have removed some of the LLM output, such as:*
- *Usage examples on how to run the program on the test input*.
- *Prose analysis of the problem, descriptions of the programs, and chain of thought thinking.*
  - *In most cases these were accurate and thorough! Great job! But they were too long to include here.*
- *The "`#!/usr/bin/env python3`" and "`__main__`" idioms for command line execution.*
  - *In retrospect, I should have specified in the prompt that "**You can get the input text as a single string with get_text(day)"***

# Day 0: Human

*On Day 0 I load my [**AdventUtils.ipynb**](AdventUtils.ipynb)  so I can access two of my utilities:*
- *`get_text(day)` returns the complete text of the input file for `current_year` and `day`*
- *`answer` verifies whether the LLM got the right answer to the problem, and records the run time.*

In [1]:
%run AdventUtils.ipynb
current_year = 2025

# Day 1: Gemini 3 Pro

*The [**Day 1 Part 1**](https://adventofcode.com/2025/day/1) puzzle is about turning the dial on a safe and counting how many times the pointer ends up at 0.*

*I started with the Gemini 3 Pro Fast model, which produced this  code:*

In [2]:
def solve_safe_password(input_str):
    """
    Simulates the safe dial rotations and counts how many times the dial
    lands on 0.

    Args:
        rotations: A list of strings, each representing a rotation (e.g., "L68").

    Returns:
        The total number of times the dial points at 0 after a rotation.
    """
    
    # Split the single string into individual lines
    rotations = input_str.strip().split('\n')
    
    # The dial starts at 50
    current_position = 50
    zero_count = 0
    
    # The dial has numbers 0 through 99, so all math is modulo 100
    MODULO = 100

    for rotation in rotations:
        # 1. Parse the instruction
        direction = rotation[0]
        # Distance is the rest of the string converted to an integer
        distance = int(rotation[1:])

        # 2. Determine the rotation value
        if direction == 'R':
            rotation_value = distance
        elif direction == 'L':
            # Left rotations subtract from the current position
            rotation_value = -distance
        else:
            # Should not happen based on problem description
            print(f"Error: Unknown direction '{direction}' in rotation: {rotation}")
            continue

        # 3. Calculate the new position and apply Modulo 100
        # The standard Python % operator handles negative numbers correctly
        # for a positive modulus (e.g., (-1 - 99) % 100 == 0).
        new_position = (current_position + rotation_value) % MODULO

        # 4. Check for Zero
        if new_position == 0:
            zero_count += 1
        
        # 5. Update Position
        current_position = new_position
        
        # Optional: Print step-by-step for debugging/verification
        # print(f"Rotation {rotation}: new position = {new_position}, zero_count = {zero_count}")

    return zero_count

*A bit verbose and over-commented (and under-type-annotated), but pretty nice. Below I fetch my Day 1 data with `get_text(1)`, pass it to `solve_safe_password`,  and verify that Gemini's answer is correct:*

In [3]:
rotations = get_text(1)

answer(puzzle=1.1, solution=1182, code=lambda: 
       solve_safe_password(rotations))

Puzzle  1.1:    0.7 msec, correct answer: 1182           

*For **Part 2** I gave Gemini Fast the updated instructions (that we should count a zero crossing in the middle of a rotation), and again it had a nice analysis and the following code:*

In [4]:
import math

def solve_new_safe_password(input_str):
    """
    Simulates the safe dial rotations and counts every time the dial
    points at 0, including during the rotation movement.

    Args:
        rotations: A list of strings, each representing a rotation (e.g., "L68").

    Returns:
        The total number of times the dial points at 0.
    """
    
    # Split the single string into individual lines
    rotations = input_str.strip().split('\n')
    
    # The dial starts at 50
    current_position = 50
    total_zero_crossings = 0
    MODULO = 100

    for rotation in rotations:
        direction = rotation[0]
        distance = int(rotation[1:])
        
        S = current_position # Start Position
        D = distance         # Distance
        
        crossings_for_this_rotation = 0

        if direction == 'L':
            # --- LEFT ROTATION ---
            # Dial moves from S -> S-1 -> ... -> 0 -> 99 -> ...
            
            # 1. Distance from S to the first 0: S clicks
            # If D is large enough to cover this distance, the first 0 is crossed.
            
            if D >= S:
                # The first crossing (at position 0) is achieved
                crossings_for_this_rotation += 1
                
                # 2. Remaining distance after first 0 crossing
                remaining_distance = D - S
                
                # 3. Subsequent crossings: The dial crosses 0 every 100 clicks
                # after the first one.
                crossings_for_this_rotation += remaining_distance // MODULO
                
        elif direction == 'R':
            # --- RIGHT ROTATION ---
            # Dial moves from S -> S+1 -> ... -> 99 -> 0 -> 1 -> ...
            
            # 1. Distance from S to the first 0: 100 - S clicks
            # (e.g., 50 to 0 is 50 clicks; 99 to 0 is 1 click)
            distance_to_first_zero = MODULO - S
            
            if D >= distance_to_first_zero:
                # The first crossing (at position 0) is achieved
                crossings_for_this_rotation += 1
                
                # 2. Remaining distance after first 0 crossing
                remaining_distance = D - distance_to_first_zero
                
                # 3. Subsequent crossings: The dial crosses 0 every 100 clicks
                crossings_for_this_rotation += remaining_distance // MODULO
        
        # Add the crossings for this rotation to the total
        total_zero_crossings += crossings_for_this_rotation

        # Calculate the final position to set up for the next rotation
        if direction == 'R':
            rotation_value = distance
        else: # 'L'
            rotation_value = -distance
            
        current_position = (current_position + rotation_value) % MODULO

    return total_zero_crossings

*Note that the code does `import math` but then never uses the math module. (I've done that, but I try to eventually clean things up. Maybe that's the different between Gemini "Fast" and Gemini "Thinking" models.)*

*Here I check whether the code is correct:*

In [5]:
answer(1.2, 6907, lambda: 
       solve_new_safe_password(rotations))

Puzzle  1.2:    0.8 msec, WRONG!! answer: 7509           ; EXPECTED: 6907

*Too bad; the code has a **bug**. I switched to the Gemini 3 Pro "Thinking" model and gave the prompt "**That's not quite right**." It responded with*

    It seems I made a subtle logic error in my previous calculation for the Right (R) rotation.
    When the dial is at 0, the distance to the *next* zero is **100**, not 0.

*which is exactly the error I made on *my* first attempt! Gemini produced the following updated code:*

In [6]:
def solve_safe_password_method_2(input_str):
    """
    Simulates the safe dial rotations and counts every time the dial
    points at 0 (Method 0x434C49434B).
    
    Correctly handles cases where the dial starts at 0.
    """
        
    # Split the single string into individual lines
    rotations = input_str.strip().split('\n')
    
    current_position = 50
    total_zero_crossings = 0
    MODULO = 100

    for rotation in rotations:
        direction = rotation[0]
        distance = int(rotation[1:])
        
        # Determine distance required to hit the *next* 0
        if direction == 'L':
            if current_position == 0:
                dist_to_first_zero = 100
            else:
                dist_to_first_zero = current_position
                
        elif direction == 'R':
            if current_position == 0:
                dist_to_first_zero = 100
            else:
                dist_to_first_zero = 100 - current_position

        # Check if the rotation distance is enough to reach the first zero
        if distance >= dist_to_first_zero:
            # We hit 0 at least once
            count = 1
            
            # Remove the distance traveled to the first zero
            remaining_distance = distance - dist_to_first_zero
            
            # Add any subsequent full rotations (every 100 clicks hits 0 again)
            count += remaining_distance // 100
            
            total_zero_crossings += count

        # Update position for next step
        if direction == 'R':
            current_position = (current_position + distance) % 100
        else:
            current_position = (current_position - distance) % 100

    return total_zero_crossings

*This second try is indeed correct:*

In [7]:
answer(1.2, 6907, lambda: 
       solve_safe_password_method_2(rotations))

Puzzle  1.2:    0.8 msec, correct answer: 6907           

# Day 2: Claude Opus 4.5

*For [**Day 2 Part 1**](https://adventofcode.com/2025/day/2) Claude wrote code that produces the correct answer but prints a lot of unneccessary debugging output along the way. I prompted it to "**Change the code to not print anything, just return the answer**" and got this:*

In [8]:
def find_invalid_ids_in_range(start, end):
    invalid_ids = []
    min_digits = len(str(start))
    max_digits = len(str(end))
    
    for total_digits in range(min_digits, max_digits + 1):
        if total_digits % 2 != 0:
            continue
        
        half_digits = total_digits // 2
        pattern_start = 1 if half_digits == 1 else 10 ** (half_digits - 1)
        pattern_end = 10 ** half_digits - 1
        for pattern in range(pattern_start, pattern_end + 1):
            pattern_str = str(pattern)
            invalid_id = int(pattern_str + pattern_str)
            if start <= invalid_id <= end:
                invalid_ids.append(invalid_id)
    
    return invalid_ids


def solve(input_str):
    input_str = input_str.strip().replace('\n', '').replace(' ', '').rstrip(',')
    total = 0
    for r in input_str.split(','):
        if r:
            start, end = map(int, r.split('-'))
            total += sum(find_invalid_ids_in_range(start, end))
    return total

*This code is overall rather nice, but conspicously lacks comments and doc strings. (If you can't tell what the code is doing without comments, the goal is to count the number of invalid IDs, which are numbers that consist of the same digit string repeated twice, like "100100", within some ID ranges.) It uses the more efficient "enumerate over the first half of the digit string" strategy, but is not precise in narrowing down the range it enumerates over. For example, for the range "999000-109000", this code will enumerate the range (100, 1000), when it could enumerate just the range (999, 1000).*

*I verified that the code gives the correct answer:*

In [9]:
input_str = get_text(2)

answer(2.1, 23560874270, lambda:
       solve(input_str))

Puzzle  2.1:   36.7 msec, correct answer: 23560874270    

*When given the **Part 2** instructions (where invalid IDs now can be any number of repetitions, like "100100100"), Claude wrote:*

In [10]:
def find_invalid_ids_in_range(start, end):
    invalid_ids = set()
    min_digits = len(str(start))
    max_digits = len(str(end))
    
    for total_digits in range(min_digits, max_digits + 1):
        # Pattern length can be 1 to total_digits//2 (need at least 2 repetitions)
        for pattern_len in range(1, total_digits // 2 + 1):
            if total_digits % pattern_len != 0:
                continue
            
            repetitions = total_digits // pattern_len
            if repetitions < 2:
                continue
            
            pattern_start = 1 if pattern_len == 1 else 10 ** (pattern_len - 1)
            pattern_end = 10 ** pattern_len - 1
            
            for pattern in range(pattern_start, pattern_end + 1):
                pattern_str = str(pattern)
                invalid_id = int(pattern_str * repetitions)
                if start <= invalid_id <= end:
                    invalid_ids.add(invalid_id)
    
    return invalid_ids


def solve(input_str):
    input_str = input_str.strip().replace('\n', '').replace(' ', '').rstrip(',')
    total = 0
    for r in input_str.split(','):
        if r:
            start, end = map(int, r.split('-'))
            total += sum(find_invalid_ids_in_range(start, end))
    return total

*Claude's code produces the correct answer:*

In [11]:
answer(2.2, 44143124633, lambda:
       solve(input_str))

Puzzle  2.2:   38.2 msec, correct answer: 44143124633    

# Day 3: ChatGPT 5.1 Auto

*The [**Day 3**](https://adventofcode.com/2025/day/3) puzzle is to pick the biggest two-digit number from a string of digits, like "89" from "8675309". The task is to compute the sum of biggest two-digit numbers over all the input digit strings.*

*For **Part 1** ChatGPT gave a very brief analysis of the problem and produced this code (conspicuously lacking comments or doc strings):*

In [12]:
def total_output_joltage(input_text: str) -> int:
    def max_joltage_for_bank(bank: str) -> int:
        max_first = -1
        max_pair = 0
        for ch in bank:
            if not ch.isdigit():
                continue
            d = int(ch)
            if max_first != -1:
                pair = 10 * max_first + d
                if pair > max_pair:
                    max_pair = pair
            if d > max_first:
                max_first = d
        return max_pair

    total = 0
    for line in input_text.strip().splitlines():
        line = line.strip()
        if line:
            total += max_joltage_for_bank(line)
    return total

*I wish it had made a separate function to parse the input, and I don't know why it made `max_joltage_for_bank` an inner function (remember: `import this` says flat is better than nested), but I verified the code is correct:*

In [13]:
input_text = get_text(3)

answer(3.1, 17085, lambda:
       total_output_joltage(input_text))

Puzzle  3.1:    1.7 msec, correct answer: 17085          

*For **Part 2** *we want a 12-digit number, not 2-digit.**

*ChatGPT did well (and for some reason included comments and doc strings, which were missing the first time):*

In [14]:
def total_output_joltage(input_text: str, k: int = 12) -> int:
    """
    Given the full puzzle input as a string, return the total maximum joltage
    when, for each line (bank), you must choose exactly k digits in order to
    form the largest possible number.
    """

    def max_joltage_for_bank(bank: str, k: int) -> int:
        # Extract digits from the line
        digits = [int(ch) for ch in bank if ch.isdigit()]
        n = len(digits)
        if k > n:
            raise ValueError(f"Bank has only {n} digits, cannot pick {k}.")

        # We will remove exactly n - k digits to maximize the resulting sequence
        removals = n - k
        stack = []

        for d in digits:
            # While we can still remove digits, and the last digit in the stack
            # is smaller than the current digit, pop it to make the number larger.
            while stack and removals > 0 and stack[-1] < d:
                stack.pop()
                removals -= 1
            stack.append(d)

        # If we still have removals left, remove from the end
        if removals > 0:
            stack = stack[:-removals]

        # Now stack should have length n - original_removals == k
        chosen = stack[:k]
        return int("".join(map(str, chosen)))

    total = 0
    for line in input_text.strip().splitlines():
        line = line.strip()
        if not line:
            continue
        total += max_joltage_for_bank(line, k)

    return total

In [15]:
answer(3.2, 169408143086082, lambda:
       total_output_joltage(input_text))

Puzzle  3.2:    2.6 msec, correct answer: 169408143086082

# Day 4: Gemini 3 Pro

*In [**Day 4**](https://adventofcode.com/2025/day/4) we are given a 2D map and asked how many squares have a "@" that is surrounded by fewer than 4 other "@" (out of the 8 orthogonal or diagonal neighbors).*

*Gemini produced a solution to **Part 1** that is straightforward and efficient, although perhaps could use better modularization (e.g. if they had a function to count neighbors, they wouldn't need the `continue` in the main loop).*

In [16]:
def count_accessible_rolls(grid_str):
    """
    Parses a grid string and counts paper rolls (@) that have
    fewer than 4 neighbors of the same type.
    """
    # Parse input into a list of lists (2D grid)
    # .strip() removes leading/trailing whitespace to handle clean input
    lines = grid_str.strip().split('\n')
    grid = [list(line) for line in lines]
    
    rows = len(grid)
    cols = len(grid[0])
    
    accessible_count = 0
    
    # Directions for 8 neighbors: (row_offset, col_offset)
    directions = [
        (-1, -1), (-1, 0), (-1, 1),
        (0, -1),           (0, 1),
        (1, -1),  (1, 0),  (1, 1)
    ]
    
    for r in range(rows):
        for c in range(cols):
            # We only care if the current spot is a paper roll
            if grid[r][c] != '@':
                continue
                
            neighbor_rolls = 0
            
            # Check all 8 neighbors
            for dr, dc in directions:
                nr, nc = r + dr, c + dc
                
                # Check boundaries
                if 0 <= nr < rows and 0 <= nc < cols:
                    if grid[nr][nc] == '@':
                        neighbor_rolls += 1
            
            # The Condition: fewer than 4 adjacent rolls
            if neighbor_rolls < 4:
                accessible_count += 1
                
    return accessible_count

*I really like that the 8 `directions` are laid out in a 2D pattern that corresponds to where the directions actually go.*

*Here I verify that the code does produce the correct answer:*

In [17]:
grid_str = get_text(4)

answer(4.1, 1569, lambda:
       count_accessible_rolls(grid_str))

Puzzle  4.1:    8.5 msec, correct answer: 1569           

*Now we're asked if we repeatedly remove "@" with fewer than 4 neighbors, how many are left in the end?*

*For **Part 2**, Gemini chose the "repeatedly remove acessible rolls" approach rather than the more efficient (and arguable more elegant) "remove a roll and check its neighbors" approach.*

In [18]:
def count_total_removable_rolls(grid_str):
    """
    Parses a grid string and iteratively removes paper rolls (@) 
    that have fewer than 4 neighbors of the same type.
    Returns the total number of rolls removed.
    """
    # Parse input into a list of lists (2D grid)
    lines = grid_str.strip().split('\n')
    grid = [list(line) for line in lines]
    
    rows = len(grid)
    cols = len(grid[0])
    
    total_removed_count = 0
    
    # Directions for 8 neighbors
    directions = [
        (-1, -1), (-1, 0), (-1, 1),
        (0, -1),           (0, 1),
        (1, -1),  (1, 0),  (1, 1)
    ]
    
    while True:
        rolls_to_remove = []
        
        # 1. Identify all rolls that can be removed in this pass
        for r in range(rows):
            for c in range(cols):
                if grid[r][c] != '@':
                    continue
                    
                neighbor_rolls = 0
                for dr, dc in directions:
                    nr, nc = r + dr, c + dc
                    if 0 <= nr < rows and 0 <= nc < cols:
                        if grid[nr][nc] == '@':
                            neighbor_rolls += 1
                
                # Condition: fewer than 4 adjacent rolls
                if neighbor_rolls < 4:
                    rolls_to_remove.append((r, c))
        
        # 2. If nothing to remove, we are done
        if not rolls_to_remove:
            break
            
        # 3. Remove them (update grid and count)
        total_removed_count += len(rolls_to_remove)
        for r, c in rolls_to_remove:
            grid[r][c] = '.'  # Mark as empty space
            
    return total_removed_count

In [19]:
answer(4.2, 9280, lambda:
       count_total_removable_rolls(grid_str))

Puzzle  4.2:  200.3 msec, correct answer: 9280           

*I asked Gemini "Can you do it more efficiently" and it replied*

    The previous implementation scanned the entire grid repeatedly (`O(Rows * Cols * Iterations)`). I've replaced it with a Worklist/Queue algorithm that checks neighbors only when necessary. This reduces the complexity to be roughly proportional to the number of rolls, which is significantly faster for large grids.

*This is the same approach that I used in my solution.*

In [20]:
def count_total_removable_rolls(grid_str):
    """
    Parses a grid string and iteratively removes paper rolls (@) 
    that have fewer than 4 neighbors of the same type.
    Returns the total number of rolls removed.
    
    Optimized to use a worklist (queue) approach instead of full grid scans.
    """
    # Parse input into a list of lists (2D grid)
    lines = grid_str.strip().split('\n')
    grid = [list(line) for line in lines]
    
    rows = len(grid)
    cols = len(grid[0])
    
    total_removed_count = 0
    
    # Directions for 8 neighbors
    directions = [
        (-1, -1), (-1, 0), (-1, 1),
        (0, -1),           (0, 1),
        (1, -1),  (1, 0),  (1, 1)
    ]
    
    def count_neighbors(r, c):
        count = 0
        for dr, dc in directions:
            nr, nc = r + dr, c + dc
            if 0 <= nr < rows and 0 <= nc < cols:
                if grid[nr][nc] == '@':
                    count += 1
        return count

    # 1. Initial scan: Find all rolls that are immediately removable
    queue = []
    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '@':
                if count_neighbors(r, c) < 4:
                    queue.append((r, c))
    
    # 2. Process the queue
    # Using a list as a queue. Duplicates are handled by checking grid state.
    head = 0
    while head < len(queue):
        r, c = queue[head]
        head += 1
        
        # If already removed (e.g., added to queue multiple times), skip
        if grid[r][c] != '@':
            continue
            
        # Remove the roll
        grid[r][c] = '.'
        total_removed_count += 1
        
        # Check its neighbors
        for dr, dc in directions:
            nr, nc = r + dr, c + dc
            if 0 <= nr < rows and 0 <= nc < cols:
                # If neighbor is a roll, check if it became removable
                if grid[nr][nc] == '@':
                    if count_neighbors(nr, nc) < 4:
                        queue.append((nr, nc))
            
    return total_removed_count

In [21]:
answer(4.2, 9280, lambda:
       count_total_removable_rolls(grid_str))

Puzzle  4.2:   33.8 msec, correct answer: 9280           

# Day 5: Claude Opus 4.5

*In [**Day 5**](https://adventofcode.com/2025/day/5) we are asked how many ingredient IDs from a list of IDs are fresh, according to a list of fresh ID ranges.*

*Claude produces a straightforward program that solves **Part 1** just fine and demonstrates good use of modularization. This time it has nice doc strings; for Day 2 it had none. Go figure.*

In [22]:
"""
Solution for the Fresh Ingredients puzzle.
Parses a database with fresh ingredient ID ranges and available ingredient IDs,
then counts how many available ingredients are fresh.
"""

def parse_input(input_text):
    """Parse the input into ranges and ingredient IDs."""
    parts = input_text.strip().split('\n\n')
    
    # Parse fresh ranges
    ranges = []
    for line in parts[0].strip().split('\n'):
        start, end = map(int, line.split('-'))
        ranges.append((start, end))
    
    # Parse available ingredient IDs
    ingredient_ids = []
    for line in parts[1].strip().split('\n'):
        ingredient_ids.append(int(line))
    
    return ranges, ingredient_ids


def is_fresh(ingredient_id, ranges):
    """Check if an ingredient ID falls within any fresh range."""
    for start, end in ranges:
        if start <= ingredient_id <= end:
            return True
    return False


def count_fresh_ingredients(ranges, ingredient_ids):
    """Count how many ingredient IDs are fresh."""
    count = 0
    for ingredient_id in ingredient_ids:
        if is_fresh(ingredient_id, ranges):
            count += 1
    return count


def solve(input_text):
    """Solve the puzzle and return the count of fresh ingredients."""
    ranges, ingredient_ids = parse_input(input_text)
    return count_fresh_ingredients(ranges, ingredient_ids)

*Here I feed the input to `solve` and verify that the answer is correct:*

In [23]:
input_text = get_text(5)

answer(5.1, 635, lambda:
       solve(input_text))

Puzzle  5.1:    3.1 msec, correct answer: 635            

*For **Part 2**, we are asked how many possible fresh IDs are there, whether they are in inventory or not. Claude chose to sort ranges (good choice), then merge them, then count the total of the sizes of the ranges. That's a reasonable approach–it involves an extra step of merging ranges (instead of just counting on the fly) but it could be useful to have the merged ranges.*

In [24]:
"""
Solution for the Fresh Ingredients puzzle - Part 2.
Count all unique ingredient IDs that are considered fresh by any range.
"""

def parse_input(input_text):
    """Parse the input to get fresh ranges (ignore ingredient IDs section)."""
    parts = input_text.strip().split('\n\n')
    
    # Parse fresh ranges (only first section matters now)
    ranges = []
    for line in parts[0].strip().split('\n'):
        start, end = map(int, line.split('-'))
        ranges.append((start, end))
    
    return ranges


def merge_ranges(ranges):
    """Merge overlapping ranges to count unique IDs efficiently."""
    if not ranges:
        return []
    
    # Sort ranges by start value
    sorted_ranges = sorted(ranges)
    
    merged = [sorted_ranges[0]]
    
    for current_start, current_end in sorted_ranges[1:]:
        last_start, last_end = merged[-1]
        
        # Check if current range overlaps or is adjacent to the last merged range
        if current_start <= last_end + 1:
            # Merge by extending the end if needed
            merged[-1] = (last_start, max(last_end, current_end))
        else:
            # No overlap, add as new range
            merged.append((current_start, current_end))
    
    return merged


def count_fresh_ids(ranges):
    """Count total unique fresh ingredient IDs."""
    merged = merge_ranges(ranges)
    
    total = 0
    for start, end in merged:
        # Count is inclusive: end - start + 1
        total += end - start + 1
    
    return total


def solve(input_text):
    """Solve part 2 and return the count of all fresh ingredient IDs."""
    ranges = parse_input(input_text)
    return count_fresh_ids(ranges)

In [25]:
answer(5.2, 369761800782619, lambda:
       solve(input_text))

Puzzle  5.2:    0.1 msec, correct answer: 369761800782619

# Day 6: ChatGPT 5.1 Auto

*For [**Day 6**](https://adventofcode.com/2025/day/6) we are asked to solve some math problems written in an unusal format (vertical instead of horizontal, with some special rules).*

*For **Part 1** ChatGPT produced a program that is correct, but has poor modularization, with one long 63-line function. (It also contains a pet peeve of mine: in lines 17–20 the pattern "`if some_boolean: True else: False`" can always be replaced with "`some_boolean`".) And it would have been easier to replace the six lines with one: `sep = {c for c in range(width) if all(grid[r][c] == ' ' for r in range(h))}`.

In [26]:
import re
from functools import reduce
from operator import add, mul

def solve(lines):
    if not lines:
        return 0

    # Preserve internal spaces; pad all lines to same width
    width = max(len(line) for line in lines)
    grid = [line.rstrip('\n').ljust(width) for line in lines]
    h = len(grid)

    # Identify separator columns: columns that are all spaces
    sep = []
    for c in range(width):
        if all(grid[r][c] == ' ' for r in range(h)):
            sep.append(True)
        else:
            sep.append(False)

    # Find ranges of columns that form individual problems
    problem_ranges = []
    in_problem = False
    start = 0
    for c in range(width):
        if not sep[c]:
            if not in_problem:
                in_problem = True
                start = c
        else:
            if in_problem:
                in_problem = False
                problem_ranges.append((start, c - 1))
    if in_problem:
        problem_ranges.append((start, width - 1))

    grand_total = 0

    # Process each problem
    for c_start, c_end in problem_ranges:
        numbers = []

        # All rows except the last contain numbers (potentially)
        for r in range(h - 1):
            segment = grid[r][c_start:c_end + 1]
            for match in re.finditer(r'\d+', segment):
                numbers.append(int(match.group(0)))

        # Last row contains the operator (+ or *)
        bottom_segment = grid[h - 1][c_start:c_end + 1]
        if '+' in bottom_segment:
            op = add
        elif '*' in bottom_segment:
            op = mul
        else:
            # No valid operator found; skip this malformed problem
            continue

        if not numbers:
            continue

        # Fold numbers using the chosen operator
        result = reduce(op, numbers)
        grand_total += result

    return grand_total

*Here I interface my input data to ChatGPT's function and verify it is correct:*

In [27]:
lines = get_text(6).splitlines()

answer(6.1, 5877594983578, lambda:
       solve(lines))

Puzzle  6.1:    2.9 msec, correct answer: 5877594983578  

*For **Part 2** we learn that the details of the format are even more ideosyncratic: the exact placement of spaces is crucial. (Here, as in each day's problem, see [my other notebook](Advent-2025.ipynb) or [the official AoC site](https://adventofcode.com/2025/day/6) for details*.)

*ChatGPT writes an even-longer function, with even more complications. For example in "Determine the operator"; the 14 lines could be replaced with `op = (add if '+' in segment else mul)`. I think ChatGPT ws trying to be robust against errors in the input.*

In [28]:
import re
from functools import reduce
from operator import add, mul

def solve(lines):
    if not lines:
        return 0

    # Normalize lines to a rectangle, preserving internal spaces
    width = max(len(line) for line in lines)
    grid = [line.rstrip('\n').ljust(width) for line in lines]
    h = len(grid)

    # A separator column is all spaces -> separates problems
    sep = []
    for c in range(width):
        if all(grid[r][c] == ' ' for r in range(h)):
            sep.append(True)
        else:
            sep.append(False)

    # Find contiguous ranges of non-separator columns (each is a problem)
    problem_ranges = []
    in_problem = False
    start = 0
    for c in range(width):
        if not sep[c]:
            if not in_problem:
                in_problem = True
                start = c
        else:
            if in_problem:
                in_problem = False
                problem_ranges.append((start, c - 1))
    if in_problem:
        problem_ranges.append((start, width - 1))

    grand_total = 0

    for c_start, c_end in problem_ranges:
        # Find the operator row for THIS problem slice:
        # bottom-most row with + or * in [c_start, c_end]
        op_row = None
        for r in range(h):
            segment = grid[r][c_start:c_end + 1]
            if '+' in segment or '*' in segment:
                op_row = r

        if op_row is None:
            # No operator for this vertical slice -> skip
            continue

        # Determine the operator
        op_segment = grid[op_row][c_start:c_end + 1]
        if '+' in op_segment and '*' in op_segment:
            # If both appear (shouldn't in valid input), choose the leftmost
            plus_idx = op_segment.find('+') if '+' in op_segment else float('inf')
            mul_idx = op_segment.find('*') if '*' in op_segment else float('inf')
            op = add if plus_idx < mul_idx else mul
        elif '+' in op_segment:
            op = add
        elif '*' in op_segment:
            op = mul
        else:
            # No operator actually present
            continue

        # Build numbers column-by-column (each column is one number),
        # reading digits from top to just above op_row
        numbers = []
        for c in range(c_start, c_end + 1):
            digits = []
            for r in range(op_row):
                ch = grid[r][c]
                if ch.isdigit():
                    digits.append(ch)
            if digits:
                num = int(''.join(digits))
                numbers.append(num)

        if not numbers:
            continue

        result = reduce(op, numbers)
        grand_total += result

    return grand_total

*The program is correct:*

In [29]:
answer(6.2, 11159825706149, lambda:
       solve(lines))

Puzzle  6.2:    2.3 msec, correct answer: 11159825706149 

# Day 7: Gemini 3 Pro

*In [**Day 7**](https://adventofcode.com/2025/day/7) we are given a 2D grid of characters where a beam enters at the top and moves downward, but is split to both sides by a "`^`" character. We need to compute the total number of split beams at the bottom.*

*Gemini's code for **Part 1** is a bit verbose, but gets the job done.*

In [30]:
def solve_tachyon_manifold(puzzle_input):
    # 1. Parse the Grid
    grid = puzzle_input.splitlines()
    grid = [line for line in grid if line] # Remove empty lines
    
    if not grid:
        return 0

    height = len(grid)
    width = max(len(row) for row in grid)
    
    # Pad rows to ensure uniform width
    grid = [row.ljust(width, '.') for row in grid]

    # 2. Find Start Position 'S'
    # We scan the first row (or all rows if S is buried, but usually it's top)
    active_beams = set()
    for y in range(height):
        if 'S' in grid[y]:
            active_beams.add(grid[y].index('S'))
            # We assume beams start flowing from this row downwards
            # so we break and start the simulation from this row
            start_row = y
            break
    else:
        return 0 # No start found

    split_count = 0

    # 3. Simulate Row by Row
    # We iterate through the rows starting from the one containing 'S'
    for y in range(start_row, height):
        next_beams = set()
        
        # Check every column that currently has a beam
        for x in active_beams:
            cell = grid[y][x]
            
            if cell == '^':
                # SPLITTER: 
                # 1. Increment total split count
                split_count += 1
                
                # 2. Create new beams left and right
                # Check boundaries to ensure we don't wrap around or error
                if x - 1 >= 0:
                    next_beams.add(x - 1)
                if x + 1 < width:
                    next_beams.add(x + 1)
                    
            else:
                # EMPTY SPACE (.) or START (S):
                # The beam continues straight down
                next_beams.add(x)
        
        # Update the active beams for the next row iteration
        active_beams = next_beams
        
        # Optimization: If no beams are left, stop early
        if not active_beams:
            break

    return split_count

*The solution is correct:*

In [31]:
puzzle_input = get_text(7)

answer(7.1, 1681, lambda:
       solve_tachyon_manifold(puzzle_input))

Puzzle  7.1:    0.4 msec, correct answer: 1681           

*For **Part 2**, we are asked how many ways can the beams get to the bottom.*

*Gemini gets the key ideas (although it is a bit old-fashioned to use a `defaultdict(int)` instead of a `Counter`).*

In [32]:
from collections import defaultdict

def solve_quantum_tachyon_manifold(puzzle_input):
    # 1. Parse the Grid
    grid = puzzle_input.splitlines()
    grid = [line for line in grid if line]
    
    if not grid:
        return 0

    height = len(grid)
    width = max(len(row) for row in grid)
    # Pad rows to ensure uniform width
    grid = [row.ljust(width, '.') for row in grid]

    # 2. Find Start Position 'S'
    # 'counts' maps column_index -> number_of_timelines_at_this_column
    counts = defaultdict(int)
    start_row = 0
    found = False
    
    for y in range(height):
        if 'S' in grid[y]:
            start_x = grid[y].index('S')
            counts[start_x] = 1 # We start with 1 timeline
            start_row = y
            found = True
            break
            
    if not found:
        return 0

    # 3. Simulate Row by Row
    # We process the grid from the start row down to the bottom.
    # Each iteration calculates where the particles end up in the *next* row.
    for y in range(start_row, height):
        next_counts = defaultdict(int)
        
        for x, num_timelines in counts.items():
            # Determine what is at this position
            # If the timeline has drifted out of bounds, treat it as empty space (.)
            if 0 <= x < width:
                char = grid[y][x]
            else:
                char = '.'

            if char == '^':
                # SPLITTER:
                # The timelines here split. The count moves to both left and right.
                # If 5 timelines hit this splitter, 5 go left AND 5 go right.
                next_counts[x - 1] += num_timelines
                next_counts[x + 1] += num_timelines
            else:
                # EMPTY SPACE (.) or START (S):
                # The timelines pass straight through.
                next_counts[x] += num_timelines
        
        # Move to the next row
        counts = next_counts

    # 4. Sum Total Timelines
    # After processing the final row, 'counts' holds the number of timelines 
    # exiting the bottom of the manifold at each column.
    return sum(counts.values())

*The answer is correct:*

In [33]:
answer(7.2, 422102272495018, lambda:
       solve_quantum_tachyon_manifold(puzzle_input))

Puzzle  7.2:    0.8 msec, correct answer: 422102272495018

*My friend Barry Hayes suggested I ask Gemini to compare its code to mine. So I asked  "**How would you compare your code to this:"***

In [34]:
def count_timelines(manifold: list[str]) -> int:
    """How many possible paths are there to the final line of the manifold?"""
    start = manifold[0].index('S')
    beams = Counter({start: 1})
    for line in manifold:
        for b, n in list(beams.items()):
            if line[b] == '^':
                beams[b]     -= n
                beams[b - 1] += n
                beams[b + 1] += n
    return sum(beams.values())

*where this is what I wrote for 7.2. Here is Gemini's reply:*

This is a great comparison. Your code is much more concise and "Pythonic," using collections.Counter and in-place updates to reduce verbosity. It strips away a lot of the boilerplate  that my solution included for robustness.

- **Your Code**: Assumes 'S' is always on the first line (manifold[0]). If 'S' is buried in the middle of the grid, your code crashes or misses it.
- **My Code**: Scans the whole grid for 'S', which handles cases where the entry point isn't at the very top.
- **Your Code**: Relies on Python's robust dictionary handling to avoid index errors (if b-1 is negative, it just adds a key to the dict). This is clever, but effectively allows particles to wrap around or exist in "negative space" if not careful (though typically harmless for this specific puzzle input).

***My rebuttal**: To me, the instructions make it clear that the  'S' must be on the top line. (If there is no 'S' on the top line, my code will return 0, not crash.) Similarly, the puzzle implies that beams can't go outside the borders, but Gemini is right that if the input file **did** have a splitter in the first or last column then there could be a problem that I don't check for.*

*Next I asked Gemini: **Do you think your code could be improved with some more modularity/abstraction, such as having a separate function to get input?** It replied:*

Yes, absolutely. Breaking the code down makes it much easier to read, test, and debug. If the input format changes (e.g., reading from a file vs. a raw string) or the simulation rules change (e.g., different splitter types), you only have to modify one specific function.

Here is a refactored version that separates Input Parsing, State Initialization, and Simulation Logic.

In [35]:
from collections import Counter
from typing import List, Tuple

def parse_manifold(input_text: str) -> List[str]:
    """
    Cleans and converts raw input text into a grid of strings.
    Handles stripping empty lines and padding if necessary.
    """
    lines = [line for line in input_text.splitlines() if line]
    if not lines:
        return []
    
    # Optional: Pad lines to uniform width for safety
    width = max(len(line) for line in lines)
    return [line.ljust(width, '.') for line in lines]

def find_start_position(grid: List[str]) -> Tuple[int, int]:
    """
    Locates the 'S' character in the grid.
    Returns (row_index, col_index).
    """
    for r, row in enumerate(grid):
        if 'S' in row:
            return r, row.index('S')
    raise ValueError("Start position 'S' not found in manifold.")

def process_row(beams: Counter, row_string: str) -> None:
    """
    Updates the beam positions for a single row of the manifold.
    Modifies the 'beams' Counter in-place using the subtraction logic.
    """
    # Create a snapshot of items to iterate over, since we modify 'beams' in the loop
    current_states = list(beams.items())
    
    for col, count in current_states:
        # Boundary check ensures we don't access outside the string
        if col < 0 or col >= len(row_string):
            continue
            
        char = row_string[col]
        
        if char == '^':
            # Splitter logic: Remove current stack, add to left and right
            beams[col] -= count
            beams[col - 1] += count
            beams[col + 1] += count
        # Else: beam passes through (no change needed for '.')

def solve_quantum_manifold(input_text: str) -> int:
    """
    Main orchestrator function.
    """
    # 1. Get Input
    grid = parse_manifold(input_text)
    
    # 2. Initialize State
    try:
        start_row, start_col = find_start_position(grid)
    except ValueError:
        return 0

    beams = Counter({start_col: 1})

    # 3. Run Simulation
    # We slice the grid to start from the row containing 'S'
    for row in grid[start_row:]:
        process_row(beams, row)

    # 4. Output Result
    return sum(beams.values())

*I like this refactored version better (although the "`#1 #2 #3 #4`" comments don't add much). This version produces the correct answer:*

In [36]:
answer(7.2, 422102272495018, lambda:
       solve_quantum_manifold(puzzle_input))

Puzzle  7.2:    1.0 msec, correct answer: 422102272495018

# Day 8: Claude Opus 4.5

*In the [**Day 8**](https://adventofcode.com/2025/day/8) puzzle there is a collection of junction boxes with coordinates in 3D space. We are asked to connect them together, conneting the nerarest two first, and continuing for 1000 connections, and then report the product of the sizes of the three largest connected circuits.*

*Here's Claude's code for **Part 1**:*

In [37]:
"""
Solution for the Junction Box puzzle.

Connect the 1000 pairs of junction boxes which are closest together.
Then multiply together the sizes of the three largest circuits.
"""

from collections import defaultdict

def parse_input(text):
    """Parse the input into a list of (x, y, z) coordinates."""
    boxes = []
    for line in text.strip().split('\n'):
        if line.strip():
            x, y, z = map(int, line.strip().split(','))
            boxes.append((x, y, z))
    return boxes

def distance_squared(p1, p2):
    """Calculate squared Euclidean distance between two points."""
    return (p1[0] - p2[0])**2 + (p1[1] - p2[1])**2 + (p1[2] - p2[2])**2

class UnionFind:
    """Union-Find data structure for tracking connected components."""
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.size = [1] * n
    
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # Already in same circuit
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        self.size[px] += self.size[py]
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True
    
    def get_component_sizes(self):
        """Get sizes of all connected components."""
        components = defaultdict(int)
        for i in range(len(self.parent)):
            root = self.find(i)
            components[root] = self.size[root]
        return list(components.values())

def solve(boxes, num_connections=1000):
    """Connect the num_connections closest pairs and return product of 3 largest circuit sizes."""
    n = len(boxes)
    
    # Calculate all pairwise distances
    distances = []
    for i in range(n):
        for j in range(i + 1, n):
            dist_sq = distance_squared(boxes[i], boxes[j])
            distances.append((dist_sq, i, j))
    
    # Sort by distance
    distances.sort()
    
    # Use Union-Find to track circuits
    uf = UnionFind(n)
    
    # Make the specified number of connections
    connections_made = 0
    for dist_sq, i, j in distances:
        if connections_made >= num_connections:
            break
        # Attempt to connect (even if already in same circuit, it counts as a connection attempt)
        uf.union(i, j)
        connections_made += 1
    
    # Get component sizes and find the three largest
    sizes = uf.get_component_sizes()
    sizes.sort(reverse=True)
    
    # Multiply the three largest
    result = 1
    for i in range(min(3, len(sizes))):
        result *= sizes[i]
    
    return result

*Claude's function computes the correct answer:*

In [38]:
input_text = get_text(8)

answer(8.1, 24360, lambda:
       solve(parse_input(input_text), 1000))

Puzzle  8.1:  293.3 msec, correct answer: 24360          

*Here is Claude's **Part 2** code, in which we're asked about the X coordinates of the last two connections:*

In [39]:
"""
Solution for the Junction Box puzzle - Part 2.

Connect junction boxes until they're all in one circuit.
Find the last two junction boxes connected and multiply their X coordinates.
"""

from collections import defaultdict

def parse_input(text):
    """Parse the input into a list of (x, y, z) coordinates."""
    boxes = []
    for line in text.strip().split('\n'):
        if line.strip():
            x, y, z = map(int, line.strip().split(','))
            boxes.append((x, y, z))
    return boxes

def distance_squared(p1, p2):
    """Calculate squared Euclidean distance between two points."""
    return (p1[0] - p2[0])**2 + (p1[1] - p2[1])**2 + (p1[2] - p2[2])**2

class UnionFind:
    """Union-Find data structure for tracking connected components."""
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.size = [1] * n
        self.num_components = n
    
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # Already in same circuit
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        self.size[px] += self.size[py]
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        self.num_components -= 1
        return True

def solve_part2(boxes):
    """Connect boxes until all in one circuit, return product of X coords of last connection."""
    n = len(boxes)
    
    # Calculate all pairwise distances
    distances = []
    for i in range(n):
        for j in range(i + 1, n):
            dist_sq = distance_squared(boxes[i], boxes[j])
            distances.append((dist_sq, i, j))
    
    # Sort by distance
    distances.sort()
    
    # Use Union-Find to track circuits
    uf = UnionFind(n)
    
    # Keep connecting until all in one circuit
    last_i, last_j = None, None
    for dist_sq, i, j in distances:
        if uf.union(i, j):
            # This connection actually merged two circuits
            last_i, last_j = i, j
            if uf.num_components == 1:
                # All boxes are now in one circuit!
                break
    
    # Return product of X coordinates
    x1 = boxes[last_i][0]
    x2 = boxes[last_j][0]
    print(f"Last connection: {boxes[last_i]} and {boxes[last_j]}")
    print(f"X coordinates: {x1} and {x2}")
    return x1 * x2

*This code is correct. It does some extraneous printing, which is unneccessary, but not strictly incorrect. It is to Claude's credit that it recognized this as a Union-Find problem, but its implementation actually runs slower than my keep-it-simple approach in 
[**my notebook**](Advent-2025.ipynb).*

In [40]:
answer(8.2, 2185817796, lambda:
       solve_part2(parse_input(input_text)))

Last connection: (53148, 96207, 9295) and (41127, 88410, 10248)
X coordinates: 53148 and 41127


Puzzle  8.2:  284.5 msec, correct answer: 2185817796     

# Day 9: ChatGPT 5.1 Auto

*In [**Day 9**](https://adventofcode.com/2025/day/9) we are given the (x, y) coordinates of a collection of red tiles on the floor, and asked what is the largest rectangle with two red tiles as corners.*

*For **Part 1**, I was getting tired of all the programs that have a `main` that reads from input and prints the answer, so I told ChatGPT: **Refactor to have a function that takes the points as input and returns the area** and got this:*

In [41]:
def largest_rectangle_area(points):
    """
    Given a list of (x, y) points representing red tiles,
    return the largest possible rectangle area using any
    two points as opposite corners.
    """
    max_area = 0
    n = len(points)

    for i in range(n):
        x1, y1 = points[i]
        for j in range(i + 1, n):
            x2, y2 = points[j]

            width = abs(x2 - x1) + 1
            height = abs(y2 - y1) + 1
            area = width * height

            if area > max_area:
                max_area = area

    return max_area

def parse_points(text: str):
    pts = []
    for line in text.splitlines():
        line = line.strip()
        if not line:
            continue
        x_str, y_str = line.split(",")
        pts.append((int(x_str), int(y_str)))
    return pts

*This is correct*:

In [42]:
text = get_text(9)

answer(9.1, 4772103936, lambda:
       largest_rectangle_area(parse_points(text)))

Puzzle  9.1:    9.3 msec, correct answer: 4772103936     

*For **Part 2**, we're told to consider the sequence of tiles as the vertexes of a polygon, and find the largest rectangle inside the polygon.*

*ChatGPT considered a [scanline algorithm](https://en.wikipedia.org/wiki/Scanline_rendering), and then went for a [flood fill](https://en.wikipedia.org/wiki/Flood_fill). It assumed "I'm working with a grid that can have dimensions up to 2k x 2k, or 4 million cells, which seems manageable." ChatGPT is correct that it would be manageable under those asumptions, however the assumptions are wrong. I told it **The grid can have dimensions up to 100,000 x 100,000 and there are about 500 red tile points**. I also gave it my input file, so that it would have a chance of discovering the structure of thChatGPT then came up with a [ray casting](https://en.wikipedia.org/wiki/Ray_casting) approach using [grid compression](https://stackoverflow.com/questions/29528934/coordinate-compression): define a grid where each red tile is in a separate cell, where the x and y extent of that cell spans an integer range (which must be recorded elsewhere). Now the grid is only of size roughly 500<sup>2</sup> instead of 100,000<sup>2</sup>.*

In [43]:
from typing import List, Tuple, Dict

Point = Tuple[int, int]
Edge = Tuple[int, int, int, int]


def build_edges(points: List[Point]) -> List[Edge]:
    """
    Build polygon edges from the red points in loop order.
    Each edge is (x1, y1, x2, y2) between consecutive points (wrapped).
    """
    n = len(points)
    edges: List[Edge] = []
    for i in range(n):
        x1, y1 = points[i]
        x2, y2 = points[(i + 1) % n]
        edges.append((x1, y1, x2, y2))
    return edges


def point_in_or_on_polygon(x: int, y: int, edges: List[Edge]) -> bool:
    """
    Return True if lattice point (x, y) is inside the polygon or on its boundary.
    Polygon is rectilinear (axis-aligned), edges given by build_edges.
    Uses a ray-casting test along +x, counting intersections with vertical edges.
    """
    inside = False

    for (x1, y1, x2, y2) in edges:
        # Boundary check: point lies on this edge
        if x1 == x2:  # vertical edge
            if x == x1 and min(y1, y2) <= y <= max(y1, y2):
                return True
            # For ray cast, only vertical edges can intersect ray to +x
            # Use y-interval [min, max) to avoid double-counting vertices
            if (y1 > y) != (y2 > y) and x1 > x:
                inside = not inside
        else:  # horizontal edge
            if y == y1 and min(x1, x2) <= x <= max(x1, x2):
                return True
            # horizontal edges do not contribute to this vertical ray count

    return inside


def rect_crosses_polygon(
    xmin: int, xmax: int, ymin: int, ymax: int, edges: List[Edge]
) -> bool:
    """
    For a non-degenerate rectangle [xmin,xmax] x [ymin,ymax] (width>1,height>1),
    check if any polygon edge passes through the *interior* of that rectangle.
    Colinear edges on the boundary are allowed; only crossings inside are bad.
    """
    for (x1, y1, x2, y2) in edges:
        if x1 == x2:
            # vertical edge at x = x1
            x0 = x1
            if xmin < x0 < xmax:
                lo = max(min(y1, y2), ymin)
                hi = min(max(y1, y2), ymax)
                # If overlap has positive length, edge crosses interior
                if hi > lo:
                    return True
        else:
            # horizontal edge at y = y1
            y0 = y1
            if ymin < y0 < ymax:
                lo = max(min(x1, x2), xmin)
                hi = min(max(x1, x2), xmax)
                if hi > lo:
                    return True
    return False


def largest_rectangle_area_red_green(points: List[Point]) -> int:
    """
    Given the red tiles as vertices of an axis-aligned loop (in order),
    return the largest rectangle area such that:

    - Two opposite corners are red tiles (from `points`).
    - Every tile inside the rectangle is red or green, i.e., inside/on the loop.

    This works without building the full grid; it uses polygon geometry instead.
    """
    n = len(points)
    if n < 2:
        return 0

    edges = build_edges(points)

    # Cache point-in-polygon results so each lattice point is tested at most once
    point_cache: Dict[Point, bool] = {}

    def is_inside(x: int, y: int) -> bool:
        key = (x, y)
        if key in point_cache:
            return point_cache[key]
        val = point_in_or_on_polygon(x, y, edges)
        point_cache[key] = val
        return val

    max_area = 0

    for i in range(n):
        x1, y1 = points[i]
        for j in range(i + 1, n):
            x2, y2 = points[j]

            xmin, xmax = (x1, x2) if x1 <= x2 else (x2, x1)
            ymin, ymax = (y1, y2) if y1 <= y2 else (y2, y1)

            width = xmax - xmin + 1   # tiles
            height = ymax - ymin + 1  # tiles
            area = width * height

            if area <= max_area:
                continue

            # Thin rectangles (one row or one column): treat as 1D segments.
            if width == 1 or height == 1:
                if width == 1:
                    # Vertical segment at x = xmin from ymin..ymax
                    x = xmin
                    ok = True
                    for yy in range(ymin, ymax + 1):
                        if not is_inside(x, yy):
                            ok = False
                            break
                else:
                    # Horizontal segment at y = ymin from xmin..xmax
                    y = ymin
                    ok = True
                    for xx in range(xmin, xmax + 1):
                        if not is_inside(xx, y):
                            ok = False
                            break

                if ok:
                    max_area = area
                continue

            # Proper 2D rectangle: width>1 and height>1

            # The two given red corners are on the boundary by definition.
            # The other two corners must be inside/on polygon as well.
            if not is_inside(x1, y2):
                continue
            if not is_inside(x2, y1):
                continue

            # Ensure polygon boundary does not cut through the interior
            if rect_crosses_polygon(xmin, xmax, ymin, ymax, edges):
                continue

            max_area = area

    return max_area

*This approach works:*

In [44]:
answer(9.2, 1529675217, lambda:
       largest_rectangle_area_red_green(parse_points(text)))

Puzzle  9.2:  447.0 msec, correct answer: 1529675217     

# Day 10: Gemini 3 Pro

*For [**Day 10**](https://adventofcode.com/2025/day/10) we are given some descriptions of machines. See [AoC](https://adventofcode.com/2025/day/10) or [my other notebook](Advent-2025.ipynb) for details, but the  description:*

     [#....] (0,2,3) (0,2,3,4) (2,3) (0,1,2) (0,3,4) (3) (1,2) {75,18,60,71,39}

*means that the machine has 5 lights, and the goal is to turn the first one on (`#....`), by pushing buttons. There are 7 buttons, the first one toggles lights 0, 2, and 3. We want to know the minimal number of button presses. The last 5 numbers are used only in Part 2, where they are the desired joltage of each light.*



*Gemini had no problem with **Part 1:***

In [45]:
import re

def solve(input_text):
    
    # If you have the input in a file, uncomment the lines below:
    # with open("input.txt", "r") as f:
    #     input_text = f.read()

    machines = input_text.strip().split('\n')
    total_presses = 0

    print(f"Processing {len(machines)} machines...")

    for line_idx, line in enumerate(machines):
        if not line.strip():
            continue

        # --- 1. PARSING ---
        # Extract target state inside brackets [...]
        target_match = re.search(r'\[([.#]+)\]', line)
        if not target_match:
            continue
        
        target_str = target_match.group(1)
        
        # Create integer bitmask for target (Bit 0 = Index 0)
        target_mask = 0
        for i, char in enumerate(target_str):
            if char == '#':
                target_mask |= (1 << i)

        # Extract buttons inside parentheses (...)
        # We ignore the {...} joltage part completely
        button_matches = re.findall(r'\(([\d,]+)\)', line)
        buttons = []
        
        for b_str in button_matches:
            indices = [int(x) for x in b_str.split(',')]
            b_mask = 0
            for idx in indices:
                b_mask |= (1 << idx)
            buttons.append(b_mask)

        # --- 2. SOLVER (Meet-in-the-Middle) ---
        # We need to find a subset of buttons that XOR to target_mask
        # minimizing the subset size.
        
        n = len(buttons)
        mid = n // 2
        
        left_buttons = buttons[:mid]
        right_buttons = buttons[mid:]

        # Helper to generate all reachable states and min cost to get there
        def get_reachable_states(btn_list):
            # Dictionary { mask_value : min_presses }
            states = {0: 0}
            
            for btn_mask in btn_list:
                # Create a copy of current states to iterate over
                # We update states by adding the current button to all existing states
                next_batch = {}
                for mask, count in states.items():
                    new_mask = mask ^ btn_mask
                    new_count = count + 1
                    
                    # If we found a new state, or a shorter way to reach an existing state
                    if new_mask not in states or new_count < states[new_mask]:
                        # Check if we already have a better way in the current batch processing
                        if new_mask not in next_batch or new_count < next_batch[new_mask]:
                            next_batch[new_mask] = new_count
                
                # Update main dictionary
                for m, c in next_batch.items():
                    if m not in states or c < states[m]:
                        states[m] = c
            return states

        # Generate maps for both halves
        left_map = get_reachable_states(left_buttons)
        right_map = get_reachable_states(right_buttons)

        # Find the combination of Left + Right that equals Target
        # Logic: Left_Mask ^ Right_Mask = Target  =>  Left_Mask = Target ^ Right_Mask
        min_presses_for_machine = float('inf')
        found = False

        for r_mask, r_count in right_map.items():
            needed = target_mask ^ r_mask
            
            if needed in left_map:
                current_total = left_map[needed] + r_count
                if current_total < min_presses_for_machine:
                    min_presses_for_machine = current_total
                    found = True

        if found:
            total_presses += min_presses_for_machine
        else:
            print(f"Warning: No solution found for machine {line_idx+1}")
    return total_presses

*It works:*

In [46]:
input_text = get_text(10)

answer(10.1, 441, lambda:
       solve(input_text))

Processing 165 machines...


Puzzle 10.1:    1.9 msec, correct answer: 441            

***Part 2*** *was trickier: now each button press increases the joltage of the each numbered light by 1 and we want to know the inimal number of presses to reach the joltage requirements.  Gemini's first solution used z3 (it even gave nice instructions for how to `pip install z3`), but I responded with the prompt **Can you do it without using z3?** to which Gemini wrote its own Gaussian elimination code:*

In [47]:
import re
from fractions import Fraction

def solve(input_text):

    # with open("input.txt", "r") as f:
    #     input_text = f.read()

    machines = input_text.strip().split('\n')
    total_presses = 0

    for i, line in enumerate(machines):
        if not line.strip(): continue

        # --- 1. PARSING ---
        # Extract targets
        joltage_match = re.search(r'\{([\d,]+)\}', line)
        if not joltage_match: continue
        b_vec = [int(x) for x in joltage_match.group(1).split(',')]
        num_eq = len(b_vec) # Number of counters (rows)

        # Extract buttons
        button_matches = re.findall(r'\(([\d,]+)\)', line)
        buttons = []
        for b_str in button_matches:
            buttons.append([int(x) for x in b_str.split(',')])
        num_vars = len(buttons) # Number of buttons (columns)

        # Build Matrix A
        # A[row][col] = 1 if button col affects counter row
        matrix = [[Fraction(0) for _ in range(num_vars)] for _ in range(num_eq)]
        for col_idx, affected_rows in enumerate(buttons):
            for row_idx in affected_rows:
                matrix[row_idx][col_idx] = Fraction(1)
        
        # We also need the augmented vector B as fractions
        b_vec_frac = [Fraction(x) for x in b_vec]

        # --- 2. GAUSSIAN ELIMINATION ---
        # We perform row reduction to express Pivot variables in terms of Free variables.
        pivot_row = 0
        pivots = {} # Maps row_index -> col_index of the pivot
        pivot_cols = set()

        for col in range(num_vars):
            if pivot_row >= num_eq: break

            # Find a row with a non-zero entry in this column
            sel = -1
            for row in range(pivot_row, num_eq):
                if matrix[row][col] != 0:
                    sel = row
                    break
            
            if sel == -1: continue # This is a free variable column

            # Swap rows
            matrix[pivot_row], matrix[sel] = matrix[sel], matrix[pivot_row]
            b_vec_frac[pivot_row], b_vec_frac[sel] = b_vec_frac[sel], b_vec_frac[pivot_row]

            # Normalize pivot row
            scalar = matrix[pivot_row][col]
            for c in range(col, num_vars):
                matrix[pivot_row][c] /= scalar
            b_vec_frac[pivot_row] /= scalar

            # Eliminate other rows
            for row in range(num_eq):
                if row != pivot_row and matrix[row][col] != 0:
                    factor = matrix[row][col]
                    for c in range(col, num_vars):
                        matrix[row][c] -= factor * matrix[pivot_row][c]
                    b_vec_frac[row] -= factor * b_vec_frac[pivot_row]

            pivots[pivot_row] = col
            pivot_cols.add(col)
            pivot_row += 1

        # Identify Free Variables
        free_cols = [c for c in range(num_vars) if c not in pivot_cols]
        
        # --- 3. SEARCH / OPTIMIZATION ---
        
        # Pre-calculate upper bounds for free variables.
        # Since coefficients are all non-negative in the original problem,
        # a button cannot be pressed more times than the smallest target it affects.
        # (This is a heuristic bound to limit the search space).
        bounds = []
        for f_col in free_cols:
            limit = float('inf')
            # Check original button definition to find tightest constraint
            affected = buttons[f_col]
            if not affected: 
                limit = 0 # Button does nothing
            else:
                for r in affected:
                    limit = min(limit, b_vec[r])
            bounds.append(limit)

        min_total = float('inf')
        found_solution = False

        # Recursive search function for free variables
        def search(idx, current_free_vals):
            nonlocal min_total, found_solution

            # Base case: All free variables assigned
            if idx == len(free_cols):
                # Calculate Pivot variables based on these Free variables
                current_presses = list(current_free_vals) # stores just the free ones for now
                
                # We need to reconstruct the full 'x' vector to calculate sum
                # But calculating pivots is enough to validate.
                
                temp_pivots = {} # col -> value
                
                # Check each equation in RREF
                possible = True
                
                # RREF Equation: Pivot + sum(Coeff * Free) = Constant
                # Pivot = Constant - sum(Coeff * Free)
                
                # Calculate all Pivot values
                current_sum = sum(current_free_vals)
                
                for r in range(len(pivots)):
                    p_col = pivots[r]
                    val = b_vec_frac[r]
                    for i, f_col in enumerate(free_cols):
                        val -= matrix[r][f_col] * current_free_vals[i]
                    
                    # Validation: Must be integer and >= 0
                    if val.denominator != 1 or val < 0:
                        possible = False
                        break
                    temp_pivots[p_col] = int(val)
                    current_sum += int(val)

                if possible:
                    # Check any rows that were all-zero (redundant constraints)
                    # to ensure 0 = B_value (consistency check)
                    for r in range(len(pivots), num_eq):
                        if b_vec_frac[r] != 0:
                            possible = False
                            break
                    
                    if possible:
                        found_solution = True
                        if current_sum < min_total:
                            min_total = current_sum
                return

            # Recursive step: Iterate possible values for this free variable
            # Optimization: The cost function is linear, so we can try to be greedy or just bounds.
            # Given the problem size, full iteration 0..bound is safe.
            # (Usually bounds are small, like 0-10)
            f_col = free_cols[idx]
            limit = bounds[idx]
            
            # If current partial sum already exceeds best found, prune? 
            # (Hard to prune without knowing pivots, but we can try)
            
            for val in range(limit + 1):
                search(idx + 1, current_free_vals + [val])

        # Start search
        if not free_cols:
            # unique solution or no solution
            search(0, [])
        else:
            search(0, [])

        if found_solution:
            total_presses += min_total
        else:
            print(f"Machine {i+1}: No solution found.")

    return total_presses

*The part that says `if not free_cols: search(0, []) else: search(0, [])` is a bit unsettling, and I'm not a big fan of `nonlocal` in this context, but the code works; the only downside is that it takes about 3 seconds to run, a lot more than previous problems.*

In [48]:
answer(10.2, 18559, lambda:
       solve(input_text))

Puzzle 10.2: 3492.1 msec, correct answer: 18559          

*My next prompt was **Can you solve it with scipy?** and Gemini told me:*

    Since SciPy version 1.9.0, there is a dedicated Mixed-Integer Linear Programming (MILP) solver called scipy.optimize.milp
    
*which was something I had just figured out an hour or so earlier. The C++ milp solution is (unsurpringly) faster than the Python Gaussian elimination code:*

In [49]:
import re
import numpy as np
from scipy.optimize import milp, LinearConstraint, Bounds

def solve(input_text):
    
    machines = input_text.strip().split('\n')
    total_presses = 0

    print(f"Processing {len(machines)} machines using SciPy MILP...")

    for i, line in enumerate(machines):
        if not line.strip(): continue

        # --- 1. PARSING ---
        
        # Targets (RHS of the equation: b)
        joltage_match = re.search(r'\{([\d,]+)\}', line)
        if not joltage_match: continue
        targets = np.array([int(x) for x in joltage_match.group(1).split(',')])
        num_counters = len(targets)

        # Buttons (Columns of matrix A)
        button_matches = re.findall(r'\(([\d,]+)\)', line)
        buttons = []
        for b_str in button_matches:
            buttons.append([int(x) for x in b_str.split(',')])
        num_buttons = len(buttons)

        # Build Matrix A
        # Rows = Counters, Cols = Buttons
        # A[r, c] = 1 if button c increments counter r
        A = np.zeros((num_counters, num_buttons))
        for col, affected_rows in enumerate(buttons):
            for row in affected_rows:
                if row < num_counters:
                    A[row, col] = 1

        # --- 2. SOLVER CONFIGURATION ---

        # Objective: Minimize sum of button presses
        # We want to minimize c @ x, so c is a vector of all 1s.
        c = np.ones(num_buttons)

        # Constraints: A @ x == targets
        # SciPy uses bounds for constraints: lb <= A @ x <= ub
        # For equality, we set lb = ub = targets
        constraints = LinearConstraint(A, lb=targets, ub=targets)

        # Variable Bounds: 0 <= x <= infinity
        # (Cannot press a button negative times)
        bounds = Bounds(lb=0, ub=np.inf)

        # Integrality: Force all variables to be integers
        # 1 means integer, 0 means continuous
        integrality = np.ones(num_buttons)

        # --- 3. RUN MILP SOLVER ---
        res = milp(c=c, constraints=constraints, bounds=bounds, integrality=integrality)

        if res.success:
            # Result values might be floats like 2.999999 or 3.000001, so we round them
            solution = np.round(res.x).astype(int)
            min_presses = np.sum(solution)
            total_presses += min_presses
            # print(f"Machine {i+1}: {min_presses} presses")
        else:
            print(f"Machine {i+1}: No integer solution found.")

    return total_presses

In [50]:
answer(10.2, 18559, lambda:
       solve(input_text))

Processing 165 machines using SciPy MILP...


Puzzle 10.2:   45.3 msec, correct answer: 18559          

*I had been unable to make a search-based approach feasible, and was wondering if Gemini had any good ideas, so I prompted **Can you solve it with search?** and Gemini gave a depth-first search function that correctly solves small machines. Gemini correctly identified the idea that operators are commutative, and applied the ordering of "always press a button that affects the first light that is not at the goal." Gemini confidently states*

     This drastically cuts down the search tree, making it solvable in milliseconds for typical puzzle inputs.

*and it is true that the idea cuts down the search tree, but it still takes seconds, not milliseconds for even moderate size machines, and would probably take hours for some of the machines in the input.  So in the end neither I nor Gemini can solve this efficiently with search, but both of us can figure out how to call **milp** to solve it (and Gemini's z3 code probably also works).*

In [51]:
import re

# Increase recursion depth just in case the targets are high
sys.setrecursionlimit(5000)

def solve_machine_with_search(targets, buttons):
    """
    Finds the minimum button presses to reduce 'targets' to all zeros
    using the available 'buttons'.
    """    
    # Memoization dictionary to store results of states we've already solved
    # Key: tuple(current_counts), Value: int(min_presses)
    memo = {}
    
    # Pre-calculate which buttons affect which indices to speed up the search
    # buttons_affecting[i] = list of button_indices that change counter i
    num_counters = len(targets)
    buttons_affecting = [[] for _ in range(num_counters)]
    for b_idx, b_indices in enumerate(buttons):
        for affected_counter in b_indices:
            if affected_counter < num_counters:
                buttons_affecting[affected_counter].append(b_idx)

    def dfs(current_state):
        # 1. Check Memo
        state_tuple = tuple(current_state)
        if state_tuple in memo:
            return memo[state_tuple]
        
        # 2. Base Case: If all counters are 0, we are done (0 cost)
        if all(x == 0 for x in current_state):
            return 0
        
        # 3. Optimization: Focus Strategy
        # Find the *first* counter that is non-zero. 
        # We MUST press a button that affects this counter eventually.
        # By forcing the search to address counters in order, we reduce branching.
        first_nonzero_idx = -1
        for i, val in enumerate(current_state):
            if val > 0:
                first_nonzero_idx = i
                break
        
        # If somehow we have negative numbers (shouldn't happen with checks below), return infinity
        if first_nonzero_idx == -1:
            # This means all were 0 (handled above) or negative (invalid)
            return float('inf')

        # 4. Try all valid moves
        # Only try buttons that help reduce the `first_nonzero_idx`
        best_cost = float('inf')
        
        candidates = buttons_affecting[first_nonzero_idx]
        
        for b_idx in candidates:
            # Calculate new state if we press this button (subtract it)
            button_effect = buttons[b_idx]
            
            # Check if this move is valid (doesn't drop any counter below 0)
            valid_move = True
            new_state = list(current_state)
            
            for affected_idx in button_effect:
                new_state[affected_idx] -= 1
                if new_state[affected_idx] < 0:
                    valid_move = False
                    break
            
            if valid_move:
                # Recurse
                res = dfs(new_state)
                if res != float('inf'):
                    best_cost = min(best_cost, 1 + res)
        
        # 5. Store and return
        memo[state_tuple] = best_cost
        return best_cost

    return dfs(targets)

def main(input_text):
    
    machines = input_text.strip().split('\n')
    total_presses = 0

    print(f"Processing {len(machines)} machines using Search...")

    for i, line in enumerate(machines):
        if not line.strip(): continue

        # --- Parsing ---
        # Get targets inside {}
        joltage_match = re.search(r'\{([\d,]+)\}', line)
        if not joltage_match: continue
        targets = [int(x) for x in joltage_match.group(1).split(',')]

        # Get buttons inside ()
        button_matches = re.findall(r'\(([\d,]+)\)', line)
        buttons = []
        for b_str in button_matches:
            buttons.append([int(x) for x in b_str.split(',')])

        # --- Solve ---
        min_presses = solve_machine_with_search(targets, buttons)
        if min_presses == float('inf'):
            print(f"Machine {i+1}: No solution possible.")
        else:
            print(f"Machine {i+1}: {min_presses} presses")
            total_presses += min_presses

    return total_presses

# Day 11: Claude Opus 4.5

*For [**Day 11**](https://adventofcode.com/2025/day/11) we are given inputs like `qxn: mow khk`, whihc means that device `qxn` outputs to `mow` and `khk`, and are asked how many distinct output paths there are from the device named `you` to the device named `out`.*

*Claude had no trouble solving **Part 1**. It even volunteered two possible implementations of `count_paths`. One thing was strange:*

     # Convert to frozenset for caching
     graph_tuple = {k: tuple(v) for k, v in graph.items()}

*I can see where this is coming from: lists mutable and thus not hashable, so you can't have them in a cache, but frozensets and tuples are immutable and cachable. Maybe Claude was first going to convert `v` to a frozenset and then decided on a tuple? However, a dict is not hashable, so this doesn't help. In the end, Claude decided to use the inner function `dp`, so the graph is not in the cache anyway, and this step is not needed. Maybe Claude changed its mind and forgot to update this part of the code.*

In [52]:
"""
Solution for the device path counting puzzle.
Count all paths from 'you' to 'out' in a directed graph.
"""

from collections import defaultdict
from functools import lru_cache

def parse_input(text: str) -> dict:
    """Parse the input into a graph (adjacency list)."""
    graph = defaultdict(list)
    
    for line in text.strip().split('\n'):
        line = line.strip()
        if not line or ':' not in line:
            continue
        
        parts = line.split(':')
        source = parts[0].strip()
        targets = parts[1].strip().split()
        
        for target in targets:
            graph[source].append(target)
    
    return graph


def count_paths(graph: dict, start: str, end: str) -> int:
    """
    Count all paths from start to end using memoization.
    Since this is a DAG (data only flows forward), we can use dynamic programming.
    """
    # Convert to frozenset for caching
    graph_tuple = {k: tuple(v) for k, v in graph.items()}
    
    @lru_cache(maxsize=None)
    def dp(node: str) -> int:
        if node == end:
            return 1
        
        if node not in graph_tuple:
            return 0
        
        total = 0
        for neighbor in graph_tuple[node]:
            total += dp(neighbor)
        
        return total
    
    return dp(start)


def count_paths_iterative(graph: dict, start: str, end: str) -> int:
    """
    Alternative: Count paths using DFS without memoization.
    Works for graphs with cycles too (though problem says data flows forward only).
    """
    count = 0
    stack = [start]
    
    def dfs(node, visited):
        nonlocal count
        if node == end:
            count += 1
            return
        
        if node not in graph:
            return
        
        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                dfs(neighbor, visited)
                visited.remove(neighbor)
    
    dfs(start, {start})
    return count


def solve(input_text: str) -> int:
    """Solve the puzzle and return the number of paths."""
    graph = parse_input(input_text)
    return count_paths(graph, 'you', 'out')

In [53]:
input_text = get_text(11)

answer(11.1, 574, lambda:
       solve(input_text))

Puzzle 11.1:    0.3 msec, correct answer: 574            

*For **Part 2** we are asked how many paths from `svr` to `out` visit both `dac` and `fft`, and Claude again gets it right:*

In [54]:
"""
Solution for the device path counting puzzle (Part 2).
Count all paths from 'svr' to 'out' that visit both 'dac' and 'fft'.
"""

from collections import defaultdict
from functools import lru_cache

def parse_input(text: str) -> dict:
    """Parse the input into a graph (adjacency list)."""
    graph = defaultdict(list)
    
    for line in text.strip().split('\n'):
        line = line.strip()
        if not line or ':' not in line:
            continue
        
        parts = line.split(':')
        source = parts[0].strip()
        targets = parts[1].strip().split()
        
        for target in targets:
            graph[source].append(target)
    
    return graph


def count_paths_through_nodes(graph: dict, start: str, end: str, required: set) -> int:
    """
    Count all paths from start to end that visit all required nodes.
    
    We use a state-based DP where the state includes which required nodes have been visited.
    State = (current_node, frozenset of visited required nodes)
    """
    graph_tuple = {k: tuple(v) for k, v in graph.items()}
    required_frozenset = frozenset(required)
    
    @lru_cache(maxsize=None)
    def dp(node: str, visited_required: frozenset) -> int:
        # Update visited required nodes
        if node in required:
            visited_required = visited_required | {node}
        
        # If we reached the end
        if node == end:
            # Only count if we've visited all required nodes
            if visited_required == required_frozenset:
                return 1
            else:
                return 0
        
        # If node has no outgoing edges
        if node not in graph_tuple:
            return 0
        
        # Sum paths through all neighbors
        total = 0
        for neighbor in graph_tuple[node]:
            total += dp(neighbor, visited_required)
        
        return total
    
    # Start with empty set of visited required nodes (or with start if it's required)
    initial_visited = frozenset()
    if start in required:
        initial_visited = frozenset({start})
    
    return dp(start, initial_visited)


def solve(input_text: str) -> int:
    """Solve the puzzle and return the number of valid paths."""
    graph = parse_input(input_text)
    required_nodes = {'dac', 'fft'}
    return count_paths_through_nodes(graph, 'svr', 'out', required_nodes)

In [55]:
answer(11.2, 306594217920240, lambda:
       solve(input_text))

Puzzle 11.2:    0.8 msec, correct answer: 306594217920240

# Day 12: ChatGPT 5.2 Auto

***Note:*** *ChatGPT has a new version today: 5.2. I don't notice a big difference from 5.1, but I only did this one interaction.*

*For [**Day 12**](https://adventofcode.com/2025/day/12) we are given some 3x3 grids describing the shapes of some oddly-shaped Christmas presents, thena re given some regions with given width and length, and asked if a specified number of presents of each kind can fit in the region.*

*In my prompt I included my actual input, because that is key to the shortcut for solving the problem (which I covered in [**my notebook**](Advent-2025.ipynb)). ChatGPT didn't detect the shortcut and wrote code to rotate the shapes and search through possible placements. ChatGPT did have the check for `total_area > W * H`, so it is able to instantly reject the regions with too many presents (about half of them). But for the regions where there is a trivial fit into 3x3 squares, ChatGPT's code still tries to pack them in tightly rather than doing the simple layout.* 

In [56]:
"""
Polyomino packing checker (rectangular regions, grid-aligned, rotations+flips allowed).

Input format (as described):
- Shapes section: repeated blocks like
    0:
    ### 
    ##.
    ...
  separated by blank lines.
- Then a blank line, then regions like:
    12x5: 1 0 1 0 2 2

Output: number of regions that can fit all requested presents.
"""

from dataclasses import dataclass
from functools import lru_cache
from typing import Dict, Iterable, List, Sequence, Tuple


Coord = Tuple[int, int]  # (x, y)


@dataclass(frozen=True)
class Shape:
    idx: int
    cells: Tuple[Coord, ...]          # canonical cells for the "base" orientation
    orientations: Tuple[Tuple[Coord, ...], ...]  # all unique orientations
    area: int


def _normalize(cells: Iterable[Coord]) -> Tuple[Coord, ...]:
    cells = list(cells)
    min_x = min(x for x, _ in cells)
    min_y = min(y for _, y in cells)
    norm = sorted((x - min_x, y - min_y) for x, y in cells)
    return tuple(norm)


def _rot90(cells: Iterable[Coord]) -> List[Coord]:
    # Rotate around origin: (x, y) -> (y, -x)
    return [(y, -x) for x, y in cells]


def _flip_x(cells: Iterable[Coord]) -> List[Coord]:
    # Mirror across y-axis: (x, y) -> (-x, y)
    return [(-x, y) for x, y in cells]


def _unique_orientations(cells: Tuple[Coord, ...]) -> Tuple[Tuple[Coord, ...], ...]:
    # Start from given cells (already normalized), generate 8 dihedral transforms.
    base = list(cells)
    seen = set()
    outs: List[Tuple[Coord, ...]] = []

    cur = base
    for _ in range(4):
        for variant in (cur, _flip_x(cur)):
            norm = _normalize(variant)
            if norm not in seen:
                seen.add(norm)
                outs.append(norm)
        cur = _rot90(cur)

    # Deterministic ordering: sort by lexicographic coords
    outs.sort()
    return tuple(outs)


def parse_input(text: str) -> Tuple[List[Shape], List[Tuple[int, int, List[int]]]]:
    lines = [ln.rstrip("\n") for ln in text.splitlines()]

    shapes_raw: Dict[int, List[str]] = {}
    regions: List[Tuple[int, int, List[int]]] = []

    i = 0
    # Parse shapes until we encounter a region line (contains 'x' and ':')
    while i < len(lines):
        ln = lines[i].strip()
        if not ln:
            i += 1
            continue
        # Region lines look like "12x5: ..."
        if "x" in ln and ":" in ln and ln.split(":", 1)[0].count("x") == 1 and ln.split(":", 1)[0].replace("x", "").isdigit() is False:
            # That heuristic is messy; instead: try parse as region
            pass

        # Prefer: shape header "N:"
        if ln.endswith(":") and ln[:-1].isdigit():
            idx = int(ln[:-1])
            i += 1
            grid: List[str] = []
            while i < len(lines) and lines[i].strip() != "":
                # Stop if we hit something that looks like a region (WxH:)
                if ":" in lines[i] and "x" in lines[i].split(":", 1)[0]:
                    break
                grid.append(lines[i].rstrip())
                i += 1
            shapes_raw[idx] = grid
            continue

        # If not a shape header, we assume we've reached regions section
        break

    # Parse regions from remaining lines
    while i < len(lines):
        ln = lines[i].strip()
        i += 1
        if not ln:
            continue
        if ":" not in ln or "x" not in ln.split(":", 1)[0]:
            continue
        dim, rest = ln.split(":", 1)
        w_str, h_str = dim.split("x", 1)
        w, h = int(w_str), int(h_str)
        counts = [int(x) for x in rest.strip().split()] if rest.strip() else []
        regions.append((w, h, counts))

    # Build Shape objects (ensure sorted by idx)
    shapes: List[Shape] = []
    for idx in sorted(shapes_raw.keys()):
        grid = shapes_raw[idx]
        cells: List[Coord] = []
        for y, row in enumerate(grid):
            for x, ch in enumerate(row):
                if ch == "#":
                    cells.append((x, y))
        if not cells:
            raise ValueError(f"Shape {idx} has no # cells")
        canon = _normalize(cells)
        orients = _unique_orientations(canon)
        shapes.append(Shape(idx=idx, cells=canon, orientations=orients, area=len(canon)))

    return shapes, regions


def build_placements(
    W: int, H: int, shape: Shape
) -> List[int]:
    """Return all placement bitmasks for this shape in a W x H rectangle."""
    placements: List[int] = []
    for orient in shape.orientations:
        max_x = max(x for x, _ in orient)
        max_y = max(y for _, y in orient)
        for oy in range(H - max_y):
            for ox in range(W - max_x):
                mask = 0
                for x, y in orient:
                    xx, yy = ox + x, oy + y
                    bit = yy * W + xx
                    mask |= 1 << bit
                placements.append(mask)
    return placements


def can_pack_region(
    W: int,
    H: int,
    shapes: List[Shape],
    counts: List[int],
) -> bool:
    n = len(shapes)
    if len(counts) < n:
        counts = counts + [0] * (n - len(counts))
    elif len(counts) > n:
        raise ValueError("Region counts list longer than number of shapes")

    total_area = sum(counts[i] * shapes[i].area for i in range(n))
    if total_area > W * H:
        return False

    placements_by_type: List[List[int]] = [build_placements(W, H, shapes[i]) for i in range(n)]

    # Only keep types that are actually requested
    types = [i for i in range(n) if counts[i] > 0]
    if not types:
        return True

    # Precompute remaining-area contribution array for fast pruning
    areas = [shapes[i].area for i in range(n)]

    # Use a compact counts tuple only over requested types (stable order)
    type_to_pos = {t: p for p, t in enumerate(types)}
    init_ct = tuple(counts[t] for t in types)

    @lru_cache(maxsize=None)
    def dfs(occ: int, ct: Tuple[int, ...], remaining_area: int) -> bool:
        # Prune by area vs free cells
        free = W * H - bin(occ).count("1")
        if remaining_area > free:
            return False
        if remaining_area == 0:
            return True

        # Choose next type with fewest available placements under current occupancy
        best_pos = -1
        best_type = -1
        best_moves: List[int] = []
        best_len = 10**18

        for pos, t in enumerate(types):
            k = ct[pos]
            if k <= 0:
                continue
            moves = [m for m in placements_by_type[t] if (m & occ) == 0]
            lm = len(moves)
            if lm == 0:
                return False
            if lm < best_len:
                best_len = lm
                best_pos = pos
                best_type = t
                best_moves = moves
                if lm == 1:
                    break

        # Try each placement
        new_ct_base = list(ct)
        new_ct_base[best_pos] -= 1
        new_ct_base = tuple(new_ct_base)
        new_remaining_area = remaining_area - areas[best_type]

        for m in best_moves:
            if dfs(occ | m, new_ct_base, new_remaining_area):
                return True
        return False

    return dfs(0, init_ct, total_area)


def solve(text: str) -> int:
    shapes, regions = parse_input(text)
    ok = 0
    for W, H, counts in regions:
        if can_pack_region(W, H, shapes, counts):
            ok += 1
    return ok

*Kudos to ChatGPT for writing code that works, and for quickly rejecting regions where `total_area > W * H`. But by failing to immediately detect the cases where all the presents trivially fit into 3x3 boxes, the program takes two minutes to run, when it could have been done in under a millisecond. I'm not going to make you wait two minutes, but if you want to you can uncomment the code below:*

In [57]:
# text = get_text(12)
# answer(12.1, 454, lambda:
#       solve(text))

# AoC Utilities

*I showed this notebook at a [Hacker Dojo](https://hackerdojo.org/) meetup and one comment was that the lines-of-code comparison was unfair, since I used my utilities module, which saved a lot of lines, particularly with parsing the input. I completely agree with this comment, so I asked each of the LLMs **If you were going to do the Advent of Code programming contest, what set of utility functions would you define ahead of time?** In retrospect, I should have done this right from the start.*

*They all were quite familiar with AoC and had ideas for utilities. All three came up with very similar functions for these four areas:*
1) *Input parsing*
2) *Manipulating a 2D grid and (x, y) coordinate points*
3) *Graph search (Dijkstra and breadth-first search)*
4) *Various math functions*

*All three LLMs chose to implement grids as dicts rather than nested lists, perhaps  because some AoC problems have grids of unbounded size. Since all three came up with similar code, I asked them **How did you come up with that?** and they each explained:*

***Gemini*** *gave credit to [betaveros](https://github.com/betaveros), [ecnerwala](https://github.com/ecnerwala/aoc-2019), [hyperneutrino](https://github.com/hyperneutrino/aoc-header), [Jonathan Paulson](https://github.com/jonathanpaulson/AdventOfCode), and some guy named [Peter Norvig](https://github.com/norvig/pytudes/tree/main) (for the `ints` function, about which Gemini says* "It is a brilliant "lazy" hack that saves 10 minutes of typing per day.") *Gemini  mentioned that some programmers at the top of the leaderboard use complex numbers for 2D coordinates, but that an `(x, y)` tuple is easier to read.   Gemini also included a complicated function, `solve_cycle`, designed to simulate several generations of a process (like the Game of Life), find a repeated state, and then look up the state for a generation far in the future. Gemini said*: this is a lifesaver for problems that ask "What will the grid look like after 1,000,000,000 steps?"

***Claude*** *said its inspiration came from* "The AoC problem archetypes themselves" *and* "common frustrations:  The `ints()` regex approach exists because so many inputs look like "`Sensor at x=2, y=18: closest beacon is at x=-2, y=15`" and manually parsing that is tedious."

**ChatGPT** *had the most to say on how it developed its utilities:*

Over the years, reading:
- Reddit solution megathreads
- Writeups by top solvers
- GitHub repos after the contest

…you see the same shapes again and again, even when code differs. Examples:
- Everyone’s Dijkstra looks the same.
- Everyone eventually has a DIRS4. *(Note: mine is called `directions4`)*
- Everyone regrets not having ints() on Day 1.

A more “actionable” mapping (day number → utilities I’d expect)

- Day 1–3: lines, ints, Counter, simple loops
- Day 4–7: tokenization (words, split_strip), sets/maps, maybe regex
- Day 8–10: coordinate sets, manhattan, DIRS4, simple simulation
- Day 11–14: grid regions ⇒ grid_*, neighbors, flood fill (BFS/DFS)
- Day 15–17: shortest path ⇒ dijkstra (+ sometimes “state includes direction”)
- Day 18–20: graph building, BFS/DFS, sometimes union-find (add when needed)
- Day 21–24: DP/memoization ⇒ lru_cache, careful state encoding, sometimes modular math / cycle finding
- Day 25: usually a “build the thing / find the structure” day; helpers depend on puzzle


The template reflects those convergent solutions—but simplified to the core essence.

This is not a library. It’s a memory prosthetic.

*Below I merge the three utility libraries into one for the areas where they all wrote very similar code, and then I give the functions that were unique to each LLM:*

In [58]:
# =============================================================================
#  1. INPUT PARSING
# =============================================================================

def read_input(day: int) -> str:
    """Read raw input file."""
    return open(f"day{day:02d}.txt").read().strip()

def lines(s: str) -> list[str]:
    return s.strip().split('\n')

def paragraphs(s: str) -> list[str]:
    """Split on blank lines (common for grouped input)."""
    return s.strip().split('\n\n')

def ints(s: str) -> list[int]:
    """Extract all integers from a string, including negatives."""
    import re
    return list(map(int, re.findall(r'-?\d+', s)))

# =============================================================================
#  2. GRID & COORDINATES
# =============================================================================

DIRS4 = [(0, 1), (1, 0), (0, -1), (-1, 0)]  # right, down, left, up
DIRS8 = DIRS4 + [(1, 1), (1, -1), (-1, 1), (-1, -1)]

def grid(s: str) -> dict[tuple[int, int], str]:
    """Parse a 2D grid into {(row, col): char} dict."""
    return {(r, c): ch 
            for r, line in enumerate(s.strip().split('\n')) 
            for c, ch in enumerate(line)}

def neighbors4(r, c):
    return [(r + dr, c + dc) for dr, dc in DIRS4]

def neighbors8(r, c):
    return [(r + dr, c + dc) for dr, dc in DIRS8]

def in_bounds(r, c, rows, cols):
    return 0 <= r < rows and 0 <= c < cols

def manhattan(p1, p2):
    """
    Manhattan distance between two points (tuples).
    """
    return sum(abs(a - b) for a, b in zip(p1, p2))

def print_grid(grid_dict, default='.'):
    """
    Visualizes a dictionary grid.
    """
    if not grid_dict:
        return
    rs = [r for r, c in grid_dict.keys()]
    cs = [c for r, c in grid_dict.keys()]
    min_r, max_r = min(rs), max(rs)
    min_c, max_c = min(cs), max(cs)

    for r in range(min_r, max_r + 1):
        line = ""
        for c in range(min_c, max_c + 1):
            line += str(grid_dict.get((r, c), default))
        print(line)

# =============================================================================
#  3. GRAPH ALGORITHMS
# =============================================================================

def bfs(start, neighbors_fn, goal_fn=None):
    """Generic BFS. Returns distances dict (and path to goal if goal_fn provided)."""
    dist = {start: 0}
    queue = deque([start])
    while queue:
        node = queue.popleft()
        if goal_fn and goal_fn(node):
            return dist[node], dist
        for neighbor in neighbors_fn(node):
            if neighbor not in dist:
                dist[neighbor] = dist[node] + 1
                queue.append(neighbor)
    return dist

def dijkstra(start, neighbors_fn, goal_fn=None):
    """neighbors_fn returns [(neighbor, cost), ...]"""
    dist = {start: 0}
    pq = [(0, start)]
    while pq:
        d, node = heappop(pq)
        if d > dist.get(node, float('inf')):
            continue
        if goal_fn and goal_fn(node):
            return d
        for neighbor, cost in neighbors_fn(node):
            nd = d + cost
            if nd < dist.get(neighbor, float('inf')):
                dist[neighbor] = nd
                heappush(pq, (nd, neighbor))
    return dist

# =============================================================================
#  4. MATH & HELPERS
# =============================================================================

def lcm(a, b):
    return a * b // gcd(a, b)

def lcm_many(nums):
    return reduce(lcm, nums)

def sign(x):
    return (x > 0) - (x < 0)

# =============================================================================
#  Functions unique to Gemini
# =============================================================================

def digits(text):
    """
    Extracts all single digits from a string.
    Example: "a1b2" -> [1, 2]
    """
    return [int(x) for x in re.findall(r'\d', text)]

def solve_cycle(start_state, step_fn, steps_needed):
    """
    Simulates a process until a cycle is found, then fast-forwards to the target step.
    
    :param start_state: The initial state (must be hashable, e.g., tuple or frozenset).
    :param step_fn: Function(state) -> next_state.
    :param steps_needed: The huge number (e.g., 1_000_000_000).
    :return: The state at `steps_needed`.
    """
    seen = {}
    history = []
    curr = start_state
    
    for i in range(steps_needed):
        # If we have seen this state before, we found a cycle!
        if curr in seen:
            first_seen_index = seen[curr]
            cycle_len = i - first_seen_index
            remaining_steps = steps_needed - i
            
            # Calculate where we land in the cycle
            final_index = first_seen_index + (remaining_steps % cycle_len)
            return history[final_index]
        
        # Record state
        seen[curr] = i
        history.append(curr)
        
        # Advance
        curr = step_fn(curr)
        
    return curr
    
# =============================================================================
#  Functions unique to Claude
# =============================================================================

def transpose(grid):
    return list(map(list, zip(*grid)))

def flatten(nested):
    return [item for sublist in nested for item in sublist]

# =============================================================================
#  Functions unique to ChatGPT
# =============================================================================

def words(s: str) -> list[str]:
    return re.findall(r"[A-Za-z]+", s)

def split_strip(s: str, sep: str = ",") -> list[str]:
    return [x.strip() for x in s.split(sep)]

def csv(s, cast=str):
    return [cast(x) for x in s.split(",")]

def windows(xs, n):
    for i in range(len(xs)-n+1):
        yield xs[i:i+n]

def pairwise(xs):
    for a,b in zip(xs, xs[1:]):
        yield a,b

# Run Time Comparison


*The human-written code is roughly **three times faster** than the LLM code (for both total and median times).*

*(This is after throwing out 12.1, because the human interepreted it as "solve my particular input" and the LLM as "solve any possible input" so it is not fair to compare run times.)*

<table>
<tr><th>LLM<th>Human</tr>
<tr><td><pre>
Time in msecs: sum: 1215.3, mean: 55.2, median: 2.8

Puzzle  1.1:    0.7 msec, answer: 1182           
Puzzle  1.2:    0.8 msec, answer: 6907           
Puzzle  2.1:   36.7 msec, answer: 23560874270    
Puzzle  2.2:   38.2 msec, answer: 44143124633    
Puzzle  3.1:    1.7 msec, answer: 17085          
Puzzle  3.2:    2.6 msec, answer: 169408143086082
Puzzle  4.1:    8.5 msec, answer: 1569           
Puzzle  4.2:   33.8 msec, answer: 9280           
Puzzle  5.1:    3.1 msec, answer: 635            
Puzzle  5.2:    0.1 msec, answer: 369761800782619
Puzzle  6.1:    2.9 msec, answer: 5877594983578  
Puzzle  6.2:    2.3 msec, answer: 11159825706149 
Puzzle  7.1:    0.4 msec, answer: 1681           
Puzzle  7.2:    1.0 msec, answer: 422102272495018
Puzzle  8.1:  293.3 msec, answer: 24360          
Puzzle  8.2:  284.5 msec, answer: 2185817796     
Puzzle  9.1:    9.3 msec, answer: 4772103936     
Puzzle  9.2:  447.0 msec, answer: 1529675217     
Puzzle 10.1:    1.9 msec, answer: 441            
Puzzle 10.2:   45.3 msec, answer: 18559          
Puzzle 11.1:    0.3 msec, answer: 574            
Puzzle 11.2:    0.8 msec, answer: 306594217920240
</pre>
<td><pre>
Time in msecs: sum: 366.2, mean: 15.9, median: 0.9

Puzzle  1.1:    0.2 msec, answer: 1182           
Puzzle  1.2:    0.4 msec, answer: 6907           
Puzzle  2.1:    0.1 msec, answer: 23560874270    
Puzzle  2.2:    0.2 msec, answer: 44143124633    
Puzzle  3.1:    0.3 msec, answer: 17085          
Puzzle  3.2:    0.8 msec, answer: 169408143086082
Puzzle  4.1:   18.8 msec, answer: 1569           
Puzzle  4.2:   50.1 msec, answer: 9280           
Puzzle  5.1:    4.4 msec, answer: 635            
Puzzle  5.2:    0.0 msec, answer: 369761800782619
Puzzle  6.1:    0.9 msec, answer: 5877594983578  
Puzzle  6.2:    1.3 msec, answer: 11159825706149 
Puzzle  7.1:    0.4 msec, answer: 1681           
Puzzle  7.2:    0.7 msec, answer: 422102272495018
Puzzle  8.1:   93.0 msec, answer: 24360          
Puzzle  8.2:  113.5 msec, answer: 2185817796     
Puzzle  9.1:   11.7 msec, answer: 4772103936     
Puzzle  9.2:    2.0 msec, answer: 1529675217     
Puzzle 10.1:   21.5 msec, answer: 441            
Puzzle 10.2:   44.3 msec, answer: 18559          
Puzzle 11.1:    0.1 msec, answer: 574            
Puzzle 11.2:    0.9 msec, answer: 306594217920240
</pre>
</table>

# Lines of Code Comparison and Commentary

*The human-written code is about **five times more concise** than the LLM code.*

*The LOC numbers are total lines of code, including blank lines, comments, and doc strings.*

 |  DAY | LLM<br>Name | LLM<br>LOC | Human<br>LOC | Commentary |
 |---:|:------:|---:|-----:|---|
 |  1.1 | Gemini | 57 | 5 |  Nice code, a little over-commented. | 
 |  1.2 | Gemini |  80 | 12 | Both LLM and human erred on the distance from 0 to 0.  | 
 |  2.1 | Claude |  31 | 19 | No comments by Claude today | 
 |  2.2 | Claude |  36 | 19 | Both LLM and human found the more efficient half-digits approach  | 
 |  3.1 | ChatGPT |  24 | 12 | Nice use of type annotations by ChatGPT today | 
 |  3.2 | ChatGPT | 43 | 15 |  | 
 |  4.1 | Gemini |  46 | 10 |  | 
 |  4.2 | Gemini |  67| 7 | LLM chose the less efficient scan-whole-grid approach; when prompted, fixed it  | 
 |  5.1 | Claude |  47 | 12 | Nice modularization and doc strings; they were missing on Day 2 | 
 |  5.2 | Claude | 59 | 10|  | 
 |  6.1 | ChatGPT |  69 | 8 | Poor modularization; bad “if x: True else: False” idiom by ChatGPT  | 
 |  6.2 | ChatGPT |  88 | 27 | LLM overly verbose  | 
 |  7.1 | Gemini |  65 | 14 | When prompted, Gemini added type annotations and nice modularization | 
 |  7.2 | Gemini |  71 | 12 |  | 
 |  8.1 | Claude |  92 | 27 |  | 
 |  8.2 | Claude |  82 | 13| LLM's UnionFind class runs slower than my simpler code  | 
 |  9.1 | ChatGPT |  34 | 8 |  ChatGPT had no type annotations for 9.1, then added them for 9.2| 
 |  9.2 | ChatGPT |  158 | 48 | ChatGPT code a bit complicated; human uses “2 point” trick for speedup  | 
 |  10.1 | Gemini |  103 | 20|  Poor modularization; one 100-line function| 
 |  10.2 | Gemini |  71 | 14 | milp solutions similar; LLM offers other solutions  | 
 |  11.1 | Claude |  85 | 12 | Good modularization, but a bit of vestigial code  | 
 |  11.2 | Claude | 76 | 12 |  Claude used type annotations today| 
 |  12.1 | ChatGPT | 248 |  20 | Human used shortcut to avoid search; LLM wrote slow search (but with type annotations)   | 
 |  **TOTAL** | | **1732** | **356** |   | 
 |  **MEAN** | | **75** | **14.5** | |