## AOC 2024/09
### Part 1

--- Day 9: Disk Fragmenter ---
Another push of the button leaves you in the familiar hallways of some friendly amphipods! Good thing you each somehow got your own personal mini submarine. The Historians jet away in search of the Chief, mostly by driving directly into walls.

While The Historians quickly figure out how to pilot these things, you notice an amphipod in the corner struggling with his computer. He's trying to make more contiguous free space by compacting all of the files, but his program isn't working; you offer to help.

He shows you the disk map (your puzzle input) he's already generated. For example:

```
2333133121414131402
```

The disk map uses a dense format to represent the layout of files and free space on the disk. The digits alternate between indicating the length of a file and the length of free space.

So, a disk map like 12345 would represent a one-block file, two blocks of free space, a three-block file, four blocks of free space, and then a five-block file. A disk map like 90909 would represent three nine-block files in a row (with no free space between them).

Each file on disk also has an ID number based on the order of the files as they appear before they are rearranged, starting with ID 0. So, the disk map 12345 has three files: a one-block file with ID 0, a three-block file with ID 1, and a five-block file with ID 2. Using one character for each block where digits are the file ID and . is free space, the disk map 12345 represents these individual blocks:

```
0..111....22222
```

The first example above, 2333133121414131402, represents these individual blocks:

```
00...111...2...333.44.5555.6666.777.888899
```

The amphipod would like to move file blocks one at a time from the end of the disk to the leftmost free space block (until there are no gaps remaining between file blocks). For the disk map 12345, the process looks like this:

```
0..111....22222
02.111....2222.
022111....222..
0221112...22...
02211122..2....
022111222......
```

The first example requires a few more steps:

```
00...111...2...333.44.5555.6666.777.888899
009..111...2...333.44.5555.6666.777.88889.
0099.111...2...333.44.5555.6666.777.8888..
00998111...2...333.44.5555.6666.777.888...
009981118..2...333.44.5555.6666.777.88....
0099811188.2...333.44.5555.6666.777.8.....
009981118882...333.44.5555.6666.777.......
0099811188827..333.44.5555.6666.77........
00998111888277.333.44.5555.6666.7.........
009981118882777333.44.5555.6666...........
009981118882777333644.5555.666............
00998111888277733364465555.66.............
0099811188827773336446555566..............
```

The final step of this file-compacting process is to update the filesystem checksum. To calculate the checksum, add up the result of multiplying each of these blocks' position with the file ID number it contains. The leftmost block is in position 0. If a block contains free space, skip it instead.

Continuing the first example, the first few blocks' position multiplied by its file ID number are 0 * 0 = 0, 1 * 0 = 0, 2 * 9 = 18, 3 * 9 = 27, 4 * 8 = 32, and so on. In this example, the checksum is the sum of these, 1928.

Compact the amphipod's hard drive using the process he requested. What is the resulting filesystem checksum? (Be careful copy/pasting the input for this puzzle; it is a single, very long line.)

**AI Prompt**

I'm a skilled python developer, please help go consider various approaches for solving this problem and their pros/cons. I prefer simpler and understandable solutions, easily readable also for other developers. It's ok to generate code with testcases without explanation. I'll ask if I do not understand it. Always wrap test code into a function, to not clutter the global variable namespace.

**AI Response**

For this problem, I see a few potential approaches. Let's discuss them first:

1. String manipulation approach:
   - Parse input into lists of (length, is_file) tuples
   - Simulate moves using string operations
   - Simple but might be less efficient for large inputs

2. List-based simulation:
   - Convert to a list of blocks where each position contains file ID or -1 for empty
   - Easier to manipulate positions and calculate checksum
   - More memory efficient than string operations

Would you like me to create a test implementation of either approach first? Or would you prefer to explore a different strategy?

**AI Prompt**

lets go with list approach

**AI Response**

I'll create a test implementation focusing on the list-based approach. Here's a basic structure with test cases:

```python
def parse_disk_map(s: str) -> list[int]:
    """Convert disk map string to list of blocks. -1 represents empty space."""
    result = []
    file_id = 0
    for i, length in enumerate(map(int, s)):
        if i % 2 == 0:  # File
            result.extend([file_id] * length)
            file_id += 1
        else:  # Empty space
            result.extend([-1] * length)
    return result

def test_parse():
    assert parse_disk_map("12345") == [0, -1, -1, 1, 1, 1, -1, -1, -1, -1, 2, 2, 2, 2, 2]
    assert parse_disk_map("2333") == [0, 0, -1, -1, -1, 1, 1, 1, -1, -1, -1]
```

Would you like me to continue with implementing the movement simulation and checksum calculation? Or would you prefer to implement those parts yourself based on this initial structure?

In [None]:
def parse_disk_map(s: str) -> list[int]:
    """Convert disk map string to list of blocks. -1 represents empty space."""
    result = []
    file_id = 0
    for i, length in enumerate(map(int, s)):
        if i % 2 == 0:  # File
            result.extend([file_id] * length)
            file_id += 1
        else:  # Empty space
            result.extend([-1] * length)
    return result

def test_parse():
    assert parse_disk_map("12345") == [0, -1, -1, 1, 1, 1, -1, -1, -1, -1, 2, 2, 2, 2, 2]
    assert parse_disk_map("2333") == [0, 0, -1, -1, -1, 1, 1, 1, -1, -1, -1]
    print("all tests passed")

test_parse()

all tests passed


In [None]:
sample_input = '2333133121414131402'
sample_disk = parse_disk_map(sample_input)
sample_disk = ''.join([str(i) if i>=0 else '.' for i in sample_disk])
sample_disk

'00...111...2...333.44.5555.6666.777.888899'

**AI Prompt**

load full data

**AI Response**

I'll help you load the full data using the advent-of-code-data library which is available:

```python
from aocd import get_data
data = get_data(day=9, year=2024)
print(f"Length of input: {len(data)}")
```

Would you like to see the parsed disk map or shall we proceed with implementing the movement simulation?

In [None]:
from aocd import get_data
full_input = get_data(day=9, year=2024)
full_disk = parse_disk_map(full_input)
len(full_disk)

93759

**AI Prompt**

continue with implementing the movement simulation

**AI Response**

Let's implement the movement simulation. Based on the problem description, we need to move files from right to left into the first available empty space. Here's a proposed implementation with tests:

```python
def compact_disk(disk: list[int]) -> list[int]:
    """Move files from right to left into first available empty space."""
    disk = disk.copy()  # Don't modify input
    n = len(disk)
    
    for i in range(n):
        if disk[i] == -1:  # Found empty space
            # Look for rightmost file to move
            for j in range(n-1, i, -1):
                if disk[j] != -1:  # Found a file
                    disk[i] = disk[j]
                    disk[j] = -1
                    break
    return disk

def calculate_checksum(disk: list[int]) -> int:
    return sum(pos * file_id 
              for pos, file_id in enumerate(disk) 
              if file_id != -1)

def test_compact():
    # Test with example from problem
    disk = parse_disk_map("12345")
    final = compact_disk(disk)
    assert ''.join(str(x) if x != -1 else '.' for x in final) == "022111222......"
    
    # Test checksum
    disk = parse_disk_map("2333133121414131402")
    final = compact_disk(disk)
    assert calculate_checksum(final) == 1928

test_compact()
```

Would you like me to explain any part of this implementation, or shall we try it with the full input?

In [None]:
def calculate_checksum(disk: list[int]) -> int:
    return sum(pos * file_id 
              for pos, file_id in enumerate(disk) 
              if file_id != -1)

**AI Prompt**

optimize it more with tracking the last known position from right, where only -1 continues

**AI Response**

Ah yes, good optimization idea. We can avoid scanning the entire right side each time by keeping track of the rightmost non-empty position. Here's the optimized version:

```python
def compact_disk(disk: list[int]) -> list[int]:
    """Move files from right to left into first available empty space."""
    disk = disk.copy()
    n = len(disk)
    right_pos = n - 1
    
    # Find initial rightmost file position
    while right_pos >= 0 and disk[right_pos] == -1:
        right_pos -= 1
    
    for i in range(n):
        if disk[i] == -1 and i < right_pos:  # Found empty space
            # Move file from right_pos
            disk[i] = disk[right_pos]
            disk[right_pos] = -1
            # Find next rightmost file
            right_pos -= 1
            while right_pos >= i and disk[right_pos] == -1:
                right_pos -= 1
    return disk

# Let's verify it still passes the tests
def test_compact():
    disk = parse_disk_map("12345")
    final = compact_disk(disk)
    assert ''.join(str(x) if x != -1 else '.' for x in final) == "022111222......"
    
    disk = parse_disk_map("2333133121414131402")
    final = compact_disk(disk)
    assert calculate_checksum(final) == 1928

test_compact()
```

Would you like to compare the performance with the full input, or would you like to optimize it further?

In [None]:
def compact_disk(disk: list[int]) -> list[int]:
    """Move files from right to left into first available empty space."""
    disk = disk.copy()
    n = len(disk)
    right_pos = n - 1
    
    # Find initial rightmost file position
    while right_pos >= 0 and disk[right_pos] == -1:
        right_pos -= 1
    
    for i in range(n):
        if disk[i] == -1 and i < right_pos:  # Found empty space
            # Move file from right_pos
            disk[i] = disk[right_pos]
            disk[right_pos] = -1
            # Find next rightmost file
            right_pos -= 1
            while right_pos >= i and disk[right_pos] == -1:
                right_pos -= 1
    return disk

# Let's verify it still passes the tests
def test_compact():
    disk = parse_disk_map("12345")
    final = compact_disk(disk)
    assert ''.join(str(x) if x != -1 else '.' for x in final) == "022111222......"
    
    disk = parse_disk_map("2333133121414131402")
    final = compact_disk(disk)
    assert calculate_checksum(final) == 1928
    print("all tests passed")

test_compact()

all tests passed


In [None]:
def solve_part1():
    cdisk = compact_disk(full_disk)
    return calculate_checksum(cdisk)

solve_part1()

6201130364722

that was correct answer

### Part 2

--- Part Two ---
Upon completion, two things immediately become clear. First, the disk definitely has a lot more contiguous free space, just like the amphipod hoped. Second, the computer is running much more slowly! Maybe introducing all of that file system fragmentation was a bad idea?

The eager amphipod already has a new plan: rather than move individual blocks, he'd like to try compacting the files on his disk by moving whole files instead.

This time, attempt to move whole files to the leftmost span of free space blocks that could fit the file. Attempt to move each file exactly once in order of decreasing file ID number starting with the file with the highest file ID number. If there is no span of free space to the left of a file that is large enough to fit the file, the file does not move.

The first example from above now proceeds differently:

```
00...111...2...333.44.5555.6666.777.888899
0099.111...2...333.44.5555.6666.777.8888..
0099.1117772...333.44.5555.6666.....8888..
0099.111777244.333....5555.6666.....8888..
00992111777.44.333....5555.6666.....8888..
```

The process of updating the filesystem checksum is the same; now, this example's checksum would be 2858.

Start over, now compacting the amphipod's hard drive using this new method instead. What is the resulting filesystem checksum?

**AI Prompt**

analyze updated problem

**AI Response**

Let's analyze the key differences in Part 2:

1. Movement Rules Changed:
   - Instead of moving individual blocks, we move entire files at once
   - Files move to the leftmost possible free space that can fit them
   - Each file moves at most once
   - Files are processed in decreasing order of file ID

2. Movement Constraints:
   - We need to find continuous spans of free space that can fit the entire file
   - If no suitable space exists to the left, the file stays put

3. Data Structure Needs:
   - We'll need to track file sizes and their current positions
   - We'll need to identify continuous spans of free space
   - The basic list representation can still work, but we might want to add helper functions for:
     - Finding continuous free spaces
     - Moving entire files at once
     - Identifying file sizes and positions

Would you like me to propose a specific implementation approach, or would you prefer to explore one of these aspects in more detail first?

**AI Prompt**

propose improved data structure

**AI Response**

For Part 2, I suggest a data structure that tracks files and gaps more efficiently. Here's a proposal:

```python
from dataclasses import dataclass
from typing import List

@dataclass
class FileSpan:
    id: int
    start: int
    length: int

@dataclass
class DiskState:
    files: List[FileSpan]    # sorted by position
    gaps: List[FileSpan]     # free spaces, sorted by position
    total_length: int
    
    def find_leftmost_fit(self, file_size: int) -> int:
        """Find leftmost gap that can fit given file size, return start position or -1"""
        for gap in self.gaps:
            if gap.length >= file_size:
                return gap.start
        return -1
```

This structure would give us:
1. O(1) access to file sizes and positions
2. Easy iteration through gaps to find fitting spaces
3. Simpler movement operations

Would you like me to implement the conversion from our current list representation to this structure, or would you prefer to explore a different approach?

In [None]:
from dataclasses import dataclass
from typing import List

@dataclass
class FileSpan:
    id: int
    start: int
    length: int

@dataclass
class DiskState:
    files: List[FileSpan]    # sorted by position
    gaps: List[FileSpan]     # free spaces, sorted by position
    total_length: int
    
    def find_leftmost_fit(self, file_size: int) -> int:
        """Find leftmost gap that can fit given file size, return start position or -1"""
        for gap in self.gaps:
            if gap.length >= file_size:
                return gap.start
        return -1

**AI Prompt**

implement conversion

**AI Response**

Here's the implementation of the conversion function with tests using the example from the problem:

```python
def list_to_disk_state(disk: list[int]) -> DiskState:
    """Convert list representation to DiskState"""
    files = []
    gaps = []
    n = len(disk)
    
    # Find continuous spans
    start = 0
    while start < n:
        # Skip processed positions
        if start < n and disk[start] == -1:
            # Found start of a gap
            gap_start = start
            while start < n and disk[start] == -1:
                start += 1
            gaps.append(FileSpan(id=-1, start=gap_start, length=start-gap_start))
        elif start < n:
            # Found start of a file
            file_id = disk[start]
            file_start = start
            while start < n and disk[start] == file_id:
                start += 1
            files.append(FileSpan(id=file_id, start=file_start, length=start-file_start))
    
    return DiskState(files=files, gaps=gaps, total_length=n)

def test_conversion():
    # Test with example "2333133121414131402"
    disk = parse_disk_map("2333133121414131402")
    state = list_to_disk_state(disk)
    
    # Verify the conversion matches the example visualization
    # "00...111...2...333.44.5555.6666.777.888899"
    expected_files = [
        FileSpan(id=0, start=0, length=2),
        FileSpan(id=1, start=5, length=3),
        FileSpan(id=2, start=11, length=1),
        FileSpan(id=3, start=15, length=3),
        FileSpan(id=4, start=19, length=2),
        FileSpan(id=5, start=22, length=4),
        FileSpan(id=6, start=27, length=4),
        FileSpan(id=7, start=32, length=3),
        FileSpan(id=8, start=36, length=4),
        FileSpan(id=9, start=40, length=2),
    ]
    
    assert len(state.files) == len(expected_files)
    for actual, expected in zip(state.files, expected_files):
        assert actual == expected, f"Expected {expected}, got {actual}"
    
    print("All conversion tests passed")

test_conversion()
```

Would you like me to implement the movement logic next, or would you prefer to review and modify this conversion implementation first?

In [None]:
def list_to_disk_state(disk: list[int]) -> DiskState:
    """Convert list representation to DiskState"""
    files = []
    gaps = []
    n = len(disk)
    
    # Find continuous spans
    start = 0
    while start < n:
        # Skip processed positions
        if start < n and disk[start] == -1:
            # Found start of a gap
            gap_start = start
            while start < n and disk[start] == -1:
                start += 1
            gaps.append(FileSpan(id=-1, start=gap_start, length=start-gap_start))
        elif start < n:
            # Found start of a file
            file_id = disk[start]
            file_start = start
            while start < n and disk[start] == file_id:
                start += 1
            files.append(FileSpan(id=file_id, start=file_start, length=start-file_start))
    
    return DiskState(files=files, gaps=gaps, total_length=n)

def test_conversion():
    # Test with example "2333133121414131402"
    disk = parse_disk_map("2333133121414131402")
    state = list_to_disk_state(disk)
    
    # Verify the conversion matches the example visualization
    # "00...111...2...333.44.5555.6666.777.888899"
    expected_files = [
        FileSpan(id=0, start=0, length=2),
        FileSpan(id=1, start=5, length=3),
        FileSpan(id=2, start=11, length=1),
        FileSpan(id=3, start=15, length=3),
        FileSpan(id=4, start=19, length=2),
        FileSpan(id=5, start=22, length=4),
        FileSpan(id=6, start=27, length=4),
        FileSpan(id=7, start=32, length=3),
        FileSpan(id=8, start=36, length=4),
        FileSpan(id=9, start=40, length=2),
    ]
    
    assert len(state.files) == len(expected_files)
    for actual, expected in zip(state.files, expected_files):
        assert actual == expected, f"Expected {expected}, got {actual}"
    
    print("All conversion tests passed")

test_conversion()

All conversion tests passed


**AI Prompt**

lets implement updated compacting

**AI Response**

I'll implement the new compacting logic according to Part 2 rules. Here's the implementation with test cases:

```python
def compact_disk_v2(state: DiskState) -> DiskState:
    """Compact disk by moving whole files, in order of decreasing file ID"""
    state = DiskState(
        files=state.files.copy(),
        gaps=state.gaps.copy(),
        total_length=state.total_length
    )
    
    # Process files in decreasing order of ID
    for file in sorted(state.files, key=lambda f: f.id, reverse=True):
        # Find leftmost gap that can fit this file
        new_pos = state.find_leftmost_fit(file.length)
        
        if new_pos != -1 and new_pos < file.start:  # Can move left
            # Update gaps
            gap_idx = next(i for i, g in enumerate(state.gaps) if g.start == new_pos)
            old_gap = state.gaps[gap_idx]
            # Create new gap where file was
            state.gaps.append(FileSpan(-1, file.start, file.length))
            # Update the gap we're moving into
            state.gaps[gap_idx] = FileSpan(-1, new_pos + file.length, 
                                         old_gap.length - file.length)
            if state.gaps[gap_idx].length == 0:
                state.gaps.pop(gap_idx)
            # Move file
            file.start = new_pos
            
    # Sort gaps and files by position
    state.gaps.sort(key=lambda x: x.start)
    state.files.sort(key=lambda x: x.start)
    return state

def disk_state_to_list(state: DiskState) -> list[int]:
    """Convert DiskState back to list representation"""
    result = [-1] * state.total_length
    for file in state.files:
        for i in range(file.length):
            result[file.start + i] = file.id
    return result

def test_compact_v2():
    # Test with example from Part 2
    disk = parse_disk_map("2333133121414131402")
    state = list_to_disk_state(disk)
    final = compact_disk_v2(state)
    
    # Convert back to list and calculate checksum
    final_list = disk_state_to_list(final)
    assert calculate_checksum(final_list) == 2858
    print("Compact v2 test passed")

test_compact_v2()
```

Would you like me to explain any part of the implementation, or shall we try it with the full input?

In [None]:
def compact_disk_v2(state: DiskState) -> DiskState:
    """Compact disk by moving whole files, in order of decreasing file ID"""
    state = DiskState(
        files=state.files.copy(),
        gaps=state.gaps.copy(),
        total_length=state.total_length
    )
    
    # Process files in decreasing order of ID
    for file in sorted(state.files, key=lambda f: f.id, reverse=True):
        # Find leftmost gap that can fit this file
        new_pos = state.find_leftmost_fit(file.length)
        
        if new_pos != -1 and new_pos < file.start:  # Can move left
            # Update gaps
            gap_idx = next(i for i, g in enumerate(state.gaps) if g.start == new_pos)
            old_gap = state.gaps[gap_idx]
            # Create new gap where file was
            state.gaps.append(FileSpan(-1, file.start, file.length))
            # Update the gap we're moving into
            state.gaps[gap_idx] = FileSpan(-1, new_pos + file.length, 
                                         old_gap.length - file.length)
            if state.gaps[gap_idx].length == 0:
                state.gaps.pop(gap_idx)
            # Move file
            file.start = new_pos
            
    # Sort gaps and files by position
    state.gaps.sort(key=lambda x: x.start)
    state.files.sort(key=lambda x: x.start)
    return state

def disk_state_to_list(state: DiskState) -> list[int]:
    """Convert DiskState back to list representation"""
    result = [-1] * state.total_length
    for file in state.files:
        for i in range(file.length):
            result[file.start + i] = file.id
    return result

def test_compact_v2():
    # Test with example from Part 2
    disk = parse_disk_map("2333133121414131402")
    state = list_to_disk_state(disk)
    final = compact_disk_v2(state)
    
    # Convert back to list and calculate checksum
    final_list = disk_state_to_list(final)
    assert calculate_checksum(final_list) == 2858
    print("Compact v2 test passed")

test_compact_v2()

Compact v2 test passed


In [None]:
def solve_part2():
    state = list_to_disk_state(full_disk)
    final = compact_disk_v2(state)
    final_list = disk_state_to_list(final)
    return calculate_checksum(final_list) 

solve_part2()

6221662795602