# 2024 Day 9: Disk Fragmenter

## Part 1

Another push of the button leaves you in the familiar hallways of some friendly amphipods! Good thing you each somehow got your own personal mini submarine. The Historians jet away in search of the Chief, mostly by driving directly into walls.

While The Historians quickly figure out how to pilot these things, you notice an amphipod in the corner struggling with his computer. He's trying to make more contiguous free space by compacting all of the files, but his program isn't working; you offer to help.

He shows you the disk map (your puzzle input) he's already generated. For example:

```
2333133121414131402
```

The disk map uses a dense format to represent the layout of files and free space on the disk. The digits alternate between indicating the length of a file and the length of free space.

So, a disk map like 12345 would represent a one-block file, two blocks of free space, a three-block file, four blocks of free space, and then a five-block file. A disk map like 90909 would represent three nine-block files in a row (with no free space between them).

Each file on disk also has an ID number based on the order of the files as they appear before they are rearranged, starting with ID 0. So, the disk map 12345 has three files: a one-block file with ID 0, a three-block file with ID 1, and a five-block file with ID 2. Using one character for each block where digits are the file ID and . is free space, the disk map 12345 represents these individual blocks:

```
0..111....22222
```

The first example above, 2333133121414131402, represents these individual blocks:

```
00...111...2...333.44.5555.6666.777.888899
```

The amphipod would like to move file blocks one at a time from the end of the disk to the leftmost free space block (until there are no gaps remaining between file blocks). For the disk map 12345, the process looks like this:

```
0..111....22222
02.111....2222.
022111....222..
0221112...22...
02211122..2....
022111222......
```

The first example requires a few more steps:

```
00...111...2...333.44.5555.6666.777.888899
009..111...2...333.44.5555.6666.777.88889.
0099.111...2...333.44.5555.6666.777.8888..
00998111...2...333.44.5555.6666.777.888...
009981118..2...333.44.5555.6666.777.88....
0099811188.2...333.44.5555.6666.777.8.....
009981118882...333.44.5555.6666.777.......
0099811188827..333.44.5555.6666.77........
00998111888277.333.44.5555.6666.7.........
009981118882777333.44.5555.6666...........
009981118882777333644.5555.666............
00998111888277733364465555.66.............
0099811188827773336446555566..............
```

The final step of this file-compacting process is to update the filesystem checksum. To calculate the checksum, add up the result of multiplying each of these blocks' position with the file ID number it contains. The leftmost block is in position 0. If a block contains free space, skip it instead.

Continuing the first example, the first few blocks' position multiplied by its file ID number are 0 * 0 = 0, 1 * 0 = 0, 2 * 9 = 18, 3 * 9 = 27, 4 * 8 = 32, and so on. In this example, the checksum is the sum of these, 1928.

Compact the amphipod's hard drive using the process he requested. What is the resulting filesystem checksum? (Be careful copy/pasting the input for this puzzle; it is a single, very long line.)

In [1]:
# Importing the necessary Python libraries
from copy import deepcopy
from collections import deque

In [2]:
# Loading the disk map (puzzle input) from file
with open('aoc-2024-day-09.txt') as f:
    disk_map = f.read().splitlines()
    
# Setting the sample data
sample_data = '''2333133121414131402'''.splitlines()

# Overwriting the input for testing  purposes
# (Note: Comment this line out when ready to use full sample input)
disk_map = sample_data

In [3]:
# Instantiating a value to represent the current ID
current_id = 0

# Instantiating a list to represent the parsed disk map
parsed_disk_map = []

# Setting a boolean to indicate if we are working with a file
is_file = True

# Iterating over the disk map (first line only since it's a single line input)
for char in disk_map[0]:

    # Converting the character to an integer
    char = int(char)

    # Handling the situation where the character represents a file
    if is_file:
        
        # Appending the character to the parsed disk map
        for _ in range(char):
            parsed_disk_map += [current_id]

        # Incrementing the current ID
        current_id += 1

        # Setting the is_file flag to False
        is_file = False

    # Handling the situation where the character represents empty disk space
    elif not is_file:

        # Appending the character to the parsed disk map
        for _ in range(char):
            parsed_disk_map += ['.']

        # Setting the is_file flag to True
        is_file = True

In [4]:
def is_integers_then_periods(lst):
    '''
    Returns True if `lst` consists of zero or more integers 
    followed by zero or more periods ('.').
    Returns False otherwise.
    '''
    # Have we encountered a period yet?
    periods_started = False
    
    for item in lst:
        if isinstance(item, int):
            # If we've already started seeing periods, we can't see an integer now
            if periods_started:
                return False
        elif item == '.':
            # Once we see a period, periods_started is True
            periods_started = True
        else:
            # Neither int nor '.', so fail immediately
            return False
    
    return True



def reorder_disk_space(parsed_disk_map: list) -> list:
    '''
    Reorders the disk map

    Inputs:
        - parsed_disk_map (list): The parsed disk map

    Returns:
        - list: The reordered disk map
    '''
    left_pointer = 0
    right_pointer = len(parsed_disk_map) - 1

    # Keep iterating until the list is integers on the left, periods on the right
    while not is_integers_then_periods(parsed_disk_map):

        if str(parsed_disk_map[left_pointer]).isdigit():
            # If left_pointer is pointing at an integer, move it right
            left_pointer += 1

        elif parsed_disk_map[left_pointer] == '.':
            # If left_pointer points to '.', then we want to swap it 
            # with a right-side integer (if any).
            while parsed_disk_map[right_pointer] == '.':
                right_pointer -= 1

            # Swap
            parsed_disk_map[left_pointer], parsed_disk_map[right_pointer] = (
                parsed_disk_map[right_pointer], 
                parsed_disk_map[left_pointer]
            )
            right_pointer -= 1
            left_pointer += 1

    return parsed_disk_map

# Reordering the disk space
reordered_disk_map = reorder_disk_space(parsed_disk_map)

In [5]:
# Instantiating a value to represent final score
final_score = 0

# Iterating over the reordered disk map to produce the final score
for i, char in enumerate(reordered_disk_map):
    final_score += (i * int(char)) if str(char).isdigit() else 0

print(f'Final score: {final_score}')

Final score: 1928


## Part 2

Upon completion, two things immediately become clear. First, the disk definitely has a lot more contiguous free space, just like the amphipod hoped. Second, the computer is running much more slowly! Maybe introducing all of that file system fragmentation was a bad idea?

The eager amphipod already has a new plan: rather than move individual blocks, he'd like to try compacting the files on his disk by moving whole files instead.

This time, attempt to move whole files to the leftmost span of free space blocks that could fit the file. Attempt to move each file exactly once in order of decreasing file ID number starting with the file with the highest file ID number. If there is no span of free space to the left of a file that is large enough to fit the file, the file does not move.

The first example from above now proceeds differently:

```
00...111...2...333.44.5555.6666.777.888899
0099.111...2...333.44.5555.6666.777.8888..
0099.1117772...333.44.5555.6666.....8888..
0099.111777244.333....5555.6666.....8888..
00992111777.44.333....5555.6666.....8888..
```

The process of updating the filesystem checksum is the same; now, this example's checksum would be 2858.

Start over, now compacting the amphipod's hard drive using this new method instead. What is the resulting filesystem checksum?

In [6]:
# Loading the disk map (puzzle input) from file
with open('aoc-2024-day-09.txt') as f:
    disk_map = f.read().splitlines()
    
# Setting the sample data
sample_data = '''2333133121414131402'''.splitlines()

# Overwriting the input for testing  purposes
# (Note: Comment this line out when ready to use full sample input)
disk_map = sample_data

In [7]:
# Instantiating a value to represent the current ID
current_id = 0

# Instantiating a list to represent the parsed disk map
preparsed_disk_map = []

# Setting a boolean to indicate if we are working with a file
is_file = True

# Iterating over the disk map (first line only since it's a single line input)
for char in disk_map[0]:

    # Converting the character to an integer
    char = int(char)

    # Handling the situation where the character represents a file
    if is_file:
        
        # Appending the character to the parsed disk map
        for _ in range(char):
            preparsed_disk_map += [current_id]

        # Incrementing the current ID
        current_id += 1

        # Setting the is_file flag to False
        is_file = False

    # Handling the situation where the character represents empty disk space
    elif not is_file:

        # Appending the character to the parsed disk map
        for _ in range(char):
            preparsed_disk_map += ['.']

        # Setting the is_file flag to True
        is_file = True

In [8]:
# Initializing the parsed disk map and counters
parsed_disk_map = []
current = preparsed_disk_map[0]
count = 1

# Iterating over the preparsed disk map
for item in preparsed_disk_map[1:]:

    # Incrementing the count for the current item
    if item == current:
        count += 1

    # Moving on to the next step
    else:
        parsed_disk_map.append({current: count})
        current = item
        count = 1

# Appending the final group
parsed_disk_map.append({current: count})

In [9]:
def merge_free_spaces(disk_map_list):
    merged_list = []

    for entry in disk_map_list:
        key = list(entry.keys())[0]
        value = list(entry.values())[0]
        if merged_list and key == '.' and list(merged_list[-1].keys())[0] == '.':
            merged_list[-1] = {'.': list(merged_list[-1].values())[0] + value}
        else:
            merged_list.append(entry)
    
    return merged_list

# Convert parsed_disk_map to deque for efficient operations
sorted_parsed_disk_map = deque(deepcopy(parsed_disk_map))

# Extract files to move (descending file ID order)
files_to_move = sorted(
    [item for item in parsed_disk_map if list(item.keys())[0] != '.'],
    key=lambda x: list(x.keys())[0],
    reverse=True
)

# Iterating over the files to move
for file_to_move in files_to_move:

    # Getting the file ID and file size
    file_id = list(file_to_move.keys())[0]
    file_size = list(file_to_move.values())[0]

    # Finding source index
    source_index = next((i for i, item in enumerate(sorted_parsed_disk_map) if list(item.keys())[0] == file_id), None)

    # If file_id not found, skip this file
    if source_index is None:
        continue

    # Finding destination gap
    destination_index = None
    destination_gap_size = None
    for i in range(source_index):
        if list(sorted_parsed_disk_map[i].keys())[0] == '.' and list(sorted_parsed_disk_map[i].values())[0] >= file_size:
            destination_index = i
            destination_gap_size = list(sorted_parsed_disk_map[i].values())[0]
            break

    # Continuing if no destination found
    if destination_index is None:
        continue

    # Moving file to destination
    sorted_parsed_disk_map[destination_index] = {file_id: file_size}

    # Addressing if the destination gap size is larger than the file size
    leftover = destination_gap_size - file_size
    if leftover > 0:
        # Always insert leftover gap after the moved file
        sorted_parsed_disk_map.insert(destination_index + 1, {'.': leftover})
        if destination_index + 1 <= source_index:
            source_index += 1

    # Replace old file location with gap
    sorted_parsed_disk_map[source_index] = {'.': file_size}

    # Merging any adjacent gaps
    sorted_parsed_disk_map = deque(merge_free_spaces(list(sorted_parsed_disk_map)))

In [10]:
# Unpack the dictionaries in sorted_parsed_disk_map into a string
unpacked = ''.join(
    (str(list(entry.keys())[0]) * list(entry.values())[0]) if list(entry.keys())[0] != '.' 
    else '.' * list(entry.values())[0]
    for entry in sorted_parsed_disk_map
)

# Instantiating a value to represent final score
final_score = 0

# Iterating over the reordered disk map to produce the final score
for i, char in enumerate(unpacked):
    final_score += (i * int(char)) if str(char).isdigit() else 0

print(f'Final score: {final_score}')

Final score: 2858
