# Day 9: Disk Fragmenter ---
```
Another push of the button leaves you in the familiar hallways of some friendly amphipods! Good thing you each somehow got your own personal mini submarine. The Historians jet away in search of the Chief, mostly by driving directly into walls.

While The Historians quickly figure out how to pilot these things, you notice an amphipod in the corner struggling with his computer. He's trying to make more contiguous free space by compacting all of the files, but his program isn't working; you offer to help.

He shows you the disk map (your puzzle input) he's already generated. For example:

2333133121414131402
The disk map uses a dense format to represent the layout of files and free space on the disk. The digits alternate between indicating the length of a file and the length of free space.

So, a disk map like 12345 would represent a one-block file, two blocks of free space, a three-block file, four blocks of free space, and then a five-block file. A disk map like 90909 would represent three nine-block files in a row (with no free space between them).

Each file on disk also has an ID number based on the order of the files as they appear before they are rearranged, starting with ID 0. So, the disk map 12345 has three files: a one-block file with ID 0, a three-block file with ID 1, and a five-block file with ID 2. Using one character for each block where digits are the file ID and . is free space, the disk map 12345 represents these individual blocks:

0..111....22222
The first example above, 2333133121414131402, represents these individual blocks:

00...111...2...333.44.5555.6666.777.888899
The amphipod would like to move file blocks one at a time from the end of the disk to the leftmost free space block (until there are no gaps remaining between file blocks). For the disk map 12345, the process looks like this:

0..111....22222
02.111....2222.
022111....222..
0221112...22...
02211122..2....
022111222......
The first example requires a few more steps:

00...111...2...333.44.5555.6666.777.888899
009..111...2...333.44.5555.6666.777.88889.
0099.111...2...333.44.5555.6666.777.8888..
00998111...2...333.44.5555.6666.777.888...
009981118..2...333.44.5555.6666.777.88....
0099811188.2...333.44.5555.6666.777.8.....
009981118882...333.44.5555.6666.777.......
0099811188827..333.44.5555.6666.77........
00998111888277.333.44.5555.6666.7.........
009981118882777333.44.5555.6666...........
009981118882777333644.5555.666............
00998111888277733364465555.66.............
0099811188827773336446555566..............
The final step of this file-compacting process is to update the filesystem checksum. To calculate the checksum, add up the result of multiplying each of these blocks' position with the file ID number it contains. The leftmost block is in position 0. If a block contains free space, skip it instead.

Continuing the first example, the first few blocks' position multiplied by its file ID number are 0 * 0 = 0, 1 * 0 = 0, 2 * 9 = 18, 3 * 9 = 27, 4 * 8 = 32, and so on. In this example, the checksum is the sum of these, 1928.

Compact the amphipod's hard drive using the process he requested. What is the resulting filesystem checksum? (Be careful copy/pasting the input for this puzzle; it is a single, very long line.)
```

In [75]:
import re
import sys
sys.path.append("..")

from common_processing import read_lines, read_as_character_array,\
    read_columns, read_diagonals

test_path = "test.txt"
data_path = "data.txt"

disk_map = read_lines(test_path)[0]
disk_map

'2333133121414131402'

In [76]:
# Approach:
""" 
Create functions to parse from disk map to disk memory, then operate disk memory and
compact it, then get the result.
"""

def map2memory(disk_map: str) -> str:
    """ 
    Read each number and return a list of spaces and ID
    """
    disk_memory = ""
    memory_id = 0
    for i, n in enumerate(disk_map):

        # Odd positions are file sizes
        if i % 2 == 0:
            disk_memory += str(memory_id) * int(n)
            memory_id += 1

        # Even positions are empty spaces and it's size
        if i % 2 == 1:
            disk_memory += "." * int(n)

    return disk_memory

disk_memory = map2memory(disk_map)
disk_memory

'00...111...2...333.44.5555.6666.777.888899'

In [77]:
# Now, move elements from the back of the string to the empty spaces ".":
import re
def compact_last(disk_memory: str) -> str:
    """ 
        Move the last element that is not an
        empty space to the earliest empty space
    """

    match = list(re.finditer(r"\d", disk_memory))
    last_number_position = match[-1].start()
    
    first_space_place = disk_memory.find(".")

    # If firs space position is more than last number position, skip
    if first_space_place > last_number_position:
        return disk_memory

    # Swap using slicing of strings
    char_to_move = disk_memory[last_number_position]

    new_memory = (
    disk_memory[:first_space_place] + char_to_move +
    disk_memory[first_space_place + 1: last_number_position] + "." + 
    disk_memory[last_number_position + 1:]
    )
    return new_memory


disk_memory = compact_last(disk_memory)
disk_memory

'009..111...2...333.44.5555.6666.777.88889.'

In [78]:
# We need to run this swap function exaclty the number of spaces there are
disk_map = read_lines(test_path)[0]
disk_memory = map2memory(disk_map)
for i in range(disk_memory.count(".")):
    print(disk_memory)
    disk_memory = compact_last(disk_memory)


00...111...2...333.44.5555.6666.777.888899
009..111...2...333.44.5555.6666.777.88889.
0099.111...2...333.44.5555.6666.777.8888..
00998111...2...333.44.5555.6666.777.888...
009981118..2...333.44.5555.6666.777.88....
0099811188.2...333.44.5555.6666.777.8.....
009981118882...333.44.5555.6666.777.......
0099811188827..333.44.5555.6666.77........
00998111888277.333.44.5555.6666.7.........
009981118882777333.44.5555.6666...........
009981118882777333644.5555.666............
00998111888277733364465555.66.............
0099811188827773336446555566..............
0099811188827773336446555566..............


In [79]:
# compute the answer by cumsum the compacted memory by its place in the string
result = 0
for i, char in enumerate(disk_memory):
    # Once we reach the end, stop
    if char == ".":
        break

    result += i * int(char)
print(result)

1928


In [88]:
# New more efficient approach:

""" 
Count all spaces, 
count the number of characters that need to be swaped.

Construct the new memory all at once, once we hace retrieved the positions
"""
disk_map = read_lines(test_path)[0]
disk_memory = map2memory(disk_map)
disk_memory

'00...111...2...333.44.5555.6666.777.888899'

In [89]:
disk_map = read_lines(test_path)[0]
disk_memory = map2memory(disk_map)
print("Initial state: \n", disk_memory)

def get_memory_indeces(disk_memory: str) -> tuple:

    spaces_id = []
    digits_id = []

    for i, c in enumerate(disk_memory):
        if c == ".":
            spaces_id.append(i)
        elif c.isdigit():
            digits_id.append(i)

    return spaces_id, digits_id


def compact_memory(disk_memory: str) -> str:
    """ 
    Get where the characters and spaces are:
    swap between spaces ids and reverse digit lists
    """
    disk_memory = list(disk_memory)

    spaces_id, digits_id = get_memory_indeces(disk_memory)

    digits = [disk_memory[i] for i in digits_id]

    digits_reverse = digits[::-1]

    nOfSwaps = min(len(spaces_id), len(digits_id))

    # print("Number of spaces: ", nOfSwaps)
    for s in range(nOfSwaps):
        # Fill spaces with characters
        disk_memory[spaces_id[s]] = digits_reverse[s]

        # Latest digits are now spaces
        disk_memory[digits_id[::-1][s]] = "."

        # print("".join(disk_memory))

        if digits_id[::-1][s] == len(digits) :
            break

    return "".join(disk_memory)

disk_memory = compact_memory(disk_memory)
print("Compacted memory: \n", disk_memory)

Initial state: 
 00...111...2...333.44.5555.6666.777.888899
Compacted memory: 
 0099811188827773336446555566..............


In [90]:
# compute the answer by cumsum the compacted memory by its place in the string
result = 0
for i, char in enumerate(disk_memory):
    # Once we reach the end, stop
    if char == ".":
        break

    result += i * int(char)
    
print(result)

1928


In [91]:
# Try this method with all the data
disk_map = read_lines(data_path)[0]
disk_memory = map2memory(disk_map)

width = 100
for i in range(0,len(disk_memory), width):
    print(disk_memory[i:i+width])

000000.11..2222222..3333....44444.....55....66666666.7777777........88888...99999.......10101010...1
11111111111.........12121212121212..1313........14.........15151515151515....161616161616......17171
71717......18.1919191919......202020202020...2121212121.222222222222222222.........2323232323232323.
.......2424242424...252525252525252525....26262626262626...2727.....2828282828.........2929292929303
030........31313131.........323232323232323232......33333333.........343434..3535.....363636363636..
...37373737383838383838...39393939........404040..41..424242424242.........434343434343434343...44..
......454545454545454545.46464647......4848484848484848....494949.......50505050505050......51515151
5151515151....5252525353535353535353........5454545454..555555..56565656..5757575758...595959595959.
.....6060..616161.......6262........6363636363636363........6464646464........6565......666666666666
6666...67676767....686868.6969696969696969697070....717171717171717171....72727272727373737

In [92]:
disk_memory = compact_memory(disk_memory)

In [93]:
width = 100
for i in range(0,len(disk_memory), width):
    print(disk_memory[i:i+width])

0000009119922222229933339999444449999955998966666666977777779899989988888989999999989998101010109991
1111111111189997999712121212121212991313979997991497999799915151515151515799916161616161669996917171
7171799699918519191919199995992020202020209592121212121922222222222222222294999499923232323232323234
9994999242424242449925252525252525252593992626262626262693927279939928282828289299929992929292929303
0302999299931313131299929992323232323232323232999199333333339199919993434341935359919936363636363691
9993737373738383838383809939393939909999894040409941894242424242429989988994343434343434343438894498
8997894545454545454545459464646477899784848484848484848997849494999789975050505050505089978951515151
5151515151978952525253535353535353539689968954545454549655555589565656569657575757588995959595959596
8996860609961616168996896262968995896363636363636363958995896464646464948994896565948994666666666666
6666899676767673899686868369696969696969696970708993717171717171717171899372727272727373737

In [95]:
# compute the answer by cumsum the compacted memory by its place in the string
result = 0
for i, char in enumerate(disk_memory):
    # Once we reach the end, stop
    if char == ".":
        break

    result += i * int(char)
    
print(result)

89859464970


# Read on this on the subredit, I don't know why im not getting the answer... 

In [100]:
data = read_lines("data.txt")
data[0]

'612272445524817853574369722819746656115663519988539473255950384996493225654063483212699318913016843776943088523242401366223728885826834431902494509383618227271743585640756981714125189330625873401335224762669837316165395723901459914046797577896622933035329273764334326838251918668431127926357041461995575856277146709570645630653145583082915830589118642945642068741858554052818773992318574750659660648959871729366094358228465057694637866643296136991054185572854715232059767185401069507091928017611770313254136520786824208412274824666034596313268158492239848748264336379235865937369289381214229869705528907430565216543272556057765577432753764562517279922354506181103180654984672534334192142796484566503885716676667192318816134649134951721978762298619919186231414152157386585340261253715796106830722166699416508724332962249136672211305743165917598829455424336458581614773416433563148568169568261946894872834159151818167253929184805188497855517938912939154463553320866884531160993228784318993530835928377