# Day 14: Disk Defragmentation

Suddenly, a scheduled job activates the system's [disk defragmenter](https://en.wikipedia.org/wiki/Defragmentation). Were the situation different, you might [sit and watch it for a while](https://www.youtube.com/watch?v=kPv1gQ5Rs8A&t=37), but today, you just don't have that kind of time. It's soaking up valuable system resources that are needed elsewhere, and so the only option is to help it finish its task as soon as possible.

The disk in question consists of a 128x128 grid; each square of the grid is either *free* or *used*. On this disk, the state of the grid is tracked by the bits in a sequence of [knot hashes](aoc10.ipynb).

A total of 128 knot hashes are calculated, each corresponding to a single row in the grid; each hash contains 128 bits which correspond to individual grid squares. Each bit of a hash indicates whether that square is *free* (`0`) or *used* (`1`).

The hash inputs are a key string (your puzzle input), a dash, and a number from `0` to `127` corresponding to the row. For example, if your key string were `flqrgnkx`, then the first row would be given by the bits of the knot hash of `flqrgnkx-0`, the second row from the bits of the knot hash of `flqrgnkx-1`, and so on until the last row, `flqrgnkx-127`.

The output of a knot hash is traditionally represented by 32 hexadecimal digits; each of these digits correspond to 4 bits, for a total of `4 * 32 = 128` bits. To convert to bits, turn each hexadecimal digit to its equivalent binary value, high-bit first: `0` becomes `0000`, `1` becomes `0001`, `e` becomes `1110`, `f` becomes `1111`, and so on; a hash that begins with `a0c2017`... in hexadecimal would begin with `10100000110000100000000101110000`... in binary.

Continuing this process, the *first 8 rows and columns* for key `flqrgnkx` appear as follows, using `#` to denote used squares, and `.` to denote free ones:

    ##.#.#..-->
    .#.#.#.#   
    ....#.#.   
    #.#.##.#   
    .##.#...   
    ##..#..#   
    .#...#..   
    ##.#.##.-->
    |      |   
    V      V   

In this example, `8108` squares are used across the entire 128x128 grid.

Given your actual key string, *how many squares are used*?

Your puzzle input is `nbysizxe`.

## Solution for part 1

First, import code for day 10, since this contains an implementation of the knothash algorithm:

In [1]:
%run aoc10.ipynb

Then, we create the necessary functions to convert the hexadecimal string to a binary string, and hash all 128 combinations of the key. At last a convenience function for counting the used squares.

In [2]:
def hexstr2binstr(string):
    return ''.join('{:04b}'.format(int(c, 16)) for c in string)

def create_bitmap(key):
    return [
        hexstr2binstr(knothash(key + b'-' + str(i).encode('utf-8'))) for
        i in range(128)
    ]

def sum_used(bitmap):
    return sum(len(row.replace('0', '')) for row in bitmap)

Test using the supplied example key: `flqrgnkx`

Inspect the first 8 columns of the first 8 rows (to see if we're on the right track), and then sum all used squares to ensure we get the right sum.

In [3]:
testcase = create_bitmap(b'flqrgnkx')
for row in testcase[:8]:
    print(row[:8].replace('1', '#').replace('0', '.'))
assert sum_used(testcase) == 8108

##.#.#..
.#.#.#.#
....#.#.
#.#.##.#
.##.#...
##..#..#
.#...#..
##.#.##.


Now we're ready to solve part 1 of the challenge:

In [4]:
challenge = create_bitmap(b'nbysizxe')
sum_used(challenge)

8216

## Part Two

Now, all the defragmenter needs to know is the number of *regions*. A region is a group of used *squares* that are all *adjacent*, not including diagonals. Every used square is in exactly one region: lone used squares form their own isolated regions, while several adjacent squares all count as a single region.

In the example above, the following nine regions are visible, each marked with a distinct digit:

    11.2.3..-->
    .1.2.3.4   
    ....5.6.   
    7.8.55.9   
    .88.5...   
    88..5..8   
    .8...8..   
    88.8.88.-->
    |      |   
    V      V   

Of particular interest is the region marked `8`; while it does not appear contiguous in this small view, all of the squares marked `8` are connected when considering the whole 128x128 grid. In total, in this example, `1242` regions are present.

How many regions are present given your key string?

In [5]:
def neighbors_wo_diag(bitmap, r, c):
    neigh = []
    length = len(bitmap) - 1

    if r > 0 and bitmap[r - 1][c] == '1':
        neigh.append((r - 1, c))

    if c > 0 and bitmap[r][c - 1] == '1':
        neigh.append((r, c - 1))
    if c < length and bitmap[r][c + 1] == '1':
        neigh.append((r, c + 1))

    if r < length and bitmap[r + 1][c] == '1':
        neigh.append((r + 1, c))

    return neigh

def count_regions(bitmap):
    regionmap = {}
    count = 0
    
    for row, row_val in enumerate(bitmap):
        for col, col_val in enumerate(row_val):
            if bitmap[row][col] == '1':
                neighbors = neighbors_wo_diag(bitmap, row, col)
                
                # Look for existing regions
                if any(cell in regionmap for cell in neighbors):
                    coords = [cell for cell in neighbors if cell in regionmap]
                    region = regionmap[coords[0]]
                else:
                    count += 1
                    region = count
                
                regionmap[(row, col)] = region
                
                # Update neighbors
                for nrow, ncol in neighbors:
                    regionmap[nrow, ncol] = region

    return regionmap, count

Then, run the test:

In [6]:
mapping, count = count_regions(testcase)
for row in range(9):
    for col in range(8):
        if (row, col) in mapping:
            print('{: 3}|'.format(mapping[row, col]), end='')
        else:
            print(' . |', end='')
    print()
    print('-' * (4 * 8))

count

  1|  1| . |  2| . |  3| . | . |
--------------------------------
 . |  1| . |  2| . |  3| . |  4|
--------------------------------
 . | . | . | . | 39| . | 40| . |
--------------------------------
 48| . | 49| . | 39| 39| . | 50|
--------------------------------
 . | 49| 49| . | 39| . | . | . |
--------------------------------
 49| 49| . | . | 39| . | . | 68|
--------------------------------
 . | 49| . | . | . | 81| . | . |
--------------------------------
 49| 49| . | 89| . | 81| 81| . |
--------------------------------
 49| 49| 89| 89| . | 81| 81| 81|
--------------------------------


1652

This yeilds the wrong result, as it's only considering the nearest neighbors when updating the region map. Let's try to update the neighbors by visiting all connected neighbors:

In [7]:
def update_mapping(bitmap, row, col, region, mapping):
    mapping[row, col] = region
    visit = neighbors_wo_diag(bitmap, row, col)
    visited = set()
    
    while len(visit):
        nrow, ncol = visit.pop()

        if (nrow, ncol) in visited or (nrow, ncol) in mapping:
            continue

        visited.add((nrow, ncol))
        mapping[nrow, ncol] = region
        for orow, ocol in neighbors_wo_diag(bitmap, nrow, ncol):
            if (orow, ocol) not in visited:
                visit.append((orow, ocol))

    return mapping


def count_regions(bitmap):
    regionmap = {}
    count = 0
    
    for row, row_val in enumerate(bitmap):
        for col, col_val in enumerate(row_val):
            if bitmap[row][col] == '1':
                neighbors = neighbors_wo_diag(bitmap, row, col)
                
                # Look for existing regions
                if any(cell in regionmap for cell in neighbors):
                    coords = [cell for cell in neighbors if cell in regionmap]
                    region = regionmap[coords[0]]
                else:
                    count += 1
                    region = count
                
                regionmap.update(update_mapping(bitmap, row, col, region, regionmap))
                
                # Update neighbors
                for nrow, ncol in neighbors:
                    regionmap[nrow, ncol] = region

    return regionmap, count

Run the test, to see if we're getting the expected results:

In [8]:
mapping, count = count_regions(testcase)
for row in range(9):
    for col in range(8):
        if (row, col) in mapping:
            print('{: 3}|'.format(mapping[row, col]), end='')
        else:
            print(' . |', end='')
    print()
    print('-' * (4 * 8))

assert count == 1242

  1|  1| . |  2| . |  3| . | . |
--------------------------------
 . |  1| . |  2| . |  3| . |  4|
--------------------------------
 . | . | . | . | 30| . | 31| . |
--------------------------------
 38| . |  5| . | 30| 30| . | 39|
--------------------------------
 . |  5|  5| . | 30| . | . | . |
--------------------------------
  5|  5| . | . | 30| . | . |  5|
--------------------------------
 . |  5| . | . | . |  5| . | . |
--------------------------------
  5|  5| . |  5| . |  5|  5| . |
--------------------------------
  5|  5|  5|  5| . |  5|  5|  5|
--------------------------------


Now we should be ready to count all continous regions for the challenge input:

In [9]:
mapping, count = count_regions(challenge)
count

1139