# Binary Diagnostic

## Part 1

The submarine has been making some odd creaking noises, so you ask it to produce a diagnostic report just in case.

The diagnostic report (your puzzle input) consists of a list of binary numbers which, when decoded properly, can tell you many useful things about the conditions of the submarine. The first parameter to check is the power consumption.

You need to use the binary numbers in the diagnostic report to generate two new binary numbers (called the gamma rate and the epsilon rate). The power consumption can then be found by multiplying the gamma rate by the epsilon rate.

Each bit in the gamma rate can be determined by finding the most common bit in the corresponding position of all numbers in the diagnostic report. For example, given the following diagnostic report:

```
00100
11110
10110
10111
10101
01111
00111
11100
10000
11001
00010
01010
```

Considering only the first bit of each number, there are five 0 bits and seven 1 bits. Since the most common bit is 1, the first bit of the gamma rate is 1.

The most common second bit of the numbers in the diagnostic report is 0, so the second bit of the gamma rate is 0.

The most common value of the third, fourth, and fifth bits are 1, 1, and 0, respectively, and so the final three bits of the gamma rate are 110.

So, the gamma rate is the binary number 10110, or 22 in decimal.

The epsilon rate is calculated in a similar way; rather than use the most common bit, the least common bit from each position is used. So, the epsilon rate is 01001, or 9 in decimal. Multiplying the gamma rate (22) by the epsilon rate (9) produces the power consumption, 198.

Use the binary numbers in your diagnostic report to calculate the gamma rate and epsilon rate, then multiply them together. What is the power consumption of the submarine? (Be sure to represent your answer in decimal, not binary.)

## Numpy Code (Byte Packing - Scrapped)
`(un)packbits` only works with `ubyte` which is too small for the 10 digit input data we have.

In [64]:
import numpy as np

# TODO: Shouldnt we use ushort instead of ubyte?
def binary_diagnostic(file: str) -> int:
    # Parse data.
    with open(file, "r") as f:
        bin_list = []
        for line in f.readlines():
            bin_list.append(np.ushort(line))
        
    # print(bin_list)
    bin_array = np.array([np.unpackbits(item) for item in bin_list])
    non_zero = np.count_nonzero(bin_array, axis=0)
    print(non_zero)
    most_common_bit = np.bool8(np.round(np.divide(non_zero, 1000)))
    least_common_bit = ~most_common_bit
    
    epsilon_rate = np.packbits(most_common_bit)
    gamma_rate = np.packbits(least_common_bit)
        
    # Return power consumption.
    power_consumption = epsilon_rate * gamma_rate
    return power_consumption


binary_diagnostic("03_input.txt")

TypeError: Expected an input array of unsigned byte data type

## Numpy Code
Works, but a bit wonky to look at.

In [None]:
import numpy as np

def binary_diagnostic(file: str) -> int:
    # Parse data.
    with open(file, "r") as f:
        data = []
        for line in f.readlines():
            data.append([int(c) for c in line.strip()])
    
    # Create column sums
    data = np.array(data)
    non_zero = [np.sum(data, axis=0)]
    count = data.shape[0]
    
    # Divide by total row count and fix shape
    most_common_array = np.uint8(np.round(np.divide(non_zero, count)).reshape(-1))
    # "Invert" 
    least_common_array = np.array([int(not bool(i)) for i in most_common_array])
    
    # Convert to string, strip the `[]` and convert back to int (base 2)
    gamma_rate = int(
        np.array2string(most_common_array, separator="")[1:-1], 
        base=2
    )
    
    epsilon_rate = int(
        np.array2string(least_common_array, separator="")[1:-1], 
        base=2
    )
    
    # Return power consumption.
    power_consumption = epsilon_rate * gamma_rate
    return power_consumption


In [None]:
%%timeit
binary_diagnostic("03_input.txt")

4 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## String Eval Code
Maybe the more straightforward "string evaluation" approach.

In [None]:
def binary_diagnostic(file: str) -> int:
    # Parse data.
    with open(file, "r") as f:
        linecount = 0
        first_line = f.readline()
        linecount += 1
        non_zero = [int(i) for i in first_line.strip()]
        
        # Create column sums
        for line in f.readlines():
            for i, character in enumerate(line.strip()):
                non_zero[i] += int(character)
        linecount += 1
    
    # Divide by total row count and fix shape
    most_common_array = [round(i/1000) for i in non_zero]
    # "Invert" 
    least_common_array = [0 if i else 1 for i in most_common_array]
    
    # Convert list to string and back to int (base 2)
    gamma_rate = int("".join([str(i) for i in most_common_array]), base=2)
    epsilon_rate = int("".join([str(i) for i in least_common_array]), base=2)
    
    # Return power consumption.
    power_consumption = epsilon_rate * gamma_rate
    return power_consumption


In [None]:
%%timeit
binary_diagnostic("03_input.txt")

4.46 ms ± 657 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [None]:
binary_diagnostic("03_input.txt")

1092896

# Part 2

Next, you should verify the life support rating, which can be determined by multiplying the oxygen generator rating by the CO2 scrubber rating.

Both the oxygen generator rating and the CO2 scrubber rating are values that can be found in your diagnostic report - finding them is the tricky part. Both values are located using a similar process that involves filtering out values until only one remains. Before searching for either rating value, start with the full list of binary numbers from your diagnostic report and consider just the first bit of those numbers. Then:

```
    Keep only numbers selected by the bit criteria for the type of rating value for which you are searching. Discard numbers which do not match the bit criteria.
    If you only have one number left, stop; this is the rating value for which you are searching.
    Otherwise, repeat the process, considering the next bit to the right.
```

The bit criteria depends on which type of rating value you want to find:

```
    To find oxygen generator rating, determine the most common value (0 or 1) in the current bit position, and keep only numbers with that bit in that position. If 0 and 1 are equally common, keep values with a 1 in the position being considered.
    To find CO2 scrubber rating, determine the least common value (0 or 1) in the current bit position, and keep only numbers with that bit in that position. If 0 and 1 are equally common, keep values with a 0 in the position being considered.
```

For example, to determine the oxygen generator rating value using the same example diagnostic report from above:

```
    Start with all 12 numbers and consider only the first bit of each number. There are more 1 bits (7) than 0 bits (5), so keep only the 7 numbers with a 1 in the first position: 11110, 10110, 10111, 10101, 11100, 10000, and 11001.
    Then, consider the second bit of the 7 remaining numbers: there are more 0 bits (4) than 1 bits (3), so keep only the 4 numbers with a 0 in the second position: 10110, 10111, 10101, and 10000.
    In the third position, three of the four numbers have a 1, so keep those three: 10110, 10111, and 10101.
    In the fourth position, two of the three numbers have a 1, so keep those two: 10110 and 10111.
    In the fifth position, there are an equal number of 0 bits and 1 bits (one each). So, to find the oxygen generator rating, keep the number with a 1 in that position: 10111.
    As there is only one number left, stop; the oxygen generator rating is 10111, or 23 in decimal.
```

Then, to determine the CO2 scrubber rating value from the same example above:

```
    Start again with all 12 numbers and consider only the first bit of each number. There are fewer 0 bits (5) than 1 bits (7), so keep only the 5 numbers with a 0 in the first position: 00100, 01111, 00111, 00010, and 01010.
    Then, consider the second bit of the 5 remaining numbers: there are fewer 1 bits (2) than 0 bits (3), so keep only the 2 numbers with a 1 in the second position: 01111 and 01010.
    In the third position, there are an equal number of 0 bits and 1 bits (one each). So, to find the CO2 scrubber rating, keep the number with a 0 in that position: 01010.
    As there is only one number left, stop; the CO2 scrubber rating is 01010, or 10 in decimal.
```

Finally, to find the life support rating, multiply the oxygen generator rating (23) by the CO2 scrubber rating (10) to get 230.

Use the binary numbers in your diagnostic report to **calculate the oxygen generator rating and CO2 scrubber rating, then multiply them together.** What is the life support rating of the submarine? (Be sure to represent your answer in decimal, not binary.)

In [None]:
debug = False

def debug_print(*msg):
    if debug:
        print(*msg)

def recurse_and_discard(data, depth=0, mode="most"):
    if len(data) <= 1:
        return data
    debug_print([entry[depth] for entry in data])
    
    current_sum = sum(int(entry[depth]) for entry in data)
    debug_print(current_sum, len(data) / 2)
    
    filter_digit = int(current_sum >= len(data) / 2)
    if mode == "least":
        filter_digit = 0 if filter_digit else 1
        
    debug_print(f"keeping {filter_digit}s")
    data = [entry for entry in data 
            if int(entry[depth]) == filter_digit]
    debug_print([entry[depth] for entry in data])
    debug_print()
    
    return recurse_and_discard(data, depth+1, mode)

In [None]:
test = [
    "101000001100",
    "011111100111",
    "111100001110",
    "110000011001",
    "001001001011",
    "010011101000",
    "011001110011",
    "010100010000",
    "101110110111",
    "110110111111"
]

test2 = [
    "1000",
    "1100",
    "1110",
    "1111",
]

recurse_and_discard(test, mode="least")

['001001001011']

In [None]:
def calculate_o2_and_co2(file: str) -> int:
    with open(file, "r") as f:
        data = [line.strip() for line in f.readlines()]

    o2 = int(recurse_and_discard(data[:], mode="most")[0], base=2)
    co2 = int(recurse_and_discard(data[:], mode="least")[0], base=2)
    
    return o2 * co2

In [None]:
calculate_o2_and_co2("03_input.txt")

4672151