In [1]:
import pandas as pd

--- Day 3: Binary Diagnostic ---

The submarine has been making some odd creaking noises, so you ask it to produce a diagnostic report just in case.

The diagnostic report (your puzzle input) consists of a list of binary numbers which, when decoded properly, can tell you many useful things about the conditions of the submarine. The first parameter to check is the power consumption.

You need to use the binary numbers in the diagnostic report to generate two new binary numbers (called the gamma rate and the epsilon rate). The power consumption can then be found by multiplying the gamma rate by the epsilon rate.

Each bit in the gamma rate can be determined by finding the most common bit in the corresponding position of all numbers in the diagnostic report. For example, given the following diagnostic report:

00100
11110
10110
10111
10101
01111
00111
11100
10000
11001
00010
01010
Considering only the first bit of each number, there are five 0 bits and seven 1 bits. Since the most common bit is 1, the first bit of the gamma rate is 1.

The most common second bit of the numbers in the diagnostic report is 0, so the second bit of the gamma rate is 0.

The most common value of the third, fourth, and fifth bits are 1, 1, and 0, respectively, and so the final three bits of the gamma rate are 110.

So, the gamma rate is the binary number 10110, or 22 in decimal.

The epsilon rate is calculated in a similar way; rather than use the most common bit, the least common bit from each position is used. So, the epsilon rate is 01001, or 9 in decimal. Multiplying the gamma rate (22) by the epsilon rate (9) produces the power consumption, 198.

Use the binary numbers in your diagnostic report to calculate the gamma rate and epsilon rate, then multiply them together. What is the power consumption of the submarine? (Be sure to represent your answer in decimal, not binary.)

In [2]:
test1 = pd.read_csv('test_input_1.txt', header=None)

In [3]:
test1

Unnamed: 0,0
0,100
1,11110
2,10110
3,10111
4,10101
5,1111
6,111
7,11100
8,10000
9,11001


In [4]:
type(test1[0][0])

numpy.int64

In [5]:
with open('test_input_1.txt', 'r') as f:
    test1 = f.readlines()

In [8]:
test1 = [i.strip('\n') for i in test1]

In [9]:
test1

['00100',
 '11110',
 '10110',
 '10111',
 '10101',
 '01111',
 '00111',
 '11100',
 '10000',
 '11001',
 '00010',
 '01010']

In [10]:
df = pd.DataFrame(test1)

In [13]:
for i in range(1,6):
    df[i] = df[0].apply(lambda x: x[i-1])

In [14]:
df

Unnamed: 0,0,1,2,3,4,5
0,100,0,0,1,0,0
1,11110,1,1,1,1,0
2,10110,1,0,1,1,0
3,10111,1,0,1,1,1
4,10101,1,0,1,0,1
5,1111,0,1,1,1,1
6,111,0,0,1,1,1
7,11100,1,1,1,0,0
8,10000,1,0,0,0,0
9,11001,1,1,0,0,1


In [23]:
gamma = "".join([df[i].mode().values[0] for i in range(1,6)])

In [24]:
gamma

'10110'

In [33]:
bytes(gamma, 'utf=8') or b'11111', gamma

(b'10110', '10110')

In [37]:
not bytes(gamma,'utf-8')

False

In [38]:
bytes(gamma, 'utf-8')

b'10110'

In [50]:
epsilon = "".join([repr(int(not int(bit))) for bit in gamma])

In [51]:
gamma, epsilon

('10110', '01001')

In [54]:
int(bytes(gamma, 'utf-8'), 2) * int(bytes(epsilon, 'utf-8'), 2)

198

In [72]:
def compute_power_consumption(input_file = 'test_input_1.txt'):
    
    with open(input_file, 'r') as f:
        data = f.readlines()
    data = [i.strip('\n') for i in data]
    df = pd.DataFrame(data)
    print(data[0], len(data[0]), [i for i in data[0]])
    for i in range(1,len(data[0])+1):
        df[i] = df[0].apply(lambda x: x[i-1])
        
    gamma = "".join([df[i].mode().values[0] for i in range(1,len(data[0])+1)])
    epsilon = "".join([repr(int(not int(bit))) for bit in gamma])
    
    power_consumption = int(bytes(gamma, 'utf-8'), 2) * int(bytes(epsilon, 'utf-8'), 2)
    
    return power_consumption 
    

In [73]:
compute_power_consumption()

00100 5 ['0', '0', '1', '0', '0']


198

In [74]:
compute_power_consumption('input.txt')

010100110111 12 ['0', '1', '0', '1', '0', '0', '1', '1', '0', '1', '1', '1']


4001724

--- Part Two ---

Next, you should verify the life support rating, which can be determined by multiplying the oxygen generator rating by the CO2 scrubber rating.

Both the oxygen generator rating and the CO2 scrubber rating are values that can be found in your diagnostic report - finding them is the tricky part. Both values are located using a similar process that involves filtering out values until only one remains. Before searching for either rating value, start with the full list of binary numbers from your diagnostic report and consider just the first bit of those numbers. Then:

Keep only numbers selected by the bit criteria for the type of rating value for which you are searching. Discard numbers which do not match the bit criteria.
If you only have one number left, stop; this is the rating value for which you are searching.
Otherwise, repeat the process, considering the next bit to the right.
The bit criteria depends on which type of rating value you want to find:

To find oxygen generator rating, determine the most common value (0 or 1) in the current bit position, and keep only numbers with that bit in that position. If 0 and 1 are equally common, keep values with a 1 in the position being considered.
To find CO2 scrubber rating, determine the least common value (0 or 1) in the current bit position, and keep only numbers with that bit in that position. If 0 and 1 are equally common, keep values with a 0 in the position being considered.
For example, to determine the oxygen generator rating value using the same example diagnostic report from above:

Start with all 12 numbers and consider only the first bit of each number. There are more 1 bits (7) than 0 bits (5), so keep only the 7 numbers with a 1 in the first position: 11110, 10110, 10111, 10101, 11100, 10000, and 11001.
Then, consider the second bit of the 7 remaining numbers: there are more 0 bits (4) than 1 bits (3), so keep only the 4 numbers with a 0 in the second position: 10110, 10111, 10101, and 10000.
In the third position, three of the four numbers have a 1, so keep those three: 10110, 10111, and 10101.
In the fourth position, two of the three numbers have a 1, so keep those two: 10110 and 10111.
In the fifth position, there are an equal number of 0 bits and 1 bits (one each). So, to find the oxygen generator rating, keep the number with a 1 in that position: 10111.
As there is only one number left, stop; the oxygen generator rating is 10111, or 23 in decimal.
Then, to determine the CO2 scrubber rating value from the same example above:

Start again with all 12 numbers and consider only the first bit of each number. There are fewer 0 bits (5) than 1 bits (7), so keep only the 5 numbers with a 0 in the first position: 00100, 01111, 00111, 00010, and 01010.
Then, consider the second bit of the 5 remaining numbers: there are fewer 1 bits (2) than 0 bits (3), so keep only the 2 numbers with a 1 in the second position: 01111 and 01010.
In the third position, there are an equal number of 0 bits and 1 bits (one each). So, to find the CO2 scrubber rating, keep the number with a 0 in that position: 01010.
As there is only one number left, stop; the CO2 scrubber rating is 01010, or 10 in decimal.
Finally, to find the life support rating, multiply the oxygen generator rating (23) by the CO2 scrubber rating (10) to get 230.

Use the binary numbers in your diagnostic report to calculate the oxygen generator rating and CO2 scrubber rating, then multiply them together. What is the life support rating of the submarine? (Be sure to represent your answer in decimal, not binary.)

In [183]:
def compute_life_support_rating(input_file = 'test_input_1.txt'):
    
    with open(input_file, 'r') as f:
        data = f.readlines()
    data = [i.strip('\n') for i in data]
    df = pd.DataFrame(data)
    
    for i in range(1,len(data[0])+1):
        df[i] = df[0].apply(lambda x: x[i-1])
        
    #gamma = "".join([df[i].mode().values[0] for i in range(1,len(data[0])+1)])
    #epsilon = "".join([repr(int(not int(bit))) for bit in gamma])
    
    val = find_more_1_if_same(df,1)
    mask = df[1] == val
    df_oxygen = df[mask]
    print('computing oxygen rating...')
    for i in range(2,len(data[0])+1):
        if len(df_oxygen) > 1:
            val = find_more_1_if_same(df_oxygen,i)
            mask = df_oxygen[i] == val
            df_oxygen = df_oxygen[mask]
    oxygen_rating = df_oxygen[0].values[0]
    oxygen_rating_int = int(bytes(oxygen_rating, 'utf-8'),2)
    print('oxygen rating: ', oxygen_rating, oxygen_rating_int)
    
    val = find_less_0_if_same(df,1)
    mask = df[1] == val
    df_co2 = df[mask]
    print('computing co2 rating...')
    for i in range(2,len(data[0])+1):
        if len(df_co2) > 1:
            val = find_less_0_if_same(df_co2,i)
            mask = df_co2[i] == val
            df_co2 = df_co2[mask]
    co2_rating = df_co2[0].values[0]
    co2_rating_int = int(bytes(co2_rating, 'utf-8'),2)
    print('co2 rating: ', co2_rating, co2_rating_int)
    
    #mask = (df[1] != df[1].mode().values[0])
    #print(1, '\t', int(not df[1].mode().values[0]), '\t', len(df[mask]))
    #power_consumption = int(bytes(gamma, 'utf-8'), 2) * int(bytes(epsilon, 'utf-8'), 2)
    
    life_support_rating = oxygen_rating_int * co2_rating_int
    print('life_support_rating: ', life_support_rating)
    return df

In [184]:
def find_more_1_if_same(df,i):
    zeros = len(df[df[i]=='0'])
    ones = len(df[df[i]=='1'])
    if zeros > ones:
        return '0'
    else:
        return '1'
    
def find_less_0_if_same(df,i):
    zeros = len(df[df[i]=='0'])
    ones = len(df[df[i]=='1'])
    if zeros > ones:
        return '1'
    else:
        return '0'

In [185]:
df = compute_life_support_rating()

computing oxygen rating...
oxygen rating:  10111 23
computing co2 rating...
co2 rating:  01010 10
life_support_rating:  230


In [186]:
df = compute_life_support_rating('input.txt')

computing oxygen rating...
oxygen rating:  100111110001 2545
computing co2 rating...
co2 rating:  000011100111 231
life_support_rating:  587895


In [81]:
df[1].mode().values[0]

'1'

In [92]:
mask = (df[1] == df[1].mode().values[0])
print(1, '\t', df[1].mode().values[0], '\t', len(df[mask]))
for i in range(2,13):
    if len(df[mask]) > 1:
        mask = mask & (df[i] == df[i].mode().values[0])
        print(i, '\t', df[i].mode().values[0], '\t', len(df[mask]))

1 	 1 	 528
2 	 0 	 281
3 	 0 	 143
4 	 1 	 74
5 	 1 	 38
6 	 0 	 18
7 	 1 	 9
8 	 1 	 4
9 	 0 	 3
10 	 1 	 1


In [93]:
df[mask]

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12
577,100110110111,1,0,0,1,1,0,1,1,0,1,1,1


In [105]:
int(False)

0

In [142]:
df

Unnamed: 0,0,1,2,3,4,5
0,100,0,0,1,0,0
1,11110,1,1,1,1,0
2,10110,1,0,1,1,0
3,10111,1,0,1,1,1
4,10101,1,0,1,0,1
5,1111,0,1,1,1,1
6,111,0,0,1,1,1
7,11100,1,1,1,0,0
8,10000,1,0,0,0,0
9,11001,1,1,0,0,1


In [148]:
for i in range(1,6):
    
    print(i, len(df[df[i]=='0']), len(df[df[i]=='1']))

1 5 7
2 7 5
3 4 8
4 5 7
5 7 5
