# Advent Code: DAY 2

## Part I

 - Destination reached. Current Year: 1518. Current Location: North Pole Utility Closet 83N10
 - Exactly two of any letter and then separately counting those with exactly three of any letter
 - You can multiply those two counts together to get a rudimentary checksum and compare it to what your device predicts

## Example

* abcdef contains no letters that appear exactly two or three times.  
* bababc contains two a and three b, so it counts for both x, y
* abbcde contains two b, but no letter appears exactly three times. x
* abcccd contains three c, but no letter appears exactly two times. y
* aabcdd contains two a and two d, but it only counts once. x
* abcdee contains two e. x
* ababab contains three a and three b, but it only counts once. y
* checksum = 4 * 3 = 12

In [63]:
# Dependencies
import pandas as pd
import numpy as np
from collections import Counter
from fuzzywuzzy import fuzz
from fuzzywuzzy import process

In [6]:
# Read CSV
filepath = 'ac_day2_boxids.csv'
df = pd.read_csv(filepath, header=None)
df = df.rename(columns={0:'id'})
df.head()

Unnamed: 0,id
0,umdryebvlapkozostecnihjexg
1,amdryebalapkozfstwcnrhjqxg
2,umdcyebvlapaozfstwcnihjqgg
3,ymdryrbvlapkozfstwcuihjqxg
4,umdrsebvlapkozxstwcnihjqig


In [47]:
# testing the solution on one instance
counter = Counter(df.iloc[0,0])
converter = pd.DataFrame.from_dict(counter, orient='index')
c_list = converter[0].tolist()
print(c_list)

twos = c_list.count(2)
threes = c_list.count(3)

print(f'twos: {twos}')
print(f'threes: {threes}')
print(len(df))

[1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
twos: 1
threes: 1
250


In [51]:
# calculating the values for the entire list of ids
twos = 0
threes = 0
x = 0

while x < len(df):
    counter = Counter(df.iloc[x,0])
    converter = pd.DataFrame.from_dict(counter, orient='index')
    c_list = converter[0].tolist()
    if c_list.count(2) > 0:
        twos += 1
    if c_list.count(3) > 0:
        threes += 1
    x += 1
    
print(f'twos: {twos}')
print(f'threes: {threes}')

twos: 248
threes: 23


In [52]:
# Checksum solution
checksum = twos * threes
print(checksum)

5704


## Part II

What letters are common between the two correct box IDs? (In the example above, this is found by removing the differing character from either ID, producing fgij.)

In [99]:
# Length of string
splitstr = [x for x in df.iloc[0,0]]
len(splitstr)

26

In [67]:
# Testing fuzzywuzzy
fuzz.ratio("atlantabraves", "atlantabrbves")

92

In [84]:
y = 0
z = 1
total = []

while y < len(df):
    while z < len(df):
        str1 = df.iloc[y,0]
        str2 = df.iloc[z,0]
        ratio = fuzz.ratio(str1,str2)
        entry = [str1, str2, ratio]
        total.append(entry)
        z += 1
    z = 0
    y += 1

compare_df = pd.DataFrame(total)
compare_df = compare_df.sort_values(by=[2], ascending=False)
print(len(compare_df))
print(len(total))
compare_df.head(20)

62500
62500


Unnamed: 0,0,1,2
0,umdryebvlapkozostecnihjexg,umdryebvlapkozostecnihjexg,100
19829,umdryebveapkozfstwcnthjqgg,umdryebveapkozfstwcnthjqgg,100
23092,umdryesvnapkozestwcnihjqxg,umdryesvnapkozestwcnihjqxg,100
22841,umdryebvmapkozfstichihjqxg,umdryebvmapkozfstichihjqxg,100
22590,umdryebvrapkozfstmcndhjqxg,umdryebvrapkozfstmcndhjqxg,100
22339,umdrnebvlkpkozfstwcnihjnxg,umdrnebvlkpkozfstwcnihjnxg,100
22088,gmkryebvlapkozfstwcnihjmxg,gmkryebvlapkozfstwcnihjmxg,100
21837,umdryebvlapkosfstfcnihjqxe,umdryebvlapkosfstfcnihjqxe,100
21586,umqryebvlaphozfstwcnihjqxn,umqryebvlaphozfstwcnihjqxn,100
21335,amdryhbvlapkozfstwcnifjqxg,amdryhbvlapkozfstwcnifjqxg,100


In [86]:
# Export to CSV 
# compare_df.to_csv('compare_df.csv')

In [100]:
# Query winners?
compare_df[(compare_df[2] >= 95) & (compare_df[2] < 100)]

Unnamed: 0,0,1,2
17612,umdryabviapkozistwcnihjqxg,umdryabviapkozistwcnihjqxd,96
28070,umdryabviapkozistwcnihjqxd,umdryabviapkozistwcnihjqxg,96


In [None]:
# umdryabviapkozistwcnihjqxg
# umdryabviapkozistwcnihjqxd