In [72]:
import utils
from tqdm import tnrange, tqdm_notebook
from collections import Counter
from difflib import SequenceMatcher

# Day 2 Inventory Management System

Look at the list of boxes and make two counts:

- names containing exactly 2 of the same letter
- names containing exactly 3 of the same letter 

In [5]:
inp = utils.get_input(2).splitlines()
inp[:5]

['zihrtxagncfpbsnolxydujjmqv',
 'zihrtxagwcfpbsoolnydukjyqv',
 'aihrtxagwcfpbsnoleybmkjmqv',
 'zihrtxagwcfpbsnolgyduajmrv',
 'zihrtxgmwcfpbunoleydukjmqv']

I'm using counter to count characters here, since its built in to Python3 - though its quite likely not the fastest way to do this:

In [74]:
%%time

two = 0
three = 0

for box in inp:
    count = Counter(box)
    if len([k for k,v in count.items() if v == 2]) > 0:
        two += 1
    
    if len([k for k,v in count.items() if v == 3]) > 0:
        three += 1
    
print(f"checksum is {two*three}")

checksum is 8892
CPU times: user 3.12 ms, sys: 0 ns, total: 3.12 ms
Wall time: 3.14 ms


# Part 2

The right box names only differ by one letter - all the letters are the same in the same positions, except for one letter. So lets scan our input to find the two similar boxes.

I was going to write my own funtion, but turns out python 3 has a built in [diff library](https://docs.python.org/3/library/difflib.html) with something called a [SequenceMatcher](https://docs.python.org/3/library/difflib.html#difflib.SequenceMatcher) which sounds like it was designed for this problem, so lets take a look at it.

First up, since sequencematcher gives us a ration of the similarity, lets see what ratio we are looking for:

In [59]:
print(len(inp[0]))
[len(name) for name in inp if len(name)!=len(inp[0])]

26


[]

So every name is 26 letters long, so we need to find the two names with a matching ratio of 25/26:

In [77]:
def find_similar(ratio=25/26):
    print(f"We're looking for this ratio: {ratio}")
    for i, name in enumerate(tqdm_notebook(inp)):
        s = SequenceMatcher(a=name, b=inp[i])

        for name2 in inp[i+1:]:
            s.set_seq2(name2)
            if s.ratio() == ratio:
                print(f"found {name}, {name2}")
                return(name, name2)
            
n1, n2 = find_similar()

We're looking for this ratio: 0.9615384615384616


HBox(children=(IntProgress(value=0, max=250), HTML(value='')))

found zihwtxagsifpbsnwleydukjmqv, zihwtxagwifpbsnwleydukjmqv



Now to find the common letters n/w the two names:

In [78]:
n = ""
for c1, c2 in zip(n1,n2):
    if c1 == c2:
        n += c1
n

'zihwtxagifpbsnwleydukjmqv'