# --- Day 14: One-Time Pad ---

In order to communicate securely with Santa while you're on this mission, you've been using a one-time pad that you generate using a pre-agreed algorithm. Unfortunately, you've run out of keys in your one-time pad, and so you need to generate some more.

To generate keys, you first get a stream of random data by taking the MD5 of a pre-arranged salt (your puzzle input) and an increasing integer index (starting with 0, and represented in decimal); the resulting MD5 hash should be represented as a string of lowercase hexadecimal digits.

However, not all of these MD5 hashes are keys, and you need 64 new keys for your one-time pad. A hash is a key only if:

- It contains three of the same character in a row, like 777. Only consider the first such triplet in a hash.
- One of the next 1000 hashes in the stream contains that same character five times in a row, like 77777.

Considering future hashes for five-of-a-kind sequences does not cause those hashes to be skipped; instead, regardless of whether the current hash is a key, always resume testing for keys starting with the very next hash.

For example, if the pre-arranged salt is abc:

- The first index which produces a triple is 18, because the MD5 hash of abc18 contains ...cc38887a5.... However, index 18 does not count as a key for your one-time pad, because none of the next thousand hashes (index 19 through index 1018) contain 88888.
- The next index which produces a triple is 39; the hash of abc39 contains eee. It is also the first key: one of the next thousand hashes (the one at index 816) contains eeeee.
- None of the next six triples are keys, but the one after that, at index 92, is: it contains 999 and index 200 contains 99999.
- Eventually, index 22728 meets all of the criteria to generate the 64th key.

So, using our example salt of abc, index 22728 produces the 64th key.

Given the actual salt in your puzzle input, **what index produces your 64th one-time pad key?**

In [4]:
puzzle_salt = "jlmsuwbz"
test_salt = "abc"

First up seeing how to hash a string. I updated this function for part two to be able to make multiple hashes:

In [90]:
import hashlib
import re
from functools import lru_cache

@lru_cache(1000) # to save the last 1000 hashes
def hashval(salt=test_salt, i=0, multiple_hashes=None):
    """takes in a salt and num to append to it, returns hash"""
    h = hashlib.md5(f"{salt}{i}".encode()).hexdigest()
    
    if multiple_hashes:
        for _ in range(multiple_hashes):
            h = hashlib.md5(h.encode()).hexdigest()
    
    return h

hashval(multiple_hashes=2016)

'a107ff634856bb300138cac6568c0f24'

First up, lets find 3 repeated characters in a string:

In [56]:
s = '83501e9109999965af11270ef8d61a4f' # string with 5 repeats
rx3 = re.compile(r"(.)\1{2}") # regex to find 3 repeated chars
rx3.search(s)

<_sre.SRE_Match object; span=(9, 12), match='999'>

The above regex stops when it finds 3 repeated characters, even though are more to go. I guess this is what it means which they say regexes are greedy.

Now to use regex to find 5 repeats of the above character. 

This is where I learned there is such a thing as a raw f string, which is handy as I need to insert the char from above into a raw string for the second regex. 

The main thing here is that single curly brackets are "processed" by the f string, so brackets to pass onto the raw string have to be doubled:

In [57]:
char = rx3.search(s).group()[0] 
print("char to find 5 in a row of: ", char)

rx5 = re.compile(rf"({char})\1{{4}}") # regex to find 5 repeated chars
rx5.search(s)

char to find 5 in a row of:  9


<_sre.SRE_Match object; span=(9, 14), match='99999'>

In [70]:
import time

pbar = tqdm(total=100)

for i in range(10):
    pbar.update(10)
    time.sleep(0.2)
    
pbar.close()

100%|██████████| 100/100 [00:01<00:00, 50.59it/s]


In [84]:
from tqdm import tqdm


def solve(salt=test_salt):
    """takes in a salt and returns the index of the last key and a list of keys"""
    
    i = 0
    keys = []
    
    rx3 = re.compile(r"(.)\1{2}") # regex to find 3 repeated chars
    
    pbar = tqdm(total=64)
    while len(keys) < 64:
        h = hashval(salt, i)
        
        # find a hash with 3 chars
        if rx3.search(h):
            char = rx3.search(h).group()[0]
            
            rx5 = re.compile(rf"({char})\1{{4}}") # regex to find 5 repeated chars
            
            # now to look at the next 1K hashes:
            for j in range(i+1, i+1001):
                if rx5.search(hashval(salt, j)):
                    if h not in keys: # only append one time
                        keys.append(h)
                        pbar.update(1)
        i += 1
    pbar.close()
    print(f"{len(keys)}th key found at the {i-1} loop for salt {salt}")
    return i-1
        
solve(puzzle_salt)

100%|██████████| 64/64 [00:02<00:00, 22.45it/s]

64th key found at the 35186 loop for salt jlmsuwbz





35186

# --- Part Two ---

Of course, in order to make this process even more secure, you've also implemented key stretching.

Key stretching forces attackers to spend more time generating hashes. Unfortunately, it forces everyone else to spend more time, too.

To implement key stretching, whenever you generate a hash, before you use it, you first find the MD5 hash of that hash, then the MD5 hash of that hash, and so on, a total of 2016 additional hashings. Always use lowercase hexadecimal representations of hashes.

For example, to find the stretched hash for index 0 and salt abc:

- Find the MD5 hash of abc0: 577571be4de9dcce85a041ba0410f29f.
- Then, find the MD5 hash of that hash: eec80a0c92dc8a0777c619d9bb51e910.
- Then, find the MD5 hash of that hash: 16062ce768787384c81fe17a7a60c7e3.
- ...repeat many times...
- Then, find the MD5 hash of that hash: a107ff634856bb300138cac6568c0f24.

So, the stretched hash for index 0 in this situation is a107ff.... In the end, you find the original hash (one use of MD5), then find the hash-of-the-previous-hash 2016 times, for a total of 2017 uses of MD5.

The rest of the process remains the same, but now the keys are entirely different. Again for salt abc:

- The first triple (222, at index 5) has no matching 22222 in the next thousand hashes.
- The second triple (eee, at index 10) hash a matching eeeee at index 89, and so it is the first key.
- Eventually, index 22551 produces the 64th key (triple fff with matching fffff at index 22859.

Given the actual salt in your puzzle input and using 2016 extra MD5 calls of key stretching, what index now produces your 64th one-time pad key?

---

This is straightforward, I just modifiled the `hashval` func to do 1 or N number of hashes if a num_hash val is passed.

In [92]:
from tqdm import tqdm


def solve2(salt=test_salt):
    """takes in a salt and returns the index of the last key and a list of keys"""
    
    i = 0
    keys = []
    
    rx3 = re.compile(r"(.)\1{2}") # regex to find 3 repeated chars
    
    pbar = tqdm(total=64)
    while len(keys) < 64:
        h = hashval(salt, i, 2016)
        
        # find a hash with 3 chars
        if rx3.search(h):
            char = rx3.search(h).group()[0]
            
            rx5 = re.compile(rf"({char})\1{{4}}") # regex to find 5 repeated chars
            
            # now to look at the next 1K hashes:
            for j in range(i+1, i+1001):
                if rx5.search(hashval(salt, j, 2016)):
                    if h not in keys: # only append one time
                        keys.append(h)
                        pbar.update(1)
        i += 1
    pbar.close()
    print(f"{len(keys)}th key found at the {i-1} loop for salt {salt}")
    return i-1
        
solve2()

100%|██████████| 64/64 [01:17<00:00,  1.71s/it]

64th key found at the 22551 loop for salt abc





22551

In [93]:
solve2(puzzle_salt)

100%|██████████| 64/64 [01:18<00:00,  1.22s/it]

64th key found at the 22429 loop for salt jlmsuwbz





22429

`22429` is the right puzzle answer.