# --- Day 4: High-Entropy Passphrases ---

A new system policy has been put in place that requires all accounts to use a passphrase instead of simply a password. A passphrase consists of a series of words (lowercase letters) separated by spaces.

To ensure security, a valid passphrase must contain no duplicate words.

For example:

- aa bb cc dd ee is valid.
- aa bb cc dd aa is not valid - the word aa appears more than once.
- aa bb cc dd aaa is valid - aa and aaa count as different words.

The system's full passphrase list is available as your puzzle input. How many passphrases are valid?

In [25]:
from collections import Counter
import numpy as np

In [6]:
with open('day4_input.txt') as f:
    data = f.read().split("\n")
data[:4]

['una bokpr ftz ryw nau yknf fguaczl anu',
 'tvay wvco bcoblpt fwzg sfsys zvuqll mcbhwz ovcw fgdy',
 'ynsocz vid rfmsy essqt fpbjvvq sldje qfpvjvb',
 'yvh nxc kla vhy vkbq cxfzgr']

pythons [Counter](https://docs.python.org/3/library/collections.html) makes this a cinch:

In [27]:
[Counter(line.split()).most_common(1)[0][1] == 1 for line in data if len(line) > 0].count(True)

477

Or I can use sum, since True evaluates to 1 and False to 0:

In [28]:
sum([Counter(line.split()).most_common(1)[0][1] == 1 for line in data if len(line) > 0])

477

And that was pretty straigtforward! Though there should be a faster way to do this - Counter evaluates the entire sentence, and we can stop looking at a sentence soon as a duplicate is found.

## --- Part Two ---

For added security, yet another system policy has been put in place. Now, a valid passphrase must contain no two words that are anagrams of each other - that is, a passphrase is invalid if any word's letters can be rearranged to form any other word in the passphrase.

For example:

abcde fghij is a valid passphrase.
abcde xyz ecdab is not valid - the letters from the third word can be rearranged to form the first word.
a ab abc abd abf abj is a valid passphrase, because all letters need to be used when forming another word.
iiii oiii ooii oooi oooo is valid.
oiii ioii iioi iiio is not valid - any of these words can be rearranged to form any other word.
Under this new system policy, how many passphrases are valid?

First, we need a function to check if two words are anagrams of each another. Again, the Counter class makes this easy:

In [32]:
def is_anagram(x, y):
    '''returns true if the two words are anagrams of each other'''
    return Counter(x) == Counter(y)
    
is_anagram("abcde", "ecdab")

True

Now for a given sentence, we need to go compare each word against every other word in the dictionary. [itertools](https://docs.python.org/3/library/itertools.html) to the rescue:

In [40]:
from itertools import combinations
for x,y in combinations("abcde xyz ecdab".split(), 2):
    print(x,y)

abcde xyz
abcde ecdab
xyz ecdab


In [50]:
def check_anagrams(line):
    """takes in a line, returns True if no word is an anagram of another word
    false if an anagram is found"""
    for x, y in combinations(line.split(), 2):
        if is_anagram(x,y):
            return False
    return True
    
check_anagrams("abcde fghij"), check_anagrams("abcde xyz ecdab")

(True, False)

In [51]:
sum([check_anagrams(line) for line in data if len(line)>0])

167

Done! I could do this without using collections or itertools, but that would just make this significantly more verbose, without any gain in speed that I can think of.