# Day 4
## Part 1
We have to check that each word in each line appears only once, and count the valid ones.

In [1]:
res = 0
with open('day04-part1-input.txt') as input:
    for line in input:
        words = line.strip().split(' ') # the strip is for the \n at the end of the line
        res += (len(words) == len(set(words))) # only equal is all words are unique
res

325

## Part 2
Now we have to check for anagrams, not only exactly equal words.

* abcde fghij is a valid passphrase.
* abcde xyz ecdab is not valid - the letters from the third word can be rearranged to form the first word.
* a ab abc abd abf abj is a valid passphrase, because all letters need to be used when forming another word.
* iiii oiii ooii oooi oooo is valid.
* oiii ioii iioi iiio is not valid - any of these words can be rearranged to form any other word.


In [2]:
from itertools import product

To check if two words are anagram, we have to check that all the letters of one word are in the other exactly once. We cannot just compare if all the letters of a word are found in the other, because ther could be repeated letters.

In [3]:
def are_anagram(a, b):
    '''
    Test if a and b are anagrams
    a, b : strings
    returns : boolean
    '''
    # Split the letters of b
    letters_b = list(b)
    
    for l_a in a:
        # if the letter is missing, it is not an anagram
        if l_a not in letters_b:
            return False
        # remove the letter from b
        else:
            letters_b.remove(l_a)
    # If all the letters of b have been found, it is an anagram
    return letters_b == []
    

In [4]:
assert are_anagram('aaaa', 'bbbb') == False
assert are_anagram('abcd', 'dbac') == True
assert are_anagram('aabd', 'baad') == True
assert are_anagram('oiii', 'ooii') == False
assert are_anagram('oiiii', 'ooii') == False

Given a list, we make a product of the list with itself to check for anagrams. We check if all the words are different before, so we can skip if the product gives the same words as they come from the same word in the first place. If any of these words are anagram, we return 

In [5]:
def anagram_list(some_list):
    '''
    Are any two words in the list anagrams of each other?
    Args:
        some_list: list of words
    Returns:
        Boolean
    '''
    # If not all words are unique, return True
    if len(some_list) != len(set(some_list)):
        return True
    # We know that all the words are different
    # We can use product from itertools to compare all the
    # words from the list to each other, and we can skip
    # when the words are equal because we compare the word
    # with itself
    return any([are_anagram(p1, p2) for p1, p2 in product(some_list, some_list) if p1 != p2])

In [6]:
assert anagram_list(['iiii', 'oiii', 'ooii', 'oooi', 'oooo']) == False
assert anagram_list(['oiii', 'ioii', 'iioi', 'iiio']) == True
assert anagram_list(['a', 'ab', 'abc', 'abd', 'abf', 'abj']) == False

In [7]:
res = 0
with open('day04-part1-input.txt') as input:
    for line in input:
        words = line.strip().split(' ') # the strip is for the \n at the end of the line
        res += not anagram_list(words)  # valid if not an anagram
res

119

Another solution, inspired by Jenny Bryan's clever one (https://github.com/jennybc/2017_advent-of-code/blob/master/day04.R) for getting the anagrams : sorting the letters and checking if they are identical as in the first part.

In [8]:
res = 0
with open('day04-part1-input.txt') as input:
    for line in input:
        words = line.strip().split(' ') # the strip is for the \n at the end of the line
        sorted_words = [''.join(sorted(word)) for word in words]
        res += len(sorted_words) == len(set(sorted_words))  # valid if not an anagram
res

119