# puzzler
given the following graph of allowed pairs of characters:

![connected graph of allowed pairs of characters](puzzler.png)

write a program that can read a dictionary file (e.g. `/usr/share/dict/words`) and print out only the words that are expressable on this graph. characters do not connect to themselves, while the letters and connections can be reused.

find and report the longest dictionary word expressed this way.

In [1]:
nodes = 

In [2]:

# given a letter, this dict lists the allowed letters that may follow:
next_letter_dict = {
    'b': ['r','l','o','t','y'],
    'r': ['c','e','o','l','b'],
    'c': ['h','g','o','e','r'],
    'h': ['s','w','o','g','c'],
    's': ['y','n','o','w','h'],
    'y': ['b','t','o','n','s'],
    't': ['b','l','o','n','y'],
    'l': ['r','e','o','t','b'],
    'e': ['r','c','g','o','l'],
    'g': ['e','c','h','w','o'],
    'w': ['o','g','h','s','n'],
    'n': ['t','o','w','s','y'],
    'o': ['b','r','c','h','s', 
          'y','l','e','g','w', 
          'n','t'],
}

allowed_tokens = set(next_letter_dict.keys())


In [20]:
# word = 'wrongness'
# word = 'wongness'
word = 'wonoss'

tokens = set(word)
word_length = len(word)
print('input is', word)
print('which contains', tokens)
if tokens.issubset(allowed_tokens):
    print('all letters allowed, searching for path')
    if len(word) > 1:
        print('testing one letter pair at a time')
        allowed = True
        first_letter = word[0]
        for second_letter in word[1:]:
            if second_letter not in next_letter_dict[first_letter]:
                allowed = False
                print('the pair', first_letter, '->', second_letter, 'is not allowed')
                break
            first_letter = second_letter
    else:
        allowed = True
        print('word', word, 'is allowed')
else:
    allowed = False
    print('word', word, 'contains characters not included in graph')
    print('missing letters:', tokens - allowed_tokens)
if allowed:
    print('>>>>>>the word', word, 'is allowed')
else:
    print('>>>>>>the word', word, 'is impossible')

input is wonoss
which contains {'n', 'o', 's', 'w'}
all letters allowed, searching for path
testing one letter pair at a time
the pair s -> s is not allowed
>>>>>>the word wonoss is impossible


In [11]:
def check_if_possible(word, next_letter_dict=next_letter_dict, allowed_tokens=allowed_tokens):
    """
    given a string and a graph of allowed token pairings, this function checks whether the string can 
    be formed by traversing the graph of allowed pairs. 
    input: 
        word, a character string.
        next_letter_dict, a dict whose keys are all the allowed tokens, the values are each key's (token's) allowed pairings.
        allowed_tokens, a set of tokens, equal to the keys of the next letter dict.
    output:
        allowed, boolean. true if word can be generated by traversing the graph.
    """
    if not set(word).issubset(allowed_tokens):
        # input word contains tokens not found in graph of allowed pairs.
        allowed = False
#         print('word', word, 'contains characters not included in graph')
#         print('missing letters:', set(word) - allowed_tokens)
    else:
        if len(word) > 1:
            # need to check pairs of tokens
            allowed =True
            first_letter = word[0]
            for second_letter in word[1:]:
#                 print(word, 'checking the pairs', first_letter, second_letter)
                if second_letter not in next_letter_dict[first_letter]:
                    allowed = False
#                     print('the pair', first_letter, '->', second_letter, 'is not allowed')
                    break
                first_letter = second_letter
        else:
            allowed = True
#     if allowed:
#         print('>>>>>>the word', word, 'is allowed')
#     else:
#         print('>>>>>>the word', word, 'is impossible')
    return allowed


def read_word_list(filepath):
    word_list = []
    with open(filepath, 'r') as file:
        for line in file:
            word_list.append(file.readline().strip())
    return word_list

def scan_words(word_list):
    return [
        word 
        for word in word_list
        if check_if_possible(word)
    ]
            
    

In [12]:
check_if_possible('wongness')

False

In [21]:
%time
import random
words = read_word_list('words.txt')
print('word list `words` contains', len(words), 'words')
allowed_words = scan_words(words)
print('of which', len(allowed_words), 'can be formed using graph')
long_allowed = [word for word in allowed_words if len(word)>6]
print('of which', len(long_allowed), 'are long')
print('for example:', random.sample(long_allowed, 5))
max_length = 5
for word in long_allowed:
    word_length = len(word)
    if word_length > max_length:
        longest_word = word
        max_length = word_length
print('and the longest word found was', longest_word, '(',  max_length, ')')

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.96 µs
word list `words` contains 117943 words
of which 210 can be formed using graph
of which 20 are long
for example: ['syntony', 'gercrow', 'reconsole', 'tocogony', 'recolor']
and the longest word found was horologer ( 9 )


In [22]:
# allowed_words
long_allowed

['borecole',
 'bowshot',
 'coercer',
 'creosol',
 'geologer',
 'gercrow',
 'geronto',
 'gorcrow',
 'holotony',
 'honorer',
 'horologer',
 'oloroso',
 'oronoco',
 'recolor',
 'reconsole',
 'recrown',
 'snowshoe',
 'syntony',
 'tocogony',
 'tocororo']