## Spelling Bee Solver

Using two word lists (`big.txt` and `words-1.txt`), an initial set of letters, and a designated center letter, come up with all words that only use letters from the list and require the center letter. The big file has lots of words that Spelling Bee probably won't accept, but it seems worthwhile to cast a wide net since we don't know how accurate `words-1` is.

In [None]:
# Spelling Bee Solver Practice

# Load word lists
with open('big.txt') as f:
    big_file = f.read()
with open('words-1.txt') as f:
    word_file = f.read()

# Combine and clean words
words = set()
big_file_tokens = big_file.split()
word_file_tokens = word_file.split()

big_file_lc = [w.lower() for w in big_file_tokens if w.isalpha()]
word_file_lc = [w.lower() for w in word_file_tokens if w.isalpha()]

words.update(big_file_lc)
words.update(word_file_lc)

# Only keep words with 4 or more letters
words = {w for w in words if len(w) >= 4}

# Example puzzle setup
optional_letters = "soqdul"
required_letter = "i"
letters = optional_letters + required_letter
letter_set = set(letters)

# Find solutions: words using only allowed letters and containing the required letter
solutions = [
    word for word in words
    if required_letter in word and all(c in letter_set for c in word)
]

# Show solutions sorted by length
for word in sorted(solutions, key=len):
    print(word)

In [1]:
words = set()

big_file = open('big.txt').read()
word_file = open('words-1.txt').read()

In [2]:
# fill the set of words with individual words from these two files. 
# Make everything lowercase too (called "case folding")

big_file_tokens = big_file.split()
word_file_tokens = word_file.split()

big_file_lc = [word.lower() for word in big_file_tokens
              if word.isalpha()]

word_file_lc = [word.lower() for word in word_file_tokens
               if word.isalpha()]

words.update(big_file_lc)
words.update(word_file_lc)

We only need words that have length greater than 3. 

In [3]:
# cut down to just words four letters or longer

word_list = [word for word in words if len(word) >= 4]

words = set(word_list)

In [7]:
lst2 = sorted(word_list, key=len)
lst2

['pelt',
 'tubs',
 'ball',
 'hope',
 'poor',
 'jobs',
 'grin',
 'lure',
 'lynn',
 'bulb',
 'poll',
 'dark',
 'boys',
 'tula',
 'beau',
 'luke',
 'peak',
 'huge',
 'tory',
 'hull',
 'wild',
 'near',
 'ford',
 'fast',
 'xxiv',
 'quit',
 'rare',
 'more',
 'dose',
 'beak',
 'ages',
 'give',
 'king',
 'have',
 'fore',
 'side',
 'burr',
 'zone',
 'urea',
 'tool',
 'soap',
 'shot',
 'acme',
 'ooze',
 'twin',
 'love',
 'huts',
 'fled',
 'fare',
 'afar',
 'body',
 'deaf',
 'sale',
 'nous',
 'deah',
 'wire',
 'furs',
 'fuss',
 'swim',
 'kopf',
 'hoar',
 'vish',
 'roar',
 'real',
 'baby',
 'agra',
 'time',
 'halt',
 'bowl',
 'arch',
 'rubs',
 'icon',
 'bush',
 'rosy',
 'shod',
 'pads',
 'wink',
 'pulp',
 'sets',
 'meet',
 'judy',
 'high',
 'nuns',
 'walt',
 'nail',
 'sued',
 'sewn',
 'erza',
 'haul',
 'roof',
 'mold',
 'born',
 'deem',
 'fair',
 'date',
 'rack',
 'copy',
 'boot',
 'heel',
 'peas',
 'dive',
 'fury',
 'arts',
 'nova',
 'rags',
 'adds',
 'luck',
 'sire',
 'swan',
 'base',
 'mein',
 

In [8]:
len(words)

215728

Okay, so we've read in the words, now let's do some solving. 

In [10]:
# From 2021-10-05

optional_letters = "soqdul"
required_letter = "i"
letters = optional_letters + required_letter
letter_set = set(letters)

In [11]:
letter_set

{'d', 'i', 'l', 'o', 'q', 's', 'u'}

In [12]:
solutions = []

# Fill in the solutions list with words that *only* have letters from our letter set
# and all must include the required_letter

for word in words:
    cond = all(letter in letter_set for letter in word) and required_letter in word
    if len(word) >= 4:
        if word not in solutions and cond:
            solutions.append(word)

In [14]:
sorted(solutions, key = len)

['slid',
 'idol',
 'suis',
 'soil',
 'lids',
 'oils',
 'quid',
 'sill',
 'ills',
 'sodio',
 'squid',
 'iodol',
 'ossis',
 'soldi',
 'quill',
 'solid',
 'dooli',
 'dosis',
 'uloid',
 'dusio',
 'dilli',
 'solio',
 'dildo',
 'soils',
 'louis',
 'dossil',
 'solidi',
 'liquid',
 'iodous',
 'odious',
 'solids',
 'solodi',
 'iodoso',
 'sissoo',
 'diiodo',
 'dulosis',
 'oidioid',
 'isidoid',
 'solidus',
 'idolous',
 'dissoul',
 'illiquid',
 'lulliloo',
 'squillid',
 'isidioid',
 'squilloid',
 'siliquous',
 'quisquous',
 'quisquilious']