<a href="https://colab.research.google.com/github/LukeANewton/word-generator/blob/master/word_generator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Word generator for Scrabble, Words with Friends, etc.

This notebook contains a script that can be used to generate valid words from a sequence of characters.

##Step 1: Install dependencies

This makes use of pyenchant, a spellchecking library, to check if words are proper english words 

In [1]:
!pip install pyenchant
!apt-get install python-enchant

Collecting pyenchant
[?25l  Downloading https://files.pythonhosted.org/packages/c9/66/9fe32edef9c56d9397ea7ab5853bc96082cda2770d3437ea0656758fd6d4/pyenchant-3.0.1-py3-none-any.whl (56kB)
[K     |█████▉                          | 10kB 24.5MB/s eta 0:00:01[K     |███████████▋                    | 20kB 6.1MB/s eta 0:00:01[K     |█████████████████▌              | 30kB 8.5MB/s eta 0:00:01[K     |███████████████████████▎        | 40kB 10.9MB/s eta 0:00:01[K     |█████████████████████████████▏  | 51kB 7.0MB/s eta 0:00:01[K     |████████████████████████████████| 61kB 4.5MB/s 
[?25hInstalling collected packages: pyenchant
Successfully installed pyenchant-3.0.1
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  aspell aspell-en dictionaries-common emacsen-common enchant hunspell-en-us
  libaspell15 libenchant1c2a libhunspell-1.6-0 libtext-iconv-perl
Suggested packages:
  aspell-doc s

## Step 2: Import required libraries and load english dictionary

### Import statements:
* enchant: the spelling checking library from pyenchant, used to check if words are valid
* copy: used to create copies of data structures, since python passes everything by reference
* permutations: a function from itertools which generates all the permutations of a collection of elements

### Dictionaries used:
* US english dictionary



In [0]:
import enchant
import copy
from itertools import permutations
d = enchant.Dict("en_US")

##Step 3: Function to generate possible letter combinations

The approach used here is quite simple. We generate all the permutations from the provided sequence of characters, up to and including permutations of the length upper_bound. For wildcards, we generate every possible combination of values for the wildcards in the character sequence, and try the wildcard values in each position. 

The result of this function if a very long set of the possible character combinations given the input sequence. This function does not porvide valid words, only possible character combinations.

In [0]:
def getCombinataions(characters, upper_bound):
  words = set(characters.replace('*', ''))
  for length in range(2, upper_bound+1):
    perm = permutations(characters, length)
    for tup in list(perm):
      words.add(''.join(tup))
  if '*' in characters:
    temp = copy.copy(words)
    to_insert = getCombinataions("abcdefghijklmnopqrstuvwxyz", characters.count('*'))
    for insert in to_insert:
      for word in temp:
        for i in range(0, len(word)):
          words.add(word[:i] + insert + word[i:])
  return words

##Step 4: The script to get valid words and print them

This script performs the actual functionality of the notebook. The character sequence is provided as a string in the variable 'characters' and is used to get all possible character combinations.

After this, the possible combinations that are not valid words are filtered out, and the valid words are sorted alphabetically and by length so they can be displayed nicely.

In [14]:
#'*' for a wildcard character
characters = "helo**"

#get all possible letter combinations
words = getCombinataions(characters, len(characters))

#initialize a dictionary to sort words based on length
valid_words = {}
for i in range(1, len(characters) + 1):
  valid_words[i] = []

#filter out the character sequences that are not valid words, and sort the valid words into the dictionary
for word in words:
  if d.check(word):
    valid_words[len(word)].append(word)

#aplhabetize each sequence of valid words
for word_set in valid_words:
  valid_words[word_set].sort()

#print the resulting dictionary
for i in range(2, len(characters) + 1):
  if len(valid_words[i]) > 0:
    print('words of length ' + str(i) +':')
    for word in valid_words[i]:
      print('    ' + word)
    print()

words of length 2:
    ah
    be
    bl
    ch
    cl
    co
    do
    eh
    fl
    go
    he
    ho
    kl
    ll
    lo
    me
    ml
    mo
    no
    oh
    pl
    re
    sh
    so
    to
    uh
    we
    ye
    yo

words of length 3:
    ace
    ado
    age
    ago
    ail
    ale
    all
    ape
    are
    ash
    ate
    ave
    awe
    awl
    aye
    bah
    bee
    bio
    boo
    bro
    bye
    cal
    col
    coo
    cpl
    cue
    dbl
    die
    doe
    due
    duh
    duo
    dye
    eel
    ego
    eke
    ell
    emo
    ere
    eve
    ewe
    eye
    fee
    fie
    foe
    fol
    foo
    fro
    gal
    gee
    gel
    goo
    hie
    hoe
    hue
    huh
    ice
    ill
    ire
    isl
    kph
    lee
    lie
    loo
    lye
    meh
    mil
    moo
    mph
    nae
    nah
    nee
    nil
    nth
    och
    ode
    oho
    oil
    ole
    one
    ooh
    ope
    ore
    owe
    owl
    pah
    pal
    pee
    pie
    pol
    poo
    pro
    quo
    rah
    re