** Anagrams **

This program reads a word dictionary from a text file and uses that dictionary to find anagrams for words.

An anagram is word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once. 

For example, the word 'lives' is an anagram of 'elvis'

The file containing English words can be downloaded from GitHub at https://github.com/dwyl/english-words


In [15]:
# open function opens and reads a text file and stores the resulting data in a variable.
# This function takes two parameters:
# 1: the path to the file you wish to read and 
# 2: a flag denoting how you want to open the file.  In this case, 'r' indicates that we are opening the file as
#    read-only

words = open('words.txt', 'r')
print(words)

<_io.TextIOWrapper name='words.txt' mode='r' encoding='cp1252'>


In [16]:
# Now we need to read individual lines of the text file.  Each line contains a single word. 
# The result of the following statement is a list (an array) of individual words read from the text file
wordlist = words.readlines()

print(wordlist[0:10])

['2\n', '1080\n', '&c\n', '10-point\n', '10th\n', '11-point\n', '12-point\n', '16-point\n', '18-point\n', '1st\n']


In [17]:
# You will notice that each word is followed by '\n' - a new line character.  
# In order to be able to find anagrams, we need to do two things - (1) remove the new line character from each word
# and (2) convert each word to lower case
# NOTE: The statement below uses a Python comprehension instead of a standard for loop.  
# If you are up to the challenge, can you rewrite this comprehension as a for loop?
wordclean = [word.strip().lower() for word in wordlist]

print(wordclean[:10])

['2', '1080', '&c', '10-point', '10th', '11-point', '12-point', '16-point', '18-point', '1st']


In [18]:
# While this particular list only contains unique words, in real life we have to be concerned with duplicates.
# The easiest way to de-dupe a list in Python is to use a 'set'.  Sets are mathematical constructs that only 
# allow unique values.  Converting a list to a set will automatically remove all duplicates.
wordunique = set(wordclean)

# Now we need to convert our set back into a list
wordunique = list(wordclean)

# NOTE:  The same thing could be done in a single statement:
# wordunique = list(set(wordclean))

In [19]:
# Converting our list to a set and back to a list created an unsorted list.  
# We need to sort the list in lexiographic order
wordunique.sort()

# NOTE: Another way to sort a list is with sorted() function:
# sorted(wordunique)

print(wordunique[:10])

['&c', "'d", "'em", "'ll", "'m", "'mid", "'midst", "'mongst", "'prentice", "'re"]


In [20]:
# Sorting a string is very similar to sorting a list.  Python takes individual characters
# that compose the original string and puts them in lexiographic order
sorted('lives')

['e', 'i', 'l', 's', 'v']

In [21]:
sorted('elvis')

['e', 'i', 'l', 's', 'v']

In [22]:
# 1: get input
# 2: convert input into an ordered sequence of letters
# 3: iterate through list of words
# 4: convert each word into an ordered sequence of letters
# 5: compare original input's sequence of letters

In [35]:
user_input = input("Please input word:")

Please input word:leepx


In [36]:
sorted_user_input = sorted(user_input)

In [37]:
for word in wordunique:
    if sorted(word)==sorted_user_input:
        print(word)

expel
