** Anagrams **

This program reads a word dictionary from a text file and uses that dictionary to find anagrams for words.

An anagram is word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once. 

For example, the word 'lives' is an anagram of 'elvis'

The file containing English words can be downloaded from GitHub at https://github.com/dwyl/english-words


In [1]:
# open function opens and reads a text file and stores the resulting data in a variable.
# This function takes two parameters:
# 1: the path to the file you wish to read and 
# 2: a flag denoting how you want to open the file.  In this case, 'r' indicates that we are opening the file as
#    read-only
words = open('words.txt', 'r')
print(words)

<_io.TextIOWrapper name='words.txt' mode='r' encoding='UTF-8'>


In [2]:
# Now we need to read individual lines of the text file.  Each line contains a single word. 
# The result of the following statement is a list (an array) of individual words read from the text file
wordlist = words.readlines()

print(wordlist[0:10])

['2\n', '1080\n', '&c\n', '10-point\n', '10th\n', '11-point\n', '12-point\n', '16-point\n', '18-point\n', '1st\n']


In [3]:
# You will notice that each word is followed by '\n' - a new line character.  
# In order to be able to find anagrams, we need to do two things - (1) remove the new line character from each word
# and (2) convert each word to lower case
# NOTE: The statement below uses a Python comprehension instead of a standard for loop.  
# If you are up to the challenge, can you rewrite this comprehension as a for loop?
wordclean = [word.strip().lower() for word in wordlist]

print(wordclean[:10])

['2', '1080', '&c', '10-point', '10th', '11-point', '12-point', '16-point', '18-point', '1st']


In [4]:
# While this particular list only contains unique words, in real life we have to be concerned with duplicates.
# The easiest way to de-dupe a list in Python is to use a 'set'.  Sets are mathematical constructs that only 
# allow unique values.  Converting a list to a set will automatically remove all duplicates.
wordunique = set(wordclean)

# Now we need to convert our set back into a list
wordunique = list(wordunique)

# NOTE:  The same thing could be done in a single statement:
# wordunique = list(set(wordclean))

In [5]:
# Converting our list to a set and back to a list created an unsorted list.  
# We need to sort the list in lexiographic order
wordunique.sort()

# NOTE: Another way to sort a list is with sorted() function:
# sorted(wordunique)

print(wordunique[:10])

['&c', "'d", "'em", "'ll", "'m", "'mid", "'midst", "'mongst", "'prentice", "'re"]


In [6]:
# Sorting a string is very similar to sorting a list.  Python takes individual characters
# that compose the original string and puts them in lexiographic order
sorted('lives')

['e', 'i', 'l', 's', 'v']

In [7]:
sorted_keyword = sorted('elvis')
print(sorted_keyword)

['e', 'i', 'l', 's', 'v']


In [8]:
sorted_string = ''.join(sorted_keyword)
print(sorted_string)

eilsv


In [9]:
#prompts user for an input
#converts that input in to lowercase for future comparison
#sorts that word
#makes it a string
user_input = input('Please enter a word: \n')
user_input_m = user_input.lower()
user_input_m_1 = sorted(user_input_m)
user_input_m_s = ''.join(user_input_m_1)

Please enter a word: 
python


In [10]:
#creates a counter to count the number of anagrams(if any)
#for loop that loops through the entire file
#compares user input with the word from the list
#prints out both the sorted and unsorted version of the word
counter = 0
for i in range(0, len(wordunique)):
    sorted_keywords = sorted(wordunique[i])
    sorted_strings = ''.join(sorted_keywords)

    if(user_input_m_s == sorted_strings):
        counter+=1
        print('Anagram ' + str(counter) + ' ' + sorted_strings)
        print('Unsorted'+ ': ' + wordunique[i])
        print(' ')
    #else:
    #    print("#" + str(i) + ': No anagrams found')

Anagram 1 hnopty
Unsorted: phyton
 
Anagram 2 hnopty
Unsorted: python
 
Anagram 3 hnopty
Unsorted: typhon
 


#### 