In [4]:
import string
import itertools

## Chapter 9 - Case Study: Word Play

For the exercises in this chapter we need a list of English words.

There are lots of word lists available on the Web, but the one most suitable for our purpose is one of the word lists collected and contributed to the public domain by Grady Ward as part of the Moby lexicon project (see http://wikipedia.org/wiki/Moby_Project). It is a list of 113,809 official crosswords; that is, words that are considered valid in crossword puzzles and other word games.

In [5]:
fin = open('data/words.txt')

*fin* is a common name for a file object used for input. The file object provides several methods for reading, including readline, which reads characters from the file until it gets to a newline and returns the result as a string:

In [6]:
fin.readline()

'aa\n'

The first word in this particular list is “aa”, which is a kind of lava. The sequence \n represents the newline character that separates this word from the next.

The file object keeps track of where it is in the file, so if you call readline again, you get the next word:

In [7]:
fin.readline()

'aah\n'

If it's the newline character that’s bothering you, we can get rid of it with the string method strip:

In [8]:
fin.readline().strip() #in the book this is done in two lines, this seems fine though

'aahed'

You can also use a file object as part of a for loop. Prints each word, one per line, I won't let it run that long, I'll break it right away, but it could if I would let it!

In [9]:
for line in fin:
    word = line.strip()
    print(word)
    break

aahing


### Exercies

**Exercise 9.1.** Write a program that reads *words.txt* and prints only the words with more than 20 characters (not counting whitespace).

In [10]:
def more_than_twenty(text):
    for line in fin:
        word = line.strip()
        if len(word) > 20:
            print(word)
            
more_than_twenty(fin) #less than I thought tbh

counterdemonstrations
hyperaggressivenesses
microminiaturizations


**Exercise 9.2.** In 1939 Ernest Vincent Wright published a 50,000 word novel called Gadsby that *does not contain the letter “e”*. Since “e” is the most common letter in English, that’s not easy to do. In fact, it is difficult to construct a solitary thought without using that most common symbol. It is slow going at first, but with caution and hours of training you can gradually gain facility.


All right, I’ll stop now. Write a function called has_no_e that returns True if the given word doesn’t have the letter “e” in
it.


Modify your program from the previous section to print only the words that have no “e” and compute the percentage of the words in the list that have no “e”.

In [11]:
def has_no_e(word):
    if "e" not in word:
        return True
    
def words_without_e(word_list):
    no_e_words = 0
    for word in word_list:
        if has_no_e(word):
            print(word)
            no_e_words+= 1
    return 100* (no_e_words/len(word_list))

words_without_e(['Feris', 'Cam', 'Sloane'])

Cam


33.33333333333333

**Exercise 9.3.** Write a function named avoids that takes a word and a string of forbidden letters, and that returns True if the word doesn’t use any of the forbidden letters. 


Modify your program to prompt the user to enter a string of forbidden letters and then print the number of words that don’t contain any of them. Can you find a combination of 5 forbidden letters that excludes the smallest number of words?

In [13]:
def avoids(word, forbidden):
    for letter in word:
        if letter in forbidden:
            return False
    return True

avoids('lobster', 'mayo')
avoids('lobster', 'may0')

def user_forbids(text, user_input=True, forbidden=('e')):
    fin = open(text)
    if user_input:
        forbidden = input("Enter a list of forbidden letters: \n")
    good_words = 0 #number of words that don't contain the forbidden letters
    for word in fin:
        word = word.strip()
        if avoids(word, forbidden):
            good_words +=1
    return good_words

#user_forbids('words.txt') #this works 22717 with user input ebg
user_forbids('data/words.txt', user_input=False, forbidden='ebg')

22717

In [16]:
excluded_words = 150000
best_set = ''

all_letters = string.ascii_lowercase

subsets = list(itertools.combinations(all_letters, 5))

for sub in subsets[::-1]:
    forbidden_letters = ''.join(sub)
    #print(forbidden_letters)
    excluded = user_forbids('data/words.txt', user_input=False, forbidden=forbidden_letters)
    #print(excluded)
    if excluded < excluded_words:
        excluded_words = excluded
        #print(excluded_words)
        #print(sub)
        best_set = sub
        #print(best_set)
        
#1679
#('e', 'i', 'o', 's', 'u')

KeyboardInterrupt: 

**Exercise 9.4.** Write a function named uses_only that takes a word and a string of letters, and that returns True if the word contains only letters in the list. Can you make a sentence using only the letters acefhlo? Other than “Hoe alfalfa?”

In [17]:
def uses_only(word, use_letters):
    for letter in word.lower():
        if letter not in use_letters:
            return
    return True

uses_only('Hello', 'hello')

True

In [18]:
fin = open('data/words.txt')
hoe_alfalfa = []
for line in fin:
    word = line.strip()
    if uses_only(word, 'acefhlo'):
        hoe_alfalfa.append(word)
        
len(hoe_alfalfa) #that's a lot of words that contain only acefhlo, so I think we could figure out another sentence

188

**Exercise 9.5.** Write a function named uses_all that takes a word and a string of required letters, and that returns True if the word uses all the required letters at least once. How many words are there that use all the vowels aeiou? How about aeiouy?

In [19]:
def uses_all(word, req_letters):
    used_letters = 0
    for letter in req_letters:
        if letter in word:
            used_letters += 1
        if used_letters == len(req_letters):
            return True
        
fin = open('data/words.txt')

aeiou_count = 0
aeiouy_count = 0

for line in fin:
    word = line.strip()
    if uses_all(word, 'aeiou'):
        aeiou_count += 1
    if uses_all(word,'aeiouy'):
        aeiouy_count += 1
        
print(aeiou_count,aeiouy_count) #pretty big gap between the two

598 42


**Exercise 9.6.** Write a function called is_abecedarian that returns True if the letters in a word appear in alphabetical order (double letters are ok). How many abecedarian words are there?

In [20]:
def is_abecedarian(word):
    flag = True
    for i in range(len(word)-1):
        if i == len(word)-1:
            flag = flag
        else:
            if word[i] > word[i+1]:
                flag = False
                return flag
    return flag

fin = open('data/words.txt')

abecedarian_word = 0

for line in fin:
    word = line.strip()
    if is_abecedarian(word):
        abecedarian_word += 1
        
abecedarian_word

596

If you were really thinking like a computer scientist, you would have recognized that uses_all was an instance of a previously solved problem, and you would have written:

In [21]:
def uses_all(word, required):
    return uses_only(required, word)

This is an example of a program development plan called **reduction to a previously solved problem**, which means that you recognize the problem you are working on as an instance of a solved problem and apply an existing solution.

**Debugging**

Testing programs is hard. The functions in this chapter are relatively easy to test because you can check the results by hand. Even so, it is somewhere between difficult and impos sible to choose a set of words that test for all possible errors.

In addition to the test cases you generate, you can also test your program with a word list like words.txt . By scanning the output, you might be able to catch errors, but be careful: you might catch one kind of error (words that should not be included, but are) and not another (words that should be included, but aren’t).

### Glossary

**file object:** A value that represents an open file.


**reduction to a previously solved problem:** A way of solving a problem by expressing it as an instance of a previously solved problem.


**special case:** A test case that is atypical or non-obvious (and less likely to be handled correctly).

### Exercises

**Exercise 9.7.** This question is based on a Puzzler that was broadcast on the radio program Car Talk ( http: // www. cartalk. com/ content/ puzzlers ):

Give me a word with three consecutive double letters. I’ll give you a couple of words that almost qualify, but don’t. For example, the word committee, c-o-m-m-i-t-t-e-e. It would be great except for the ‘i’ that sneaks in there. Or Mississippi: M-i-s-s-i-s-s-i-p-p-i. If you could take out those i’s it would work. But there is a word that has three consecutive pairs of letters and to the best of my knowledge this may be the only word. Of course there are probably 500 more but I can only think of one. What is the word?


Write a program to find it.

In [22]:
def has_3_consecutive_doubles(word):
    conc_count = 0
    i = 0
    while i < len(word)-1:
        if word[i] == word[i+1]:
            i += 2
            conc_count +=1
            if conc_count == 3:
                return True
        else:
            conc_count = 0
            i += 1
    return False

def find_3_consecutive_doubles():
    fin = open('data/words.txt')
    for line in fin:
        word = line.strip()
        if has_3_consecutive_doubles(word):
            print(word)

In [23]:
find_3_consecutive_doubles()

bookkeeper
bookkeepers
bookkeeping
bookkeepings


**Exercise 9.8.** Here’s another Car Talk Puzzler

“I was driving on the highway the other day and I happened to notice my odometer. Like most odometers, it shows six digits, in whole miles only. So, if my car had 300,000 miles, for example, I’d see 3-0-0-0-0-0. 

“Now, what I saw that day was very interesting. I noticed that the last 4 digits were palindromic; that is, they read the same forward as backward. For example, 5-4-4-5 is a palindrome, so my odometer could have read 3-1-5-4-4-5. 

“One mile later, the last 5 numbers were palindromic. For example, it could have read 3-6-5-4-5-6. One mile after that, the middle 4 out of 6 numbers were palindromic. And you ready for this? One mile later, all 6 were palindromic! 

“The question is, what was on the odometer when I first looked?”

In [24]:
def is_reverse(word1, word2):
    if len(word1) != len(word2):
        return False
    i = 0
    j = len(word2)-1 #fixed
    while j >= 0: #fixed
        if word1[i] != word2[j]:
            return False
        i = i+1
        j = j-1
    return True

def is_palindrome(word):
    return is_reverse(word, word)

I've been working on this for a while, finally took a peek at the solution...cases where the odometer is at a variation starting with 0 are ignored, which makes it a much easier problem. I'm going to solve it that way and hopefully I'll come back eventually and resolve solving it with the leading 0 case.

In [25]:
def palin_odometer():
    odometer = 100000
    while odometer <= 999999:
        if (is_palindrome(str(odometer)[2:]) and  is_palindrome(str(odometer+1)[1:]) and
            is_palindrome(str(odometer+2)[1:5]) and is_palindrome(str(odometer+3))):
            print(odometer)
            odometer += 1
        else:
            odometer+=1

In [26]:
print('The following are the possible odometer readings (that do not begin with a 0):')
palin_odometer() #hurrah!

The following are the possible odometer readings (that do not begin with a 0):
198888
199999


**Exercise 9.9.** Here’s another Car Talk Puzzler you can solve with a search.

“Recently I had a visit with my mom and we realized that the two digits that make up my age when reversed resulted in her age.  For example, if she’s 73, I’m 37. We wondered how often this has happened over the years but we got sidetracked with other topics and we never came up with an answer.


“When I got home I figured out that the digits of our ages have been reversible six times so far. I also figured out that if we’re lucky it would happen again in a few years, and if we’re really lucky it would happen one more time after that. In other words, it would have happened 8 times over all. So the question is, how old am I now?”


Write a Python program that searches for solutions to this Puzzler. Hint: you might find the string method zfill useful.

In [27]:
def reverse_ages():
    solutions = [] #let's find all ages where they could have had this happen six times so far and two more times before age 100
    for mom_age in range(18,100): #let's check all possible mom ages from 18 to 100
        son_age = 0 #son age starts at 0 compared to moms
        for i,j in zip(range(mom_age,100), range(0,(100-mom_age))): #go through all combos of mom and son age to find a reverse
            if is_reverse(str(i),str(j).zfill(2)): #check to see if we found one
                if i+77 > 100: #if we did, make sure they won't be too old at the 8th time it happens
                    break #if so, break
                else:
                    solutions.append((i+55,j+55)) #if not add 55 to our ages as thats the age they are at the 6th switch
                    break
    return solutions

In [28]:
reverse_ages()

[(75, 57)]

In [29]:
from __future__ import print_function, division


def str_fill(i, n):
    """Returns i as a string with at least n digits.

    i: int
    n: int length

    returns: string
    """
    return str(i).zfill(n)


def are_reversed(i, j):
    """Checks if i and j are the reverse of each other.

    i: int
    j: int

    returns:bool
    """
    return str_fill(i, 2) == str_fill(j, 2)[::-1]


def num_instances(diff, flag=False):
    """Counts the number of palindromic ages.

    Returns the number of times the mother and daughter have
    palindromic ages in their lives, given the difference in age.

    diff: int difference in ages
    flag: bool, if True, prints the details
    """
    daughter = 0
    count = 0
    while True:
        mother = daughter + diff

        # assuming that mother and daughter don't have the same birthday,
        # they have two chances per year to have palindromic ages.
        if are_reversed(daughter, mother) or are_reversed(daughter, mother+1):
            count = count + 1
            if flag:
                print(daughter, mother)
        if mother > 120:
            break
        daughter = daughter + 1
    return count
    

def check_diffs():
    """Finds age differences that satisfy the problem.

    Enumerates the possible differences in age between mother
    and daughter, and for each difference, counts the number of times
    over their lives they will have ages that are the reverse of
    each other.
    """
    diff = 10
    while diff < 70:
        n = num_instances(diff)
        if n > 0:
            print(diff, n)
        diff = diff + 1

print('diff  #instances')
check_diffs()

print()
print('daughter  mother')
num_instances(18, True)

diff  #instances
17 8
18 8
26 7
27 7
35 6
36 6
44 5
45 5
53 4
54 4
62 3
63 3

daughter  mother
2 20
13 31
24 42
35 53
46 64
57 75
68 86
79 97


8

**I appernetly need to break up my functions into smaller chunks.**