# Word Exercises

## Downloading a Word List

These exercises all require a list of words. Let's download one from the internet.

In [1]:
import requests

In [2]:
WORD_FILE = 'https://www.gutenberg.org/files/3201/files/COMMON.TXT'

In [3]:
def get_words(url: str) -> set[str]:
    """Downloads a word file found at <url> and creates a set for quick lookups"""
    response = requests.get(url)
    return {word for word in response.text.split('\r\n')}

In [4]:
words = get_words(WORD_FILE)

## General Framework Functions

Now, since most of these exercises involve finding some words that meet some criteria, let's make a generalized function that will filter our word list given some boolean function. We should also make a function that cleans up words for comparisons, but does not alter our word list.

In [5]:
from collections.abc import Callable

def find_certain_words(words: set[str], func: Callable) -> set[str]:
    """Returns all members of words for which func returns True"""
    return {word for word in words if func(word)}

In [6]:
def alpha(word: str) -> str:
    """Clean up words for comparisons by stripping case and non-alphabetic characters"""
    return ''.join([character.casefold() for character in word if character.isalpha()])

## Long Words

- Write a program that reads the words file and prints only the words with more than 20 characters (not counting whitespace).

In [7]:
find_certain_words(words, lambda word: len(alpha(word)) > 20)

{'American Expeditionary Forces',
 'American Revised Version',
 'American Standard Version',
 'American Stock Exchange',
 'American trypanosomiasis',
 'Articles of Confederation',
 'Atlantic Intracoastal Waterway',
 'Atomic Energy Commission',
 'Australian Capital Territory',
 'Bose-Einstein statistics',
 'Bretton Woods Conference',
 'British Antarctic Territory',
 'British Commonwealth of Nations',
 'Browning automatic rifle',
 'Cassegrainian telescope',
 'Central African Federation',
 'Central Intelligence Agency',
 'Central Treaty Organization',
 'Chancellor of the Exchequer',
 'Chesapeake Bay retriever',
 'Commonwealth of Nations',
 'Communist International',
 'Congressional Medal of Honor',
 'Congressional district',
 "D'Entrecasteaux Islands",
 'Declaration of Independence',
 'Democratic-Republican Party',
 'Department of Agriculture',
 'Dionysius of Halicarnassus',
 'Distinguished Conduct Medal',
 'Distinguished Flying Cross',
 'Distinguished Service Cross',
 'Distinguished Serv

## Words without 'e'

- Write a function called has_no_e that returns True if the given word doesn’t have the letter “e”
in it. Modify your program from the previous section to print only the
words that have no “e” and compute the percentage of the words in the
list that have no “e”.

In [8]:
def has_no_e(word):
    for letter in word:
        if letter in 'eE':
            return False
    return True

In [9]:
f'{len(find_certain_words(words, has_no_e)) / len(words):%} of the words in Moby Project Common Words List have no "e".'

'37.255033% of the words in Moby Project Common Words List have no "e".'

## Forbidden Letters

- Write a function named avoids that takes a word and a string of forbidden letters, and that returns True if the word doesn’t use any of the
forbidden letters. Modify your program to prompt the user to enter a
string of forbidden letters and then print the number of words that don’t
contain any of them.

In [10]:
def avoids(word: str, forbidden_letters: str) -> bool:
    for letter in alpha(word):
        if letter in forbidden_letters:
            return False
    return True

In [11]:
len(find_certain_words(words, lambda word: avoids(word, 'e')))

27774

...Can you find a combination of five forbidden letters
that excludes the smallest number of words?

In [12]:
from itertools import combinations
from string import ascii_lowercase as alphabet

**Don't Actually Run the Following!**

This is a naive approach to the problem...

In [None]:
forbidden = {}
for letters in [''.join(letters) for letters in combinations(alphabet, 5)]:
    forbidden[letters] = len(find_certain_words(words, lambda word: avoids(word, letters)))
excluded = sorted(forbidden.items(), key=lambda x: x[1])
smallest = excluded[0][1]
for letters, number in number_excluded:
    if number > smallest:
        break
    print(letters)

The problem with this straightforward approach is that the number of combinations of five letters is `26! / (5! * (26 - 5)!)` or 65780. While this is certainly doable, it will take quite a while even on a moderately power computer. We can stack the deck in our favor by looking only at combinations of the less-than-averagely occuring letters. This reduces our search space to 462 combinations and is almost certain to be correct.

In [13]:
forbidden = {}

rare_letters = 'gypwbvkjxzq'

for letters in [''.join(letters) for letters in combinations(rare_letters, 5)]:
    forbidden[letters] = len(find_certain_words(words, lambda word: avoids(word, letters)))
number_excluded = sorted(forbidden.items(), key=lambda x: x[1])
smallest = number_excluded[0][1]
for letters, number in number_excluded:
    if number > smallest:
        break
    print(letters)

gypbv


## Use Only, Use All

- Write a function named usesonly that takes a word and a string of letters,
and that returns True if the word contains only letters in the list. Can
you make a sentence using only the letters acefhlo? Other than “Hoe
alfalfa?”

In [14]:
def uses_only(word: str, letters: str) -> bool:
    for letter in word:
        if letter not in letters:
            return False
    return True

In [15]:
find_certain_words(words, lambda word: uses_only(word, 'acefhlo'))

{'',
 'a',
 'aa',
 'ace',
 'ache',
 'ah',
 'aha',
 'ala',
 'alcohol',
 'ale',
 'alee',
 'alfalfa',
 'all',
 'allele',
 'allheal',
 'aloe',
 'aloha',
 'aloof',
 'c',
 'cacao',
 'cache',
 'calf',
 'call',
 'calla',
 'cell',
 'cella',
 'cello',
 'chafe',
 'chaff',
 'challah',
 'chef',
 'chela',
 'cholla',
 'clef',
 'cloaca',
 'cloche',
 'coach',
 'coal',
 'coca',
 'cochlea',
 'coco',
 'cocoa',
 'coff',
 'coffee',
 'coffle',
 'col',
 'cola',
 'cole',
 'coo',
 'cooee',
 'cool',
 'e',
 'each',
 'echo',
 'eel',
 'efface',
 'eh',
 'el',
 'elf',
 'ell',
 'f',
 'fa',
 'face',
 'fall',
 'fallal',
 'feal',
 'fecal',
 'fee',
 'feel',
 'felafel',
 'fell',
 'fellah',
 'felloe',
 'feoff',
 'feoffee',
 'flea',
 'flee',
 'fleece',
 'floc',
 'floe',
 'foal',
 'focal',
 'foe',
 'fool',
 'h',
 'ha',
 'haaf',
 'hae',
 'hah',
 'hale',
 'half',
 'hall',
 'hallah',
 'hallo',
 'halloo',
 'halo',
 'he',
 'heal',
 'heel',
 'hell',
 'hellhole',
 'hello',
 'hl',
 'ho',
 'hoe',
 'hole',
 'hollo',
 'hoo',
 'hooch',
 

- Write a function named uses_all that takes a word and a string of required
letters, and that returns True if the word uses all the required letters at
least once. How many words are there that use all the vowels aeiou? How
about aeiouy?

In [16]:
def uses_all(word: str, letters: str) -> bool:
    for letter in letters:
        if letter not in word:
            return False
    return True

In [17]:
find_certain_words(words, lambda word: uses_all(word, 'aeiou'))

{'account receivable',
 'unskilled labor',
 'radio source',
 'telecommunication',
 'scintillation counter',
 'reaction turbine',
 'Deucalion',
 'Fourier analysis',
 'molecular distillation',
 'abstemious',
 'Communist International',
 'sounding lead',
 'occupational therapy',
 'potassium ferricyanide',
 'Mothering Sunday',
 'monosodium glutamate',
 'interstitial-cell-stimulating hormone',
 'superannuation',
 'potassium chlorate',
 'Pontius Pilate',
 'potassium nitrate',
 'quantity surveyor',
 'leprosarium',
 'rational number',
 'medium of exchange',
 'bouquet garni',
 'pneumonic plague',
 'gourmandise',
 'rectangular coordinates',
 'subordinate',
 "Toussaint L'Ouverture",
 'mercaptopurine',
 'insubordinate',
 'sodium perborate',
 'potassium cyanide',
 'amaryllidaceous',
 'equivocation',
 'molecular film',
 'hollandaise sauce',
 'multiflora rose',
 'intercolumniation',
 'House of Representatives',
 'calcariferous',
 'laniferous',
 'harlequin opal',
 'surface-to-air',
 'instantaneous',
 

## In Order

- Write a function called is_abecedarian that returns True if the letters in a
word appear in alphabetical order (double letters are okay). How many
abecedarian words are there?

In [18]:
from string import ascii_lowercase as alphabet

def is_abecedarian(word: str) -> bool:
    position = 0
    for letter in word:
        if letter not in alphabet[position:]:
            return False
        else:
            position = alphabet.index(letter)
    return True

In [19]:
len(find_certain_words(words, is_abecedarian))

437

## Car Talk Puzzlers

- Give me a word with three consecutive double letters.

In [20]:
def three_consecutive_double_letters(word: str) -> bool:
    if len(word) < 6:
        return False
    for index in range(len(word) - 5):
        if (word[index] == word[index + 1]
            and word[index + 2] == word[index + 3]
            and word[index + 4] == word[index + 5]):
            return True
    return False

In [21]:
find_certain_words(words, three_consecutive_double_letters)

{'bookkeeper', 'bookkeeping'}

- Looking at my odometer, I noticed that the last 4 digits
were palindromic. One mile later, the last 5 numbers were palindromic. One mile after that, the middle 4 out of 6 numbers were palindromic. And you ready for this? One mile later, all
6 were palindromic! The question is, what was on the odometer when I
first looked?

In [22]:
def is_palindrome(word: str):
    return word == word[::-1]

In [23]:
for mile in range(999997):
    if (is_palindrome(str(mile).zfill(6)[2:])
        and is_palindrome(str(mile + 1).zfill(6)[1:])
        and is_palindrome(str(mile + 2).zfill(6)[1:5])
        and is_palindrome(str(mile + 3).zfill(6))):
        print([mile + x for x in range(4)])

[198888, 198889, 198890, 198891]
[199999, 200000, 200001, 200002]


- Recently I had a visit with my mom and we realized that the two digits
that make up my age when reversed resulted in her age. For example,
if she’s 73, I’m 37. We wondered how often this has happened over the
years but we got sidetracked with other topics and we never came up
with an answer. When I got home I figured out that the digits of our
ages have been reversible six times so far. I also figured out that if we’re
lucky it would happen again in a few years, and if we’re really lucky it
would happen one more time after that. In other words, it would have
happened 8 times over all. So the question is, how old am I now?

What we know:
- ages were reversed six times before
- ages currently reversed
- ages will be reversed maybe once or twice more

What we don't know:
- current ages
- difference between ages

In [24]:
def reversed_ages(a, b):
    return str(a) == str(b)[::-1]

print('Age Gap  Instances  Ages')
print('-------  ---------  -----------------------------------------------------------------------')
for difference in range(1,50):
    points = []
    for x, y in zip(range(120 - difference), range(difference, 120)):
        if reversed_ages(x, y):
            points.append((x,y))
    if len(points) > 0:
        print(f'{difference:7}  {len(points):9}  {points}')

Age Gap  Instances  Ages
-------  ---------  -----------------------------------------------------------------------
      9          8  [(12, 21), (23, 32), (34, 43), (45, 54), (56, 65), (67, 76), (78, 87), (89, 98)]
     18          7  [(13, 31), (24, 42), (35, 53), (46, 64), (57, 75), (68, 86), (79, 97)]
     27          6  [(14, 41), (25, 52), (36, 63), (47, 74), (58, 85), (69, 96)]
     36          5  [(15, 51), (26, 62), (37, 73), (48, 84), (59, 95)]
     45          4  [(16, 61), (27, 72), (38, 83), (49, 94)]
