<div style="text-align: right" align="right"><i>Peter Norvig, 3 Jan 2020</i></div>

# Spelling Bee Puzzle

The [3 Jan. 2020 edition of the **538 Riddler**](https://fivethirtyeight.com/features/can-you-solve-the-vexing-vexillology/) concerns the popular NYTimes  [**Spelling Bee**](https://www.nytimes.com/puzzles/spelling-bee) puzzle. In this game, seven letters are arranged in a **honeycomb** lattice, with one letter in the center:

<img src="https://fivethirtyeight.com/wp-content/uploads/2020/01/Screen-Shot-2019-12-24-at-5.46.55-PM.png" width="150">

The goal is to identify as many valid words as possible. A valid word uses only letters from the honeycomb, may use a letter multiple times, must use the center letter, and must be at least 4 letters long. For example,  AMALGAM is acceptable, but neither GAP (too short) nor PALM (no G) are allowed. Four-letter words are worth 1 point each, while words longer than that score one point for each letter. Words that use all seven letters in the honeycomb are known as **pangrams** and earn 7 bonus points in addition to the points for the length of the word. So in the above example, MEGAPLEX is worth 8 + 7 = 15 points. A valid honeycomb must contain at least one pangram, and must not contain the letter S (that would make it too easy, with all the plural words).

***The puzzle is: Which seven-letter honeycomb results in the highest possible score?*** 

The 538 Riddler referenced a [word list](https://norvig.com/ngrams/enable1.txt) from my [web site](https://norvig.com/ngrams), so I felt  compelled to solve the puzzle.  (Note the word list is a standard public domain Scrabble® dictionary that I happen to host a copy of; I didn't curate it, Mendel Cooper and Alan Beale did.) 

I'll show you how I address the problem. First some imports:

In [1]:
from collections import defaultdict, Counter
from dataclasses import dataclass
from itertools   import combinations, chain
from typing      import Iterable 

# Letters, Words, Lettersets, and Pangrams

Let's start by defining the most basic vocabulary terms:

- **Letter**: the **valid letters** are uppercase 'A' to 'Z', but not 'S'.
- **Word**: A string of letters.
- **Letterset**: the distinct letters in a word; e.g. letterset('BOOBOO') = 'BO'.
- **Word list**: a list of valid words.
- **Valid word**: a word of at least 4 valid letters and not more than 7 distinct letters.
- **Pangram**: a valid word with exactly 7 distinct letters.

In [2]:
valid_letters = set('ABCDEFGHIJKLMNOPQR' + 'TUVWXYZ')
Letter    = str # A string of one letter
Word      = str # A string of 4 or more letters
Letterset = str # A sorted string of distinct letters

def word_list(text: str) -> list[Word]: 
    """All the valid words in a text."""
    return [w for w in text.upper().split() if is_valid(w)]

def is_valid(word) -> bool: 
    """A word with at least 4 letters, at most 7 distinct letters, and no 'S'."""
    return len(word) >= 4 and len(set(word)) <= 7 and valid_letters.issuperset(word) 

def letterset(word) -> Letterset:
    """The set of distinct letters in a word, represented as a sorted str."""
    return ''.join(sorted(set(word)))

def is_pangram(word) -> bool: return len(set(word)) == 7

I chose to  represent a `Letterset` as a sorted string of distinct letters, and not as a `set`. Why? Because:
- A `set` can't be the key of a dict, and we'll need that capability.
- A `frozenset` could be a key, and would be a good choice for `Letterset`, but a frozenset:
  - Takes up 2 to 4 times as much space in memory.
  - Is harder to read when debugging: `frozenset({'A', 'G', 'L', 'M'})` versus `'AGLM'`.

Here's a mini word list to experiment with:

In [3]:
mini = word_list('amalgam amalgamation cacciatore erotica em game gem gems glam megaplex')
mini

['AMALGAM', 'CACCIATORE', 'EROTICA', 'GAME', 'GLAM', 'MEGAPLEX']

Note that `em` and `gem` are too short, `gems` has an `s`, and `amalgamation` has 8 distinct letters. We're left with six valid words out of the ten candidate words. Three of them are pangrams:

In [4]:
{w for w in mini if is_pangram(w)}

{'CACCIATORE', 'EROTICA', 'MEGAPLEX'}

# Honeycombs and Scoring

Here are the main concepts for defining a honeycomb and determining a score:

- A **honeycomb** lattice consists of two attributes: a letterset of seven distinct letters, and a single center letter.
- The **word score** is 1 point for a 4-letter word, or the word length for longer words, plus 7 bonus points for a pangram.
- The **total score** for a honeycomb is the sum of the word scores for the words that the honeycomb **can make**. 
- A honeycomb **can make** a word if the word contains the honeycomb's center, and every letter in the word is in the honeycomb. 

In [5]:
@dataclass(frozen=True, order=True)
class Honeycomb:
    """A Honeycomb lattice, with 7 letters, 1 of which is the center."""
    letters: Letterset 
    center:  Letter
        
def word_score(word) -> int: 
    """The points for this word, including bonus for pangram."""
    return 1 if len(word) == 4 else (len(word) + 7 * is_pangram(word))

def total_score(honeycomb, wordlist) -> int:
    """The total score for this honeycomb."""
    return sum(word_score(w) for w in wordlist if can_make(honeycomb, w))

def can_make(honeycomb, word) -> bool:
    """Can the honeycomb make this word?"""
    return honeycomb.center in word and all(L in honeycomb.letters for L in word)

Here is the honeycomb from the diagram at the top of the notebook:

In [6]:
hc = Honeycomb(letterset('LAPGEMX'), 'G')
hc

Honeycomb(letters='AEGLMPX', center='G')

The word scores, makeable words, and total score  for this honeycomb on the `mini` word list are as follows:

In [7]:
{w: word_score(w) for w in mini}

{'AMALGAM': 7,
 'CACCIATORE': 17,
 'EROTICA': 14,
 'GAME': 1,
 'GLAM': 1,
 'MEGAPLEX': 15}

In [8]:
{w for w in mini if can_make(hc, w)}

{'AMALGAM', 'GAME', 'GLAM', 'MEGAPLEX'}

In [9]:
total_score(hc, mini) # 7 + 1 + 1 + (8+7)

24

# Finding the Top-Scoring Honeycomb

A simple strategy for finding the top-scoring honeycomb is:
 - Compile a list of all valid candidate honeycombs.
 - For each honeycomb, compute the total score.
 - Return a (score, honeycomb) tuple for a honeycomb with the maximum score.

In [10]:
def top_honeycomb(wordlist) -> tuple[int, Honeycomb]: 
    """Find a (score, honeycomb) pair with a highest-scoring honeycomb."""
    return max((total_score(h, wordlist), h) 
               for h in candidate_honeycombs(wordlist))

What are the possible candidate honeycombs? We could try all letters in all slots, but that's a **lot** of honeycombs:
- The center can be any valid letter (25 choices, because 'S' is not allowed).
- The outside can be any six of the remaining 24 letters.
- All together, that's 25 × (24 choose 6) = 3,364,900 candidate honeycombs.

Fortunately, we can use the constraint that **a valid honeycomb must contain at least one pangram**.  So the letters of any valid honeycomb must ***be*** the letterset of some pangram (and the center can be any one of the seven letters):

In [11]:
def candidate_honeycombs(wordlist) -> list[Honeycomb]:
    """Valid honeycombs have pangram letters, with any center."""
    return [Honeycomb(letters, center) 
            for letters in pangram_lettersets(wordlist)
            for center in letters]

def pangram_lettersets(wordlist: list[Word]) -> set[Letterset]:
    """All lettersets from the pangrams in wordlist."""
    return {letterset(word) for word in wordlist if is_pangram(word)}

In [12]:
candidate_honeycombs(mini) # 7 candidates for each of the 2 pangram lettersets

[Honeycomb(letters='ACEIORT', center='A'),
 Honeycomb(letters='ACEIORT', center='C'),
 Honeycomb(letters='ACEIORT', center='E'),
 Honeycomb(letters='ACEIORT', center='I'),
 Honeycomb(letters='ACEIORT', center='O'),
 Honeycomb(letters='ACEIORT', center='R'),
 Honeycomb(letters='ACEIORT', center='T'),
 Honeycomb(letters='AEGLMPX', center='A'),
 Honeycomb(letters='AEGLMPX', center='E'),
 Honeycomb(letters='AEGLMPX', center='G'),
 Honeycomb(letters='AEGLMPX', center='L'),
 Honeycomb(letters='AEGLMPX', center='M'),
 Honeycomb(letters='AEGLMPX', center='P'),
 Honeycomb(letters='AEGLMPX', center='X')]

Now we're ready to find the highest-scoring honeycomb with respect to the `mini` word list:

In [13]:
top_honeycomb(mini)

(31, Honeycomb(letters='ACEIORT', center='T'))

The program appears to work. But that's just the mini word list. 

# Big Word list

Here's the big word list:

In [14]:
! [ -e  enable1.txt ] || curl -O http://norvig.com/ngrams/enable1.txt
! head  enable1.txt

aa
aah
aahed
aahing
aahs
aal
aalii
aaliis
aals
aardvark


How big is it?

In [15]:
! wc -w enable1.txt

  172820 enable1.txt


172,820 words.

Let's load it up and print some statistics:

In [16]:
file = 'enable1.txt'
big  = word_list(open(file).read())

print(f"""\
{len(big):7,d} valid Spelling Bee words
{sum(map(is_pangram, big)):7,d} pangram words
{len(pangram_lettersets(big)):7,d} distinct pangram lettersets
{len(candidate_honeycombs(big)):7,d} candidate honeycombs""")

 44,585 valid Spelling Bee words
 14,741 pangram words
  7,986 distinct pangram lettersets
 55,902 candidate honeycombs


How long will it take to run `top_honeycomb(big)`? Most of the computation time is in `total_score`, which is called once for each of the 55,902 candidate honeycombs, so let's estimate the total time by first checking how long it takes to compute the total score of a single honeycomb:

In [17]:
%timeit total_score(hc, big)

3.62 ms ± 38.6 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Roughly 3.5 milliseconds for one honeycomb. For all 55,902 valid honeycombs how many seconds would that be?

In [18]:
.0035 * 55902

195.657

A little over 3 minutes. I could run `top_honeycomb(big)`, get a coffee, come back, and declare victory. 

But I think that a puzzle like this deserves a more elegant solution.  

# Faster Scoring: Points Table

Here's an idea to make `total_score` faster by doing some precomputation:

- Do the following computation only once:
   - Compute the `letterset` and `word_score` for each word in the word list. 
   - Make a table of `{letterset: sum_of_word_scores}` giving the total score for each letterset. 
   - I call this a **points table**.
- For each of the 55,902 candidate honeycombs, do the following:
   - Consider every **letter subset** of the honeycomb's 7 letters that includes the center letter.
   - Sum the points table entries for each of these letter subsets.

The resulting algorithm, `fast_total_score`, iterates over just 2<sup>6</sup> – 1 = 63 letter subsets; much fewer than 44,585 valid words. The function `top_honeycomb2` creates the points table and calls `fast_total_score`:

In [19]:
def top_honeycomb2(wordlist) -> tuple[int, Honeycomb]: 
    """Find a (score, honeycomb) tuple with a highest-scoring honeycomb."""
    table = points_table(wordlist)
    return max((fast_total_score(h, table), h) 
               for h in candidate_honeycombs(wordlist))

def points_table(wordlist) -> Counter:
    """A Counter of {letterset: sum_of_word_scores} from all the words in wordlist."""
    table = Counter()
    for word in wordlist:
        table[letterset(word)] += word_score(word)
    return table

def fast_total_score(honeycomb, points_table) -> int:
    """The total score for this honeycomb, using a points table."""
    return sum(points_table[s] for s in letter_subsets(honeycomb))

def letter_subsets(honeycomb) -> list[Letterset]:
    """The 63 subsets of the letters in the honeycomb, each including the center letter."""
    # range starts at 2, not 1, because (e.g.) 'MAMMA' is valid, but (e.g.) 'AAAA' is not.
    subsets = chain.from_iterable(combinations(honeycomb.letters, n) for n in range(2, 8))
    return [letters for letters in map(''.join, subsets) 
            if honeycomb.center in letters]

Here is the points table for the mini word list:

In [20]:
table = points_table(mini)
table

Counter({'ACEIORT': 31, 'AEGLMPX': 15, 'AGLM': 8, 'AEGM': 1})

The letterset  `'ACEIORT'` gets 31 points (17 for CACCIATORE and 14 for EROTICA), 
`'AEGLMPX'` gets 15 for MEGAPLEX
`'AGLM'` gets  8 points (7 for AMALGAM and 1 for GLAM), and
`'AEGM'` gets 1 for GAME. 

Here is the honeycomb `hc` again, and its 63 letter subsets:

In [21]:
hc

Honeycomb(letters='AEGLMPX', center='G')

In [22]:
print(letter_subsets(hc))

['AG', 'EG', 'GL', 'GM', 'GP', 'GX', 'AEG', 'AGL', 'AGM', 'AGP', 'AGX', 'EGL', 'EGM', 'EGP', 'EGX', 'GLM', 'GLP', 'GLX', 'GMP', 'GMX', 'GPX', 'AEGL', 'AEGM', 'AEGP', 'AEGX', 'AGLM', 'AGLP', 'AGLX', 'AGMP', 'AGMX', 'AGPX', 'EGLM', 'EGLP', 'EGLX', 'EGMP', 'EGMX', 'EGPX', 'GLMP', 'GLMX', 'GLPX', 'GMPX', 'AEGLM', 'AEGLP', 'AEGLX', 'AEGMP', 'AEGMX', 'AEGPX', 'AGLMP', 'AGLMX', 'AGLPX', 'AGMPX', 'EGLMP', 'EGLMX', 'EGLPX', 'EGMPX', 'GLMPX', 'AEGLMP', 'AEGLMX', 'AEGLPX', 'AEGMPX', 'AGLMPX', 'EGLMPX', 'AEGLMPX']


The total from `fast_total_score` is the sum of the scores from its letter subsets (only 3 of which are in `points_table(mini)`):

In [23]:
assert fast_total_score(hc, table) == 24 == table['AGLM'] + table['AEGM'] + table['AEGLMPX']

We can now solve the puzzle on the big word list:

In [24]:
%time top_honeycomb2(big)

CPU times: user 702 ms, sys: 3.74 ms, total: 706 ms
Wall time: 707 ms


(3898, Honeycomb(letters='AEGINRT', center='R'))

**Wow! 3898 is a high score!** And the whole computation took **less than a second**!

# Scoring Fewer Honeycombs: Branch and Bound

A run time of less than a second to find the top possible honeycomb is pretty good! Can we do even better?

The program would run faster if we scored fewer honeycombs. But if we want to be guaranteed of finding the top-scoring honeycomb, how can we skip any? Consider the pangram **JUKEBOX**. With the unusual letters  **J**, **K**,  and **X**, it scores poorly, regardless of the choice of center:

In [25]:
for C in 'JUKEBOX':
    h = Honeycomb(letterset('JUKEBOX'), C)
    print(h, total_score(h, big), 'points')

Honeycomb(letters='BEJKOUX', center='J') 26 points
Honeycomb(letters='BEJKOUX', center='U') 32 points
Honeycomb(letters='BEJKOUX', center='K') 26 points
Honeycomb(letters='BEJKOUX', center='E') 37 points
Honeycomb(letters='BEJKOUX', center='B') 49 points
Honeycomb(letters='BEJKOUX', center='O') 39 points
Honeycomb(letters='BEJKOUX', center='X') 15 points


We might be able to dismiss **JUKEBOX** in one call to `fast_total_score`, rather than seven, with this approach:
- Keep track of the top score found so far, on any previous pangram.
- For each pangram letterset, ask "if we weren't required to use the center letter, what would this letterset score?"
- Check if that score is higher than the top score so far.
  - If yes, then try the pangram letterset with each of the seven centers; 
  - If not then dismiss it without trying *any* of the centers.
- This is called a [**branch and bound**](https://en.wikipedia.org/wiki/Branch_and_bound) algorithm: prune a  **branch** of 7 honeycombs if an upper **bound** can't beat the top score.

*Note*: To represent a honeycomb with no center, I can just use `Honeycomb(p, '')`. This works because of a quirk of Python:  `letter_subsets` checks if `honeycomb.center in letters`; normally in Python the expression `e in s` means "*is* `e` *an element of the collection* `s`", but when `s` is a string it means "*is* `e` *a substring of* `s`", and the empty string is a substring of every string. 

I can rewrite `top_honeycomb2` as follows:

In [26]:
def top_honeycomb3(wordlist) -> tuple[int, Honeycomb]: 
    """Find a (score, honeycomb) tuple with a highest-scoring honeycomb."""
    table = points_table(wordlist)
    top_score, top_honeycomb = 0, None
    pangrams = [s for s in table if len(s) == 7]
    for p in pangrams:
        if fast_total_score(Honeycomb(p, ''), table) > top_score:
            for center in p:
                honeycomb = Honeycomb(p, center)
                score = fast_total_score(honeycomb, table)
                if score > top_score:
                    top_score, top_honeycomb = score, honeycomb
    return top_score, top_honeycomb

In [27]:
%time top_honeycomb3(big)

CPU times: user 176 ms, sys: 2.31 ms, total: 179 ms
Wall time: 178 ms


(3898, Honeycomb(letters='AEGINRT', center='R'))

Awesome! We get the correct answer, and it runs four times faster than `top_honeycomb2`.

# How many honeycombs does top_honeycomb3 examine? 

We can use `functools.lru_cache` to make `Honeycomb` keep track:

In [28]:
import functools
Honeycomb = functools.lru_cache(None)(Honeycomb)
top_honeycomb3(big)
Honeycomb.cache_info()

CacheInfo(hits=0, misses=8084, maxsize=None, currsize=8084)

`top_honeycomb3`  examined 8,084 honeycombs; a 6.9× reduction from the 55,902 examined by `top_honeycomb2`. Since there are 7,986 pangram lettersets, that means we had to look at all 7 centers for only (8084-7986)/7 = 14 of them.

How much faster is `fast_total_score` than `total_score`?

In [29]:
table = points_table(big)

%timeit fast_total_score(hc, table)

10.8 μs ± 88.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


In [30]:
%timeit total_score(hc, big)

3.67 ms ± 49.5 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


We see that `fast_total_score` is about 300 times faster.

# Fancy Report

Here I show the top-scoring  honeycomb, all the pangrams and other words it can make, and the counts:

In [31]:
from textwrap import fill

def report(wordlist: list[Word]) -> None:
    """Print stats, words, and word scores for the top-scoring honeycomb."""
    score, honeycomb = top_honeycomb3(wordlist)
    made = {w for w in wordlist if can_make(honeycomb, w)}
    pangrams = {w for w in made if is_pangram(w)}
    print(f'Top {honeycomb}:\n\nWords: {len(made):,d}, Pangrams: {len(pangrams)}, Points: {score:,d}.')
    for (title, words) in [('Pangrams:', pangrams), ('Other words:', made - pangrams)]:
        print('\n' + title)
        print(fill(', '.join(sorted(words)), width=114))

Here is the big report:

In [32]:
report(wordlist=big)

Top Honeycomb(letters='AEGINRT', center='R'):

Words: 537, Pangrams: 50, Points: 3,898.

Pangrams:
AERATING, AGGREGATING, ARGENTINE, ARGENTITE, ENTERTAINING, ENTRAINING, ENTREATING, GARNIERITE, GARTERING,
GENERATING, GNATTIER, GRANITE, GRATINE, GRATINEE, GRATINEEING, GREATENING, INGRATE, INGRATIATE, INTEGRATE,
INTEGRATING, INTENERATING, INTERAGE, INTERGANG, INTERREGNA, INTREATING, ITERATING, ITINERATING, NATTERING,
RATTENING, REAGGREGATING, REATTAINING, REGENERATING, REGRANTING, REGRATING, REINITIATING, REINTEGRATE,
REINTEGRATING, REITERATING, RETAGGING, RETAINING, RETARGETING, RETEARING, RETRAINING, RETREATING, TANGERINE,
TANGIER, TARGETING, TATTERING, TEARING, TREATING

Other words:
AERATE, AERIE, AERIER, AGAR, AGER, AGGER, AGGREGATE, AGINNER, AGRARIAN, AGREE, AGREEING, AGRIA, AIGRET, AIGRETTE,
AIRER, AIRIER, AIRING, AIRN, AIRT, AIRTING, ANEAR, ANEARING, ANERGIA, ANGARIA, ANGER, ANGERING, ANGRIER, ANTEATER,
ANTIAIR, ANTIAR, ANTIARIN, ANTRA, ANTRE, AREA, AREAE, ARENA, ARENITE, ARETE, 

# 'S' Words

What if we allowed honeycombs to have an 'S' in them? I'll make a new word list that doesn't exclude the 'S'-words, and report on it:

In [33]:
valid_letters.add('S') # Make 'S' a legal letter

big_s = word_list(open(file).read())

report(wordlist=big_s)

Top Honeycomb(letters='AEINRST', center='E'):

Words: 1,179, Pangrams: 86, Points: 8,681.

Pangrams:
ANESTRI, ANTISERA, ANTISTRESS, ANTSIER, ARENITES, ARSENITE, ARSENITES, ARTINESS, ARTINESSES, ATTAINERS,
ENTERTAINERS, ENTERTAINS, ENTRAINERS, ENTRAINS, ENTREATIES, ERRANTRIES, INERTIAS, INSTANTER, INTENERATES,
INTERSTATE, INTERSTATES, INTERSTRAIN, INTERSTRAINS, INTRASTATE, INTREATS, IRATENESS, IRATENESSES, ITINERANTS,
ITINERARIES, ITINERATES, NASTIER, NITRATES, RAINIEST, RATANIES, RATINES, REATTAINS, REINITIATES, REINSTATE,
REINSTATES, RESINATE, RESINATES, RESISTANT, RESISTANTS, RESTRAIN, RESTRAINER, RESTRAINERS, RESTRAINS, RESTRAINT,
RESTRAINTS, RETAINERS, RETAINS, RETINAS, RETIRANTS, RETRAINS, RETSINA, RETSINAS, SANITARIES, SEATRAIN, SEATRAINS,
STAINER, STAINERS, STANNARIES, STEARIN, STEARINE, STEARINES, STEARINS, STRAINER, STRAINERS, STRAITEN, STRAITENS,
STRAITNESS, STRAITNESSES, TANISTRIES, TANNERIES, TEARSTAIN, TEARSTAINS, TENANTRIES, TERNARIES, TERRAINS, TERTIANS,
TRAINEES, TRAINE

Allowing 'S' more than doubles the length of the word list and the score of the top honeycomb!

# Summary

Here are the highest-scoring honeycombs (with and without an S) with their stats and a pangram to remember them by:

<img src="http://norvig.com/honeycombs.png" width="350">
<pre>
  537 words            1,179 words 
   50 pangrams            86 pangrams
3,898 points           8,681 points
      RETAINING              ENTERTAINERS
</pre>

This notebook explored four approaches to finding the highest-scoring honeycomb, with three big efficiency gains:

1. **Brute Force Enumeration**: Compute total score for every possible honeycomb letter combination; return the highest-scoring. 
2. **Pangram Lettersets**: Compute total score only for pangram lettersets. **Reduces candidates by 60x.**
3. **Points Table**: Precompute score for each letterset once; for each honeycomb, sum 63 letter subset scores. **Speeds up scoring time by 300x.**
4. **Branch and Bound**: Try all 7 centers only for lettersets that score better than the top score so far. **Reduces candidates by nearly 7x.**

All together that is about a 100,000-fold speedup!

