<div style="text-align: right" align="right"><i>Peter Norvig, 3 Jan 2020</i></div>

# Spelling Bee

The [Jan. 3 2020 Riddler](https://fivethirtyeight.com/features/can-you-solve-the-vexing-vexillology/) concerns the popular NYTimes  [Spelling Bee](https://www.nytimes.com/puzzles/spelling-bee) puzzle:

*In this game, seven letters are arranged in a honeycomb lattice, with one letter in the center. Here’s the lattice from December 24, 2019:*

<img src="https://fivethirtyeight.com/wp-content/uploads/2020/01/Screen-Shot-2019-12-24-at-5.46.55-PM.png?w=1136" width=150>


*The goal is to identify as many words that meet the following criteria:*
1. *The word must be at least four letters long.*
2. *The word must include the central letter.*
3. *The word cannot include any letter beyond the seven given letters.*

*Note that letters can be repeated. For example, the words GAME and AMALGAM are both acceptable words. Four-letter words are worth 1 point each, while five-letter words are worth 5 points, six-letter words are worth 6 points, seven-letter words are worth 7 points, etc. Words that use all of the seven letters in the honeycomb are known as “pangrams” and earn 7 bonus points (in addition to the points for the length of the word). So in the above example, MEGAPLEX is worth 15 points.*

***Which seven-letter honeycomb results in the highest possible game score?*** *To be a valid choice of seven letters, no letter can be repeated, it must not contain the letter S (that would be too easy) and there must be at least one pangram.*

*For consistency, please use [this word list](https://norvig.com/ngrams/enable1.txt) to check your game score.*

# Approach to a Solution

Since the referenced [word list](https://norvig.com/ngrams/enable1.txt) was on my web site (it is a standard Scrabble word list that I happen to host a copy of), I felt somewhat compelled to submit an answer. I had worked on word puzzles before, like Scrabble and Boggle. My first thought is that this puzzle is rather different because it deals with *unordered sets* of letters, not *ordered permutations* of letters. That makes things much easier. When I tried to find the optimal 5×5 Boggle board, I couldn't exhaustively try all $26^{(5×5)} \approx 10^{35}$ possibilites; I had to do hillclimbing to find a local maximum solution. But for Spelling Bee, it is feasible to try every possibility and get a guaranteed highest-scoring honeycomb. Here's a sketch of my approach:
 
- Since order and repetition don't count, we can represent a word as a **set** of letters, which I will call a `letterset`. For simplicity I'll choose to implement that as a sorted string (not as a Python `set` or `frozenset`). For example:
      letterset("GLAM") == letterset("AMALGAM") == "AGLM"
- A word is a **pangram** if and only if its letterset has exactly 7 letters.
- A honeycomb can be represented by a `(letterset, center)` pair, for example `('AEGLMPX', 'G')`.
- Since the rules say every valid honeycomb must contain a pangram, it must be that case that every valid honeycomb *is* a pangram. That means I can find the highest-scoring honeycomb by considering all possible pangram lettersets and all possible centers for each pangram, computing the game score for each one, and taking the maximum.
- So it all comes down to having an efficient-enough `game_score` function. We'll know how efficient it has to be once we figure out how many pangram lettersets there are (1,000? 100,000?). Note that it will be less than the number of pangrams, because, for example, the pangrams CACCIATORE and EROTICA both have the same letterset, ACEIORT.

# Words, Word Scores, Pangrams, and Lettersets

I'll start by loading some modules and defining four basic functions about words:

In [1]:
from collections import Counter, defaultdict
from itertools import combinations

In [2]:
def Words(text) -> set:
    """The set of all the valid space-separated words in a str."""
    return {w for w in text.upper().split() 
            if len(w) >= 4 and 'S' not in w and len(set(w)) <= 7}

def word_score(word) -> int: 
    """The points for this word, including bonus for pangram."""
    N = len(word)
    bonus = (7 if is_pangram(word) else 0)
    return (1 if N == 4 else N + bonus)

def is_pangram(word) -> bool: 
    """Does a word use all 7 letters (some maybe more than once)?"""
    return len(set(word)) == 7

def letterset(word) -> str:
    """The set of letters in a word, represented as a sorted str of letters."""
    return ''.join(sorted(set(word)))

I'll make a tiny word list to experiment with: 

In [3]:
words = Words('amalgam amalgamation game games gem glam megaplex cacciatore erotica I me')
words

{'AMALGAM', 'CACCIATORE', 'EROTICA', 'GAME', 'GLAM', 'MEGAPLEX'}

Note that `I`, `me` and `gem` are too short, `games` has an `s` which is not allowed, and `amalgamation` has too many distinct letters. We're left with six valid words out of the original eleven.

Here are examples of the functions in action:

In [4]:
{w: word_score(w) for w in words}

{'GAME': 1,
 'CACCIATORE': 17,
 'AMALGAM': 7,
 'GLAM': 1,
 'EROTICA': 14,
 'MEGAPLEX': 15}

In [5]:
{w for w in words if is_pangram(w)}

{'CACCIATORE', 'EROTICA', 'MEGAPLEX'}

In [6]:
{w: letterset(w) for w in words}

{'GAME': 'AEGM',
 'CACCIATORE': 'ACEIORT',
 'AMALGAM': 'AGLM',
 'GLAM': 'AGLM',
 'EROTICA': 'ACEIORT',
 'MEGAPLEX': 'AEGLMPX'}

# The enable1 Word List

Now I will load in the `enable1` word list and see what we have:

In [7]:
! [ -e enable1.txt ] || curl -O http://norvig.com/ngrams/enable1.txt
! wc -w enable1.txt

  172820 enable1.txt


In [8]:
enable1 = Words(open('enable1.txt').read())
len(enable1)

44585

In [9]:
pangrams = [w for w in enable1 if is_pangram(w)]
pangrams[:10] # Just sample some of them

['MODIFIER',
 'FLUORIC',
 'PENTANOL',
 'COMPLECT',
 'COVERTURE',
 'GNOTOBIOTIC',
 'INTREATED',
 'COMMUTATOR',
 'PREPLANT',
 'PRINTERY']

In [10]:
len(pangrams)

14741

So: we start with 172,820 words in the word list, reduce that to 44,585 valid words, and find that 14,741 of those words are pangrams. 

I'm  curious: what's the highest-scoring individual word?

In [11]:
max((word_score(w), w) for w in enable1)

(23, 'ANTITOTALITARIAN')

And what's the breakdown of reasons why words are invalid?


In [12]:
Counter(('s' if 's' in w else 'short' if len(w) < 4 else 'long' if len(set(w)) > 7 else 'valid')
        for w in open('enable1.txt').read().split())

Counter({'short': 922, 'valid': 44585, 's': 103913, 'long': 23400})

About 60% of the words have an 's' in them.

# Game Score

The game score for a honeycomb is the sum of the word scores for all the words that the honeycomb can make. How do we know if a honeycomb can make a word? Well, a honeycomb can make a word if the word contains the honeycomb's center and every letter in the word is in the honeycomb. Another way of saying this is that the letters in the word must be a subset of the letters in the honeycomb.

So the brute-force approach to `game_score` is:

In [13]:
def game_score(honeycomb, words):
    """The total score for this honeycomb."""
    (letters, center) = honeycomb
    return sum(word_score(word) for word in words 
               if center in word and all(c in letters for c in word))

Let's try it, and see how long it takes to get the game score for one honeycomb:

In [14]:
honeycomb = ('AEGLMPX', 'G')

%time game_score(honeycomb, enable1)

CPU times: user 9.86 ms, sys: 343 µs, total: 10.2 ms
Wall time: 10 ms


153

About 10 milliseconds. No problem if we only want to do it a few times. But to find the best honeycomb we're going to have to go through 14,741 pangrams, and try each of the 7 possible letters as the center. Note that 14,741 × 7 × 10 milliseconds is 15 or 20 minutes. I could leave it at that, but, for these kinds of puzzles, you don't feel like you're done until you get the runtime under one minute.

# Efficient Game Score

Here's my idea:

1. Go through all the words, compute the `letterset` and `word_score` for each one, and make a table of `{letterset: points}` giving the total number of points that can be made with that letterset. I call this a `points_table`.
3. The above calculations are independent of the honeycomb, so they only need to be done once, not 14,741 × 7  times. Nice saving!
4. Now for each honeycomb, generate every valid **subset** of the letters in the honeycomb. A valid subset must include the center letter, and it may or may not include each of the other 6 letters, so there are exactly $2^6 = 64$ subsets. The function `letter_subsets(honeycomb)` returns these.
5. To compute `game_score`, just take the sum of the 64 entries in the points table.
6. So we're only iterating over 64 lettersets in `game_score` rather than over 44,585 words. That's a nice improvement!

Here's the code:

In [15]:
def game_score(honeycomb, pts_table) -> int:
    """The total score for this honeycomb, given a points_table."""
    return sum(pts_table[s] for s in letter_subsets(honeycomb))

def letter_subsets(honeycomb) -> list:
    """All 64 subsets of the letters in the honeycomb that contain the center letter."""
    (letters, center) = honeycomb
    return [''.join(subset) 
            for n in range(1, 8) 
            for subset in combinations(letters, n)
            if center in subset]

def points_table(words) -> dict:
    """Return a dict of {letterset: points} from words."""
    table = Counter()
    for w in words:
        table[letterset(w)] += word_score(w)
    return table

Let's look into how this works. First the `letter_subsets`:

In [16]:
len(letter_subsets(honeycomb)) # It will always be 64, for any honeycomb

64

In [17]:
letter_subsets(('ABCDE', 'C')) # A small `honeycomb` with only 5 letters gives 2**4 = 16 subsets

['C',
 'AC',
 'BC',
 'CD',
 'CE',
 'ABC',
 'ACD',
 'ACE',
 'BCD',
 'BCE',
 'CDE',
 'ABCD',
 'ABCE',
 'ACDE',
 'BCDE',
 'ABCDE']

Now the `points_table` (but first a reminder of our honeycomb and our words and their scores):

In [18]:
honeycomb

('AEGLMPX', 'G')

In [19]:
{w: word_score(w) for w in words}

{'GAME': 1,
 'CACCIATORE': 17,
 'AMALGAM': 7,
 'GLAM': 1,
 'EROTICA': 14,
 'MEGAPLEX': 15}

In [20]:
points_table(words)

Counter({'AEGM': 1, 'ACEIORT': 31, 'AGLM': 8, 'AEGLMPX': 15})

The letterset `'ACEIORT'` gets 31 points, 17 for CACCIATORE and 14 for EROTICA, and the letterset `'AGLM'` gets 8 points, 7 for AMALGAM and 1 for GLAM. The other lettersets represent one word each. Now, finally, we can compute the game score:

In [21]:
game_score(honeycomb, points_table(words))

24

That's 15 points for MEGAPLEX, 7 for AMALGAM, 1 for GLAM and 1 for GAME.


The following calculation says that there are about twice as many words as lettersets: on average about two words have the same letterset.



In [22]:
len(enable1) / len(points_table(enable1))

2.058307557361156

# The Solution: The Best Honeycomb

Now that we have an efficient `game_score` function, I can define `best_honeycomb` to search through every possible pangram and center and find the honeycomb that gives the highest game score:

In [23]:
def best_honeycomb(words) -> tuple: 
    """Return (score, honeycomb) for the honeycomb with highest score on these words."""
    pts_table = points_table(words)
    pangrams = [s for s in pts_table if len(s) == 7]
    honeycombs = ((pangram, center) for pangram in pangrams for center in pangram)
    return max([game_score(h, pts_table), h]
               for h in honeycombs)

First the solution for the tiny `words` list:

In [24]:
best_honeycomb(words)

[31, ('ACEIORT', 'T')]

Now the solution for the problem that The Riddler posed, the big `enable1` word list:

In [25]:
%time best_honeycomb(enable1)

CPU times: user 2.04 s, sys: 7.02 ms, total: 2.05 s
Wall time: 2.06 s


[3898, ('AEGINRT', 'R')]

**Wow! 3898 is a high score!** And it took only 2 seconds to find it!

# Fancier Report

I'd like to see the actual words in addition to the total score, and I'm curious about how the words are divided up by letterset. Here's a function to provide such a report. I remembered that there is a `fill` function in Python (it is in the `textwrap` module) but this all turned out to be more complicated than I expected.

In [26]:
from textwrap import fill

def report(words, honeycomb=None):
    """Print stats and word scores for the given honeycomb (or for the best honeycomb
    if no honeycomb is given) on the given word list."""
    optimal = ("" if honeycomb else "optimal ")
    if not honeycomb:
        _, honeycomb = best_honeycomb(words)
    (letters, center) = honeycomb
    subsets = letter_subsets(honeycomb)
    bins = group_by(words, letterset)
    score = sum(word_score(w) for w in words if letterset(w) in subsets)
    N = sum(len(bins[s]) for s in subsets)
    print(f'For this list of {len(words):,d} words:')
    print(f'The {optimal}honeycomb ({letters}, {center}) forms '
          f'{N} words for {score:,d} points.')
    print(f'Here are the words formed, with pangrams first:\n')
    for s in sorted(subsets, key=lambda s: (-len(s), s)):
        if bins[s]:
            pts = sum(word_score(w) for w in bins[s])
            print(f'{s} forms {len(bins[s])} words for {pts:,d} points:')
            words = [f'{w}({word_score(w)})' for w in sorted(bins[s])]
            print(fill(' '.join(words), width=80,
                       initial_indent='    ', subsequent_indent='    '))

def group_by(items, key):
    "Group items into bins of a dict, each bin keyed by key(item)."
    bins = defaultdict(list)
    for item in items:
        bins[key(item)].append(item)
    return bins

In [27]:
report(words, honeycomb)

For this list of 6 words:
The honeycomb (AEGLMPX, G) forms 4 words for 24 points.
Here are the words formed, with pangrams first:

AEGLMPX forms 1 words for 15 points:
    MEGAPLEX(15)
AEGM forms 1 words for 1 points:
    GAME(1)
AGLM forms 2 words for 8 points:
    AMALGAM(7) GLAM(1)


In [28]:
report(words)

For this list of 6 words:
The optimal honeycomb (ACEIORT, T) forms 2 words for 31 points.
Here are the words formed, with pangrams first:

ACEIORT forms 2 words for 31 points:
    CACCIATORE(17) EROTICA(14)


In [29]:
report(enable1)

For this list of 44,585 words:
The optimal honeycomb (AEGINRT, R) forms 537 words for 3,898 points.
Here are the words formed, with pangrams first:

AEGINRT forms 50 words for 832 points:
    AERATING(15) AGGREGATING(18) ARGENTINE(16) ARGENTITE(16) ENTERTAINING(19)
    ENTRAINING(17) ENTREATING(17) GARNIERITE(17) GARTERING(16) GENERATING(17)
    GNATTIER(15) GRANITE(14) GRATINE(14) GRATINEE(15) GRATINEEING(18)
    GREATENING(17) INGRATE(14) INGRATIATE(17) INTEGRATE(16) INTEGRATING(18)
    INTENERATING(19) INTERAGE(15) INTERGANG(16) INTERREGNA(17) INTREATING(17)
    ITERATING(16) ITINERATING(18) NATTERING(16) RATTENING(16) REAGGREGATING(20)
    REATTAINING(18) REGENERATING(19) REGRANTING(17) REGRATING(16)
    REINITIATING(19) REINTEGRATE(18) REINTEGRATING(20) REITERATING(18)
    RETAGGING(16) RETAINING(16) RETARGETING(18) RETEARING(16) RETRAINING(17)
    RETREATING(17) TANGERINE(16) TANGIER(14) TARGETING(16) TATTERING(16)
    TEARING(14) TREATING(15)
AEGINR forms 35 words for 270 points