# **State names**

## [Riddler Classic](https://fivethirtyeight.com/features/somethings-fishy-in-the-state-of-the-riddler/), May 22, 2020

### solution by [Laurent Lessard](https://laurentlessard.com)

Ohio is the only state whose name doesn’t share any letters with the word "mackerel." It’s strange, but it’s true.

But that isn’t the only pairing of a state and a word you can say that about — it’s not even the only fish! Kentucky has "goldfish" to itself, Montana has "jellyfish" and Delaware has "monkfish," just to name a few.

What is the longest "mackerel?" That is, what is the longest word that doesn’t share any letters with exactly one state? (If multiple "mackerels" are tied for being the longest, can you find them all?)

_Extra credit:_ Which state has the _most_ "mackerels?" That is, which state has the most words for which it is the only state without any letters in common with those words?

(For both the Riddler and the extra credit, please refer to Friend of the Riddler™ Peter Norvig’s [word list](https://norvig.com/ngrams/enable1.txt).)

---

### Solution approach

Our approach will be one of brute force! Here is the process:
1. For each word in the list, figure out which states don't share any letters with that word
2. Keep all the words that have exactly one state associated with them. These are our "mackerels".
3. Now we can answer the first question: what is the longest mackerel.
4. Rearrange our data so that instead of having one state for each word, we have a list of words for each state.
5. Now we can answer the extra credit question: which state has the most mackerels.

In [6]:
states = ["Alabama","Alaska","Arizona","Arkansas","California","Colorado",
  "Connecticut","Delaware","Florida","Georgia","Hawaii","Idaho","Illinois",
  "Indiana","Iowa","Kansas","Kentucky","Louisiana","Maine","Maryland",
  "Massachusetts","Michigan","Minnesota","Mississippi","Missouri","Montana",
  "Nebraska","Nevada","New Hampshire","New Jersey","New Mexico","New York",
  "North Carolina","North Dakota","Ohio","Oklahoma","Oregon","Pennsylvania",
  "Rhode Island","South Carolina","South Dakota","Tennessee","Texas","Utah",
  "Vermont","Virginia","Washington","West Virginia","Wisconsin","Wyoming"]

# assemble word list
f = open("enable1.txt","r")
words = [ w[:-1] for w in f ]
N = len(words)
print('The word list contains', N, 'words.')

# words = words[:100]

The word list contains 172820 words.


In [8]:
%%time
# for each word, figure out which states have no letters in common.
# keep only the words for which this is true for exactly one state.
mackerels = dict()
state_letter_list = [ set(state.lower().replace(" ","")) for state in states ]
for word in words:
    word_letters = set(word)
    state_candidates = [ state for ix,state in enumerate(states) if len(state_letter_list[ix].intersection(word_letters)) == 0 ]
    if len(state_candidates) == 1:
        mackerels[word] = state_candidates[0]

Wall time: 7.22 s


In [9]:
# show the longest mackerels and their associated state
longest_mack_length = max( [len(mack) for mack,state in mackerels.items() ] )
longest_mackerels = [mack for mack,state in mackerels.items() if len(mack) == longest_mack_length]
[(mack, mackerels[mack]) for mack in longest_mackerels]

[('counterconditionings', 'Alabama'),
 ('expressionlessnesses', 'Utah'),
 ('hyperconsciousnesses', 'Alabama'),
 ('hypersensitivenesses', 'Alabama'),
 ('interconnectednesses', 'Alabama'),
 ('microelectrophoretic', 'Kansas'),
 ('nondestructivenesses', 'Alabama'),
 ('overprotectivenesses', 'Alabama')]

In [11]:
# rearrange so we have a list of mackerels for each state
# print the states with the most mackerels
mackerels_per_state = dict()

for state in states:
    mackerels_per_state[state] = []

for mack,state in mackerels.items():
    mackerels_per_state[state].append(mack)
    
number_of_mackerels = [ (state, len(mackerels_per_state[state])) for state in states ]
number_of_mackerels_sorted = sorted( number_of_mackerels, key=lambda x: x[1], reverse=True )
number_of_mackerels_sorted[:5]

[('Ohio', 7523),
 ('Alabama', 5181),
 ('Utah', 4300),
 ('Mississippi', 3339),
 ('Hawaii', 1175)]