# Mackerels

Fivethirtyeight's Riddler asks "[what is the longest word that has no letters in common with the name of *exactly one* U.S. state](https://fivethirtyeight.com/features/somethings-fishy-in-the-state-of-the-riddler/)? Examples given are the word "mackerel" (which has a letter in common with every state except Ohio), "goldfish" (Kentucky) "jellyfish" (Montana), and "monkfish" (Delaware). Given a word list, we can find each state's "mackerels" using set intersections. In particular, for any given word, we will find the state names with no letters in common with that word; if there is exactly one state, that word is a mackerel for that state.

I will use Julia to work this problem, in an attempt to learn more of the language. We start by getting the provided word list:

In [1]:
using HTTP, Gumbo

In [2]:
res = HTTP.get("https://norvig.com/ngrams/word.list")
html = parsehtml(String(res.body));

In [3]:
wordlist = split(html.root[2][1].text)

263533-element Array{SubString{String},1}:
 "aa"          
 "aah"         
 "aahed"       
 "aahing"      
 "aahs"        
 "aal"         
 "aalii"       
 "aaliis"      
 "aals"        
 "aardvark"    
 "aardvarks"   
 "aardwolf"    
 "aardwolves"  
 ⋮             
 "zymotechnics"
 "zymotic"     
 "zymotically" 
 "zymotics"    
 "zymurgic"    
 "zymurgies"   
 "zymurgy"     
 "zythum"      
 "zythums"     
 "zyzzyva"     
 "zyzzyvas"    
 "zzz"         

We also need a list of states (preferably in lowercase, so we don't have to deal with case-sensitivity later):

In [4]:
statelist=["alabama"; "alaska"; "arizona"; "arkansas"; "california"; "colorado"; "connecticut"; "delaware"; "florida"; "georgia"; "hawaii"; "idaho"; "illinois"; "indiana"; "iowa"; "kansas"; "kentucky"; "louisiana"; "maine"; "maryland"; "massachusetts"; "michigan"; "minnesota"; "mississippi"; "missouri"; "montana"; "nebraska"; "nevada"; "new hampshire"; "new jersey"; "new mexico"; "new york"; "north carolina"; "north dakota"; "ohio"; "oklahoma"; "oregon"; "pennsylvania"; "rhode island"; "south carolina"; "south dakota"; "tennessee"; "texas"; "utah"; "vermont"; "virginia"; "washington"; "west virginia"; "wisconsin"; "wyoming"];

## Build the list of mackerels

We start by creating an empty dictionary to hold the list of each state's mackerels. We'll also save the length of the longest mackerel; this will come in handy later.

In [5]:
mackerels = Dict(state => Dict("maxlength" => 0, "words" => []) for state in statelist)

Dict{String,Dict{String,Any}} with 50 entries:
  "maryland"      => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "missouri"      => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "new mexico"    => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "wyoming"       => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "delaware"      => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "washington"    => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "kentucky"      => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "iowa"          => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "colorado"      => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "oklahoma"      => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "california"    => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "rhode island"  => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "illinois"      => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "pennsylvania"  => Dict{String,Any}("maxlength"=>0,"wo

Now, loop through the word list. Any word that has no letters in common with *exactly* one state is added to that state's list of mackerels. In theory, we could do this faster by looping through the state list separately, and breaking out of that loop as soon as we found a second state with no common letters, but modern processor speeds mean we don't need to prematurely optimize, amirite?

In [6]:
for word in wordlist
    commonstates = filter(state -> isempty(intersect(word, state)), statelist)
    if length(commonstates) == 1
        push!(mackerels[commonstates[1]]["words"], word)
        if length(word) > mackerels[commonstates[1]]["maxlength"]
            mackerels[commonstates[1]]["maxlength"] = length(word)
        end
    end
end

In [7]:
mackerels

Dict{String,Dict{String,Any}} with 50 entries:
  "maryland"      => Dict{String,Any}("maxlength"=>13,"words"=>Any["beechiest",…
  "missouri"      => Dict{String,Any}("maxlength"=>12,"words"=>Any["acalephan",…
  "new mexico"    => Dict{String,Any}("maxlength"=>13,"words"=>Any["ashtray", "…
  "wyoming"       => Dict{String,Any}("maxlength"=>15,"words"=>Any["abashed", "…
  "delaware"      => Dict{String,Any}("maxlength"=>17,"words"=>Any["bionts", "b…
  "washington"    => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "kentucky"      => Dict{String,Any}("maxlength"=>15,"words"=>Any["abhors", "a…
  "iowa"          => Dict{String,Any}("maxlength"=>16,"words"=>Any["bedrenches"…
  "colorado"      => Dict{String,Any}("maxlength"=>16,"words"=>Any["begunking",…
  "oklahoma"      => Dict{String,Any}("maxlength"=>19,"words"=>Any["becrusting"…
  "california"    => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "rhode island"  => Dict{String,Any}("maxlength"=>0,"words"=>Any[])
  "illinois"     

The first thing you might notice is that several states have no mackerels. Probably, this isn't surprising; "Nebraska", for example, has no unique letters among the states.

### Find the longest words
Since we've saved the length of the longest mackerel for each state, we can find the longest overall by looking for the largest value(s), then grabbing only words of that length from the corresponding word list(s).

In [8]:
maxlength = maximum(mackerels[state]["maxlength"] for state in statelist)
maxstates = filter(state -> mackerels[state]["maxlength"] == maxlength, statelist)
maxentries = Dict(state => filter(word -> length(word) == maxlength,
    mackerels[state]["words"]) for state in maxstates)

Dict{String,Array{Any,1}} with 2 entries:
  "mississippi" => Any["hydrochlorofluorocarbon"]
  "alabama"     => Any["counterproductivenesses"]

Look at that! There was a tie, and between neighboring states no less.

### Extra Credit
The extra credit question asks which states have the *most* mackerels. Again, we've already got the wordlists, so we can just look for the longest one.

In [9]:
maxcount = maximum(length(mackerels[state]["words"]) for state in statelist)
moststates = filter(entry -> length(last(entry)["words"]) == maxcount, mackerels)

Dict{String,Dict{String,Any}} with 1 entry:
  "ohio" => Dict{String,Any}("maxlength"=>20,"words"=>Any["abamperes", "abands"…

In [10]:
mackerels["ohio"]["words"]

11342-element Array{Any,1}:
 "abamperes" 
 "abands"    
 "abasedly"  
 "abasement" 
 "abasements"
 "abatement" 
 "abatements"
 "abbeys"    
 "abducens"  
 "abducentes"
 "abends"    
 "aberrances"
 "aberrants" 
 ⋮           
 "zelants"   
 "zemstva"   
 "zenanas"   
 "zettabytes"
 "zeugmas"   
 "zugzwangs" 
 "zupan"     
 "zupans"    
 "zygantrum" 
 "zygantrums"
 "zymase"    
 "zymases"   

Holy cow! Did not expect to see quite so long a list. Some excellent "z" words there.

### More extra credit
Which state has the fewest mackerels that has at least one? What is/are the *shortest* mackerel(s)?

In [11]:
# need to filter out the zero length counts this time, so use a conditional
mincount = minimum(length(mackerels[state]["words"]) for state in statelist if mackerels[state]["maxlength"] > 0)
leaststates = filter(entry -> length(last(entry)["words"]) == mincount, mackerels)

Dict{String,Dict{String,Any}} with 1 entry:
  "michigan" => Dict{String,Any}("maxlength"=>13,"words"=>Any["bestrowed", "out…

In [12]:
mackerels["michigan"]["words"]

4-element Array{Any,1}:
 "bestrowed"    
 "outwrestled"  
 "overwrestled" 
 "woodburytypes"

In [13]:
# we didn't save the minimum length mackerels, so this requires extra steps
function findmin(words)
    return minimum(length(word) for word in words)
end

findmin (generic function with 1 method)

In [14]:
minlength = minimum(findmin(mackerels[state]["words"])
    for state in statelist if mackerels[state]["maxlength"] > 0)

minstates = filter(state -> mackerels[state]["maxlength"] > 0 &&
    findmin(mackerels[state]["words"]) == minlength, statelist)

minentries = Dict(state => filter(word -> length(word) == minlength,
    mackerels[state]["words"]) for state in minstates)

Dict{String,Array{Any,1}} with 1 entry:
  "ohio" => Any["man", "mna", "nam", "nas", "san"]

With the largest list of mackerels, it's not surprising that Ohio would have the shortest, though being unique in that regard is a surprise.