# Anagramic squares

### Problem 98

By replacing each of the letters in the word CARE with 1, 2, 9, and 6 respectively, we form a square number: 1296 = 36^2. What is remarkable is that, by using the same digital substitutions, the anagram, RACE, also forms a square number: 9216 = 96^2. We shall call CARE (and RACE) a square anagram word pair and specify further that leading zeroes are not permitted, neither may a different letter have the same digital value as another letter.

Using words.txt (right click and 'Save Link/Target As...'), a 16K text file containing nearly two-thousand common English words, find all the square anagram word pairs (a palindromic word is NOT considered to be an anagram of itself).

What is the largest square number formed by any member of such a pair?

NOTE: All anagrams formed must be contained in the given text file.

In [67]:
using CBD, IterTools, Combinatorics

In [111]:
#download("https://projecteuler.net/project/resources/p098_words.txt", "$(pwd())/euler98.txt" )

As expected, 1786 words in file: euler98.txt.

In [112]:
words = vec(readcsv("euler98.txt"))

1786-element Array{Any,1}:
 "A"         
 "ABILITY"   
 "ABLE"      
 "ABOUT"     
 "ABOVE"     
 "ABSENCE"   
 "ABSOLUTELY"
 "ACADEMIC"  
 "ACCEPT"    
 "ACCESS"    
 "ACCIDENT"  
 "ACCOMPANY" 
 "ACCORDING" 
 ⋮           
 "WRONG"     
 "YARD"      
 "YEAH"      
 "YEAR"      
 "YES"       
 "YESTERDAY" 
 "YET"       
 "YOU"       
 "YOUNG"     
 "YOUR"      
 "YOURSELF"  
 "YOUTH"     

In [121]:
[a for a in groupby(length, sort(words,by=length))]

14-element Array{Array{Any,1},1}:
 Any["A", "I"]                                                                                                                                                                                                                                                                                                     
 Any["AN", "AS", "AT", "BE", "BY", "DO", "GO", "HE", "IF", "IN"  …  "MY", "NO", "OF", "ON", "OR", "SO", "TO", "UP", "US", "WE"]                                                                                                                                                                                    
 Any["ACT", "ADD", "AGE", "AGO", "AID", "AIM", "AIR", "ALL", "AND", "ANY"  …  "USE", "VIA", "WAR", "WAY", "WHO", "WHY", "WIN", "YES", "YET", "YOU"]                                                                                                                                                                
 Any["ABLE", "ACID", "ALSO", "AREA", "ARMY

Haven't lost any . . .

In [122]:
sum(map(e->size(e)[1], [a for a in groupby(length, sort(words,by=length))]))

Find all the anagrams in list of words all of equal length

In [123]:
function is_anagram(w1, w2)
    sort(split(w1,"")) == sort(split(w2,""))
end

function find_anagrams(words)
    # words are all of equal length
    anagrams = []
    for (idx, w) in enumerate(words)
        anagram_of = String[w]
        for i in words[idx:end]
            if w != i && is_anagram(w, i)
                push!(anagram_of, i)
            end
        end
        if anagram_of != [w]
            push!(anagrams, anagram_of)
        end
    end
    anagrams
end

find_anagrams (generic function with 1 method)

Now let's find all of the anagrams in euler98.txt, grouped by length of anagram.

In [137]:
anagrams = [find_anagrams(g) for g in groupby(length, sort(words, by=length))]

14-element Array{Array{Any,1},1}:
 Any[]                                                                                                                                                                                                                                                                                                                                                                                                                              
 Any[String["NO", "ON"]]                                                                                                                                                                                                                                                                                                                                                                                                            
 Any[String["ACT", "CAT"], String["DOG", "GOD"], String["EAT", "TEA"], String["HOW", "WHO"], String["ITS", "SIT"], String["N

Since no leading zeros are permitted, the largest possible square must also be the longest possible pair of anagrams. Therefore, when searching for solutions, start with the largest pairs and then consider the next largest pairs. In this case start with pairs of length=9, i.e., "INTRODUCE", "REDUCTION" and then try length=8, etc.

By the constraints of the problem, we are looking for encodings, i.e., a number which is: 1) A square and 2) All of the digits are unique. This functions finds these by power. For example, unique_squaresbypow(3) returns all of the numbers with three digits that satisfy the constrains of the problem above.

In [316]:
function unique_squaresbypow(pow)
    (from, to) = ceil(Integer,sqrt(10^(pow-1))),floor(Integer, sqrt((10^pow)-1))
    [n^2 for n = range(from, to - from + 1) if allunique(digits(n^2))]
end

unique_squaresbypow (generic function with 1 method)

Let's try it out on something simple. Seems to work

In [317]:
@time unique_squaresbypow(3)

  0.113601 seconds (22.25 k allocations: 1.251 MiB)


13-element Array{Int64,1}:
 169
 196
 256
 289
 324
 361
 529
 576
 625
 729
 784
 841
 961

Now something harder. Still very fast. (I tried building up permutations of combinations and that version would take 12 seconds. This is much better.)

In [318]:
@time unique_squaresbypow(9)

  0.034730 seconds (108.13 k allocations: 13.200 MiB, 26.43% gc time)


83-element Array{Int64,1}:
 102495376
 102576384
 102738496
 104325796
 105637284
 139854276
 152843769
 157326849
 158306724
 158407396
 172843609
 176039824
 176305284
         ⋮
 740329681
 743816529
 783104256
 793605241
 798401536
 803495716
 816930724
 825470361
 842973156
 847159236
 853107264
 923187456

And now, let's see how many there are in total.

In [319]:
@time for p in 2:9
    println("$p: $(length(unique_squaresbypow(p)))")
end

2: 6
3: 13
4: 36
5: 66
6: 96
7: 123
8: 97
9: 83
  0.035376 seconds (158.56 k allocations: 19.164 MiB, 14.89% gc time)


In [320]:
function encode(word, cipher::Integer)
   Dict(k=>v for (k,v) in zip(split(reverse(word),""), digits(cipher)))
end

encode (generic function with 1 method)

In [321]:
function decode(word, encoding)
    str = ""
    for l in split(word,"")
        str = string(str,encoding[l])
    end
    parse(Int, str)
end

decode (generic function with 1 method)

In [322]:
encode("care", 1296)

Dict{SubString{String},Int64} with 4 entries:
  "c" => 1
  "e" => 6
  "r" => 9
  "a" => 2

In [323]:
decode("care", encode("race", 9216))

In [324]:
encode("race", 9216 )

Dict{SubString{String},Int64} with 4 entries:
  "c" => 1
  "e" => 6
  "r" => 9
  "a" => 2

In [325]:
encode("care", 1296) == encode("race", 9216 )

Here is where all the work is done. Take a list of anagram-pairs (all of a set length) and loop through these a pair at a time. Using the ciphers, which are the unique_squaresbypow(), then test to see if for an anagram-pair if the they have the same encoding. (Remember that an encoding is a Dict mapping each letter to a digit).

In [326]:
function sameencoding(anagram_pairs, ciphers)
    for (anagram1, amagram2) = anagram_pairs
        for e1 in map(c->encode(anagram1, c), ciphers)
            for e2 in map(c->encode(amagram2, c), ciphers)
                if e1 == e2
                    println(" $e1, $anagram1->$(decode(anagram1,e1)), $amagram2->$(decode(amagram2,e2))")
                end
            end
        end
    end
end

sameencoding (generic function with 2 methods)

Using trial-and-error, starting with 9 and working down, discovered the solution for anagrams of length 5. Obviously, the solution is BROAD->18769, as it is the largest square.

In [327]:
@time sameencoding(anagrams[5],unique_squaresbypow(5))

 Dict("B"=>1,"A"=>6,"D"=>9,"R"=>8,"O"=>7), BOARD->17689, BROAD->18769
  0.260644 seconds (1.38 M allocations: 73.887 MiB, 6.44% gc time)


So, just to be able to time it, put all of this together in a little script. Note: I haven't spent the time to clean this up to stop after a high enough solution is found. Moreover, I need to visually examine it to find the largest square. Easy to spend more time, but why?

In [328]:
function euler98()
    anagrams = [find_anagrams(g) for g in groupby(length, sort(words, by=length))]
    for i = 9:-1:2
        sameencoding(anagrams[i], unique_squaresbypow(i))
    end
end

euler98 (generic function with 1 method)

In [314]:
@time euler98()

 Dict("B"=>1,"A"=>6,"D"=>9,"R"=>8,"O"=>7), BOARD->17689, BROAD->18769
 Dict("A"=>2,"C"=>9,"E"=>6,"R"=>1), CARE->9216, RACE->1296
 Dict("A"=>2,"C"=>1,"E"=>6,"R"=>9), CARE->1296, RACE->9216
 Dict("A"=>6,"D"=>4,"L"=>1,"E"=>7), DEAL->4761, LEAD->1764
 Dict("A"=>6,"D"=>1,"L"=>4,"E"=>7), DEAL->1764, LEAD->4761
 Dict("S"=>1,"A"=>9,"T"=>6,"E"=>2), EAST->2916, SEAT->1296
 Dict("I"=>2,"L"=>1,"E"=>6,"F"=>9), FILE->9216, LIFE->1296
 Dict("I"=>2,"L"=>9,"E"=>6,"F"=>1), FILE->1296, LIFE->9216
 Dict("A"=>3,"T"=>6,"E"=>9,"H"=>1), HATE->1369, HEAT->1936
 Dict("A"=>3,"M"=>1,"L"=>6,"E"=>9), MALE->1369, MEAL->1936
 Dict("A"=>0,"M"=>9,"N"=>4,"E"=>6), MEAN->9604, NAME->4096
 Dict("A"=>0,"M"=>2,"N"=>1,"E"=>4), MEAN->2401, NAME->1024
 Dict("T"=>1,"N"=>9,"E"=>6,"O"=>2), NOTE->9216, TONE->1296
 Dict("T"=>9,"N"=>1,"E"=>6,"O"=>2), NOTE->1296, TONE->9216
 Dict("S"=>1,"T"=>6,"P"=>2,"O"=>9), POST->2916, SPOT->1296
 Dict("A"=>0,"T"=>9,"E"=>6,"R"=>4), RATE->4096, TEAR->9604
 Dict("A"=>0,"T"=>2,"E"=>4,"R"=>1), RATE->102