# F# Advent Calendar 2022

My contrinution to the F# Advent Calendar 2022, is a relatively short post on using F# to help solve Wordle.
For those who've been living under a rock this year, Wordle is a word guessing game created by Josh Wordle, and now owned by the NY Times.

Users have to iteratively attempt to guess a secret word, where after each guess is 'scored' after submission. The scored letter is Green, if it's in the correct position in the secret word, yellow if it's in the word but not in the correct position, and grey if it's not in contained in the word.  The user continues to guess until, the scored guess contained all Green letters.

The first thing to do, is some basic analysis on five letter words, to see if there are any patterns.  I found a dictionary on the internet, and not necessarily the same five letter word list that the office wordle app used.

### Part 2 - Initial Investigation
First of all some useful functions in order to load the letters from the file.

In [14]:
open System.IO

let rec listToString l =
    match l with
    | [] -> ""
    | head :: tail -> head.ToString() + listToString tail

let sortString s =
    s |> Seq.sort |> Seq.toList |> listToString

// load given word list and ensure 5 letter words selected
let fiveLetterWords fileName = 
    fileName 
    |> File.ReadAllLines 
    |> Seq.filter (fun w -> w.Length = 5)

In [15]:
let words = fiveLetterWords @"/Users/ioanwilliams/github/furry-palm-tree/words.txt"

In [16]:
let frequencies = 
    words 
    |> Seq.collect id 
    |> Seq.countBy id 
    |> Seq.sortByDescending snd

let frequenciesMap =
    frequenies
    |> Map.ofSeq

frequenies |> Seq.take 10


index,Item1,Item2
0,s,4331
1,e,4303
2,a,3665
3,r,2733
4,o,2712
5,i,2428
6,l,2293
7,t,2154
8,n,1867
9,d,1611


Unsurprisingly, the letter e features near the top of the list, although I didn't expect the letter s to be so frequent.

Next we want to sore each word, by the frequency of individual letters, so see if there is a word that contains all the most frequent letters.

The five most frequent letters in frequency order are: 'a', 'e', 's', 'r', 'o'.  Let's try to find a word that uses these letters.

In [20]:
let letterScores =
    words
    |> Seq.map (fun word ->
        // this filters out words with multiple letters
        let multiplier = 
            word 
            |> Seq.groupBy id 
            |> Seq.fold (fun state (f, s) ->  state * Seq.length s) 1
        let freq = 
            word 
            |> Seq.fold (fun state letter -> state + Map.find letter frequenciesMap) 0
        word, freq / multiplier)
    |> Seq.sortByDescending snd
letterScores |> Seq.take 10

index,Item1,Item2
0,arose,17744
1,arise,17460
2,raise,17460
3,serai,17460
4,arles,17325
5,earls,17325
6,lares,17325
7,laser,17325
8,lears,17325
9,rales,17325


Above, I've listed the top 10 words, i.e. words that use the most frequent letters.

So in this case, I've chosen my first word, that I will use in wordle, as "AROSE".
Now there is a choice, if you play in 'hard mode', which means that you have to reuse letters scored in the n-1 turn in the n turn, then there is little point finding a generic second word.

However, if you are using non 'hard mode', then it may well be worthwhile finding a second word.  My thinking here, is that I should choose the word that scores highly, by doesn't use the same letters are AROSE.

In [21]:
let aroseSet = "arose" |> Set.ofSeq

letterScores
|> Seq.find (fun (word, _) ->
    let wSet = 
        word 
        |> Set.ofSeq
    aroseSet 
    |> Seq.fold (fun state l -> state || Set.contains l wSet) false |> not)

Item1,Item2
unlit,10300


Therefore my next word will be "UNLIT".
Using "AROSE" and "UNLIT" while in 'easy mode', did well for me.  After about a 100 or so games, I have a Mode of 3 (see stats page), although I didn't have any luck guessing below 3.

But the What's app group that I play worlde with, decided to move to use 'Hard mode', so I could not longer use "UNLIT" as my second word, as I have to choose a second word which contained letters from "AROSE" in the right position, as given by the scoring algorithm.

### Simulation

The next step in our investigations is to choose a strategy and run a simulation on all 2305 wordle guess words, to see how that strategy fairs each day.

For those not familiar with wordle, wordle has a fixed list of 2305 (earlier in the year when I tried) of words that will be the daily secret word.  Each day a specific word is chose from the list and everyone in the world tries the guess the same word.

However, what we can do, is to take the word list from wordle (you can scrape it form the javascript source, or just search the internet for one) and run our simulation to see whether we are able solve the wordle each day, taking note of the average score for the whole 2305 set.

The first thing we need is to, is to create a scorer function, i.e. a function that given a guess and the wordle, will score each letter, grey, green or yellow.

In [24]:
type answerMask = Green | Yellow | Grey

module Counter =
    let createCounter items =
        items
        |> List.filter (fun (a, g) -> a <> g)
        |> List.map fst
        |> List.countBy id
        |> Map.ofList

    let countOf counter item =
        match Map.tryFind item counter with
        | Some c -> c
        | None -> 0

    let updateCount counter item =
        match Map.tryFind item counter with
        | Some c -> Map.add item (c - 1) counter
        | None -> counter

let scoreGuess actual guess =

    let letters = Seq.zip actual guess
    
    let folder ((count, mask): Map<'a,int> * answerMask list) (a, g) =
        if a = g then
            count, Green :: mask
        elif Seq.contains g actual && Counter.countOf count g > 0 then
            Counter.updateCount count g, Yellow :: mask
        else
            count, Grey :: mask

    List.fold folder (Counter.createCounter letters, []) letters 
    |> snd 
    |> List.rev

Now scoring, the guess 'arose' against the wordle 'favour', we see that the answer mask returned is Yellow, Yellow, Yellow, Grey, Grey.

In [28]:
scoreGuess "favor" "arose"

index,Unnamed: 1
0,Yellow
1,Yellow
2,Yellow
3,Grey
4,Grey
