## Acquiring the list of all possible words

The list of all possible words was downloaded from the source code of the [Wordle](https://www.powerlanguage.co.uk/wordle/) webpage on Jan. 10th 2022. Exerpt from the `wordle_solver.py`:
```python
def load_possible_words(self, fname='data/main.814b2d17.js', start='Ta=', stop=',Ca='):
    with open(fname, 'r') as file:
        line = file.readline()
        line = line.split(start)[1].split(stop)[0]
        self.words = json.loads(line)
```

## Ranking words

Our approach ranks words based on their potential to reduce the number of possible answers at the _next_ guess.  The score of word $w$ after the first $i$ guesses, noted $S(w|W_i)$, is computed as the average number of possible answers assuming that every word is equally likely to be the answer
$$
S(w|W_i) = \frac{1}{|W_i|} \sum_{t \in W_i} \sum_{n \in W_i} I(n|w,t)
$$
where
- $W_i$: set of words compatible with the knkowledge gained after the first $i$ guesses ($W_0$ is the set of _all_ words).
- $I(n|t,w)$: indicator function returning 1 if word $n$ is a possible answer taking into account the knowledge acquired after trying word $w$ and given that word $t$ is the answer, and 0 otherwise.  In other words, suppose that word $t$ is the answer, then $\sum_{n \in W_i} I(n|w,t)$ is the number of compatible words with the knowledge gained after guessing word $w$.

**Note**: Our approach assumes that every word is equally likely to be the answer.  A [blogpost](http://estebanmoro.org/post/2022-01-10-wordle/) by Esteban Moro suggests that a nonuniform prior should improve the performance of the approach.

**Note**: Our approach is similar to the __ used by [Laurent Lessard](https://github.com/LaurentLessard/wordlesolver).

## A priori score for all words

Computing the score of all words without prior knowledge takes quite some time (and stays the same as long as the list of possible words is not modified), so we did it in advance and saved the results into the file `data/a_priori_scores.txt`.

In [1]:
import pandas
with open('data/a_priori_scores.txt', 'r') as f:
    header = f.readline().replace('#', ' ').split()
    df = pandas.read_table(f, names=header, comment='#', delimiter=r"\s+")
    df.set_index('Word', inplace=True)
df.iloc[:9]

Unnamed: 0_level_0,SumScore,AvgScore
Word,Unnamed: 1_level_1,Unnamed: 2_level_1
raise,141217,61.001
arise,147525,63.726
irate,147649,63.779
arose,152839,66.021
alter,162031,69.992
saner,162341,70.126
later,162567,70.223
snare,164591,71.098
stare,165047,71.295


## Example

This example shows how to use the solver class to help you solve Wordle puzzles. This example uses the Wordle puzzle from Jan. 16th, 2022

First, we need to load and initialize a solver object.

In [2]:
from wordle_solver import wordle_solver
ws = wordle_solver()

### First guess: using the word with the lowest a priori score

A sound strategy is to start the word which will, on average, reduce the number of compatible words the most, based on the knowledge Wordle provides. As seen above, a promising candidate is `raise`.

The knowledge provided by Wordle is inputted into the solver object via the `result` keyword using the code
- `'g'`: correct letter (green)
- `'y'`: misplaced letter (yellow)
- `'b'`: incorrect letter (gray/dark)

In [3]:
ws.update_information(word='raise', result='yybyb')
ws.get_compatible_word_scores().iloc[:5]

Unnamed: 0_level_0,SumScore,AvgScore
Word,Unnamed: 1_level_1,Unnamed: 2_level_1
strap,61,2.904762
scrap,73,3.47619
stray,75,3.571429
straw,79,3.761905
scram,83,3.952381


### Second guess: using the word with the lowest score based newly aquired knowledge

The word which will, on average, reduce the number of compatible words the most is `strap`, which we pick as our second guess.

In [4]:
ws.update_information(word='strap', result='gbygb')
ws.get_compatible_word_scores()

Unnamed: 0_level_0,SumScore,AvgScore
Word,Unnamed: 1_level_1,Unnamed: 2_level_1
solar,3,1.0
sonar,3,1.0
sugar,5,1.666667


### Third guess: just got lucky

Now it's just a matter of gut feeling, and we've been lucky this time.

In [5]:
ws.update_information(word='solar', result='ggggg')
ws.get_compatible_word_scores()

Unnamed: 0_level_0,SumScore,AvgScore
Word,Unnamed: 1_level_1,Unnamed: 2_level_1
solar,1,1.0


Wordle 211 3/6<br>
🟨🟨⬜🟨⬜<br>
🟩⬜🟨🟩⬜<br>
🟩🟩🟩🟩🟩