# Lets take a stab at wordle

## What is the best first word?

There are many ways to answer. [@tokbyzeb on tiktok](https://www.tiktok.com/@tokbyzeb/video/7058690677634256134) observed there are many heuristics, or measures, that can be used to evaluate words, and that the actual measure of word effectiveness is how many words does each word eliminate (maximize) or how many words are valid guesses (minimize)

**Lets first try to replicate his result of "roate" minimizing the number of possible words to ≈60.4288**

To do that, we need to define what it means to minimize valid words after each guess. Wordle will show correct letters in the correct position, and correct letters in the wrong position.

NOTES: I use colorblind mode (Orange, Blue, Gray) but lets use the standard coloring for discussion.

### n=1

Let's pretend our words are 1 letter. Then it is trivial, with our letter being correct or not. 2 possibilities:
- Green
- Gray

### n=2

Now things are tricky. Our guess may have the following evaluations:
- ``Green Green`` - We solved it!
- ``Yellow Yellow`` - Wrong order
- ``Gray Gray`` - Do better

The following are not possible.
- ``Green Yellow``
- ``Yellow Green``

### n=3

The possibilities are:

- ``0 Green 0 Yellow 3 Gray``
- ``0 Green 1 Yellow 2 Gray``
- ``1 Green 0 Yellow 2 Gray``
- ``0 Green 2 Yellow 1 Gray``
- ``1 Green 1 Yellow 1 Gray``
- ``2 Green 0 Yellow 1 Gray``
- ``0 Green 3 Yellow 0 Gray``
- ``1 Green 2 Yellow 0 Gray``
- ``2 Green 1 Yellow 0 Gray`` - NOT POSSIBLE
- ``3 Green 0 Yellow 0 Gray``

### Duplicate letters

A letter will only be green or yellow as many times is it occurs in the answer.

Thanks [/u/Humdrumbee on reddit](https://www.reddit.com/r/wordle/comments/ry49ne/illustration_of_what_happens_when_your_guess_has/)!



## How does elimination work?
Every time we guess a word, the number of possible solutions goes down, but how?

- If a letter is green, every word that does not have the same letter is that position, is pruned
- If a letter is yellow, every word that does not contain that letter is pruned
- If a letter is yellow, every word that has that letter in the same position in pruned
- If a letter is grey, every word that has the letter, in any position, is pruned

This implies we should test our word against every possible solution and for each we can then evaluate the word's performance

In [1]:
import wordle_sim
import wordle_lists

#wordle_sim.score("roate", wordle_lists.valid, wordle_lists.answers)

In [2]:
## This is taking a while (12.2 seconds per word). Lets use parallelism
import multiprocess
import time
import sys
import csv

test_set = ["union","banal", "annal", "alloy"]


PROCESSES = multiprocess.cpu_count() - 1
with multiprocess.Pool(PROCESSES) as pool:
    results = [pool.apply_async(wordle_sim.score, args=(word, wordle_lists.valid, wordle_lists.answers)) for word in wordle_lists.valid]

    pool.close()
    pool.join()




In [None]:
with open('scores.csv', 'w') as f:
    writer = csv.writer(f)
    print(len(results))
    for result in results:

        writer.writerow(result.get())

4
