# Information Theory and the Game Wordle (Part 2 of 2)

## Goal of this lecture (Part 1 of 2):

Now that we know how the game Wordle works, and a basic understanding of Information Theory, we will apply these ideas to find the optimal first guess for Wordle!

## Setting up our infrastructure

I've uploaded a file called Wordle.py that includes all of the possible guesses that Wordle will accept. These list of words are identical to the words that we saw in Part 1.

Now, we are going to need to write some code to help make this process easier. Something that we'll need is a way to represent the results given to us for each guess. Since each box can be gray, yellow, or green, there is a total of $3^5 = 243$ different results that each word can give us.

In [8]:
import enum
import itertools
import Wordle
import math

# The Results class a special type of class called an Enum. 
# I am using this just to make later code easier to read! 
# Basically, it is just assignig the labels for gray, yellow, and green to a unique numerical value.
class Results(enum.Enum):
    gray = 0
    yellow = 1
    green = 2

# words and answers are disjoint, so combining them gives us all possible inputs
possibleInputs = Wordle.words + Wordle.answers

# outcome is a tuple off all tuples of length 5 where the values are Results.gray, Results.yellow, or Results.green
# it basically just hold every possible outcome
outcomes = tuple(itertools.product(Results, repeat=5)) 

print("Num outcomes: " + str(len(outcomes))) # This tells us that we have all 243 possible outcomes
print("Num possibleInputs: " + str(len(possibleInputs)))

Num outcomes: 243
Num possibleInputs: 12972


## Comparing words!
Now we want a function that tells us the outcome of the guess compared to the answer.

In [9]:
def getOutcome(guess, answer):
    outcome = [Results.gray, Results.gray, Results.gray, Results.gray, Results.gray]
    
    for i in range(0, 5):
        # If the letter at the same index is identical, then we set the result at that index to be green
        if guess[i] == answer[i]:
            outcome[i] = Results.green
            
        # Otherwise, if the letter is in the word, we will set the result to yellow    
        elif guess[i] in answer:
            outcome[i] = Results.yellow
    
    return tuple(outcome)

print(getOutcome("dried", "dodge"))

(<Results.green: 2>, <Results.gray: 0>, <Results.gray: 0>, <Results.yellow: 1>, <Results.yellow: 1>)


Here, we are comparing the words "crate" and "grain". Since the first letter is correct, the first position of the tuple is Results.green. Since the next 2 are wrong, the next 2 tuple entries are results.gray. Then since the last 2 letters exist in the word, but aren't in the right spots, we get Results.yellow.

Next, we'll also create a function that returns the number of times each possible outcome occured so that we can calculate probability later.

In [10]:
def getOutcomeDict(guess, wordList):
    # A diction that contains the number of each time an outcome appears
    outcomeDict = {}
    for outcome in outcomes:
        outcomeDict[outcome] = 0
        
    for answer in wordList:
        outcomeDict[getOutcome(guess, answer)] += 1
        
    return outcomeDict

## Implementing Entropy!
Now, we want to create a function that returns the entropy/expected information that we would get for each word and a list of words. To do so, we will define a function that takes in a word. With that word, we will compare it to every other word and see what the outcome is. Then we can calculate the probabilities with these numbers.

Notice that, instead of just always using every Wordle word, I allow you to pass in a word list so that you can remove words from the list and get a different answer. This will allow use to find the changes in entropy for after subsequent guesses.

In [11]:
def entropy(guess, wordList):
    # A dictionary that contains the number of each time an outcome appears
    outcomeDict = getOutcomeDict(guess, wordList)
    totalOutcomes = len(wordList)
    
    expected = 0
    for outcome in outcomes:
        p = outcomeDict[outcome]/totalOutcomes #calculating probability
        if p != 0:
            expected += p * math.log(1/p, 2) #Entropy equation
            
    return expected

print(entropy("caulk", possibleInputs))

4.490289479097055


Based on what value our entropy function gives us for a word, we can expect that word to divide our possible guesses in half that many number of times! Remember, an entropy value of 4 means that we cut our possibilities in half 4 times!

Now that we have a function that returns the entropy of each word. We can find the word with the highest entropy value by iterating through every single word and getting their entropy!

In [12]:
def getHighestEntropyIndex(wordList):
    highestEntropyIndex = 0
    highestEntropy = 0
    for i in range(len(wordList)):
        curEntropy = entropy(wordList[i], wordList)
        if curEntropy > highestEntropy:
            highestEntropy = curEntropy
            highestEntropyIndex = i
    return highestEntropyIndex
        
highestEntropyIndex = getHighestEntropyIndex(Wordle.answers)

The previous code has to go through a lot of entries. So it might take a while to finish. However, once it does, we can get some information from it!

In [13]:
print("The word with the highest entropy is at index " + str(highestEntropyIndex))
print("The word with the highest entropy is " + Wordle.answers[highestEntropyIndex])
print("The word's entropy is " + str(entropy(Wordle.answers[highestEntropyIndex], possibleInputs)))

The word with the highest entropy is at index 1668
The word with the highest entropy is raise
The word's entropy is 5.9197368425161985


The result of this code tells us that the word with the highest entropy is "raise" with an entropy value of almost 6. This means that, on average, guessing "raise" will cut down all the possible remaining guesses in half 6 times! So, that means that if we have 12,972 possible words at the beginning, there will only be about 202 possible guesses left!

Now, if you have watched the video by 3Blue1Brown, you might have noticed that my word is different from his. That is because of a couple of reasons:
1. He added code to consider other factors such as word frequency
2. He actually looks deeper into the search to see if how he could optimize later guesses, not just the first guess
3. He considered all possible guesses, I only considered all possible answers

If you were paying close attention, you might have noticed that I used "Wordle.answers" this time instead of the usual "possibleInputs". That's because "Wordle.answers" is the subset of all words that could possibly be answers. This makes the code run faster and it actually gives you a more accurate answer! The whole list of words is too long for this code to run on CoCalc.

So, now that we can find the word with the highest entropy, we need a function that will return us back the list of remaining words!

In [15]:
def getRemainingWordsListWithOutcome(guess, wordList, outcome):
    newWordList = []
    for word in wordList:
        add = True
        for i in range(0, 5):
            if outcome[i] == Results.green and word[i] != guess[i]:
                add = False
                continue
            elif outcome[i] == Results.yellow and guess[i] not in word:
                add = False
                continue
        if add:    
            newWordList.append(word)
    return newWordList

print(getRemainingWordsListWithOutcome("raise", possibleInputs, (Results.green, Results.green, Results.green, Results.gray, Results.green)))

['raile', 'raine', 'raise']


Now that we can find the word with the highest Entropy and we can get the remaining words left after a guess, we can actually write a function that solves the game for us! I won't actually do that because it requires me to add user input in CoCalc, and I am not sure how to do that. But the code would work something like this:
1. Get the first optimal word (Which we know is "raise")
2. Input "raise" into Wordle and get outcome
3. Get remaining words list with the outcome
4. Then get the next optimal word from the remaining list
5. Go back to step 2

# Conclusion
With a little bit of math and coding, we have written a program that can find us the word with the highest entropy so that, on average, it reduces the amount of possible guesses as much as possible. However, I would like to point out that this program can be further improved to give even better guesses. First, we can also give higher weights to words that are used more frequently, and words that have more common letters. We can also make the algorith look deeper to see if different words give better results in later guesses.

The program is also quite slow. The Youtuber 3Blue1Brown made a video where he showed that an optimization he tried created errors in the results. So I wanted to avoid those issues by making the code do all the work without trying to skip things. I'm sure that it would run much better on my local machine though. I'm not sure what limitations CoCalc has.

However, in the end, it works! And it probably would work a lot better if it we running on my local machine than on CoCalc. But I definitely enjoyed making this and I am glad that it works. Now I am going to start all my games with the word "raise".

In [16]:
print("Goodbye, world!")

Goodbye, world!
