# Demonstrating a foolproof Wordle solution -- no luck involved!

This demonstration is a fun timewaster to contribute to the annals of Wordle theory. The takeaway? **There is at least one "foolproof" method of solving any Wordle puzzle in six guesses.** Of course, there are probably many more -- including probably a bunch that are way more efficient. But I'll save that for another day.

### Import required packages

In [1]:
import numpy as np
import pandas as pd
import toyplot

# Get our data -- possible target words

### Read in the list of target words

In [2]:
# read list as np array
possible_targets = np.array(pd.read_csv('./official_answer_list.csv')['words'])
# put the list in alphabetical order a -> z
possible_targets.sort()

In [3]:
# split each word into an array of individual letters
possible_targets_split = np.array([[i for i in q] for q in possible_targets])
possible_targets_split

array([['a', 'b', 'a', 'c', 'k'],
       ['a', 'b', 'a', 's', 'e'],
       ['a', 'b', 'a', 't', 'e'],
       ...,
       ['z', 'e', 'b', 'r', 'a'],
       ['z', 'e', 's', 't', 'y'],
       ['z', 'o', 'n', 'a', 'l']], dtype='<U1')

In [4]:
# how many possible guesses are there?
len(possible_targets_split)

2315

# Define functions for "playing" Wordle:

The "get_hints" function was taking me a while to code, so rather than reinventing the wheel I copied in code by Dominic Fox. Thanks Dominic!  

**See Dominic's blog post on "Playing Wordle with Python": https://poetix.medium.com/playing-wordle-with-python-6750185ac168**

In [5]:
def get_letter_counts(word):
    '''
    ~utility function~
    simply counts the number of time each letter occurs in a word
    '''
    result = dict()
    for c in word:
        result[c] = result.get(c, 0) + 1
    return result


def get_hints(target_word, guess):
    '''
    gets wordle-coded hints for each submission!
    "g": letter is in the word and in the right position
    "y": letter is in the word but in the wrong position
    ".": letter is either not in the word, or is used more times in the guess than in the target.
    '''
    letter_counts = get_letter_counts(target_word)
    green_counts = dict()
    for position, guess_letter in enumerate(guess):
        if target_word[position] == guess_letter:
            green_counts[guess_letter] = green_counts.get(guess_letter, 0) + 1

    available_for_yellow = {letter: count - green_counts.get(letter, 0) for letter, count in letter_counts.items()}

    for position, guess_letter in enumerate(guess):
        if guess_letter in target_word:
            if target_word[position] == guess_letter:
                yield 'g', position, guess_letter
            else:
                if available_for_yellow[guess_letter] > 0:
                    available_for_yellow[guess_letter] -= 1
                    yield 'y', position, guess_letter
                else:
                    yield '.', position, guess_letter
        else:
            yield '.', position, guess_letter

            
def concise_hints(target_word, guess):
    '''
    just simplifying the output of Dominic Fox's "get_hints" function
    '''
    return(np.array([i[0] for i in get_hints(target_word,guess)]))


def reduce_list(remaining_answers_starter, guess, hints):
    '''
    Accepts the array of possible remaining target words, the current guess, and 
        the wordle hints from the current guess.
    Returns a new list of remaining answers that are compatible with the current guess.
    '''
    guess_split = np.array([i for i in guess])
    remaining_answers = remaining_answers_starter.copy()#ans_list_split.copy()
    for hintidx in range(5):
        guesslet = guess_split[hintidx]
        guesshint = hints[hintidx]
        if guesshint == 'g':
            remaining_answers = remaining_answers[remaining_answers[:,hintidx] == guesslet]
    for hintidx in range(5):
        guesslet = guess_split[hintidx]
        guesshint = hints[hintidx]
        if guesshint == 'y':
            remaining_answers = remaining_answers[remaining_answers[:,hintidx] != guesslet]
            remaining_answers = remaining_answers[[guesslet in i for i in remaining_answers[:,hints!='g']]]
    for hintidx in range(5):
        guesslet = guess_split[hintidx]
        guesshint = hints[hintidx]
        if guesshint == '.':
            num_more_than = np.sum(guess_split[hints!='.'] == guesslet)
            remaining_answers = remaining_answers[~(np.sum([i == guesslet for i in remaining_answers],axis=1) > num_more_than)]
    return(remaining_answers)

# Demonstrate "playing" Wordle:

First set a (pretending to be unknown) target word -- this is what we are trying to guess.

In [6]:
target_word = "point"

Now initialize a list of "remaining possible target words" that we will pare down.

In [7]:
remaining_targets_arr = possible_targets_split.copy()
len(remaining_targets_arr)

2315

### First guess:

In [8]:
# input the guess to get hints from Wordle
guess = "taper"
hint = concise_hints(target_word,guess)
hint

array(['y', '.', 'y', '.', '.'], dtype='<U1')

This output tells us that the first and third letters -- "t" and "p" -- are both in the target word but at different positions, and the the other letters -- "a", "e", and "r" -- are not in the target word.

Now we can use the `reduce_list` function to pare down our array of possible target words to only those that meet these criteria:

In [9]:
remaining_targets_arr = reduce_list(remaining_targets_arr,guess,hint)
remaining_targets_arr

array([['o', 'p', 't', 'i', 'c'],
       ['p', 'h', 'o', 't', 'o'],
       ['p', 'i', 'l', 'o', 't'],
       ['p', 'i', 'n', 't', 'o'],
       ['p', 'i', 't', 'c', 'h'],
       ['p', 'i', 't', 'h', 'y'],
       ['p', 'i', 'v', 'o', 't'],
       ['p', 'o', 'i', 'n', 't'],
       ['p', 'o', 's', 'i', 't'],
       ['p', 'o', 'u', 't', 'y'],
       ['p', 'u', 't', 't', 'y'],
       ['s', 'p', 'i', 'l', 't'],
       ['s', 'p', 'l', 'i', 't'],
       ['s', 'p', 'o', 'u', 't'],
       ['s', 't', 'o', 'm', 'p'],
       ['s', 't', 'o', 'o', 'p'],
       ['s', 't', 'u', 'm', 'p']], dtype='<U1')

So after our first guess, we are left with only 17 possible words!

### Second guess:

In [10]:
guess = "pitch"
hint = concise_hints(target_word,guess)
hint

array(['g', 'y', 'y', '.', '.'], dtype='<U1')

In [11]:
remaining_targets_arr = reduce_list(remaining_targets_arr,guess,hint)
remaining_targets_arr

array([['p', 'o', 'i', 'n', 't'],
       ['p', 'o', 's', 'i', 't']], dtype='<U1')

### Third guess:

In [12]:
guess = "posit"
hint = concise_hints(target_word,guess)
hint

array(['g', 'g', '.', 'y', 'g'], dtype='<U1')

In [13]:
remaining_targets_arr = reduce_list(remaining_targets_arr,guess,hint)
remaining_targets_arr

array([['p', 'o', 'i', 'n', 't']], dtype='<U1')

### Once the `remaining_targets_arr` is down to just a single element, we have won the game!

# Demonstrate the foolproof method:

For any given puzzle, we will guess: 'sprug', 'blawn', 'mythi', and 'coved'. Then, if there is more than one word left in the array of remaining possible targets -- which is **already alphabetically sorted** -- then we will simply guess the first word.

Let's try it on the target word from January 9, 2022: **GORGE**

### First, guess all four starter words:

In [14]:
# initialize with the target word and full list of possible solutions
target_word = "gorge"
remaining_targets_arr = possible_targets_split.copy()
counter = 5 # number of remaining guesses after first
for guess in ['sprug','blawn','mythi','coved']:
    hint = concise_hints(target_word,guess)
    remaining_targets_arr = reduce_list(remaining_targets_arr,guess,hint)
    print("Number of possible solutions remaining: " + str(len(remaining_targets_arr)))
    print("    Guesses left: " + str(counter))
    counter -= 1

Number of possible solutions remaining: 13
    Guesses left: 5
Number of possible solutions remaining: 8
    Guesses left: 4
Number of possible solutions remaining: 4
    Guesses left: 3
Number of possible solutions remaining: 2
    Guesses left: 2


In [15]:
remaining_targets_arr

array([['f', 'o', 'r', 'g', 'e'],
       ['g', 'o', 'r', 'g', 'e']], dtype='<U1')

### Now, guess the first word in the remaining, alphabetically-ordered array of possible solutions:

In [16]:
# our fifth guess is now the first word in the remaining targets
guess = remaining_targets_arr[0]
hint = concise_hints(target_word,guess)
remaining_targets_arr = reduce_list(remaining_targets_arr,guess,hint)
print("Number of possible solutions remaining: " + str(len(remaining_targets_arr)))
print("    Guesses left: " + str(counter))
counter -= 1

Number of possible solutions remaining: 1
    Guesses left: 1


Even though our fifth guess was still wrong in this case, it brought our list of remaining possible solutions down to just one word -- gorge -- which we can guess on our sixth try!

# Show that the method works on all 2315 possible target words:

We can take what we just did and loop it across all target words, recording the number of tries it takes to reduce the list of possible solutions down to just one.

In [17]:
num_remaining_arr = np.zeros((2315,5),dtype=int)

# looping through all possible target words -- 
for idx, target_word in enumerate(possible_targets):
    # initialized with the target word and full list of possible solutions
    remaining_targets_arr = possible_targets_split.copy()
    
    # initialize list to track the number of remaining solution options after each guess
    current_num_remaining = []
    
    # loop through the starter words
    for guess in ['sprug','blawn','mythi','coved']:
        # get the wordle hint for the guess
        hint = concise_hints(target_word,guess)
        
        # reduce the pool of possible solutions
        remaining_targets_arr = reduce_list(remaining_targets_arr,guess,hint)
        
        # record the number of remaining possible solutions
        current_num_remaining.append(len(remaining_targets_arr))

    # our fifth guess is now the first word in the remaining targets
    guess = remaining_targets_arr[0]
    
    # get the wordle hint for the guess
    hint = concise_hints(target_word,guess)
    
    # reduce the number of remaining guesses
    remaining_targets_arr = reduce_list(remaining_targets_arr,guess,hint)
    
    # record the number of remaining possible solutions
    current_num_remaining.append(len(remaining_targets_arr))
    
    # update the 2315-row array with the number of solutions left per-guess for this target word
    num_remaining_arr[idx] = current_num_remaining

In [18]:
num_remaining_arr[:10]

array([[569,   2,   1,   1,   1],
       [100,   1,   1,   1,   1],
       [569,   2,   1,   1,   1],
       [569,  14,   2,   1,   1],
       [569,  14,   2,   2,   1],
       [351,  11,   2,   1,   1],
       [569,  14,   1,   1,   1],
       [569,   7,   5,   1,   1],
       [569,  14,   4,   2,   1],
       [351,  11,   1,   1,   1]])

We are left with a 2315 x 5 array. Each row starts at the number of remaining possible solutions after the first guess. The numbers decrease across the columns as possible solutions are eliminated with each guess.  

The fifth column tells us the number of remaining possible solutions after the fifth guess. **In order to say we have "solved" this Wordle puzzle in six guesses, this fifth column should equal one -- implying that there is only one possible choice for the sixth guess.**

Let's see if that is the case:

In [19]:
# grab the last column of the array
last_column = num_remaining_arr[:, 4]

In [20]:
# ask if values are greater than 1 (if True, then we failed to solve this word)
failed_to_solve = last_column > 1

In [21]:
# count how many we failed to solve:
np.sum(failed_to_solve)

0

# There it is! This method gives the correct solution for every possible Wordle target word.

# While this method definitely ensures success, the distribution of number of required guesses is pretty lousy... Tradeoffs, I guess?

In [22]:
num_guesses_required = (7 - np.sum(num_remaining_arr == 1,axis=1))

In [23]:
toyplot.bars((np.histogram(num_guesses_required,bins=5)[0],
              [1.5,2.5,3.5,4.5,5.5,6.5]),
             xlabel="Num Guesses Required",
             label="Distribution of Number of Guesses Required to Solve"
            );

# 4 or 5 guesses are usually required... But never more than 6!