<a href="https://colab.research.google.com/github/isaacmattern/wordle/blob/main/wordle.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Wordle Solver**
### Python project by Isaac Mattern

[Wordle](https://www.powerlanguage.co.uk/wordle/) is a game where the objective is to correctly guess a 5-letter word in 6 tries or less. Each time a player submits a guess, the game highlights each letter in the word in

*   green, if the letter is in the word and in the correct location in the word
*   yellow, if the letter is in the word, but not in the position where the user guessed
*   gray, if the word does not contain the letter at all

This project is an attempt to use a list of the most 15,918 5-letter words and some Python magic to solve some Wordles.



##Getting Words
First thing's first, Wordle is all about 5-letter words. So let's find a large dataset of 5-letter words! [This repository](https://github.com/dwyl/english-words) contains the 466,000 most common English words. We'll select the [words_aplpha.txt](https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt) file and place all words that aren't of length 5 into a list. Printing the length of this list tells us we have the 15,918 most common 5-letter English words. So, unless the people at Wordle are feeling particulary cruel, we probably have the answer here somewhere. 

In [85]:
words_file = open("466k.txt", "r")

all_words = []
for line in words_file:
  word = line.strip()
  if len(word) == 5:
    all_words.append(word)

# print(all_words)
print(len(all_words))

15918


## Defining Some Important Functions

Wordle will tell us certain information about our guess which we can use to eliminate many words from our list of possible words.

*   A **green**
*   A **yellow**
*   A **gray** letter means that the letter we guessed is not in the word at all. 

Let's define two functions which will be useful regardless of what algorithm we are using.


1.   An **update_possible** function will allow us to trim down our list of possible Wordle solutions.
2.   A **get_colors** function will allow us to simulate what Wordle's program does each time you submit a guess. Thus, this function will only be used when we're running simulations to test the efficiency of our word-guessing algorithms

In [84]:
def update_possible(guess, possible, colors):
  """
  Uses a list of colors (equal in length
  to the length of our words) to eliminate words
  which could not possibly be correct. 
  """
  for i in range(len(guess)):
    if colors[i] == 0:
      # Eliminate all words which do not have a correct letter in a correct spot
      for word in possible[:]:
        if guess[i] != word[i]:
          possible.remove(word)
    elif colors[i] == 1:
      # Eliminate all words which do not contain a correct letter
      for word in possible[:]:
        if guess[i] not in word:
          possible.remove(word)
    else:
      # Eliminate all words which contain an incorrect letter
      for word in possible[:]:
        if guess[i] in word:
          possible.remove(word)

def get_colors(guess, answer) -> list:
  """
  Compares a guess to an answer and
  returns a list of numbers which signifies
  colors returned by a wordle guess.
  0 = Green
  1 = Yellow
  2 = Gray
  """
  colors = []
  for i in range(len(guess)):
    if guess[i] == answer[i]:
      colors.append(0)
    elif guess[i] in answer:
      colors.append(1)
    else:
      colors.append(2)

  return colors

## Approach number 1: Random Guess

We will first randomly select a word using *random.choice*. A completely random guess probably isn't the greatest strategy, so we shouldn't expect an amazing result. 

After selecting a random 100 words from the list and running a simulation for each of them, it took, on average, 6.37 guesses, which is kind of garbage, since more than 6 guesses is considered a loss by Wordle. 

In [88]:
import random

def random_guess() -> int:
  # Set up answer, word list of possible answers, and generate our first guess
  answer = random.choice(all_words)
  possible = all_words.copy()
  guess = random.choice(possible)
  num_guesses = 1

  # Randomly select a possible answer and use color information to eliminate
  # wrong solutions until we have found our word
  while guess != answer:
    # print(f"Guess #{num_guesses}: {guess} (incorrect)")
    colors = get_colors(guess, answer)
    update_possible(guess, possible, colors)
    guess = random.choice(possible)
    num_guesses = num_guesses + 1

  # print(f"{num_guesses} guesses for the solution \"{guess}\"")
  return num_guesses

In [89]:
simulations = 100
total_guesses = 0

for i in range(simulations):
  total_guesses = total_guesses + random_guess()

average = (float) (total_guesses / simulations)
print(f"Average # of guesses for random: {average}")

Average # of guesses for random: 6.37
