<img src="../../images/banners/python-modular.png" width="600"/>

# <img src="../../images/logos/python.png" width="23"/> Project: Wordle


## <img src="../../images/logos/toc.png" width="20"/> Table of Contents 
* [Rules](#rules)
* [TODO](#todo)
* [Tips](#tips)
* [Test](#test)

---

Wordle is a word game, which recently got very popular and was added to NYT Games website. It is developed by Josh Wardle. You can find the original game [here](https://www.nytimes.com/games/wordle/index.html). However, you can only play it once a day.

Luckily, in this version of Wordle that you are going to be programming, you will be able to play as many times as you want in a day. Moreover, you will be allowed to see which words could potentially be the right answer. What is more, you will be using a bigger data set than the actual Wordle, which basically involves all the 5 letter words in a Scrabble dictionary.

<a class="anchor" id="rules"></a>

## Rules

- The player enters a random 5-letter word.
- If the random word is the word to be guessed, the game is over. The player receives a congratulations message.
- If the random word isn’t the word to be guessed, the player is informed about whether the right letter is at the right place and if some of the letters are in the word but wrongly placed.
- Based on this, the player has 6 tries to guess the word.
- At the end of the 6 attempts, if the player fails to guess the right word, the word is revealed.

<a class="anchor" id="todo"></a>

## TODO

1. Read the possible words from the txt file and save them on a list.
2. Make sure that the user can enter input exactly 6 times.
3. Make sure that regardless of the case, the input is processed correctly.
4. Make sure that you use appropriate data structures for valid characters, invalid positions, and invalid characters.
5. Use the random module to make sure the word to be guessed is randomly chosen.
6. Cluster the potential words accordingly and reveal it to the player each round.
7. In case the player first guesses the right letter at the wrong place, and later on gets the place right, remove that from your valid characters invalid positions.

<a class="anchor" id="tips"></a>

## Tips

- At the very beginning each of the words have a chance of being the word to be guessed.
- A word is invalid, when there are invalid letters in it or when there is a valid letter at the wrong place.
- A word is possible, when it isn’t invalid and contains the correctly guessed letter at the right place.
- You can initiate a random number by:

```python
from random import randint, seed
seed()
```

<a class="anchor" id="test"></a>

## Test

If the word to be guessed is _BUIST_ and if I guess first _MILKY_, my cluster of potential words should consist of 1127 words. If I then go ahead and guess _POUND_, my cluster of potential words should consist of only 52 words. If I go ahead and guess _RATES_, my cluster should consist of only 3 words and they should be `['BUIST', 'BUSTI', 'QUIST']`.

## Solution

### Build Data

First, we need to download a corpus of english (or any language you are interested in) words. By a simple google search, you can find many datasets. However, we want to use a dataset that has frequency information as well. Words frequency data tell us how popular a word is and to control the difficulty level of the game, we can use top popular words to make it easy or rare words to make it difficult to guess.

For the purpose of this project, we use data in [Kaggle English Word Frequency](https://www.kaggle.com/datasets/rtatman/english-word-frequency). This dataset contains the counts of the 333,333 most commonly-used single words on the English language web, as derived from the Google Web Trillion Word Corpus.

Dataset file is in `.csv` version which you may not know how to work with, so we convert it to a comma separated `.txt` file.

In [1]:
# Downloaded file is located in data/words_frequency.txt
file_path = 'data/words_frequency.txt'

We need to filter the top N words that have M letters. We choose 10_000 and 5 for N and M respectively, but you can select any values to make your game more or less fun/difficult.

In [2]:
def generate_word_frequency(file_path, word_len: int = 5, limit: int = 1000):
    """
    Generate top words (top `limit` words) that have word_len letters.

    :param file_path: Words frequency data txt file
    :param word_len: Word length (M)
    :param limit: Top N words
    :return: List of words
    """
    # Build data
    words_freq = []
    with open(file_path) as f:
        for line in f:
            word, frequency = line.split(', ')
            frequency = int(frequency)
            words_freq.append((word, frequency))

    # `word_len` letters words
    words_freq = list(filter(
        lambda w_freq: len(w_freq[0]) == word_len, words_freq
    ))
            
    # Sort data
    words_freq = sorted(words_freq, key=lambda w_freq: w_freq[1], reverse=True)

    # Limit data
    words_freq = words_freq[:limit]

    # Drop frequency data and only keep the words
    words = [w_freq[0] for w_freq in words_freq]
    
    return words

In [3]:
word_len = 5
limit = 10_000

words = generate_word_frequency(file_path, 5, 10_000)

In [4]:
words[:10]

['about',
 'other',
 'which',
 'their',
 'there',
 'first',
 'would',
 'these',
 'click',
 'price']

### Select a Random Word

To select a random word from a list of words, we can use the `random.choice` function from the `random` module.

In [5]:
import random

In [6]:
random.choice([1, 2, 3])

1

To select a random word from the list, we can use `random` library. The `choice()` method returns a randomly selected element from the specified sequence. The sequence can be a string, a range, a list, a tuple or any other kind of sequence.

In [7]:
random.seed(3)

In [8]:
word = random.choice(words)

In [9]:
word = word.upper()

In [10]:
word

'DOLOR'

If you restart the kernel and run the cell again, you will get the same result.

### Wordle Run

To implement the Wordle game, we need to do the following:

1. Create a list of words
2. Select a random word from the list
3. Ask the user to guess the word
4. Check if the user's guess is correct
6. If the user's guess is correct, print a message and end the game.
7. If the user's guess is incorrect, print a message and let the user try again.
8. The message should tell the user which letters are valid and in the correct position, which letters are valid but in the wrong position, and which letters are invalid (not in the word).
9. If the user has guessed incorrectly 6 times, print a message and end the game.


So far we have done steps 1 and 2. We can do step 3 by using the `input` function. Step 4 can be done by using the `==` operator. Steps 6 and 7 can be done by using the `if` statement. Step 8 can be done by using the `for` loop. Step 9 can be done by using the `while` loop.

To make the game colorful, we can use `termcolor` to print the letters in different colors (Green for correct letters in the correct position, Yellow for correct letters in the wrong position, and Red for incorrect letters). To make the prints easier, we can make them into functions:

In [11]:
!pip install termcolor

from termcolor import colored



In [12]:
def print_success(text, end='\n'):
    print(colored(text, 'green', attrs=['reverse']), end=end)

def print_warning(text, end='\n'):
    print(colored(text, 'yellow', attrs=['reverse']), end=end)

def print_error(text, end='\n'):
    print(colored(text, 'red', attrs=['reverse']), end=end)

In [13]:
print_error('Error')

[7m[31mError[0m


In [14]:
print_success('Success')

[7m[32mSuccess[0m


In [15]:
print_warning('Warning')



In [16]:
def check_word(word, guess_word):
    for w_letter, g_letter in zip(word, guess_word):
        if w_letter == g_letter:
            print_success(f' {g_letter} ', end='')
            print(' ', end='')
        elif g_letter in word:
            print_warning(f' {g_letter} ', end='')
            print(' ', end='')
        else:
            print_error(f' {g_letter} ', end='')
            print(' ', end='')
    print()

In [None]:
# Start Game
num_try = 6
success = False

while num_try:
    guess_word = input(f'Enter a {word_len} letter word (or q to exit): ')
    if guess_word.lower() == 'q':
        break
    guess_word = guess_word.upper()

    # Word length
    if len(guess_word) != 5:
        print(f'Word must have {word_len} letters. You entered {len(guess_word)}!')
        continue

    # Check valid word
    if guess_word.lower() not in words:
        print_warning('Word is not valid!')
        continue

    # Check valid, invalid positions, invalid characters
    check_word(word, guess_word)

    if word == guess_word:
        print()
        print_success(' Congratulations! ')
        success = True
        break

    num_try -= 1

if not success:
    print_warning(f'Game over: The word was "{word}".')

Enter a 5 letter word (or q to exit):  apple


[7m[31m A [0m [7m[31m P [0m [7m[31m P [0m [7m[33m L [0m [7m[31m E [0m 


Enter a 5 letter word (or q to exit):  color


[7m[31m C [0m [7m[32m O [0m [7m[32m L [0m [7m[32m O [0m [7m[32m R [0m 


Enter a 5 letter word (or q to exit):  dolor


[7m[32m D [0m [7m[32m O [0m [7m[32m L [0m [7m[32m O [0m [7m[32m R [0m 

[7m[32m Congratulations! [0m


### Modularization

We can make the code easier to read and reuse by putting it into a Wordle class and functions. We can also make the code more flexible by allowing the user to specify the number of guesses and the list of words as parameters (Do this as an exercise).

The modularized code is available in github at [Pytopia Wordle Repo](https://github.com/pytopia/wordle).