
# 👋 Welcome!

This notebook demonstrates Monte Carlo with a familiar example: **Wordle!**

You'll see how we:
1. Model a minimal Wordle environment
2. Simulate many possible hidden words
3. Summarize how well a specific opener word performs

**The key idea:** We are *not* guessing today's word, rather measuring the quality of an opener in an uncertain environment.

## 1. Setup and Data
The list of words for wordle can be found in serveral places. I got these lists from https://github.com/Kinkelin/WordleCompetition/tree/main/data/official.

We sort them into *answers.txt* and *guesses.txt*. These can be found in the root folder of this repository.

In [1]:
# Set up the global variables and housekeeping.

from collections import Counter
from random import Random
import matplotlib.pyplot as plt

# --- Load Word Lists ---
# Answers are all possible wordle answer about 2000 words.
with open("./data/answers.txt") as f:
    ANSWERS = [w.strip().upper() for w in f if len(w.strip()) == 5]

# Guesses are all possible wordle guesses and the possible answers combined about 12000 words.
with open("./data/guesses.txt") as f:
    GUESSES = [w.strip().upper() for w in f if len(w.strip()) == 5]

*Note:* We split the list into valid answers and valid guesses, this is what Wordle does, not all words you can guess are actually possible answers to the puzzle.

## 2. Wordle Environment
This section handles the guesses, which letters are good and which are bad, and handles duplicate letters correctly. The narrows the list of words to be consistent with the feedback.

In [2]:
# --- Feedback Function ---
def feedback(secret: str, guess: str):
    """Return pattern as tuple of G/Y/- (greens, yellows, grays)."""
    secret = secret.upper(); guess = guess.upper()
    res = ["-"] * 5
    counts = Counter(secret)

    # Greens
    for i in range(5):
        if guess[i] == secret[i]:
            res[i] = "G"
            counts[guess[i]] -= 1

    # Yellows
    for i in range(5):
        if res[i] == "G":
            continue
        if counts[guess[i]] > 0:
            res[i] = "Y"
            counts[guess[i]] -= 1

    return tuple(res)

def filter_candidates(cands, guess, patt):
    """Keep only words consistent with guess & feedback pattern."""
    return [w for w in cands if feedback(w, guess) == patt]

## 3. Strategy
To keep the focus on the monte carlo simulation, we use a simple strategy and just pick the first "eligible" word in the list when guessing.

In [3]:
# --- The actual gameplay strategy ---
def play_one(secret, opener, max_guesses=6):
    """
    Play one game:
    - Start with `opener`
    - Then always pick the first candidate from remaining ANSWERS
    """
    cands = ANSWERS[:]  # secrets are always from the ANSWERS list
    guess = opener.upper()

    for turn in range(1, max_guesses + 1):
        patt = feedback(secret, guess)
        if patt == ("G","G","G","G","G"):
            return turn, True

        cands = filter_candidates(cands, guess, patt)
        if not cands:
            return turn, False

        # Next guess: naive (first candidate in sorted order)
        guess = sorted(cands)[0]

    return max_guesses, False

## 4. Monte Carlo
**Goal** estimate how good the word is on *average*, not simulating one specific game.

- Randomly sample a set of words from answers.txt
- Play a game where the answer is that word
- Measure the success rate within 6 guesses

*This* is what Monte Carlo is about, random sampling + repeated trials + aggregation

In [4]:
# --- Monte Carlo Simulation ---
def run_monte_carlo(opener, trials, seed=42):
    """
    Run Monte Carlo simulation:
    - Randomly sample `trials` secrets from ANSWERS
    - Play each game with the strategy
    - Summarize results
    """
    rng = Random(seed)
    secrets = rng.sample(ANSWERS, min(trials, len(ANSWERS)))

    results = [play_one(s, opener=opener) for s in secrets]
    solved = [t for (t, ok) in results if ok]

    success_rate = len(solved) / len(secrets)
    avg_guesses = sum(solved) / len(solved) if solved else float('nan')

    print(f"Opener: {opener}")
    print(f"Secrets tested: {len(secrets)}")
    print(f"Success within 6: {success_rate:.1%}")
    print(f"Average guesses (solved only): {avg_guesses:.2f}")

In [5]:
# Example run
# Sub in your starting word
run_monte_carlo(opener="SOARE", trials=500)

Opener: SOARE
Secrets tested: 500
Success within 6: 97.6%
Average guesses (solved only): 3.94


### 👉 Try:
- Change the opener
- Increase the trails
- Making a smarter strategy