# Hangman Solver using Hidden Markov Models (HMM)

## Part A: HMM Implementation & Validation

This notebook implements a probabilistic approach to predict letters in Hangman using Hidden Markov Models with character sequence analysis.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from hmmlearn import hmm
import string
from collections import Counter
import pickle
import json

# ============================================================================
# DATA PREPROCESSING
# ============================================================================

def load_corpus(file_path):
    """Load words from corpus file"""
    with open(file_path, 'r') as f:
        words = f.read().splitlines()
    return words

def preprocess_words(words):
    """Preprocess words: lowercase and filter"""
    processed_words = []
    for word in words:
        word = word.lower()
        if word.isalpha():
            processed_words.append(word)
    return processed_words


## Step 1: Setup & Data Preprocessing

**Algorithm:** Text Preprocessing with Filtering
- Load corpus and test files
- Normalize text (lowercase conversion)
- Filter non-alphabetic words using basic string operations

In [2]:

# Load training corpus
corpus_path = 'Data/corpus.txt'
raw_words = load_corpus(corpus_path)
processed_words = preprocess_words(raw_words)

print(f"✓ Total words in corpus: {len(raw_words)}")
print(f"✓ Words after preprocessing: {len(processed_words)}")

# Load test data
test_path = 'Data/test.txt'
test_words_raw = load_corpus(test_path)
test_words = preprocess_words(test_words_raw)
print(f"✓ Test words loaded: {len(test_words)}")


✓ Total words in corpus: 50000
✓ Words after preprocessing: 49979
✓ Test words loaded: 2000


## Step 2: Load Training & Test Data

**Operation:** File I/O and Data Loading
- Load corpus words for training HMM models
- Load test set for validation
- Display dataset statistics

In [3]:

# ============================================================================
# HMM MODEL CLASS
# ============================================================================

class HangmanHMM:
    """Hidden Markov Model for Hangman letter prediction"""
    
    def __init__(self):
        self.models = {}
        self.alphabet = string.ascii_lowercase
        self.letter_to_idx = {letter: idx for idx, letter in enumerate(self.alphabet)}
        self.idx_to_letter = {idx: letter for idx, letter in enumerate(self.alphabet)}
    
    def word_to_sequence(self, word):
        """Convert word to sequence of letter indices"""
        return [[self.letter_to_idx[letter]] for letter in word.lower() if letter in self.alphabet]
    
    def train_for_length(self, words_of_length, n_states=5):
        """Train HMM for words of specific length"""
        if not words_of_length or len(words_of_length) < 2:
            return None
        
        word_length = len(words_of_length[0])
        
        # Convert words to sequences
        sequences = [self.word_to_sequence(word) for word in words_of_length]
        lengths = [len(seq) for seq in sequences]
        X = np.concatenate(sequences)
        
        # Compute letter frequencies for this length
        letter_counts = Counter(''.join(words_of_length))
        total_letters = sum(letter_counts.values())
        freq_vec = np.array([letter_counts.get(ch, 0) / total_letters for ch in self.alphabet])
        
        # Add smoothing
        freq_vec = freq_vec + 1e-6
        freq_vec = freq_vec / np.sum(freq_vec)
        
        # Determine number of states
        n_states = min(max(3, word_length // 2), 10)
        
        # Initialize HMM
        model = hmm.MultinomialHMM(n_components=n_states, n_iter=100, random_state=42, tol=0.01)
        
        model.startprob_ = np.ones(n_states) / n_states
        model.transmat_ = np.ones((n_states, n_states)) / n_states
        model.emissionprob_ = np.tile(freq_vec, (n_states, 1))
        
        # Train
        try:
            model.fit(X, lengths=lengths)
            return model
        except Exception as e:
            return None
    
    def train(self, words):
        """Train separate HMMs for EACH word length"""
        words_by_length = {}
        for word in words:
            length = len(word)
            if length not in words_by_length:
                words_by_length[length] = []
            words_by_length[length].append(word)
        
        print("\n" + "="*70)
        print("TRAINING INDIVIDUAL HMM MODELS FOR EACH WORD LENGTH")
        print("="*70)
        
        total_lengths = len(words_by_length)
        trained_count = 0
        
        for i, (length, words_of_length) in enumerate(sorted(words_by_length.items()), 1):
            print(f"[{i:2d}/{total_lengths}] Length {length:2d}: {len(words_of_length):5d} words", end=" ... ")
            
            if len(words_of_length) < 2:
                print("✗ SKIP (too few)")
                continue
            
            model = self.train_for_length(words_of_length)
            if model is not None:
                self.models[length] = model
                trained_count += 1
                print("✓ TRAINED")
            else:
                print("✗ FAILED")
        
        print("\n" + "="*70)
        print(f"Training Summary:")
        print(f"  - Successfully trained: {trained_count}/{total_lengths}")
        print(f"  - Covered word lengths: {sorted(self.models.keys())}")
        print("="*70)
        
        return self.models


## Step 3: HMM Model Class

**Algorithm:** Hidden Markov Model (Baum-Welch Training)
- Separate models for each word length
- Letter frequency-based initialization
- Laplace smoothing for probability estimation
- Emission probabilities capture letter patterns
- MultinomialHMM with 3-10 states depending on word length

In [4]:

# ============================================================================
# PREDICTION CLASS
# ============================================================================

class HangmanPredictor:
    """Predict next letter using trained HMM"""
    
    def __init__(self, hmm_model):
        self.hmm_model = hmm_model
        self.alphabet = string.ascii_lowercase
        self.letter_to_idx = {letter: idx for idx, letter in enumerate(self.alphabet)}
        self.idx_to_letter = {idx: letter for idx, letter in enumerate(self.alphabet)}
    
    def get_letter_probabilities(self, word_length):
        """Compute letter probabilities for each position"""
        if word_length not in self.hmm_model.models:
            return np.ones((26, word_length)) / 26
        
        model = self.hmm_model.models[word_length]
        
        try:
            emission_probs = model.emissionprob_
            avg_emission = np.mean(emission_probs, axis=0)
            
            avg_emission = np.maximum(avg_emission, 1e-10)
            emission_probs_normalized = avg_emission / np.sum(avg_emission)
            
            position_probs = np.tile(emission_probs_normalized, (word_length, 1)).T
            return position_probs
        except Exception as e:
            return np.ones((26, word_length)) / 26
    
    def predict_next_letter(self, masked_word, guessed_letters):
        """Predict next best letter to guess"""
        word_length = len(masked_word)
        
        probs = self.get_letter_probabilities(word_length)
        
        if probs.shape[0] != 26 or probs.shape[1] != word_length:
            probs = np.ones((26, word_length)) / 26
        
        unknown_positions = [i for i, c in enumerate(masked_word) if c == '_']
        if unknown_positions:
            avg_probs = np.mean(probs[:, unknown_positions], axis=1)
        else:
            avg_probs = np.mean(probs, axis=1)
        
        if len(avg_probs) != 26:
            avg_probs = np.ones(26) / 26
        
        for letter in guessed_letters:
            if letter in self.letter_to_idx:
                idx = self.letter_to_idx[letter]
                avg_probs[idx] = 0
        
        total_prob = np.sum(avg_probs)
        if total_prob > 1e-10:
            avg_probs = avg_probs / total_prob
        else:
            avg_probs = np.ones(26) / 26
        
        best_idx = np.argmax(avg_probs)
        best_letter = self.alphabet[best_idx]
        best_prob = avg_probs[best_idx]
        
        return best_letter, best_prob

# ============================================================================
# TRAIN AND SAVE MODELS
# ============================================================================

print("\nTraining models...")
hmm_model = HangmanHMM()
hmm_model.train(processed_words)

predictor = HangmanPredictor(hmm_model)

# Save models
print("\n" + "="*70)
print("SAVING MODELS")
print("="*70)

model_data = {
    'hmm_model': hmm_model,
    'predictor': predictor,
    'trained_lengths': list(hmm_model.models.keys())
}

with open('hmm_hangman_model.pkl', 'wb') as f:
    pickle.dump(model_data, f)

print(f"Saved {len(hmm_model.models)} trained models")
print(f"Model file: 'hmm_hangman_model.pkl'")



MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'
Some rows of transmat_ have zero sum because no transition from the state was ever observed.
MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in htt


Training models...

TRAINING INDIVIDUAL HMM MODELS FOR EACH WORD LENGTH
[ 1/24] Length  1:    46 words ... ✓ TRAINED
[ 2/24] Length  2:    84 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[ 3/24] Length  3:   388 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[ 4/24] Length  4:  1169 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[ 5/24] Length  5:  2340 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[ 6/24] Length  6:  3755 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[ 7/24] Length  7:  5111 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[ 8/24] Length  8:  6348 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[ 9/24] Length  9:  6787 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[10/24] Length 10:  6465 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[11/24] Length 11:  5452 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[12/24] Length 12:  4292 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[13/24] Length 13:  3094 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[14/24] Length 14:  2019 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[15/24] Length 15:  1226 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[16/24] Length 16:   698 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[17/24] Length 17:   375 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[18/24] Length 18:   174 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'


✓ TRAINED
[19/24] Length 19:    88 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'
MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://g

✓ TRAINED
[20/24] Length 20:    40 words ... ✓ TRAINED
[21/24] Length 21:    16 words ... 

MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://github.com/hmmlearn/hmmlearn/issues/335
https://github.com/hmmlearn/hmmlearn/issues/340
Even though the 'startprob_' attribute is set, it will be overwritten during initialization because 'init_params' contains 's'
Even though the 'transmat_' attribute is set, it will be overwritten during initialization because 'init_params' contains 't'
MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
https://g

✓ TRAINED
[22/24] Length 22:     8 words ... ✓ TRAINED
[23/24] Length 23:     3 words ... ✓ TRAINED
[24/24] Length 24:     1 words ... ✗ SKIP (too few)

Training Summary:
  - Successfully trained: 23/24
  - Covered word lengths: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

SAVING MODELS
Saved 23 trained models
Model file: 'hmm_hangman_model.pkl'


## Step 4: Prediction Engine & Model Training

**Algorithm:** Letter Probability Inference + Model Persistence
- Extract emission probabilities from trained HMM
- Predict best letter using argmax on probability distribution
- Exclude already-guessed letters
- Save serialized models using pickle for later use

In [5]:
# ============================================================================
# VALIDATION AND SUCCESS RATE
# ============================================================================

print("\n" + "="*70)
print("VALIDATING MODEL ON TEST DATA")
print("="*70)

def simulate_hangman_game(word, hmm_model, predictor, num_reveals=1):
    """Simulate Hangman game and check prediction accuracy"""
    if len(word) < 3:
        return None
    
    word = word.lower()
    masked = ['_'] * len(word)
    guessed_letters = set()
    
    # Randomly reveal initial letters
    reveal_indices = np.random.choice(len(word), min(num_reveals, len(word)), replace=False)
    for idx in reveal_indices:
        masked[idx] = word[idx]
        guessed_letters.add(word[idx])
    
    masked_word = ''.join(masked)
    
    try:
        next_letter, probability = predictor.predict_next_letter(masked_word, guessed_letters)
    except Exception as e:
        return None
    
    # Check if prediction is correct
    remaining_letters = set(c for i, c in enumerate(word) 
                           if c not in guessed_letters and masked[i] == '_')
    is_correct = next_letter in remaining_letters
    
    return {
        'word': word,
        'masked_word': masked_word,
        'prediction': next_letter,
        'probability': probability,
        'is_correct': is_correct,
        'word_length': len(word),
        'has_model': len(word) in hmm_model.models
    }

np.random.seed(42)
results = []

print("\nRunning predictions on test set...")
for i, word in enumerate(test_words):
    if i % 500 == 0:
        print(f"  Progress: {i}/{len(test_words)}", end='\r')
    
    result = simulate_hangman_game(word, hmm_model, predictor, num_reveals=1)
    if result is not None:
        results.append(result)

print(f"  Progress: {len(test_words)}/{len(test_words)} ✓")



VALIDATING MODEL ON TEST DATA

Running predictions on test set...
  Progress: 2000/2000 ✓


## Step 5: Model Validation & Testing

**Algorithm:**  Accuracy Testing
- Simulate Hangman games with random letter reveals
- Use trained HMM predictor for letter guessing
- Calculate correctness of predictions
- Stratified analysis by word length

In [6]:

# ============================================================================
# CALCULATE SUCCESS METRICS
# ============================================================================

print("\n" + "="*70)
print("SUCCESS RATE ANALYSIS")
print("="*70)

correct = sum(1 for r in results if r['is_correct'])
total = len(results)
success_rate = (correct / total) * 100 if total > 0 else 0

print(f"\n{'OVERALL SUCCESS RATE':.<50} {success_rate:.2f}%")
print(f"{'Total Predictions':.<50} {total}")
print(f"{'Correct Predictions':.<50} {correct}")
print(f"{'Incorrect Predictions':.<50} {total - correct}")

# Success rate by word length
print("\n" + "-"*70)
print("SUCCESS RATE BY WORD LENGTH")
print("-"*70)

success_by_length = {}
for r in results:
    length = r['word_length']
    if length not in success_by_length:
        success_by_length[length] = {'correct': 0, 'total': 0, 'with_model': 0}
    success_by_length[length]['total'] += 1
    if r['has_model']:
        success_by_length[length]['with_model'] += 1
    if r['is_correct']:
        success_by_length[length]['correct'] += 1

for length in sorted(success_by_length.keys()):
    data = success_by_length[length]
    rate = (data['correct'] / data['total'] * 100) if data['total'] > 0 else 0
    model_status = "✓" if data['with_model'] > 0 else "✗"
    print(f"Length {length:2d} {model_status}: {rate:6.2f}% ({data['correct']:3d}/{data['total']:3d})")

# Confidence analysis
correct_probs = [r['probability'] for r in results if r['is_correct']]
incorrect_probs = [r['probability'] for r in results if not r['is_correct']]

print("\n" + "-"*70)
print("PREDICTION CONFIDENCE METRICS")
print("-"*70)
if len(correct_probs) > 0:
    print(f"{'Avg Confidence (Correct)':.<50} {np.mean(correct_probs):.4f}")
if len(incorrect_probs) > 0:
    print(f"{'Avg Confidence (Incorrect)':.<50} {np.mean(incorrect_probs):.4f}")
if len(correct_probs) > 0 and len(incorrect_probs) > 0:
    print(f"{'Confidence Gap':.<50} {np.mean(correct_probs) - np.mean(incorrect_probs):.4f}")

print(f"{'Overall Avg Confidence':.<50} {np.mean([r['probability'] for r in results]):.4f}")

# ============================================================================
# SAVE RESULTS
# ============================================================================

results_summary = {
    'overall_success_rate': success_rate,
    'total_predictions': total,
    'correct_predictions': correct,
    'incorrect_predictions': total - correct,
    'by_word_length': {},
    'trained_lengths': list(hmm_model.models.keys()),
    'total_trained_models': len(hmm_model.models)
}

for length in sorted(success_by_length.keys()):
    data = success_by_length[length]
    rate = (data['correct'] / data['total'] * 100) if data['total'] > 0 else 0
    results_summary['by_word_length'][str(length)] = {
        'success_rate': rate,
        'correct': data['correct'],
        'total': data['total'],
        'has_model': data['with_model'] > 0
    }

# Save results to JSON
with open('hmm_validation_results.json', 'w') as f:
    json.dump(results_summary, f, indent=4)

print("\n✓ Results saved to 'hmm_validation_results.json'")



SUCCESS RATE ANALYSIS

OVERALL SUCCESS RATE.............................. 54.95%
Total Predictions................................. 1998
Correct Predictions............................... 1098
Incorrect Predictions............................. 900

----------------------------------------------------------------------
SUCCESS RATE BY WORD LENGTH
----------------------------------------------------------------------
Length  3 ✓:  44.44% (  4/  9)
Length  4 ✓:  43.24% ( 16/ 37)
Length  5 ✓:  40.66% ( 37/ 91)
Length  6 ✓:  40.58% ( 56/138)
Length  7 ✓:  46.83% ( 96/205)
Length  8 ✓:  49.19% (121/246)
Length  9 ✓:  51.09% (140/274)
Length 10 ✓:  56.38% (159/282)
Length 11 ✓:  54.87% (124/226)
Length 12 ✓:  64.63% (106/164)
Length 13 ✓:  68.75% ( 88/128)
Length 14 ✓:  73.26% ( 63/ 86)
Length 15 ✓:  70.21% ( 33/ 47)
Length 16 ✓:  87.88% ( 29/ 33)
Length 17 ✓:  64.71% ( 11/ 17)
Length 18 ✓: 100.00% (  8/  8)
Length 19 ✓: 100.00% (  3/  3)
Length 20 ✓: 100.00% (  2/  2)
Length 21 ✓: 100.00% (

## Step 6: Performance Metrics & Analysis

**Algorithms:** Descriptive Statistics + Confidence Analysis
- Compute success rates (overall and by word length)
- Analyze prediction confidence distributions
- Calculate confidence gap between correct/incorrect predictions
- Serialize results to JSON for downstream analysis