# Trexquant Interview Project (The Hangman Game)

* Copyright Trexquant Investment LP. All Rights Reserved. 
* Redistribution of this question without written consent from Trexquant is prohibited

## Instruction:
For this coding test, your mission is to write an algorithm that plays the game of Hangman through our API server. 

When a user plays Hangman, the server first selects a secret word at random from a list. The server then returns a row of underscores (space separated)—one for each letter in the secret word—and asks the user to guess a letter. If the user guesses a letter that is in the word, the word is redisplayed with all instances of that letter shown in the correct positions, along with any letters correctly guessed on previous turns. If the letter does not appear in the word, the user is charged with an incorrect guess. The user keeps guessing letters until either (1) the user has correctly guessed all the letters in the word
or (2) the user has made six incorrect guesses.

You are required to write a "guess" function that takes current word (with underscores) as input and returns a guess letter. You will use the API codes below to play 1,000 Hangman games. You have the opportunity to practice before you want to start recording your game results.

Your algorithm is permitted to use a training set of approximately 250,000 dictionary words. Your algorithm will be tested on an entirely disjoint set of 250,000 dictionary words. Please note that this means the words that you will ultimately be tested on do NOT appear in the dictionary that you are given. You are not permitted to use any dictionary other than the training dictionary we provided. This requirement will be strictly enforced by code review.

You are provided with a basic, working algorithm. This algorithm will match the provided masked string (e.g. a _ _ l e) to all possible words in the dictionary, tabulate the frequency of letters appearing in these possible words, and then guess the letter with the highest frequency of appearence that has not already been guessed. If there are no remaining words that match then it will default back to the character frequency distribution of the entire dictionary.

This benchmark strategy is successful approximately 18% of the time. Your task is to design an algorithm that significantly outperforms this benchmark.

In [1]:
import json
import requests
import random
import string
import secrets
import time
import re
import collections
import os
import torch
from urllib.parse import parse_qs

try:
    from urllib.parse import parse_qs, urlencode, urlparse
except ImportError:
    from urlparse import parse_qs, urlparse
    from urllib import urlencode

from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

In [4]:
# import re
# from nltk.corpus import words as nltk_words

# def load_words_with_nltk(filepath, output_filepath="words_nltk_filtered.txt"):
#     """
#     Loads words, cleans them, and uses the NLTK library's word corpus
#     to filter for real English words. The cleaned list is then saved.
#     """
#     # Create a set of valid English words from the NLTK corpus for fast lookups.
#     print("Loading NLTK English word corpus...")
#     valid_english_words = set(nltk_words.words())
    
#     with open(filepath, 'r', encoding='utf-8') as f:
#         word_list = f.readlines()

#     # --- Step 1: Initial Cleaning ---
#     print("Performing initial cleaning...")
#     cleaned_words = [re.sub(r'[^a-z]', '', word.lower().strip()) for word in word_list]
#     unique_words = set(word for word in cleaned_words if len(word) > 1)
    
#     # --- Step 2: Filter Using the NLTK Word List ---
#     print(f"Checking {len(unique_words)} unique words against the NLTK dictionary...")
#     final_words = sorted([word for word in unique_words if word in valid_english_words])
    
#     # --- Step 3: Save the Final Words to a File ---
#     print(f"\nFound {len(final_words)} real words. Saving them to {output_filepath}...")
#     with open(output_filepath, 'w', encoding='utf-8') as f_out:
#         for word in final_words:
#             f_out.write(word + "\n")
#     print("File saved successfully.")

#     return final_words

# # --- How to use it ---
# # This will create your new, highly accurate word list for training.
# DATA_PATH = 'words_250000_train.txt'
# filtered_list = load_words_with_nltk(DATA_PATH)

In [None]:
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from torch.nn.utils.rnn import pad_sequence
import torch.optim as optim
import numpy as np
import random
from tqdm import tqdm
import string
import re
import math
import os
import requests
import time
import collections
from urllib.parse import parse_qs

# --- 1. Data Loading and Preprocessing ---
def load_words(filepath):
    """Loads and cleans words from a file."""
    # Create a dummy file if it doesn't exist for demonstration
    if not os.path.exists(filepath):
        print(f"File '{filepath}' not found. Creating a dummy word file.")
        dummy_words = ['apple', 'banana', 'orange', 'grape', 'strawberry', 'blueberry', 'python', 'pytorch', 'tensor', 'notebook']
        with open(filepath, 'w') as f:
            for word in dummy_words:
                f.write(word + '\n')

    with open(filepath, 'r', encoding='utf-8') as f:
        words = f.readlines()
    # Clean words: lowercase, strip whitespace, remove non-alpha characters
    words = [re.sub(r'[^a-z]', '', word.lower().strip()) for word in words]
    # Filter out very short words or empty strings resulting from cleaning
    words = [word for word in words if len(word) > 1]
    return words

def create_char_mappings():
    """Creates character to index and index to character mappings."""
    all_chars = string.ascii_lowercase + '_#' # '#' is our padding token
    char_to_idx = {char: i for i, char in enumerate(all_chars)}
    idx_to_char = {i: char for i, char in enumerate(all_chars)}
    vocab_size = len(all_chars)
    mask_token_idx = char_to_idx['_']
    padding_idx = char_to_idx['#']
    return char_to_idx, idx_to_char, vocab_size, mask_token_idx, padding_idx

# --- 2. Dataset and Dataloader Preparation ---

class MaskedWordDataset(Dataset):
    """Generates a randomly masked version of a word on-the-fly."""
    def __init__(self, word_list, char_to_idx, mask_token_idx):
        self.word_list = word_list
        self.char_to_idx = char_to_idx
        self.mask_token_idx = mask_token_idx

    def __len__(self):
        return len(self.word_list)

    def __getitem__(self, idx):
        word = self.word_list[idx]
        word_indices = torch.tensor([self.char_to_idx[char] for char in word], dtype=torch.long)
        n = len(word)
        if n <= 2:
            mask_positions = [random.randint(0, n - 1)]
        else:
            num_masks = random.randint(1, max(1, n - 1))
            mask_positions = sorted(random.sample(range(n), num_masks))
        input_word = word_indices.clone()
        input_word[mask_positions] = self.mask_token_idx
        target_word = torch.full_like(word_indices, -100) # ignore_index
        target_word[mask_positions] = word_indices[mask_positions]
        return input_word, target_word

def pad_collate_fn(batch, padding_value):
    """Pads sequences in a batch."""
    inputs, targets = zip(*batch)
    padded_inputs = pad_sequence(inputs, batch_first=True, padding_value=padding_value)
    padded_targets = pad_sequence(targets, batch_first=True, padding_value=-100)
    return padded_inputs, padded_targets

# --- 3. Unified Training Loop ---

def train_model(model, dataloader, optimizer, scheduler, criterion, device, num_epochs, model_path):
    """Trains a given model with gradient clipping."""
    model.train()
    best_loss = float('inf')
    print(f"\n--- Starting Training for {model.__class__.__name__} ---")
    for epoch in range(num_epochs):
        epoch_loss = 0
        progress_bar = tqdm(dataloader, desc=f"Epoch {epoch+1}/{num_epochs}")
        for inputs, targets in progress_bar:
            inputs, targets = inputs.to(device), targets.to(device)
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs.view(-1, outputs.shape[2]), targets.view(-1))
            loss.backward()
            torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
            optimizer.step()
            epoch_loss += loss.item()
            progress_bar.set_postfix({'Loss': loss.item()})
        avg_epoch_loss = epoch_loss / len(dataloader)
        print(f"Epoch {epoch+1} Completed | Average Loss: {avg_epoch_loss:.4f}")
        scheduler.step(avg_epoch_loss)
        if avg_epoch_loss < best_loss:
            best_loss = avg_epoch_loss
            torch.save(model.state_dict(), model_path)
            print(f"Model improved and saved to {model_path} (Loss: {best_loss:.4f})")
    print("--- Training Finished ---\n")



In [8]:
#defining path of file*(change this to train the model on a different file)
DATA_PATH = 'words_250000_train.txt'

In [10]:
# --- Helper Class: Positional Encoding (Used by BiLSTM and Transformer) ---
class PositionalEncoding(nn.Module):
    def __init__(self, d_model, dropout=0.1, max_len=50):
        super(PositionalEncoding, self).__init__()
        self.dropout = nn.Dropout(p=dropout)
        pe = torch.zeros(max_len, d_model)
        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
        pe[:, 0::2] = torch.sin(position * div_term)
        pe[:, 1::2] = torch.cos(position * div_term)
        pe = pe.unsqueeze(0)
        self.register_buffer('pe', pe)

    def forward(self, x):
        x = x + self.pe[:, :x.size(1), :]
        return self.dropout(x)

# --- Model 1: BiLSTM Solver ---
class BiLSTMSolver(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim, n_layers, padding_idx, dropout=0.3):
        super(BiLSTMSolver, self).__init__()
        self.embedding_dim = embedding_dim
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=padding_idx)
        self.pos_encoder = PositionalEncoding(embedding_dim, dropout)
        self.lstm = nn.LSTM(
            embedding_dim, hidden_dim, num_layers=n_layers,
            bidirectional=True, batch_first=True, dropout=dropout if n_layers > 1 else 0
        )
        self.fc = nn.Linear(hidden_dim * 2, vocab_size)

    def forward(self, x):
        embedded = self.embedding(x) * math.sqrt(self.embedding_dim)
        pos_encoded = self.pos_encoder(embedded)
        lstm_out, _ = self.lstm(pos_encoded)
        return self.fc(lstm_out)

# --- Model 2: Transformer Solver ---
class TransformerSolver(nn.Module):
    def __init__(self, vocab_size, embedding_dim, n_heads, n_encoder_layers, dim_feedforward, padding_idx, dropout=0.3):
        super(TransformerSolver, self).__init__()
        self.embedding_dim = embedding_dim
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=padding_idx)
        self.pos_encoder = PositionalEncoding(embedding_dim, dropout)
        encoder_layer = nn.TransformerEncoderLayer(
            d_model=embedding_dim, nhead=n_heads,
            dim_feedforward=dim_feedforward, dropout=dropout, batch_first=True
        )
        self.transformer_encoder = nn.TransformerEncoder(encoder_layer, num_layers=n_encoder_layers)
        self.fc = nn.Linear(embedding_dim, vocab_size)
        self.padding_idx = padding_idx

    def forward(self, src):
        src_key_padding_mask = (src == self.padding_idx)
        embedded = self.embedding(src) * math.sqrt(self.embedding_dim)
        pos_encoded = self.pos_encoder(embedded)
        transformer_out = self.transformer_encoder(pos_encoded, src_key_padding_mask=src_key_padding_mask)
        return self.fc(transformer_out)

# --- Model 3: CharCNN Solver ---
class CharCNNSolver(nn.Module):
    def __init__(self, vocab_size, embedding_dim, num_filters, kernel_sizes, padding_idx, dropout=0.5):
        super(CharCNNSolver, self).__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=padding_idx)
        self.convs = nn.ModuleList([
            nn.Conv1d(in_channels=embedding_dim, out_channels=num_filters, kernel_size=k)
            for k in kernel_sizes
        ])
        self.dropout = nn.Dropout(dropout)
        self.fc = nn.Linear(len(kernel_sizes) * num_filters, vocab_size)
        
    def forward(self, x):
        embedded = self.embedding(x).permute(0, 2, 1)
        conved = [torch.relu(conv(embedded)) for conv in self.convs]
        pooled = [torch.max_pool1d(conv, conv.shape[2]).squeeze(2) for conv in conved]
        cat = self.dropout(torch.cat(pooled, dim=1))
        # Repeat output for each position to maintain a consistent output shape
        return self.fc(cat).unsqueeze(1).repeat(1, x.shape[1], 1)

In [None]:
# --- BiLSTM Training Configuration ---
MODEL_PATH_BILSTM = 'best_bilstm_solver.pth'
TARGET_GPU_ID = 3

# Hyperparameters
LEARNING_RATE = 0.001
BATCH_SIZE = 512
NUM_EPOCHS = 50
EMBEDDING_DIM = 768
HIDDEN_DIM = 512
N_LAYERS = 4
DROPOUT = 0.4

# --- Setup ---
device = torch.device(f'cuda:{TARGET_GPU_ID}' if torch.cuda.is_available() else 'cpu')
char_to_idx, _, vocab_size, mask_token_idx, padding_idx = create_char_mappings()
all_words = load_words(DATA_PATH)
train_dataset = MaskedWordDataset(all_words, char_to_idx, mask_token_idx)
train_dataloader = DataLoader(
    train_dataset, batch_size=BATCH_SIZE, shuffle=True,
    collate_fn=lambda b: pad_collate_fn(b, padding_idx),
    num_workers=2
)

# --- Model Initialization ---
bilstm_model = BiLSTMSolver(
    vocab_size, EMBEDDING_DIM, HIDDEN_DIM, N_LAYERS, padding_idx, DROPOUT
).to(device)

optimizer = optim.Adam(bilstm_model.parameters(), lr=LEARNING_RATE, weight_decay=1e-5)
criterion = nn.CrossEntropyLoss(ignore_index=-100)
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'min', factor=0.2, patience=3)

# --- Run Training ---
train_model(bilstm_model, train_dataloader, optimizer, scheduler, criterion, device, NUM_EPOCHS, MODEL_PATH_BILSTM)


--- Starting Training for BiLSTMSolver ---


Epoch 1/50: 100%|██████████| 444/444 [01:34<00:00,  4.72it/s, Loss=2.29]


Epoch 1 Completed | Average Loss: 2.4847
Model improved and saved to best_bilstm_solver.pth (Loss: 2.4847)


Epoch 2/50:  35%|███▍      | 154/444 [00:33<01:02,  4.62it/s, Loss=2.28]


KeyboardInterrupt: 

In [None]:
# --- Transformer Training Configuration ---
MODEL_PATH_TRANSFORMER = 'best_transformer_solver.pth'
TARGET_GPU_ID = 3 # Or a different GPU if you have one

# Hyperparameters
LEARNING_RATE = 0.0001
BATCH_SIZE = 256 # Transformers might need smaller batch sizes
NUM_EPOCHS = 60
EMBEDDING_DIM = 768
N_HEADS = 12
N_LAYERS = 6
DIM_FEEDFORWARD = 1024
DROPOUT = 0.2

# --- Setup ---
device = torch.device(f'cuda:{TARGET_GPU_ID}' if torch.cuda.is_available() else 'cpu')
char_to_idx, _, vocab_size, mask_token_idx, padding_idx = create_char_mappings()
all_words = load_words(DATA_PATH)
train_dataset = MaskedWordDataset(all_words, char_to_idx, mask_token_idx)
train_dataloader = DataLoader(
    train_dataset, batch_size=BATCH_SIZE, shuffle=True,
    collate_fn=lambda b: pad_collate_fn(b, padding_idx),
    num_workers=2
)

# --- Model Initialization ---
transformer_model = TransformerSolver(
    vocab_size, EMBEDDING_DIM, N_HEADS, N_LAYERS, DIM_FEEDFORWARD, padding_idx, DROPOUT
).to(device)

optimizer = optim.Adam(transformer_model.parameters(), lr=LEARNING_RATE, weight_decay=1e-5)
criterion = nn.CrossEntropyLoss(ignore_index=-100)
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'min', factor=0.2, patience=3)

# --- Run Training ---
train_model(transformer_model, train_dataloader, optimizer, scheduler, criterion, device, NUM_EPOCHS, MODEL_PATH_TRANSFORMER)


--- Starting Training for TransformerSolver ---


Epoch 1/60: 100%|██████████| 888/888 [02:05<00:00,  7.10it/s, Loss=2.89]


Epoch 1 Completed | Average Loss: 2.8936
Model improved and saved to best_transformer_solver.pth (Loss: 2.8936)


Epoch 2/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.9] 


Epoch 2 Completed | Average Loss: 2.8765
Model improved and saved to best_transformer_solver.pth (Loss: 2.8765)


Epoch 3/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.84]


Epoch 3 Completed | Average Loss: 2.8711
Model improved and saved to best_transformer_solver.pth (Loss: 2.8711)


Epoch 4/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.88]


Epoch 4 Completed | Average Loss: 2.8676
Model improved and saved to best_transformer_solver.pth (Loss: 2.8676)


Epoch 5/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.87]


Epoch 5 Completed | Average Loss: 2.8657
Model improved and saved to best_transformer_solver.pth (Loss: 2.8657)


Epoch 6/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.91]


Epoch 6 Completed | Average Loss: 2.8649
Model improved and saved to best_transformer_solver.pth (Loss: 2.8649)


Epoch 7/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.88]


Epoch 7 Completed | Average Loss: 2.8636
Model improved and saved to best_transformer_solver.pth (Loss: 2.8636)


Epoch 8/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.84]


Epoch 8 Completed | Average Loss: 2.8619
Model improved and saved to best_transformer_solver.pth (Loss: 2.8619)


Epoch 9/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.86]


Epoch 9 Completed | Average Loss: 2.8583
Model improved and saved to best_transformer_solver.pth (Loss: 2.8583)


Epoch 10/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.86]


Epoch 10 Completed | Average Loss: 2.8560
Model improved and saved to best_transformer_solver.pth (Loss: 2.8560)


Epoch 11/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.83]


Epoch 11 Completed | Average Loss: 2.8522
Model improved and saved to best_transformer_solver.pth (Loss: 2.8522)


Epoch 12/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.86]


Epoch 12 Completed | Average Loss: 2.8506
Model improved and saved to best_transformer_solver.pth (Loss: 2.8506)


Epoch 13/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.84]


Epoch 13 Completed | Average Loss: 2.8477
Model improved and saved to best_transformer_solver.pth (Loss: 2.8477)


Epoch 14/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.84]


Epoch 14 Completed | Average Loss: 2.8452
Model improved and saved to best_transformer_solver.pth (Loss: 2.8452)


Epoch 15/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.85]


Epoch 15 Completed | Average Loss: 2.8415
Model improved and saved to best_transformer_solver.pth (Loss: 2.8415)


Epoch 16/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.8] 


Epoch 16 Completed | Average Loss: 2.8359
Model improved and saved to best_transformer_solver.pth (Loss: 2.8359)


Epoch 17/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.86]


Epoch 17 Completed | Average Loss: 2.8300
Model improved and saved to best_transformer_solver.pth (Loss: 2.8300)


Epoch 18/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.82]


Epoch 18 Completed | Average Loss: 2.8215
Model improved and saved to best_transformer_solver.pth (Loss: 2.8215)


Epoch 19/60: 100%|██████████| 888/888 [02:05<00:00,  7.05it/s, Loss=2.83]


Epoch 19 Completed | Average Loss: 2.8119
Model improved and saved to best_transformer_solver.pth (Loss: 2.8119)


Epoch 20/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.74]


Epoch 20 Completed | Average Loss: 2.8033
Model improved and saved to best_transformer_solver.pth (Loss: 2.8033)


Epoch 21/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.79]


Epoch 21 Completed | Average Loss: 2.7953
Model improved and saved to best_transformer_solver.pth (Loss: 2.7953)


Epoch 22/60: 100%|██████████| 888/888 [02:05<00:00,  7.06it/s, Loss=2.78]


Epoch 22 Completed | Average Loss: 2.7865
Model improved and saved to best_transformer_solver.pth (Loss: 2.7865)


Epoch 23/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.81]


Epoch 23 Completed | Average Loss: 2.7769
Model improved and saved to best_transformer_solver.pth (Loss: 2.7769)


Epoch 24/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.79]


Epoch 24 Completed | Average Loss: 2.7684
Model improved and saved to best_transformer_solver.pth (Loss: 2.7684)


Epoch 25/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.75]


Epoch 25 Completed | Average Loss: 2.7591
Model improved and saved to best_transformer_solver.pth (Loss: 2.7591)


Epoch 26/60: 100%|██████████| 888/888 [02:06<00:00,  7.00it/s, Loss=2.71]


Epoch 26 Completed | Average Loss: 2.7494
Model improved and saved to best_transformer_solver.pth (Loss: 2.7494)


Epoch 27/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.72]


Epoch 27 Completed | Average Loss: 2.7387
Model improved and saved to best_transformer_solver.pth (Loss: 2.7387)


Epoch 28/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.76]


Epoch 28 Completed | Average Loss: 2.7265
Model improved and saved to best_transformer_solver.pth (Loss: 2.7265)


Epoch 29/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.73]


Epoch 29 Completed | Average Loss: 2.7126
Model improved and saved to best_transformer_solver.pth (Loss: 2.7126)


Epoch 30/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.68]


Epoch 30 Completed | Average Loss: 2.7018
Model improved and saved to best_transformer_solver.pth (Loss: 2.7018)


Epoch 31/60: 100%|██████████| 888/888 [02:06<00:00,  7.01it/s, Loss=2.67]


Epoch 31 Completed | Average Loss: 2.6896
Model improved and saved to best_transformer_solver.pth (Loss: 2.6896)


Epoch 32/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.62]


Epoch 32 Completed | Average Loss: 2.6780
Model improved and saved to best_transformer_solver.pth (Loss: 2.6780)


Epoch 33/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.63]


Epoch 33 Completed | Average Loss: 2.6663
Model improved and saved to best_transformer_solver.pth (Loss: 2.6663)


Epoch 34/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.61]


Epoch 34 Completed | Average Loss: 2.6539
Model improved and saved to best_transformer_solver.pth (Loss: 2.6539)


Epoch 35/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.68]


Epoch 35 Completed | Average Loss: 2.6431
Model improved and saved to best_transformer_solver.pth (Loss: 2.6431)


Epoch 36/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.66]


Epoch 36 Completed | Average Loss: 2.6296
Model improved and saved to best_transformer_solver.pth (Loss: 2.6296)


Epoch 37/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.6] 


Epoch 37 Completed | Average Loss: 2.6180
Model improved and saved to best_transformer_solver.pth (Loss: 2.6180)


Epoch 38/60: 100%|██████████| 888/888 [02:05<00:00,  7.05it/s, Loss=2.64]


Epoch 38 Completed | Average Loss: 2.6045
Model improved and saved to best_transformer_solver.pth (Loss: 2.6045)


Epoch 39/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.6] 


Epoch 39 Completed | Average Loss: 2.5885
Model improved and saved to best_transformer_solver.pth (Loss: 2.5885)


Epoch 40/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.61]


Epoch 40 Completed | Average Loss: 2.5755
Model improved and saved to best_transformer_solver.pth (Loss: 2.5755)


Epoch 41/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.59]


Epoch 41 Completed | Average Loss: 2.5608
Model improved and saved to best_transformer_solver.pth (Loss: 2.5608)


Epoch 42/60: 100%|██████████| 888/888 [02:05<00:00,  7.06it/s, Loss=2.58]


Epoch 42 Completed | Average Loss: 2.5444
Model improved and saved to best_transformer_solver.pth (Loss: 2.5444)


Epoch 43/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.53]


Epoch 43 Completed | Average Loss: 2.5315
Model improved and saved to best_transformer_solver.pth (Loss: 2.5315)


Epoch 44/60: 100%|██████████| 888/888 [02:06<00:00,  7.01it/s, Loss=2.54]


Epoch 44 Completed | Average Loss: 2.5156
Model improved and saved to best_transformer_solver.pth (Loss: 2.5156)


Epoch 45/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.47]


Epoch 45 Completed | Average Loss: 2.5011
Model improved and saved to best_transformer_solver.pth (Loss: 2.5011)


Epoch 46/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.46]


Epoch 46 Completed | Average Loss: 2.4859
Model improved and saved to best_transformer_solver.pth (Loss: 2.4859)


Epoch 47/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.45]


Epoch 47 Completed | Average Loss: 2.4715
Model improved and saved to best_transformer_solver.pth (Loss: 2.4715)


Epoch 48/60: 100%|██████████| 888/888 [02:05<00:00,  7.05it/s, Loss=2.49]


Epoch 48 Completed | Average Loss: 2.4595
Model improved and saved to best_transformer_solver.pth (Loss: 2.4595)


Epoch 49/60: 100%|██████████| 888/888 [02:05<00:00,  7.06it/s, Loss=2.56]


Epoch 49 Completed | Average Loss: 2.4441
Model improved and saved to best_transformer_solver.pth (Loss: 2.4441)


Epoch 50/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.47]


Epoch 50 Completed | Average Loss: 2.4315
Model improved and saved to best_transformer_solver.pth (Loss: 2.4315)


Epoch 51/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.46]


Epoch 51 Completed | Average Loss: 2.4184
Model improved and saved to best_transformer_solver.pth (Loss: 2.4184)


Epoch 52/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.4] 


Epoch 52 Completed | Average Loss: 2.4069
Model improved and saved to best_transformer_solver.pth (Loss: 2.4069)


Epoch 53/60: 100%|██████████| 888/888 [02:05<00:00,  7.05it/s, Loss=2.4] 


Epoch 53 Completed | Average Loss: 2.3945
Model improved and saved to best_transformer_solver.pth (Loss: 2.3945)


Epoch 54/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.29]


Epoch 54 Completed | Average Loss: 2.3829
Model improved and saved to best_transformer_solver.pth (Loss: 2.3829)


Epoch 55/60: 100%|██████████| 888/888 [02:05<00:00,  7.06it/s, Loss=2.35]


Epoch 55 Completed | Average Loss: 2.3719
Model improved and saved to best_transformer_solver.pth (Loss: 2.3719)


Epoch 56/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.36]


Epoch 56 Completed | Average Loss: 2.3642
Model improved and saved to best_transformer_solver.pth (Loss: 2.3642)


Epoch 57/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.39]


Epoch 57 Completed | Average Loss: 2.3553
Model improved and saved to best_transformer_solver.pth (Loss: 2.3553)


Epoch 58/60: 100%|██████████| 888/888 [02:06<00:00,  7.03it/s, Loss=2.39]


Epoch 58 Completed | Average Loss: 2.3474
Model improved and saved to best_transformer_solver.pth (Loss: 2.3474)


Epoch 59/60: 100%|██████████| 888/888 [02:06<00:00,  7.02it/s, Loss=2.26]


Epoch 59 Completed | Average Loss: 2.3406
Model improved and saved to best_transformer_solver.pth (Loss: 2.3406)


Epoch 60/60: 100%|██████████| 888/888 [02:06<00:00,  7.04it/s, Loss=2.34]


Epoch 60 Completed | Average Loss: 2.3347
Model improved and saved to best_transformer_solver.pth (Loss: 2.3347)
--- Training Finished ---



In [None]:
# --- CharCNN Training Configuration ---
MODEL_PATH_CHARCNN = 'best_charcnn_solver.pth'
TARGET_GPU_ID = 3 # Or a different GPU if you have one

# Hyperparameters
LEARNING_RATE = 0.001
BATCH_SIZE = 512
NUM_EPOCHS = 50
EMBEDDING_DIM = 768
NUM_FILTERS = 128
KERNEL_SIZES = [2, 3, 4, 5]
DROPOUT = 0.5

# --- Setup ---
device = torch.device(f'cuda:{TARGET_GPU_ID}' if torch.cuda.is_available() else 'cpu')
char_to_idx, _, vocab_size, mask_token_idx, padding_idx = create_char_mappings()
all_words = load_words(DATA_PATH)
train_dataset = MaskedWordDataset(all_words, char_to_idx, mask_token_idx)
train_dataloader = DataLoader(
    train_dataset, batch_size=BATCH_SIZE, shuffle=True,
    collate_fn=lambda b: pad_collate_fn(b, padding_idx),
    num_workers=2
)

# --- Model Initialization ---
charcnn_model = CharCNNSolver(
    vocab_size, EMBEDDING_DIM, NUM_FILTERS, KERNEL_SIZES, padding_idx, DROPOUT
).to(device)

optimizer = optim.Adam(charcnn_model.parameters(), lr=LEARNING_RATE, weight_decay=1e-5)
criterion = nn.CrossEntropyLoss(ignore_index=-100)
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'min', factor=0.2, patience=3)

# --- Run Training ---
train_model(charcnn_model, train_dataloader, optimizer, scheduler, criterion, device, NUM_EPOCHS, MODEL_PATH_CHARCNN)


--- Starting Training for CharCNNSolver ---


Epoch 1/50: 100%|██████████| 444/444 [00:22<00:00, 20.13it/s, Loss=2.85]


Epoch 1 Completed | Average Loss: 2.8944
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8944)


Epoch 2/50: 100%|██████████| 444/444 [00:21<00:00, 20.18it/s, Loss=2.87]


Epoch 2 Completed | Average Loss: 2.8586
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8586)


Epoch 3/50: 100%|██████████| 444/444 [00:21<00:00, 20.27it/s, Loss=2.85]


Epoch 3 Completed | Average Loss: 2.8530
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8530)


Epoch 4/50: 100%|██████████| 444/444 [00:21<00:00, 20.19it/s, Loss=2.84]


Epoch 4 Completed | Average Loss: 2.8481
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8481)


Epoch 5/50: 100%|██████████| 444/444 [00:21<00:00, 20.21it/s, Loss=2.87]


Epoch 5 Completed | Average Loss: 2.8446
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8446)


Epoch 6/50: 100%|██████████| 444/444 [00:22<00:00, 20.18it/s, Loss=2.85]


Epoch 6 Completed | Average Loss: 2.8419
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8419)


Epoch 7/50: 100%|██████████| 444/444 [00:22<00:00, 20.17it/s, Loss=2.84]


Epoch 7 Completed | Average Loss: 2.8409
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8409)


Epoch 8/50: 100%|██████████| 444/444 [00:21<00:00, 20.19it/s, Loss=2.83]


Epoch 8 Completed | Average Loss: 2.8378
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8378)


Epoch 9/50: 100%|██████████| 444/444 [00:21<00:00, 20.20it/s, Loss=2.82]


Epoch 9 Completed | Average Loss: 2.8355
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8355)


Epoch 10/50: 100%|██████████| 444/444 [00:21<00:00, 20.23it/s, Loss=2.83]


Epoch 10 Completed | Average Loss: 2.8340
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8340)


Epoch 11/50: 100%|██████████| 444/444 [00:21<00:00, 20.19it/s, Loss=2.82]


Epoch 11 Completed | Average Loss: 2.8337
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8337)


Epoch 12/50: 100%|██████████| 444/444 [00:22<00:00, 20.05it/s, Loss=2.82]


Epoch 12 Completed | Average Loss: 2.8317
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8317)


Epoch 13/50: 100%|██████████| 444/444 [00:22<00:00, 20.13it/s, Loss=2.81]


Epoch 13 Completed | Average Loss: 2.8311
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8311)


Epoch 14/50: 100%|██████████| 444/444 [00:22<00:00, 20.06it/s, Loss=2.82]


Epoch 14 Completed | Average Loss: 2.8303
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8303)


Epoch 15/50: 100%|██████████| 444/444 [00:22<00:00, 20.10it/s, Loss=2.83]


Epoch 15 Completed | Average Loss: 2.8275
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8275)


Epoch 16/50: 100%|██████████| 444/444 [00:22<00:00, 20.09it/s, Loss=2.85]


Epoch 16 Completed | Average Loss: 2.8281


Epoch 17/50: 100%|██████████| 444/444 [00:22<00:00, 20.09it/s, Loss=2.84]


Epoch 17 Completed | Average Loss: 2.8272
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8272)


Epoch 18/50: 100%|██████████| 444/444 [00:22<00:00, 20.12it/s, Loss=2.83]


Epoch 18 Completed | Average Loss: 2.8267
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8267)


Epoch 19/50: 100%|██████████| 444/444 [00:22<00:00, 20.03it/s, Loss=2.84]


Epoch 19 Completed | Average Loss: 2.8255
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8255)


Epoch 20/50: 100%|██████████| 444/444 [00:22<00:00, 20.12it/s, Loss=2.81]


Epoch 20 Completed | Average Loss: 2.8248
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8248)


Epoch 21/50: 100%|██████████| 444/444 [00:22<00:00, 20.09it/s, Loss=2.85]


Epoch 21 Completed | Average Loss: 2.8236
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8236)


Epoch 22/50: 100%|██████████| 444/444 [00:22<00:00, 20.12it/s, Loss=2.83]


Epoch 22 Completed | Average Loss: 2.8227
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8227)


Epoch 23/50: 100%|██████████| 444/444 [00:22<00:00, 20.11it/s, Loss=2.78]


Epoch 23 Completed | Average Loss: 2.8216
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8216)


Epoch 24/50: 100%|██████████| 444/444 [00:22<00:00, 20.03it/s, Loss=2.85]


Epoch 24 Completed | Average Loss: 2.8221


Epoch 25/50: 100%|██████████| 444/444 [00:22<00:00, 20.08it/s, Loss=2.82]


Epoch 25 Completed | Average Loss: 2.8212
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8212)


Epoch 26/50: 100%|██████████| 444/444 [00:22<00:00, 20.06it/s, Loss=2.84]


Epoch 26 Completed | Average Loss: 2.8222


Epoch 27/50: 100%|██████████| 444/444 [00:22<00:00, 20.18it/s, Loss=2.81]


Epoch 27 Completed | Average Loss: 2.8210
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8210)


Epoch 28/50: 100%|██████████| 444/444 [00:22<00:00, 20.07it/s, Loss=2.82]


Epoch 28 Completed | Average Loss: 2.8201
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8201)


Epoch 29/50: 100%|██████████| 444/444 [00:22<00:00, 20.11it/s, Loss=2.85]


Epoch 29 Completed | Average Loss: 2.8193
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8193)


Epoch 30/50: 100%|██████████| 444/444 [00:22<00:00, 20.16it/s, Loss=2.81]


Epoch 30 Completed | Average Loss: 2.8198


Epoch 31/50: 100%|██████████| 444/444 [00:22<00:00, 20.13it/s, Loss=2.81]


Epoch 31 Completed | Average Loss: 2.8191
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8191)


Epoch 32/50: 100%|██████████| 444/444 [00:22<00:00, 20.14it/s, Loss=2.81]


Epoch 32 Completed | Average Loss: 2.8189
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8189)


Epoch 33/50: 100%|██████████| 444/444 [00:21<00:00, 20.23it/s, Loss=2.8] 


Epoch 33 Completed | Average Loss: 2.8191


Epoch 34/50: 100%|██████████| 444/444 [00:21<00:00, 20.21it/s, Loss=2.83]


Epoch 34 Completed | Average Loss: 2.8176
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8176)


Epoch 35/50: 100%|██████████| 444/444 [00:21<00:00, 20.21it/s, Loss=2.83]


Epoch 35 Completed | Average Loss: 2.8176
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8176)


Epoch 36/50: 100%|██████████| 444/444 [00:22<00:00, 20.10it/s, Loss=2.81]


Epoch 36 Completed | Average Loss: 2.8175
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8175)


Epoch 37/50: 100%|██████████| 444/444 [00:22<00:00, 20.12it/s, Loss=2.79]


Epoch 37 Completed | Average Loss: 2.8168
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8168)


Epoch 38/50: 100%|██████████| 444/444 [00:22<00:00, 20.14it/s, Loss=2.85]


Epoch 38 Completed | Average Loss: 2.8154
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8154)


Epoch 39/50: 100%|██████████| 444/444 [00:22<00:00, 20.16it/s, Loss=2.83]


Epoch 39 Completed | Average Loss: 2.8160


Epoch 40/50: 100%|██████████| 444/444 [00:22<00:00, 20.12it/s, Loss=2.81]


Epoch 40 Completed | Average Loss: 2.8157


Epoch 41/50: 100%|██████████| 444/444 [00:22<00:00, 20.11it/s, Loss=2.8] 


Epoch 41 Completed | Average Loss: 2.8149
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8149)


Epoch 42/50: 100%|██████████| 444/444 [00:21<00:00, 20.21it/s, Loss=2.83]


Epoch 42 Completed | Average Loss: 2.8155


Epoch 43/50: 100%|██████████| 444/444 [00:21<00:00, 20.30it/s, Loss=2.81]


Epoch 43 Completed | Average Loss: 2.8150


Epoch 44/50: 100%|██████████| 444/444 [00:21<00:00, 20.27it/s, Loss=2.81]


Epoch 44 Completed | Average Loss: 2.8154


Epoch 45/50: 100%|██████████| 444/444 [00:21<00:00, 20.22it/s, Loss=2.82]


Epoch 45 Completed | Average Loss: 2.8134
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8134)


Epoch 46/50: 100%|██████████| 444/444 [00:21<00:00, 20.19it/s, Loss=2.82]


Epoch 46 Completed | Average Loss: 2.8140


Epoch 47/50: 100%|██████████| 444/444 [00:21<00:00, 20.21it/s, Loss=2.82]


Epoch 47 Completed | Average Loss: 2.8140


Epoch 48/50: 100%|██████████| 444/444 [00:21<00:00, 20.38it/s, Loss=2.81]


Epoch 48 Completed | Average Loss: 2.8124
Model improved and saved to best_charcnn_solver.pth (Loss: 2.8124)


Epoch 49/50: 100%|██████████| 444/444 [00:21<00:00, 20.34it/s, Loss=2.81]


Epoch 49 Completed | Average Loss: 2.8131


Epoch 50/50: 100%|██████████| 444/444 [00:21<00:00, 20.20it/s, Loss=2.81]

Epoch 50 Completed | Average Loss: 2.8136
--- Training Finished ---






In [2]:
import requests
import time
import collections
import re
import string
import math
import os
import json
from urllib.parse import parse_qs
import torch
import torch.nn as nn

from model import BiLSTMSolver, TransformerSolver, CharCNNSolver

class HangmanAPI(object):
    def __init__(self, access_token=None, session=None, timeout=None):
        self.hangman_url = self.determine_hangman_url()
        self.access_token = access_token
        self.session = session or requests.Session()
        self.timeout = timeout
        self.guessed_letters = []
        
        # --- Ensemble Weights ---
        self.W_BILSTM = 0.5
        self.W_TRANSFORMER = 0.3
        self.W_CHARCNN = 0.2

        # --- Shared Parameters & Device ---
        self.char_to_idx, self.idx_to_char, self.vocab_size, self.mask_token_idx, self.padding_idx = self._create_char_mappings()
        TARGET_GPU_ID = 3 # You can change this ID
        self.device = self._get_device(TARGET_GPU_ID)

        # --- Load All Three Models ---
        print("--- Initializing Ensemble Hangman Solver ---")
        self.bilstm_model = self._load_model('bilstm')
        self.transformer_model = self._load_model('transformer')
        self.charcnn_model = self._load_model('charcnn')
        print("--- Initialization Complete ---")

    def _create_char_mappings(self):
        """Creates character-to-index mappings for the models."""
        all_chars = string.ascii_lowercase + '_#' # '_' is mask, '#' is padding
        char_to_idx = {char: i for i, char in enumerate(all_chars)}
        idx_to_char = {i: char for i, char in enumerate(all_chars)}
        return char_to_idx, idx_to_char, len(all_chars), char_to_idx['_'], char_to_idx['#']
        
    def _get_device(self, gpu_id):
        """Selects the appropriate device (GPU or CPU) for PyTorch models."""
        if torch.cuda.is_available() and gpu_id < torch.cuda.device_count():
            device = torch.device(f'cuda:{gpu_id}')
        else:
            device = torch.device('cpu')
        print(f"All models will be loaded onto device: {device}")
        return device

    def _load_model(self, model_type):
        """Loads a specified pre-trained model."""
        MODEL_PATH = f'best_{model_type}_solver.pth'
        if not os.path.exists(MODEL_PATH):
            print(f"Warning: Model file not found at {MODEL_PATH}. This model will be skipped.")
            return None
        try:
            if model_type == 'bilstm':
                model = BiLSTMSolver(self.vocab_size, 768, 512, 4, self.padding_idx, dropout=0.4)
            elif model_type == 'transformer':
                model = TransformerSolver(self.vocab_size, 768, 12, 6, 1024, self.padding_idx, dropout=0.2)
            elif model_type == 'charcnn':
                model = CharCNNSolver(self.vocab_size, 768, 128, [2,3,4,5], self.padding_idx, dropout=0.5)
            else:
                raise ValueError(f"Unknown model type: {model_type}")
            
            state_dict = torch.load(MODEL_PATH, map_location=self.device)
            model.load_state_dict(state_dict)
            model.to(self.device)
            model.eval()
            print(f"Successfully loaded {model_type} model from {MODEL_PATH}")
            return model
        except Exception as e:
            print(f"Could not load {model_type} model from {MODEL_PATH}. Reason: {e}")
            return None

    def _get_aggregated_prediction(self, model, input_tensor):
        """Gets the probability distribution for letters in blank spaces from a model."""
        if model is None: return None
        with torch.no_grad():
            output = model(input_tensor)
            # Apply softmax to get probabilities
            probabilities = torch.softmax(output, dim=2)[0]
            
            # Identify blank positions in the original word
            clean_word = "".join([self.idx_to_char[idx.item()] for idx in input_tensor[0] if idx.item() != self.padding_idx])
            blank_indices = [i for i, char in enumerate(clean_word) if char == '_']
            
            if not blank_indices: return torch.zeros(self.vocab_size, device=self.device)
            
            # Get probabilities only at blank positions
            blank_probs = probabilities[blank_indices]
            
            prob_not_in_any_blank = torch.prod(1 - blank_probs, dim=0)
            return 1 - prob_not_in_any_blank

    ################################################
    # Implemented your "guess" function here       #
    ################################################
    def guess(self, word):
        clean_word = word.replace(" ", "")
        input_indices = [self.char_to_idx.get(c, self.mask_token_idx) for c in clean_word]
        input_tensor = torch.tensor([input_indices], dtype=torch.long).to(self.device)

        seq_len = input_tensor.shape[1]
        MAX_KERNEL_SIZE = 5 
        if seq_len < MAX_KERNEL_SIZE:
            padding_needed = MAX_KERNEL_SIZE - seq_len
            padding = torch.full((1, padding_needed), self.padding_idx, dtype=torch.long, device=self.device)
            input_tensor = torch.cat([input_tensor, padding], dim=1)

        # 3. Get predictions from all available models
        bilstm_probs = self._get_aggregated_prediction(self.bilstm_model, input_tensor)
        transformer_probs = self._get_aggregated_prediction(self.transformer_model, input_tensor)
        charcnn_probs = self._get_aggregated_prediction(self.charcnn_model, input_tensor)
        
        # 4. Combine predictions using ensemble weights
        final_probs = torch.zeros(self.vocab_size, device=self.device)
        total_weight = 0
        
        if bilstm_probs is not None:
            final_probs += self.W_BILSTM * bilstm_probs
            total_weight += self.W_BILSTM
        if transformer_probs is not None:
            final_probs += self.W_TRANSFORMER * transformer_probs
            total_weight += self.W_TRANSFORMER
        if charcnn_probs is not None:
            final_probs += self.W_CHARCNN * charcnn_probs
            total_weight += self.W_CHARCNN

        # 5. Handle case where no models are loaded
        if total_weight < 1e-5:
            print("Warning: No models were loaded. Using basic fallback guess.")
            # Fallback to guessing most common letters in English
            return next((c for c in "etaoinshrdlu" if c not in self.guessed_letters), 'e')

        # 6. Normalize probabilities and mask already guessed letters
        final_probs /= total_weight
        
        for letter in self.guessed_letters:
            if letter in self.char_to_idx:
                final_probs[self.char_to_idx[letter]] = -1.0 # Set to a low value to avoid being picked

        # 7. Return the letter with the highest probability
        best_guess_idx = torch.argmax(final_probs).item()
        return self.idx_to_char[best_guess_idx]

    ##########################################################
    # You'll likely not need to modify any of the code below #
    ##########################################################
    
    @staticmethod
    def determine_hangman_url():
        links = ['https://trexsim.com']
        data = {link: 0 for link in links}
        for link in links:
            requests.get(link)
            for i in range(10):
                s = time.time()
                requests.get(link)
                data[link] = time.time() - s
        link = sorted(data.items(), key=lambda x: x[1])[0][0]
        link += '/trexsim/hangman'
        return link
            
    def start_game(self, practice=True, verbose=True):
        # Reset guessed letters to empty set
        self.guessed_letters = []
                                
        response = self.request("/new_game", {"practice":practice})
        if response.get('status')=="approved":
            game_id = response.get('game_id')
            word = response.get('word')
            tries_remains = response.get('tries_remains')
            if verbose:
                print("Successfully started a new game! Game ID: {0}. # of tries remaining: {1}. Word: {2}.".format(game_id, tries_remains, word))
            while tries_remains>0:
                # Get guessed letter from user code
                guess_letter = self.guess(word)
                        
                # Append guessed letter to guessed letters field in hangman object
                self.guessed_letters.append(guess_letter)
                if verbose:
                    print("Guessed letters: {}".format(self.guessed_letters))
                    print("Guessing letter: {0}".format(guess_letter))
                    
                try:    
                    res = self.request("/guess_letter", {"request":"guess_letter", "game_id":game_id, "letter":guess_letter})
                except HangmanAPIError:
                    print('HangmanAPIError exception caught on request.')
                    continue
                except Exception as e:
                    print('Other exception caught on request.')
                    raise e
                
                if verbose:
                    print("Server response: {0}".format(res))
                status = res.get('status')
                tries_remains = res.get('tries_remains')
                if status=="success":
                    if verbose:
                        print("Successfully finished game: {0}".format(game_id))
                    return True
                elif status=="failed":
                    reason = res.get('reason', '# of tries exceeded!')
                    if verbose:
                        print("Failed game: {0}. Because of: {1}".format(game_id, reason))
                    return False
                elif status=="ongoing":
                    word = res.get('word')
        else:
            if verbose:
                print("Failed to start a new game")
        return status=="success"
        
    def my_status(self):
        return self.request("/my_status", {})
    
    def request(
            self, path, args=None, post_args=None, method=None):
        if args is None:
            args = dict()
        if post_args is not None:
            method = "POST"

        if self.access_token:
            if post_args and "access_token" not in post_args:
                post_args["access_token"] = self.access_token
            elif "access_token" not in args:
                args["access_token"] = self.access_token

        time.sleep(0.2)

        num_retry, time_sleep = 50, 2
        for it in range(num_retry):
            try:
                response = self.session.request(
                    method or "GET",
                    self.hangman_url + path,
                    timeout=self.timeout,
                    params=args,
                    data=post_args,
                    verify=False
                )
                break
            except requests.HTTPError as e:
                response = json.loads(e.read())
                raise HangmanAPIError(response)
            except requests.exceptions.SSLError as e:
                if it + 1 == num_retry:
                    raise
                time.sleep(time_sleep)

        headers = response.headers
        if 'json' in headers['content-type']:
            result = response.json()
        elif "access_token" in parse_qs(response.text):
            query_str = parse_qs(response.text)
            if "access_token" in query_str:
                result = {"access_token": query_str["access_token"][0]}
                if "expires" in query_str:
                    result["expires"] = query_str["expires"][0]
            else:
                raise HangmanAPIError(response.json())
        else:
            raise HangmanAPIError('Maintype was not text, or querystring')

        if result and isinstance(result, dict) and result.get("error"):
            raise HangmanAPIError(result)
        return result
    
class HangmanAPIError(Exception):
    def __init__(self, result):
        self.result = result
        self.code = None
        try:
            self.type = result["error_code"]
        except (KeyError, TypeError):
            self.type = ""

        try:
            self.message = result["error_description"]
        except (KeyError, TypeError):
            try:
                self.message = result["error"]["message"]
                self.code = result["error"].get("code")
                if not self.type:
                    self.type = result["error"].get("type", "")
            except (KeyError, TypeError):
                try:
                    self.message = result["error_msg"]
                except (KeyError, TypeError):
                    self.message = result

        Exception.__init__(self, self.message)

# API Usage Examples

## To start a new game:
1. Make sure you have implemented your own "guess" method.
2. Use the access_token that we sent you to create your HangmanAPI object. 
3. Start a game by calling "start_game" method.
4. If you wish to test your function without being recorded, set "practice" parameter to 1.
5. Note: You have a rate limit of 20 new games per minute. DO NOT start more than 20 new games within one minute.

In [3]:
api = HangmanAPI(access_token="70786437cf11ea37b8a7599cccfd20", timeout=2000)


All models will be loaded onto device: cuda:3
--- Initializing Ensemble Hangman Solver ---


  state_dict = torch.load(MODEL_PATH, map_location=self.device)


Successfully loaded bilstm model from best_bilstm_solver.pth
Successfully loaded transformer model from best_transformer_solver.pth
Successfully loaded charcnn model from best_charcnn_solver.pth
--- Initialization Complete ---


## Playing practice games:
You can use the command below to play up to 100,000 practice games.

In [4]:
for i in range(1000):
     api.start_game(practice=1,verbose=True)
     time.sleep(0.5)
[total_practice_runs,total_recorded_runs,total_recorded_successes,total_practice_successes] = api.my_status() # Get my game stats: (# of tries, # of wins)
practice_success_rate = total_practice_successes / total_practice_runs
print('run %d practice games out of an allotted 100,000. practice success rate so far = %.5f' % (total_practice_runs, practice_success_rate))


HangmanAPIError: {'error': 'Your account has been deactivated!'}

## Playing recorded games:
Please finalize your code prior to running the cell below. Once this code executes once successfully your submission will be finalized. Our system will not allow you to rerun any additional games.

Please note that it is expected that after you successfully run this block of code that subsequent runs will result in the error message "Your account has been deactivated".

Once you've run this section of the code your submission is complete. Please send us your source code via email.

In [18]:
for i in range(1000):
    print('Playing ', i, ' th game')
    # Uncomment the following line to execute your final runs. Do not do this until you are satisfied with your submission
    api.start_game(practice=0,verbose=False)
    
    # DO NOT REMOVE as otherwise the server may lock you out for too high frequency of requests
    time.sleep(0.5)

Playing  0  th game
Playing  1  th game
Playing  2  th game
Playing  3  th game
Playing  4  th game
Playing  5  th game
Playing  6  th game
Playing  7  th game
Playing  8  th game
Playing  9  th game
Playing  10  th game
Playing  11  th game
Playing  12  th game
Playing  13  th game
Playing  14  th game
Playing  15  th game
Playing  16  th game
Playing  17  th game
Playing  18  th game
Playing  19  th game
Playing  20  th game
Playing  21  th game
Playing  22  th game
Playing  23  th game
Playing  24  th game
Playing  25  th game
Playing  26  th game
Playing  27  th game
Playing  28  th game
Playing  29  th game
Playing  30  th game
Playing  31  th game
Playing  32  th game
Playing  33  th game
Playing  34  th game
Playing  35  th game
Playing  36  th game
Playing  37  th game
Playing  38  th game
Playing  39  th game
Playing  40  th game
Playing  41  th game
Playing  42  th game
Playing  43  th game
Playing  44  th game
Playing  45  th game
Playing  46  th game
Playing  47  th game
Pl

HangmanAPIError: {'error': 'You have reached 1000 of games', 'status': 'denied'}

## To check your game statistics
1. Simply use "my_status" method.
2. Returns your total number of games, and number of wins.

In [19]:
[total_practice_runs,total_recorded_runs,total_recorded_successes,total_practice_successes] = api.my_status() # Get my game stats: (# of tries, # of wins)
success_rate = total_recorded_successes/total_recorded_runs
print('overall success rate = %.5f' % success_rate)

overall success rate = 0.60100
