# Trexquant Interview Project (The Hangman Game)

* Copyright Trexquant Investment LP. All Rights Reserved. 
* Redistribution of this question without written consent from Trexquant is prohibited

## Instruction:
For this coding test, your mission is to write an algorithm that plays the game of Hangman through our API server. 

When a user plays Hangman, the server first selects a secret word at random from a list. The server then returns a row of underscores (space separated)—one for each letter in the secret word—and asks the user to guess a letter. If the user guesses a letter that is in the word, the word is redisplayed with all instances of that letter shown in the correct positions, along with any letters correctly guessed on previous turns. If the letter does not appear in the word, the user is charged with an incorrect guess. The user keeps guessing letters until either (1) the user has correctly guessed all the letters in the word
or (2) the user has made six incorrect guesses.

You are required to write a "guess" function that takes current word (with underscores) as input and returns a guess letter. You will use the API codes below to play 1,000 Hangman games. You have the opportunity to practice before you want to start recording your game results.

Your algorithm is permitted to use a training set of approximately 250,000 dictionary words. Your algorithm will be tested on an entirely disjoint set of 250,000 dictionary words. Please note that this means the words that you will ultimately be tested on do NOT appear in the dictionary that you are given. You are not permitted to use any dictionary other than the training dictionary we provided. This requirement will be strictly enforced by code review.

You are provided with a basic, working algorithm. This algorithm will match the provided masked string (e.g. a _ _ l e) to all possible words in the dictionary, tabulate the frequency of letters appearing in these possible words, and then guess the letter with the highest frequency of appearence that has not already been guessed. If there are no remaining words that match then it will default back to the character frequency distribution of the entire dictionary.

This benchmark strategy is successful approximately 18% of the time. Your task is to design an algorithm that significantly outperforms this benchmark.

In [None]:
import json
import requests
import random
import string
import secrets
import time
import re
import collections

try:
    from urllib.parse import parse_qs, urlencode, urlparse
except ImportError:
    from urlparse import parse_qs, urlparse
    from urllib import urlencode

from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

## added
import torch



In [2]:
class HangmanAPI(object):
    def __init__(self, model, access_token=None, session=None, timeout=None):
        self.hangman_url = self.determine_hangman_url()
        self.access_token = access_token
        self.session = session or requests.Session()
        self.timeout = timeout
        self.guessed_letters = []
        
        full_dictionary_location = "words_250000_train.txt"
        self.full_dictionary = self.build_dictionary(full_dictionary_location)        
        self.full_dictionary_common_letter_sorted = collections.Counter("".join(self.full_dictionary)).most_common()
        
        self.current_dictionary = []

        ## added
        self.model = model
        
        all_chars = list(string.ascii_lowercase)
        self.all_chars_len = len(all_chars)
        self.char_to_idx = {c: i for i, c in enumerate(all_chars)}
        self.idx_to_char = {i: c for i, c in enumerate(all_chars)}
        
    @staticmethod
    def determine_hangman_url():
        links = ['https://trexsim.com']

        data = {link: 0 for link in links}

        for link in links:

            requests.get(link)

            for i in range(10):
                s = time.time()
                requests.get(link)
                data[link] = time.time() - s

        link = sorted(data.items(), key=lambda x: x[1])[0][0]
        link += '/trexsim/hangman'
        return link

    '''
    def guess(self, word): # word input example: "_ p p _ e "
        ###############################################
        # Replace with your own "guess" function here #
        ###############################################

        # clean the word so that we strip away the space characters
        # replace "_" with "." as "." indicates any character in regular expressions
        clean_word = word[::2].replace("_",".")
        
        # find length of passed word
        len_word = len(clean_word)
        
        # grab current dictionary of possible words from self object, initialize new possible words dictionary to empty
        current_dictionary = self.current_dictionary
        new_dictionary = []
        
        # iterate through all of the words in the old plausible dictionary
        for dict_word in current_dictionary:
            # continue if the word is not of the appropriate length
            if len(dict_word) != len_word:
                continue
                
            # if dictionary word is a possible match then add it to the current dictionary
            if re.match(clean_word,dict_word):
                new_dictionary.append(dict_word)
        
        # overwrite old possible words dictionary with updated version
        self.current_dictionary = new_dictionary
        
        
        # count occurrence of all characters in possible word matches
        full_dict_string = "".join(new_dictionary)
        
        c = collections.Counter(full_dict_string)
        sorted_letter_count = c.most_common()                   
        
        guess_letter = '!'
        
        # return most frequently occurring letter in all possible words that hasn't been guessed yet
        for letter,instance_count in sorted_letter_count:
            if letter not in self.guessed_letters:
                guess_letter = letter
                break
            
        # if no word matches in training dictionary, default back to ordering of full dictionary
        if guess_letter == '!':
            sorted_letter_count = self.full_dictionary_common_letter_sorted
            for letter,instance_count in sorted_letter_count:
                if letter not in self.guessed_letters:
                    guess_letter = letter
                    break            
        
        return guess_letter
    '''
    

    def encode_word(self, word, all_chars_len=26):
        encoded_word = torch.zeros((len(word), all_chars_len))

        for i, char in enumerate(word):
            if char == '_':
                continue  # Skip if masked character
            encoded_word[i, self.char_to_idx[char]] = 1
        return encoded_word


    def guess(self, masked_word):

        stripped = masked_word.replace(" ", "")
        if all(c == '_' for c in stripped):
            # First guess: prioritize vowels not yet guessed
            vowels = ['e', 'a', 'o', 'i', 'u']
            for v in vowels:
                if v not in self.guessed_letters:
                    return v
            # If all vowels are wrong, fall back to consonants (ETAOIN SHRDLU consonants)
            sorted_consonants = ['t', 'n', 's', 'h', 'r', 'd', 'l', 'b', 'c', 'f', 'g', 'j', 'k', 'm', 'p', 'q', 'v', 'w', 'x', 'y', 'z']
            for c in sorted_consonants:
                if c not in self.guessed_letters:
                    return c

        word_len = len(stripped)
        x_input = self.encode_word(stripped)

        self.model.eval()
        with torch.no_grad():
            lengths = torch.tensor([word_len])  # batch of 1
            logits = self.model(x_input.unsqueeze(0), lengths)
            probs = torch.softmax(logits[0], dim=-1)    # max_word_len x all_chars_len
            
            # 1. Zero out positions already known (non-zero in input)
            known_positions_mask = x_input.sum(dim=1) > 0  # [T]
            probs[known_positions_mask] = 0.0

            # 2. Zero out previously guessed letters
            if self.guessed_letters:
                guessed_char_idx = torch.tensor([self.char_to_idx.get(x) for x in self.guessed_letters])
                unknown_positions_mask = torch.tensor([i for i in range(word_len) if not known_positions_mask[i]])
                probs[unknown_positions_mask[:, None], guessed_char_idx] = 0

            # 3. normalize probabilities within each position
            row_sums = probs.sum(dim=1, keepdim=True) + 1e-8  # avoid division by zero
            probs_normalized = probs / row_sums

            # 4. pick max probability among all positions and characters
            pos, char = torch.where(probs_normalized == probs_normalized.max())
            guessed_char = self.idx_to_char[char[0].item()]

        return guessed_char


    ##########################################################
    # You'll likely not need to modify any of the code below #
    ##########################################################
    
    def build_dictionary(self, dictionary_file_location):
        text_file = open(dictionary_file_location,"r")
        full_dictionary = text_file.read().splitlines()
        text_file.close()
        return full_dictionary
                
    def start_game(self, practice=True, verbose=True):
        # reset guessed letters to empty set and current plausible dictionary to the full dictionary
        self.guessed_letters = []
        self.current_dictionary = self.full_dictionary
                         
        response = self.request("/new_game", {"practice":practice})
        if response.get('status')=="approved":
            game_id = response.get('game_id')
            word = response.get('word')
            tries_remains = response.get('tries_remains')
            if verbose:
                print("Successfully start a new game! Game ID: {0}. # of tries remaining: {1}. Word: {2}.".format(game_id, tries_remains, word))
            while tries_remains>0:
                # get guessed letter from user code
                guess_letter = self.guess(word)
                    
                # append guessed letter to guessed letters field in hangman object
                self.guessed_letters.append(guess_letter)
                if verbose:
                    print("Guessing letter: {0}".format(guess_letter))
                    
                try:    
                    res = self.request("/guess_letter", {"request":"guess_letter", "game_id":game_id, "letter":guess_letter})
                except HangmanAPIError:
                    print('HangmanAPIError exception caught on request.')
                    continue
                except Exception as e:
                    print('Other exception caught on request.')
                    raise e
               
                if verbose:
                    print("Sever response: {0}".format(res))
                status = res.get('status')
                tries_remains = res.get('tries_remains')
                if status=="success":
                    if verbose:
                        print("Successfully finished game: {0}".format(game_id))
                    return True
                elif status=="failed":
                    reason = res.get('reason', '# of tries exceeded!')
                    if verbose:
                        print("Failed game: {0}. Because of: {1}".format(game_id, reason))
                    return False
                elif status=="ongoing":
                    word = res.get('word')
        else:
            if verbose:
                print("Failed to start a new game")
        return status=="success"
        
    def my_status(self):
        return self.request("/my_status", {})
    
    def request(
            self, path, args=None, post_args=None, method=None):
        if args is None:
            args = dict()
        if post_args is not None:
            method = "POST"

        # Add `access_token` to post_args or args if it has not already been
        # included.
        if self.access_token:
            # If post_args exists, we assume that args either does not exists
            # or it does not need `access_token`.
            if post_args and "access_token" not in post_args:
                post_args["access_token"] = self.access_token
            elif "access_token" not in args:
                args["access_token"] = self.access_token

        time.sleep(0.2)

        num_retry, time_sleep = 50, 2
        for it in range(num_retry):
            try:
                response = self.session.request(
                    method or "GET",
                    self.hangman_url + path,
                    timeout=self.timeout,
                    params=args,
                    data=post_args,
                    verify=False
                )
                break
            except requests.HTTPError as e:
                response = json.loads(e.read())
                raise HangmanAPIError(response)
            except requests.exceptions.SSLError as e:
                if it + 1 == num_retry:
                    raise
                time.sleep(time_sleep)

        headers = response.headers
        if 'json' in headers['content-type']:
            result = response.json()
        elif "access_token" in parse_qs(response.text):
            query_str = parse_qs(response.text)
            if "access_token" in query_str:
                result = {"access_token": query_str["access_token"][0]}
                if "expires" in query_str:
                    result["expires"] = query_str["expires"][0]
            else:
                raise HangmanAPIError(response.json())
        else:
            raise HangmanAPIError('Maintype was not text, or querystring')

        if result and isinstance(result, dict) and result.get("error"):
            raise HangmanAPIError(result)
        return result
    
class HangmanAPIError(Exception):
    def __init__(self, result):
        self.result = result
        self.code = None
        try:
            self.type = result["error_code"]
        except (KeyError, TypeError):
            self.type = ""

        try:
            self.message = result["error_description"]
        except (KeyError, TypeError):
            try:
                self.message = result["error"]["message"]
                self.code = result["error"].get("code")
                if not self.type:
                    self.type = result["error"].get("type", "")
            except (KeyError, TypeError):
                try:
                    self.message = result["error_msg"]
                except (KeyError, TypeError):
                    self.message = result

        Exception.__init__(self, self.message)

#### Load previously saved model

In [1]:
import torch
import torch.nn as nn
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence


class HangmanRNN(nn.Module):
    def __init__(self, chars_len=26, embed_dim=16, hidden_dim=128, num_layers=2, dropout=0.2):
        super().__init__()
        self.embedding = nn.Linear(chars_len, embed_dim)
        self.rnn = nn.GRU(
            embed_dim,
            hidden_dim,
            num_layers=num_layers,
            batch_first=True,
            dropout=dropout,
            bidirectional=True
        )
        self.norm = nn.LayerNorm(hidden_dim * 2)
        self.fc = nn.Linear(hidden_dim * 2, chars_len)
        self.embed_dropout = nn.Dropout(dropout)

    def forward(self, x, lengths):
        x = self.embedding(x)
        packed = pack_padded_sequence(x, lengths.cpu(), batch_first=True, enforce_sorted=True)
        packed_out, _ = self.rnn(packed)
        out, _ = pad_packed_sequence(packed_out, batch_first=True)
        out = self.norm(out)
        logits = self.fc(out)
        return logits


model = HangmanRNN(chars_len=26, embed_dim=16, hidden_dim=128, num_layers=2, dropout=0.2)
model.load_state_dict(torch.load("model7.pth"))
model.eval()

HangmanRNN(
  (embedding): Linear(in_features=26, out_features=16, bias=True)
  (rnn): GRU(16, 128, num_layers=2, batch_first=True, dropout=0.2, bidirectional=True)
  (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
  (fc): Linear(in_features=256, out_features=26, bias=True)
  (embed_dropout): Dropout(p=0.2, inplace=False)
)

# API Usage Examples

## To start a new game:
1. Make sure you have implemented your own "guess" method.
2. Use the access_token that we sent you to create your HangmanAPI object. 
3. Start a game by calling "start_game" method.
4. If you wish to test your function without being recorded, set "practice" parameter to 1.
5. Note: You have a rate limit of 20 new games per minute. DO NOT start more than 20 new games within one minute.

In [23]:
api = HangmanAPI(model, access_token="186d71b7aaf4caa6b2f2408478095b", timeout=2000)

## Playing practice games:
You can use the command below to play up to 100,000 practice games.

In [25]:
num_games = 100

for g in range(num_games):
    print(f"\n\nGame {g+1}")
    api.start_game(practice=True, verbose=True)
    [total_practice_runs,total_recorded_runs,total_recorded_successes,total_practice_successes] = api.my_status() # Get my game stats: (# of tries, # of wins)
    practice_success_rate = total_recorded_successes / total_recorded_runs
    print('run %d practice games out of an allotted 100,000. practice success rate so far = %.3f' % (total_recorded_runs, practice_success_rate))



Game 1
Successfully start a new game! Game ID: cc9ed2c9aadb. # of tries remaining: 6. Word: _ _ _ _ _ _ _ _ _ .
Guessing letter: e
Sever response: {'game_id': 'cc9ed2c9aadb', 'status': 'ongoing', 'tries_remains': 6, 'word': '_ _ _ _ e _ _ e _ '}
Guessing letter: s
Sever response: {'game_id': 'cc9ed2c9aadb', 'status': 'ongoing', 'tries_remains': 5, 'word': '_ _ _ _ e _ _ e _ '}
Guessing letter: d
Sever response: {'game_id': 'cc9ed2c9aadb', 'status': 'ongoing', 'tries_remains': 5, 'word': '_ _ _ _ e _ d e d '}
Guessing letter: a
Sever response: {'game_id': 'cc9ed2c9aadb', 'status': 'ongoing', 'tries_remains': 5, 'word': '_ _ _ _ e a d e d '}
Guessing letter: u
Sever response: {'game_id': 'cc9ed2c9aadb', 'status': 'ongoing', 'tries_remains': 4, 'word': '_ _ _ _ e a d e d '}
Guessing letter: h
Sever response: {'game_id': 'cc9ed2c9aadb', 'status': 'ongoing', 'tries_remains': 4, 'word': '_ _ _ h e a d e d '}
Guessing letter: o
Sever response: {'game_id': 'cc9ed2c9aadb', 'status': 'ongoing'

KeyboardInterrupt: 

#### 1. Clean list of words

In [6]:
def clean_word_list(words, min_len=2, max_len=30):
    import re

    cleaned = set()
    for w in words:
        w = w.strip().lower()
        # rule 1: alphabetic only
        if not re.fullmatch(r'[a-z]+', w):
            continue
        # rule 2: length constraints
        if not (min_len <= len(w) <= max_len):
            continue
        # rule 3: skip words of all identical letters
        if len(set(w)) == 1:
            continue
        cleaned.add(w)

    print(f"Number of words removed: {len(words) - len(cleaned)}")
    return sorted(cleaned)


max_word_len = 30

raw_words = api.full_dictionary
clean_words = clean_word_list(raw_words)

Number of words removed: 41


#### 2. Prepare training data

In [7]:
import numpy as np
import torch

## List of words
# words = ["apple", "banana", "grape", "orange", "watermelon"]
words = clean_words


## Alphabets dictionary
all_chars = list(string.ascii_lowercase)
all_chars_len = len(all_chars)
char_to_idx = {c: i for i, c in enumerate(all_chars)}
idx_to_char = {i: c for i, c in enumerate(all_chars)}


## Encoding and decoding functions
def encode_word(word, all_chars_len=26):
    encoded_word = torch.zeros((len(word), all_chars_len))

    for i, char in enumerate(word):
        if char == '_':
            continue  # Skip if masked character
        encoded_word[i, char_to_idx[char]] = 1
    return encoded_word


def decode_actual_word(encoded_word):
    row_sums = encoded_word.sum(dim=1)
    last_nonzero_idx = (row_sums != 0).nonzero(as_tuple=True)[0].max().item() + 1
    # remove padding
    trimmed = encoded_word[:last_nonzero_idx]
    char_indices = trimmed.argmax(dim=1)

    return ''.join(idx_to_char[i.item()] for i in char_indices)


def decode_masked_word(masked_word, word_len):
    trimmed = masked_word[:word_len]
    row_sums = trimmed.sum(dim=1)

    decoded_chars = []
    for i in range(trimmed.size(0)):
        if row_sums[i] == 0:
            decoded_chars.append('_')
        else:
            idx = trimmed[i].argmax().item()
            decoded_chars.append(idx_to_char[idx])
    return ''.join(decoded_chars)


## Generate training data from words
def convert_word_to_training_data(word, all_chars_len=26):
    encoded_word = encode_word(word, all_chars_len=all_chars_len)

    # create random masking, consistent per unique character
    unique_chars = sorted(set(word))
    while True:
        char_mask_map = {c: np.random.randint(0, 2) for c in unique_chars}  # 0=shown, 1=masked
        if len(set(char_mask_map.values())) != 1:
            break
    mask = np.array([char_mask_map[c] for c in word])

    # Apply masking directly to word length only
    x_input = encoded_word.clone()
    mask_tensor = torch.tensor(mask, dtype=torch.float32)
    mask_bool = mask_tensor.bool()
    x_input[mask_bool] = 0.0

    y_target = encoded_word  # same length as x_input

    return x_input, y_target, mask_tensor


## Multithreading to generate training data
def process_all_words(words, all_chars_len=26, cache_file="processed_data.pkl", force_process=False, multiplier=2):
    from multiprocessing.dummy import Pool
    import pickle
    from tqdm import tqdm

    if not force_process:
        try:
            with open(cache_file, "rb") as f:
                print(f"Loading cached processed data from {cache_file}...")
                return pickle.load(f)
        except FileNotFoundError:
            print("No cached data found — preprocessing...")

    # repeat each word `multiplier` times for additional training
    words *= multiplier

    # Worker to apply the conversion
    def worker(w):
        return convert_word_to_training_data(w, all_chars_len)

    # Parallel processing
    with Pool() as pool:
        processed_data = list(
            tqdm(pool.imap(worker, words), total=len(words), desc="Processing words")
        )

    # Remove duplicates (tuples can be used in a set directly)
    processed_data = list(set(processed_data))

    # Cache results
    with open(cache_file, "wb") as f:
        pickle.dump(processed_data, f)
        print(f"Saved preprocessed data to {cache_file}")

    return processed_data

In [None]:
from torch.utils.data import Dataset

class HangmanDataset(Dataset):
    def __init__(self, processed_data):
        self.data = processed_data
    
    def __len__(self):
        return len(self.data)
    
    def __getitem__(self, idx):
        return self.data[idx]


processed_data = process_all_words(
    words, 
    multiplier=3, 
    cache_file="processed_data.pkl", 
    force_process=False, 
)

dataset = HangmanDataset(processed_data)

Processing words: 100%|██████████| 681777/681777 [01:04<00:00, 10601.60it/s]


Saved preprocessed data to processed_data.pkl


#### 3. Inspect training data

In [9]:
## Preview
num_previews = 30

for i in range(num_previews):
    actual_word = decode_actual_word(processed_data[i][1])
    masked_word = decode_masked_word(processed_data[i][0], len(actual_word))
    print(f'Actual: {actual_word} \nMasked: {masked_word} \n')

Actual: ricey 
Masked: ri_ey 

Actual: resubmitting 
Masked: _esub__tt_n_ 

Actual: ruel 
Masked: ru_l 

Actual: heterosomati 
Masked: _e_er___m___ 

Actual: orthotomous 
Masked: o_t_oto_ou_ 

Actual: electrobiological 
Masked: e_ect_o__o_og_c__ 

Actual: dispiteous 
Masked: d_sp__eo_s 

Actual: tardity 
Masked: tar__t_ 

Actual: spiritualminded 
Masked: ______u________ 

Actual: barrat 
Masked: _arra_ 

Actual: oleums 
Masked: oleum_ 

Actual: ereuthalion 
Masked: _r__thali__ 

Actual: geobiont 
Masked: _eobiont 

Actual: hirundo 
Masked: _i_u_d_ 

Actual: drawoff 
Masked: _r__o__ 

Actual: tallowberries 
Masked: ta__o_b___i__ 

Actual: soleless 
Masked: __l_l___ 

Actual: isouric 
Masked: i_ou_ic 

Actual: kunmiut 
Masked: kun_iut 

Actual: wincopipe 
Masked: _in___i_e 

Actual: prooestrous 
Masked: pr__es_r__s 

Actual: blazing 
Masked: ___zi__ 

Actual: lunations 
Masked: lu__t____ 

Actual: superappreciation 
Masked: su___a____cia_io_ 

Actual: culicine 
Masked: c_l_c_n_ 

Actual:

In [10]:
## Validate quality of encoded training data

idx = 3

print(decode_actual_word(dataset[idx][1]))

# x_input
print(dataset[idx][0])

# y_input
print(dataset[idx][1])

# mask
print(dataset[idx][2])

heterosomati
tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.,
         0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0.],
   

#### 4. Split data into training and validation sets

In [11]:
from torch.nn.utils.rnn import pad_sequence, pack_padded_sequence, pad_packed_sequence
from torch.utils.data import random_split, DataLoader


def collate_fn_dynamic_padding(batch):
    xs, ys, masks = zip(*batch)
    lengths = [x.shape[0] for x in xs]
    lengths = torch.tensor(lengths)

    xs = pad_sequence(xs, batch_first=True) # [seq_len, feature_dim]
    ys = pad_sequence(ys, batch_first=True)
    masks = pad_sequence(masks, batch_first=True)

    lengths, perm_idx = lengths.sort(0, descending=True)    # sort by descending length for pack_padded_sequence
    xs, ys, masks = xs[perm_idx], ys[perm_idx], masks[perm_idx]
    return xs, ys, masks, lengths


dataset_size = len(dataset)
val_size = int(0.1 * dataset_size)
train_size = dataset_size - val_size
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])

train_loader = DataLoader(train_dataset, batch_size=256, collate_fn=collate_fn_dynamic_padding, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=256, collate_fn=collate_fn_dynamic_padding, shuffle=False)

#### 5. Build model

In [None]:
import torch
import torch.nn as nn
from tqdm import tqdm


## RNN architecture
class HangmanRNN(nn.Module):
    def __init__(self, chars_len=26, embed_dim=16, hidden_dim=128, num_layers=2, dropout=0.2):
        super().__init__()
        self.embedding = nn.Linear(chars_len, embed_dim)
        self.rnn = nn.GRU(
            embed_dim,
            hidden_dim,
            num_layers=num_layers,
            batch_first=True,
            dropout=dropout,
            bidirectional=True
        )
        self.norm = nn.LayerNorm(hidden_dim * 2)    # bidirectional doubles hidden size
        self.fc = nn.Linear(hidden_dim * 2, chars_len)
        self.embed_dropout = nn.Dropout(dropout)    # Optional dropout after embedding

    def forward(self, x, lengths):
        x = self.embedding(x)               # [B, T, embed_dim]
        packed = pack_padded_sequence(x, lengths.cpu(), batch_first=True, enforce_sorted=True)
        packed_out, _ = self.rnn(packed)
        out, _ = pad_packed_sequence(packed_out, batch_first=True)
        out = self.norm(out)                # normalize bidirectional output
        logits = self.fc(out)               # [B, T, chars_len]
        return logits


## Define model, optimizer, loss, learning rate
model7 = HangmanRNN(chars_len=26, embed_dim=16, hidden_dim=128, num_layers=2, dropout=0.2)
optimizer = torch.optim.Adam(model7.parameters(), lr=1e-3)
criterion = torch.nn.BCEWithLogitsLoss()
# reduces LR by factor of 0.5 if val loss stagnant for 2 epochs
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=2)


## Train
num_epochs = 10
for epoch in range(num_epochs):
    model7.train()
    total_train_loss = 0

    for x, y, mask, lengths in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs} [Train]"):
        logits = model7(x, lengths)
        loss = criterion(logits[mask == 1], y[mask == 1])

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_train_loss += loss.item()

    avg_train_loss = total_train_loss / len(train_loader)

    # compute validation loss
    model7.eval()
    total_val_loss = 0
    with torch.no_grad():
        for x, y, mask, lengths in val_loader:
            logits = model7(x, lengths)
            val_loss = criterion(logits[mask == 1], y[mask == 1])
            total_val_loss += val_loss.item()

    avg_val_loss = total_val_loss / len(val_loader)
    scheduler.step(avg_val_loss)

    print(f"Epoch {epoch+1}: Train Loss = {avg_train_loss:.4f}, Val Loss = {avg_val_loss:.4f}")

Epoch 1/10 [Train]: 100%|██████████| 2397/2397 [04:51<00:00,  8.21it/s]


Epoch 1: Train Loss = 0.1241, Val Loss = 0.1129


Epoch 2/10 [Train]: 100%|██████████| 2397/2397 [05:22<00:00,  7.44it/s]


Epoch 2: Train Loss = 0.1113, Val Loss = 0.1084


Epoch 3/10 [Train]: 100%|██████████| 2397/2397 [04:48<00:00,  8.30it/s]


Epoch 3: Train Loss = 0.1079, Val Loss = 0.1059


Epoch 4/10 [Train]: 100%|██████████| 2397/2397 [04:44<00:00,  8.43it/s]


Epoch 4: Train Loss = 0.1060, Val Loss = 0.1047


Epoch 5/10 [Train]: 100%|██████████| 2397/2397 [04:44<00:00,  8.42it/s]


Epoch 5: Train Loss = 0.1047, Val Loss = 0.1036


Epoch 6/10 [Train]: 100%|██████████| 2397/2397 [04:49<00:00,  8.28it/s]


Epoch 6: Train Loss = 0.1038, Val Loss = 0.1030


Epoch 7/10 [Train]: 100%|██████████| 2397/2397 [04:48<00:00,  8.31it/s]


Epoch 7: Train Loss = 0.1031, Val Loss = 0.1025


Epoch 8/10 [Train]: 100%|██████████| 2397/2397 [04:46<00:00,  8.37it/s]


Epoch 8: Train Loss = 0.1024, Val Loss = 0.1021


Epoch 9/10 [Train]: 100%|██████████| 2397/2397 [04:46<00:00,  8.37it/s]


Epoch 9: Train Loss = 0.1019, Val Loss = 0.1016


Epoch 10/10 [Train]: 100%|██████████| 2397/2397 [04:38<00:00,  8.59it/s]


Epoch 10: Train Loss = 0.1015, Val Loss = 0.1013


#### 6. Inference

In [19]:
## Inference
def guess(model, masked_word, guessed_letters):
    stripped = masked_word.replace(" ", "")
    if all(c == '_' for c in stripped):
        # First guess: prioritize vowels not yet guessed
        vowels = ['e', 'a', 'o', 'i', 'u']
        for v in vowels:
            if v not in guessed_letters:
                return v, None, None
        # If all vowels are wrong, fall back to consonants (ETAOIN SHRDLU consonants)
        sorted_consonants = ['t', 'n', 's', 'h', 'r', 'd', 'l', 'b', 'c', 'f', 'g', 'j', 'k', 'm', 'p', 'q', 'v', 'w', 'x', 'y', 'z']
        for c in sorted_consonants:
            if c not in guessed_letters:
                return c, None, None

    word_len = len(stripped)
    x_input = encode_word(stripped)

    model.eval()
    with torch.no_grad():
        lengths = torch.tensor([word_len])  # batch of 1
        logits = model(x_input.unsqueeze(0), lengths)
        probs = torch.softmax(logits[0], dim=-1)    # max_word_len x all_chars_len
        
        # 1. Zero out positions already known (non-zero in input)
        known_positions_mask = x_input.sum(dim=1) > 0  # [T]
        probs[known_positions_mask] = 0.0

        # 2. Zero out previously guessed letters
        if guessed_letters:
            guessed_char_idx = torch.tensor([char_to_idx.get(x) for x in guessed_letters])
            unknown_positions_mask = torch.tensor([i for i in range(word_len) if not known_positions_mask[i]])
            probs[unknown_positions_mask[:, None], guessed_char_idx] = 0

        # 3. normalize probabilities within each position
        row_sums = probs.sum(dim=1, keepdim=True) + 1e-8  # avoid division by zero
        probs_normalized = probs / row_sums

        # 4. pick max probability among all positions and characters
        guessed_pos, char = torch.where(probs_normalized == probs_normalized.max())
        guessed_char = idx_to_char[char[0].item()]

    return guessed_char, guessed_pos, probs_normalized

#### 7. Test model

In [20]:
## Example 1
masked_word = 'app_e'
hangman_input = ' '.join(masked_word)
guessed_letters = ['a', 'p', 'e', 'b', 'c', 'd', 'h']


char, pos, probs = guess(model, hangman_input, guessed_letters)

if len(set(masked_word)) == 1:
    print(f"Next guess: '{char}'")
else:
    print(f"Next guess: '{char}' at position {pos}")
    print("Probability matrix (masked positions only):")
    masked_positions = [idx for idx, char in enumerate(masked_word) if char == '_']
    for i in masked_positions:
        probs_dict = {idx_to_char[j]: float(probs[i, j]) for j in range(all_chars_len) if probs[i,j]>0}
        print(f"Position {i}: ", probs_dict)
        print(sorted(probs_dict, key=probs_dict.get, reverse=True))

Next guess: 'l' at position tensor([3])
Probability matrix (masked positions only):
Position 3:  {'f': 0.00012836948735639453, 'g': 0.001108763855881989, 'i': 0.19128884375095367, 'j': 0.0002790417929645628, 'k': 0.00015477229317184538, 'l': 0.6618263721466064, 'm': 0.00042593933176249266, 'n': 0.001427668146789074, 'o': 0.023885030299425125, 'q': 2.3676307137066033e-06, 'r': 0.05030466243624687, 's': 0.029816830530762672, 't': 0.00936949159950018, 'u': 0.024286340922117233, 'v': 0.0003797242825385183, 'w': 0.00012386047455947846, 'x': 1.4550921150657814e-05, 'y': 0.005062872543931007, 'z': 0.00011447598080849275}
['l', 'i', 'r', 's', 'u', 'o', 't', 'y', 'n', 'g', 'm', 'v', 'j', 'k', 'f', 'w', 'z', 'x', 'q']


In [21]:
## Example 2
masked_word = '____'
hangman_input = ' '.join(masked_word)
guessed_letters = []


char, pos, probs = guess(model, hangman_input, guessed_letters)

if len(set(masked_word)) == 1:
    print(f"Next guess: '{char}'")
else:
    print(f"Next guess: '{char}' at position {pos}")
    print("Probability matrix (masked positions only):")
    masked_positions = [idx for idx, char in enumerate(masked_word) if char == '_']
    for i in masked_positions:
        probs_dict = {idx_to_char[j]: float(probs[i, j]) for j in range(all_chars_len) if probs[i,j]>0}
        print(f"Position {i}: ", probs_dict)
        print(sorted(probs_dict, key=probs_dict.get, reverse=True))

Next guess: 'e'


#### 8. Save model

In [None]:
## save model

if False:
    torch.save(model7.state_dict(), "model7.pth")

## Playing recorded games:
Please finalize your code prior to running the cell below. Once this code executes once successfully your submission will be finalized. Our system will not allow you to rerun any additional games.

Please note that it is expected that after you successfully run this block of code that subsequent runs will result in the error message "Your account has been deactivated".

Once you've run this section of the code your submission is complete. Please send us your source code via email.

In [32]:
for i in range(1000):
    print('Playing ', i, ' th game')
    # Uncomment the following line to execute your final runs. Do not do this until you are satisfied with your submission
    api.start_game(practice=0,verbose=False)
    
    # DO NOT REMOVE as otherwise the server may lock you out for too high frequency of requests
    time.sleep(0.5)

Playing  0  th game
Playing  1  th game
Playing  2  th game
Playing  3  th game
Playing  4  th game
Playing  5  th game
Playing  6  th game
Playing  7  th game
Playing  8  th game
Playing  9  th game
Playing  10  th game
Playing  11  th game
Playing  12  th game
Playing  13  th game
Playing  14  th game
Playing  15  th game
Playing  16  th game
Playing  17  th game
Playing  18  th game
Playing  19  th game
Playing  20  th game
Playing  21  th game
Playing  22  th game
Playing  23  th game
Playing  24  th game
Playing  25  th game
Playing  26  th game


HangmanAPIError: {'error': 'You have reached 1000 of games', 'status': 'denied'}

## To check your game statistics
1. Simply use "my_status" method.
2. Returns your total number of games, and number of wins.

In [33]:
[total_practice_runs,total_recorded_runs,total_recorded_successes,total_practice_successes] = api.my_status() # Get my game stats: (# of tries, # of wins)
success_rate = total_recorded_successes/total_recorded_runs
print('overall success rate = %.3f' % success_rate)

overall success rate = 0.517


## ANNEX

In [None]:
## Model 1 - basic

class HangmanRNN(nn.Module):
    def __init__(self, chars_len=26, embed_dim=16, hidden_dim=256, num_layers=2, dropout=0.2):
        super().__init__()
        self.embedding = nn.Linear(chars_len, embed_dim)
        self.rnn = nn.GRU(embed_dim, hidden_dim, num_layers=num_layers, 
                          batch_first=True, dropout=dropout)#, bidirectional=True)
        self.norm = nn.LayerNorm(hidden_dim)
        self.fc = nn.Linear(hidden_dim, chars_len)

    def forward(self, x):
        x = self.embedding(x)   # num_batches x max_word_len x embed_dim
        out, _ = self.rnn(x)    # num_batches x max_word_len x hidden_dim
        out = self.norm(out)
        logits = self.fc(out)   # num_batches x max_word_len x all_chars_len
        return logits



## Model 2 - added bidirectional training

class HangmanRNN(nn.Module):
    def __init__(self, chars_len=26, embed_dim=16, hidden_dim=256, num_layers=2, dropout=0.2):
        super().__init__()
        self.embedding = nn.Linear(chars_len, embed_dim)
        self.rnn = nn.GRU(
            embed_dim,
            hidden_dim,
            num_layers=num_layers,
            batch_first=True,
            dropout=dropout,
            bidirectional=True
        )
        self.norm = nn.LayerNorm(hidden_dim * 2)    # bidirectional doubles hidden size
        self.fc = nn.Linear(hidden_dim * 2, chars_len)
        self.embed_dropout = nn.Dropout(dropout)    # Optional dropout after embedding

    def forward(self, x):
        x = self.embedding(x)
        x = self.embed_dropout(x)
        out, _ = self.rnn(x)
        out = self.norm(out)
        logits = self.fc(out)
        return logits



## Model 3 - added training-validation split

dataset_size = len(dataset)
val_size = int(0.1 * dataset_size)
train_size = dataset_size - val_size
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64, shuffle=False)



## Model 4 - added learning rate scheduler

model4 = HangmanRNN(chars_len=26, embed_dim=16, hidden_dim=128, num_layers=2, dropout=0.2)
optimizer = torch.optim.Adam(model4.parameters(), lr=1e-3)
criterion = torch.nn.BCEWithLogitsLoss()
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=2)



## Model 5 -- added dynamic padding so max_seq_len depends on max word length within each batch

def collate_fn_dynamic_padding(batch):
    xs, ys, masks = zip(*batch)
    lengths = [x.shape[0] for x in xs]
    lengths = torch.tensor(lengths)

    # pad_sequence expects a list of tensors [seq_len, feature_dim]
    xs = pad_sequence(xs, batch_first=True)
    ys = pad_sequence(ys, batch_first=True)
    masks = pad_sequence(masks, batch_first=True)

    # sort by descending length for pack_padded_sequence
    lengths, perm_idx = lengths.sort(0, descending=True)
    xs, ys, masks = xs[perm_idx], ys[perm_idx], masks[perm_idx]
    return xs, ys, masks, lengths

In [None]:
## Data processing with max word length

def encode_word(word, max_word_len=30, all_chars_len=26):
    word_len = len(word)
    encoded_word = torch.zeros((max_word_len, all_chars_len))

    for i, char in enumerate(word):
        if char == '_':
            continue  # skip if masked character
        encoded_word[i, char_to_idx[char]] = 1
    return encoded_word, word_len


def convert_word_to_training_data(word, max_word_len=30, all_chars_len=26):
    encoded_word, word_len = encode_word(word, max_word_len, all_chars_len)

    # create random masking, but ensures consistent masking per unique character
    unique_chars = sorted(set(word))
    while True:
        char_mask_map = {c: np.random.randint(0, 2) for c in unique_chars}  # 0=shown, 1=masked
        if len(set(char_mask_map.values())) != 1:   # ensure not all masked/unmasked
            break
    mask = np.array([char_mask_map[c] for c in word])

    # apply masking
    mask_tensor = torch.zeros(max_word_len, dtype=torch.float32)
    mask_tensor[:word_len] = torch.tensor(mask, dtype=torch.float32)

    # zero-out masked positions
    mask_bool = mask_tensor.bool()
    x_input = encoded_word.clone()
    mask_full = torch.zeros(max_word_len, dtype=torch.bool) # pad mask_bool to full length first (30)
    mask_full[:word_len] = mask_bool[:word_len]
    x_input[mask_full] = 0.0

    y_target = encoded_word  # same

    return x_input, y_target, mask_tensor


def process_all_words(words, max_word_len=30, all_chars_len=26, cache_file="processed.pkl", force_process=False):
    from multiprocessing.dummy import Pool
    import pickle
    
    if not force_process:
        try:
            with open(cache_file, "rb") as f:
                print(f"Loading cached preprocessed data from {cache_file}...")
                return pickle.load(f)
        except FileNotFoundError:
            print("No cached data found — preprocessing...")
 
    def worker(w):
        return convert_word_to_training_data(w, max_word_len, all_chars_len)

    with Pool() as pool:
        processed_data = list(
            tqdm(pool.imap(worker, words), total=len(words), desc="Preprocessing words")
        )

    # cache results for future runs
    with open(cache_file, "wb") as f:
        pickle.dump(processed_data, f)
        print(f"Saved preprocessed data to {cache_file}")

    return processed_data

In [None]:
## Model

class HangmanRNN(nn.Module):
    def __init__(self, chars_len=26, embed_dim=16, hidden_dim=128, num_layers=2, dropout=0.2):
        super().__init__()
        self.embedding = nn.Linear(chars_len, embed_dim)
        self.rnn = nn.GRU(
            embed_dim,
            hidden_dim,
            num_layers=num_layers,
            batch_first=True,
            dropout=dropout,
            bidirectional=True
        )
        self.norm = nn.LayerNorm(hidden_dim * 2)    # bidirectional doubles hidden size
        self.fc = nn.Linear(hidden_dim * 2, chars_len)
        self.embed_dropout = nn.Dropout(dropout)

    def forward(self, x):
        x = self.embedding(x)
        x = self.embed_dropout(x)
        out, _ = self.rnn(x)
        out = self.norm(out)
        logits = self.fc(out)
        return logits



## Split into training and val
dataset_size = len(dataset)
val_size = int(0.1 * dataset_size)
train_size = dataset_size - val_size
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64, shuffle=False)


## Define model, optimizer, loss, learning rate
model4 = HangmanRNN(chars_len=26, embed_dim=16, hidden_dim=128, num_layers=2, dropout=0.2)
optimizer = torch.optim.Adam(model4.parameters(), lr=1e-3)
criterion = torch.nn.BCEWithLogitsLoss()
# reduces LR by factor of 0.5 if val loss stagnant for 2 epochs
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=2)


## Train
num_epochs = 6
for epoch in range(num_epochs):
    model4.train()
    total_train_loss = 0

    for x, y, mask in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs} [Train]"):
        logits = model4(x)
        loss = criterion(logits[mask == 1], y[mask == 1])

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_train_loss += loss.item()

    avg_train_loss = total_train_loss / len(train_loader)

    # compute validation loss
    model4.eval()
    total_val_loss = 0
    with torch.no_grad():
        for x, y, mask in val_loader:
            logits = model4(x)
            val_loss = criterion(logits[mask == 1], y[mask == 1])
            total_val_loss += val_loss.item()

    avg_val_loss = total_val_loss / len(val_loader)
    scheduler.step(avg_val_loss)

    print(f"Epoch {epoch+1}: Train Loss = {avg_train_loss:.4f}, Val Loss = {avg_val_loss:.4f}")


## Inference
def guess(model, masked_word, guessed_letters):

    stripped = masked_word.replace(" ", "")
    if all(c == '_' for c in stripped):
        # First guess: prioritize vowels not yet guessed
        vowels = ['e', 'a', 'o', 'i', 'u']
        for v in vowels:
            if v not in guessed_letters:
                return v, None, None
        # If all vowels are wrong, fall back to consonants (ETAOIN SHRDLU consonants)
        sorted_consonants = ['t', 'n', 's', 'h', 'r', 'd', 'l', 'b', 'c', 'f', 'g', 'j', 'k', 'm', 'p', 'q', 'v', 'w', 'x', 'y', 'z']
        for c in sorted_consonants:
            if c not in guessed_letters:
                return c, None, None

    x_input, word_len = encode_word(stripped)

    model.eval()
    with torch.no_grad():
        logits = model(x_input.unsqueeze(0))        # 1 x max_word_len x all_chars_len
        probs = torch.softmax(logits[0], dim=-1)    # max_word_len x all_chars_len
        
        # 1. Zero out positions already known (non-zero in input)
        x_input = x_input[:word_len]
        probs = probs[:word_len]

        known_positions_mask = x_input.sum(dim=1) > 0  # [T]
        probs[known_positions_mask] = 0.0

        # 2. Zero out previously guessed letters
        if guessed_letters:
            guessed_char_idx = torch.tensor([char_to_idx.get(x) for x in guessed_letters])
            unknown_positions_mask = torch.tensor([i for i in range(word_len) if not known_positions_mask[i]])
            probs[unknown_positions_mask[:, None], guessed_char_idx] = 0

        # 3. Remove padding predictions
        probs = probs[:word_len]

        # 4. normalize probabilities within each position
        row_sums = probs.sum(dim=1, keepdim=True) + 1e-8  # avoid division by zero
        probs_normalized = probs / row_sums

        # 5. pick max probability among all positions and characters
        pos, char = torch.where(probs_normalized == probs_normalized.max())
        guessed_char = idx_to_char[char[0].item()]
        guessed_pos = pos[0].item()

    return guessed_char, guessed_pos, probs_normalized



## Example
masked_word = 'app_e'
hangman_input = ' '.join(masked_word)
guessed_letters = ['a', 'p', 'e', 'b', 'c', 'd', 'h']


char, pos, probs = guess(model4, hangman_input, guessed_letters)

if len(set(masked_word)) == 1:
    print(f"Next guess: '{char}'")
else:
    print(f"Next guess: '{char}' at position {pos}")
    print("Probability matrix (masked positions only):")
    masked_positions = [idx for idx, char in enumerate(masked_word) if char == '_']
    for i in masked_positions:
        probs_dict = {idx_to_char[j]: float(probs[i, j]) for j in range(all_chars_len) if probs[i,j]>0}
        print(f"Position {i}: ", probs_dict)
        print(sorted(probs_dict, key=probs_dict.get, reverse=True))