<a href="https://colab.research.google.com/github/gothchico/Hangman-RL/blob/main/hangman_api_user.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Trexquant Interview Project (The Hangman Game)

* Copyright Trexquant Investment LP. All Rights Reserved. 
* Redistribution of this question without written consent from Trexquant is prohibited

## Instruction:
For this coding test, your mission is to write an algorithm that plays the game of Hangman through our API server. 

When a user plays Hangman, the server first selects a secret word at random from a list. The server then returns a row of underscores (space separated)—one for each letter in the secret word—and asks the user to guess a letter. If the user guesses a letter that is in the word, the word is redisplayed with all instances of that letter shown in the correct positions, along with any letters correctly guessed on previous turns. If the letter does not appear in the word, the user is charged with an incorrect guess. The user keeps guessing letters until either (1) the user has correctly guessed all the letters in the word
or (2) the user has made six incorrect guesses.

You are required to write a "guess" function that takes current word (with underscores) as input and returns a guess letter. You will use the API codes below to play 1,000 Hangman games. You have the opportunity to practice before you want to start recording your game results.

Your algorithm is permitted to use a training set of approximately 250,000 dictionary words. Your algorithm will be tested on an entirely disjoint set of 250,000 dictionary words. Please note that this means the words that you will ultimately be tested on do NOT appear in the dictionary that you are given. You are not permitted to use any dictionary other than the training dictionary we provided. This requirement will be strictly enforced by code review.

You are provided with a basic, working algorithm. This algorithm will match the provided masked string (e.g. a _ _ l e) to all possible words in the dictionary, tabulate the frequency of letters appearing in these possible words, and then guess the letter with the highest frequency of appearence that has not already been guessed. If there are no remaining words that match then it will default back to the character frequency distribution of the entire dictionary.

This benchmark strategy is successful approximately 18% of the time. Your task is to design an algorithm that significantly outperforms this benchmark.

In [2]:
import json
import requests
import random
import string
import secrets
import time
import re
import collections
import torch
import numpy
import pandas
import os
from torch.utils.data import TensorDataset, DataLoader
try:
    from urllib.parse import parse_qs, urlencode, urlparse
except ImportError:
    from urllib.parse import parse_qs, urlparse
    from urllib import urlencode

In [3]:
print(torch.__version__)
# torch.cuda.is_available() checks and returns a Boolean True if a GPU is available, else it'll return False
is_cuda = torch.cuda.is_available()

# If we have a GPU available, we'll set our device to GPU. We'll use this device variable later in our code.
if is_cuda:
    device = torch.device("cuda:0")
else:
    device = torch.device("cpu")

print(device)

1.12.1+cu113
cpu


In [4]:
torch.version.cuda

'11.3'

In [5]:
!pip install wandb
!wandb login

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting wandb
  Downloading wandb-0.13.5-py2.py3-none-any.whl (1.9 MB)
[K     |████████████████████████████████| 1.9 MB 5.2 MB/s 
[?25hCollecting pathtools
  Downloading pathtools-0.1.2.tar.gz (11 kB)
Collecting docker-pycreds>=0.4.0
  Downloading docker_pycreds-0.4.0-py2.py3-none-any.whl (9.0 kB)
Collecting sentry-sdk>=1.0.0
  Downloading sentry_sdk-1.11.0-py2.py3-none-any.whl (168 kB)
[K     |████████████████████████████████| 168 kB 50.7 MB/s 
Collecting shortuuid>=0.5.0
  Downloading shortuuid-1.0.11-py3-none-any.whl (10 kB)
Collecting setproctitle
  Downloading setproctitle-1.3.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (30 kB)
Collecting GitPython>=1.0.0
  Downloading GitPython-3.1.29-py3-none-any.whl (182 kB)
[K     |████████████████████████████████| 182 kB 58.0 MB/s 
Collecting gitdb<5,>=4.0.1
  Downloading gitdb-4.0.9-

In [6]:
import wandb

wandb.init(project="hangman", entity="gothchico")

ERROR:wandb.jupyter:Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Currently logged in as: [33mgothchico[0m. Use [1m`wandb login --relogin`[0m to force relogin


In [7]:
import numpy
import torch
from torch.autograd import Variable
x = numpy.zeros(26)
print(x.shape)
y = torch.from_numpy(x).unsqueeze(0)
print(y.shape)


output = Variable(torch.randn(10, 120).float())
target = Variable(torch.FloatTensor(10).uniform_(0, 120).long())

print(output)
print(target)
# embedding_dim = MAX_NUM_INPUTS*27

got_em_right = 0

(26,)
torch.Size([1, 26])
tensor([[-0.0712, -0.2696, -0.6908,  ..., -1.0930, -1.3589, -1.7767],
        [-0.2336, -0.2340,  0.7080,  ...,  1.7974, -0.0369,  1.1163],
        [ 0.9953, -0.1802, -0.3438,  ...,  0.3147, -1.2167, -0.6586],
        ...,
        [-2.6683, -2.5680,  0.9873,  ...,  0.3147,  0.3944, -0.5959],
        [ 1.1478,  0.1586,  0.6630,  ..., -0.2713,  2.1713, -0.8966],
        [-0.4004, -0.1192,  0.3251,  ..., -0.0725, -0.1774,  0.0656]])
tensor([ 23,   8,  12, 118,  80,  71, 112,  52,  32,  62])


In [8]:
class GuessNet(torch.nn.Module):
  def __init__(self , output_size, embedding_dim, hidden_dim, n_layers=3):
        super(GuessNet, self).__init__()
        self.output_size = output_size
        self.n_layers = n_layers
        self.hidden_dim = hidden_dim
        
        self.embedding = torch.nn.Embedding(28, embedding_dim, padding_idx=0)
        self.lstm = torch.nn.LSTM(embedding_dim, hidden_dim, n_layers, batch_first=True)
        # self.dropout = torch.nn.Dropout(drop_prob)
        self.fc = torch.nn.Linear(hidden_dim+26*embedding_dim, output_size)
        # self.softmax = torch.nn.LogSoftmax()
        
  def forward(self, a, b, hidden):
        batch_size = a.size(0)
        # a = a.long()
        print(f'obscured input looks like : {a} with shape {a.shape}')
        embeds = self.embedding(a.long())
        # print(f'embeds look like : {embeds} with shape{embeds.shape}')
        lstm_out, hidden = self.lstm(embeds, hidden)
        # print(f'lstm_out looks like : {lstm_out} with shape{lstm_out.shape}')
        # print(f'hidden look like : {hidden}')
       
        # # Index hidden state of last time step
        # 1x135x50
        # lstm_out = lstm_out.contiguous().view(-1, self.hidden_dim)
        out = lstm_out[:, -1, :]

        # print(f'out looks like : {out} with shape{out.shape}')

        # 1x50
        # out = self.dropout(out)
        b = self.embedding(b.long()).reshape(1,-1)
        # print(f'b looks like : {b} with shape{b.shape}')
        concat_inp = torch.cat((out, b), -1)
        # print(f'concat_inp looks like : {concat_inp} with shape{concat_inp.shape}')
        # print(concat_inp.shape)
        out = self.fc(concat_inp)      #use concat hereeee
        # print(f'out looks like : {out} with shape{out.shape}')
        # print(f'result of fc is : {out} with shape {out.shape}')


        # out = self.sigmoid(out)
        
        # out = out.view(batch_size, -1)
        # out = out[:,-1]
        return out, hidden

    
  def init_hidden(self, batch_size = 1):
        weight = next(self.parameters()).data
        hidden = (weight.new(self.n_layers, batch_size, self.hidden_dim).zero_().to(device),
                      weight.new(self.n_layers, batch_size, self.hidden_dim).zero_().to(device))
        return hidden

class SmartPlayer(object):

  def __init__(self, input, model, criterion, optimiser, batched = False, chances = 6):

        self.guessed_letters = set([])
        self.batched = batched
        if batched is False :
          self.encoded_word = input[0][0].detach().cpu().numpy().tolist().remove(0)
          self.original_word = [chr(i+97) for i in self.encoded_word]
          self.remaining_letters_set = set(self.encoded_word)
        else :
          self.batched_input = input
          self.encoded_word = []

        self.correct_responses = torch.empty((0,26))
        self.obscured_words_seen = []
        self.letters_previously_guessed = []
        self.guesses = []
        self.out_tensors = torch.empty((0,26))
        self.batched_out_tensors = torch.empty((0,chances,26)).to(device)
        self.batched_correct_tensors = torch.empty((0,chances,26)).to(device)

        self.chances_left = chances

        self.model = model
        self.optim = optimiser
        self.criterion = criterion

        full_dictionary_location = "words_250000_train.txt"
        self.full_dictionary = self.build_dictionary(full_dictionary_location)
        self.max_word_len = max([len(i) for i in self.full_dictionary])
        self.full_dictionary_common_letter_sorted = collections.Counter("".join(self.full_dictionary)).most_common()

        self.current_dictionary = self.full_dictionary

        # print(f'encoded word is : {self.encoded_word}')
        # print(f'remaining_letters_set is : {self.remaining_letters_set}')

  def set_word(self,enc_word):
    
        self.encoded_word = enc_word.detach().cpu().numpy().tolist()
        self.encoded_word = list(filter((0).__ne__, self.encoded_word))
        self.encoded_word = [ i-1 for i in self.encoded_word]
        self.original_word = [chr(i+97) for i in self.encoded_word]
        print(f'encoded and original words : {self.encoded_word} and {self.original_word}')
        self.remaining_letters_set = set(self.encoded_word)
        self.correct_responses = torch.empty((0,26)).to(device)
        self.obscured_words_seen = []
        self.letters_previously_guessed = []
        self.guesses = []
        self.out_tensors = torch.empty((0,26)).to(device)
        self.guessed_letters = set([])

        return

  def reset_chances(self):
      self.chances_left = 6

  def padded_for_output(self, x_tensor):
    if(x_tensor.shape[0]==0):
        return torch.zeros((6,26), dtype=torch.float32).to(device)
    dim = x_tensor.shape[0]
    for i in range(6-dim):
        x_tensor = torch.cat((x_tensor,torch.zeros((1,26), dtype=torch.float32).to(device)),0)
    assert(x_tensor.shape[0]==6)

    return x_tensor.to(device)

  def run_1(self):
    # if self.batched is True:
      for enc_word in self.batched_input[0]:
          self.reset_chances()
          self.set_word(enc_word)
          while (self.chances_left > 0) and (len(self.remaining_letters_set) > 0):
              encoded_guessed_letters, obscured_input = self.prepare_inputs() 
              self.model.zero_grad()
              guessed_encoding_tensor, _ = self.model(obscured_input.to(device),encoded_guessed_letters.to(device), None)
              # print(f'guessed_encoding_tensor is : {guessed_encoding_tensor} with shape : {guessed_encoding_tensor.shape}')
              guessed_encoding = self.store_result(guessed_encoding_tensor,obscured_input,encoded_guessed_letters)
              # print(f'The guessed output num is {guessed_encoding} and letter is {chr(guessed_encoding+97)}') 
              
              # print(f'chances left are : {self.chances_left} and no. of remaining letters left are : {len(self.remaining_letters_set)}')
              # print(f'remaining_letters_set is : {self.remaining_letters_set}')
              # print(f'shapes : {self.batched_out_tensors.shape} and {self.out_tensors.unsqueeze(0).shape}')
          self.batched_out_tensors = torch.cat((self.batched_out_tensors, self.padded_for_output(self.out_tensors).unsqueeze(0)),0)
          self.batched_correct_tensors = torch.cat((self.batched_correct_tensors, self.padded_for_output(self.correct_responses).unsqueeze(0)),0)
        # Return the observations for use in training (both inputs, predictions, and losses)
        # print(type(torch.tensor(self.correct_responses)), torch.tensor(self.correct_responses).shape)
      return(self.batched_out_tensors, self.batched_correct_tensors)

  def run(self):
    # if self.batched is True:
      got_em_right = 0
      loss = 0
      for enc_word in self.batched_input[0]:
          self.reset_chances()
          self.set_word(enc_word)
          while (self.chances_left > 0) and (len(self.remaining_letters_set) > 0):
              encoded_guessed_letters, obscured_input = self.prepare_inputs() 
              self.model.zero_grad()
              self.optim.zero_grad()
              guessed_encoding_tensor, _ = self.model(obscured_input.to(device),encoded_guessed_letters.to(device), None)
              # print(f'guessed_encoding_tensor is : {guessed_encoding_tensor} with shape : {guessed_encoding_tensor.shape}')
              guessed_encoding = self.store_new_result(guessed_encoding_tensor)
              print(f'The guessed output num is {guessed_encoding} and letter is {chr(guessed_encoding+97)}') 
              y_pred = self.padded_for_output(self.out_tensors).to(device)
              y_target = self.padded_for_output(self.correct_responses).to(device)
              loss = self.criterion(y_pred, y_target)
          # print(f'out_tensors : {y_pred} and shape : {y_pred.shape}')
          # print(f'correct_responses : {y_target} and shape : {y_target.shape}')
          if (self.chances_left == 6):
              got_em_right+=1
          if (self.model.training is True) and (self.chances_left < 6):
              wandb.log({"loss": loss})
              loss.backward()
              torch.nn.utils.clip_grad_norm_(self.model.parameters(), 5)
              self.optim.step()

      return(loss, got_em_right)

  def encode_correct_responses(self):
        # To be used with cross_entropy_with_softmax, this vector must be normalized
        # response = numpy.zeros(26, dtype=numpy.float32)
        # for i in self.remaining_letters_set:
        #     response[i] = 1.0
        # response /= response.sum()
        # return torch.tensor(response).to(device)
        
        # response = numpy.zeros(26, dtype=numpy.float32)
        # for ii, x in enumerate(self.remaining_letters_set):
        #     response[ii] = x
        # response[numpy.nonzero(response)[-1]+1] = -1

        labels = torch.tensor(list(self.remaining_letters_set))
        labels = labels.unsqueeze(0)
        target = torch.zeros(labels.size(0), 26).scatter_(1, labels, 1.)

        # print(target)

        return target.to(device)
  
  # def store_result(self, guessed_enc_tensor, obscured_input, encoded_guessed_letters):

  #       # Record what the model saw as input: an obscured word and a list of previously-guessed letters
  #       self.obscured_words_seen.append(obscured_input.detach().cpu().numpy())
  #       self.letters_previously_guessed.append(encoded_guessed_letters.detach().cpu().numpy())
        
  #       # Find the index of the guessed letter
  #       guessed_encoding = numpy.argmax(numpy.squeeze(guessed_enc_tensor.detach().cpu().numpy()))

        
  #       # Record the letter that the model guessed, and add that guess to the list of previous guesses
  #       self.guesses.append(guessed_encoding).sort()
  #       self.guessed_letters.add(guessed_encoding)


  #       # Determine an appropriate reward, and reduce # of chances left if appropriate
  #       if guessed_encoding in self.remaining_letters_set:
  #           self.remaining_letters_set.remove(guessed_encoding)
  #       else : 
  #           self.chances_left -= 1
  #           #Saving the out tensors
  #           self.out_tensors = torch.cat((self.out_tensors, guessed_enc_tensor),0)
  #           # Store the "correct responses"
  #           correct_responses = self.encode_correct_responses()
  #           self.correct_responses = torch.cat((self.correct_responses, correct_responses.unsqueeze(0)),0)

  #       # if self.correct_responses[-1][guessed_encoding] < 0.00001:
  #       #     self.chances_left -= 1
  #       return guessed_encoding

  def store_new_result(self, guessed_enc_tensor):
        
        # Find the index of the guessed letter
        guessed_encoding = torch.argmax(torch.squeeze(guessed_enc_tensor))
        guessed_encoding = guessed_encoding.item()
        # Record the letter that the model guessed, and add that guess to the list of previous guesses
        self.guesses.append(guessed_encoding)
        self.guessed_letters.add(guessed_encoding)  
        
        # print(f'guessed_letters are : {self.guessed_letters}')
        print(f'guesses are : {self.guesses}')
         
        # Determine an appropriate reward, and reduce # of chances left if appropriate
        if guessed_encoding in self.remaining_letters_set:
            self.remaining_letters_set.remove(guessed_encoding)
        else : 
            self.chances_left -= 1
            #Saving the out tensors
            self.out_tensors = torch.cat((self.out_tensors, guessed_enc_tensor),0).to(device)
            # Store the "correct responses"
            correct_responses = self.encode_correct_responses()
            self.correct_responses = torch.cat((self.correct_responses, correct_responses),0).to(device)

        # if self.correct_responses[-1][guessed_encoding] < 0.00001:
        #     self.chances_left -= 1
        return guessed_encoding

  def build_dictionary(self, dictionary_file_location):
      text_file = open(dictionary_file_location,"r")
      full_dictionary = text_file.read().splitlines()
      text_file.close()
      return full_dictionary


  def prepare_inputs(self):

        encoded_obscured_word = []
        encoded_guessed_letters_vector = torch.zeros(26, dtype=torch.float32)
        # for ii, x in enumerate(self.guessed_letters):
        #     if x in self.encoded_word:
        #         encoded_guessed_letters_vector[ii] = x
        #     else:
        #         encoded_guessed_letters_vector[i] = -1

        for ii, x in enumerate(self.guesses):
            encoded_guessed_letters_vector[ii] = x+1

        
        encoded_obscured_word = [i+1 if i in self.guessed_letters else 27 for i in self.encoded_word]
        # encoded_obscured_word_vector = numpy.zeros((len(encoded_obscured_word), 27), dtype=numpy.float32)
        # for i, j in enumerate(encoded_obscured_word):
        #     encoded_obscured_word_vector[i, j] = 1

        encoded_obscured_word_vector = torch.zeros(self.max_word_len, dtype=torch.float32)
        for ii, x in enumerate(encoded_obscured_word):
            encoded_obscured_word_vector[-len(encoded_obscured_word)+ii] = x

        # flattened_obscured_word = encoded_obscured_word_vector.reshape(1,-1)

        return encoded_guessed_letters_vector.unsqueeze(0), encoded_obscured_word_vector.unsqueeze(0)

 

In [9]:
HANGMAN_URL = "https://www.trexsim.com/trexsim/hangman"
model_filename = './state_dict.pt'




# fill up guessed_letters = []
class HangmanAPI(object):
    def __init__(self, access_token=None, session=None, timeout=None):
        self.access_token = access_token
        self.session = session or requests.Session()
        self.timeout = timeout
        self.guessed_letters = []

        full_dictionary_location = "words_250000_train.txt"
        self.full_dictionary = self.build_dictionary(full_dictionary_location)
        self.vocab_size = len(self.full_dictionary)
        self.max_word_len = max([len(i) for i in self.full_dictionary])
        self.avg_word_len = numpy.array([len(i) for i in self.full_dictionary]).mean()

        self.encoded_dictionary = [[ord(i)-96 for i in word] for word in self.full_dictionary]
        # print(f'printing : {type(self.encoded_dictionary)} and {self.encoded_dictionary[:5]}')
        
        self.full_dictionary_common_letter_sorted = collections.Counter("".join(self.full_dictionary)).most_common()
        if os.path.exists(model_filename) :
            self.model = GuessNet(26, 8, 26)
            self.model.load_state_dict(torch.load(model_filename))
            self.model.eval()
        else :
            self.perform_training_and_save_model()
            

    # Defining a function that pads words with 0 to a fixed length
    def pad_input(self, words, max_len):
        data = numpy.zeros((self.vocab_size, max_len),dtype= numpy.int32)
        for ii, word in enumerate(words):
            if len(word) != 0:
                data[ii, -len(word):] = numpy.array(word)[:max_len]
        return data

    def guess(self,word):

        word = word[::2].replace("_","{")
        guessed_letter = '?'
        if self.model is None:
            if os.path.exists(model_filename) :
                self.model = GuessNet(26, 8, 26)
                self.model.load_state_dict(torch.load(model_filename))
                self.model.eval()
            else :
                print(f'pre-trained model isn\'t available, check model.load and model.save')
            
        with torch.no_grad():
            self.model.eval()
            player = SmartPlayer(word,self.model)
            encoded_guessed_letters, obscured_input = player.prepare_inputs() 
            self.model.zero_grad()
            guessed_encoding_tensor, _ = player.model(obscured_input.to(device),encoded_guessed_letters.to(device), None)
            
            guessed_encoding = numpy.argmax(numpy.squeeze(guessed_encoding_tensor.detach().numpy()))
        
        if guessed_letter == '?':
            print("Didn't find a suitable letter lol")
        return chr(guessed_encoding+96)

    def perform_training_and_save_model(self):

        # During training, the model will only see words below this index.
        # The remainder of the words can be used as a validation set.
        train_val_split_idx = int(len(self.full_dictionary) * 0.8)
        print('Training with {} words'.format(train_val_split_idx))

        padded_words = self.pad_input(self.encoded_dictionary, self.max_word_len)

        # print(f'first 3 padded words : {padded_words[:3]}')

        # train_words = padded_words[:train_val_split_idx]
        # val_words = padded_words[train_val_split_idx:]

        train_words = padded_words[:200]
        val_words = padded_words[train_val_split_idx:train_val_split_idx+200]
        
        train_data = TensorDataset(torch.from_numpy(train_words))
        val_data = TensorDataset(torch.from_numpy(val_words))

        batch_size = 10

        train_loader = DataLoader(train_data, shuffle=True, batch_size=batch_size)
        val_loader = DataLoader(val_data, shuffle=True, batch_size=batch_size)
        
        print('Max word length: {}, average word length: {:0.1f}'.format(self.max_word_len, self.avg_word_len))

        self.model = GuessNet(26, 8, 26)
        self.model.to(device)
        lr=0.0001
        # criterion = torch.nn.CrossEntropyLoss()
        # criterion = torch.nn.MultiLabelMarginLoss()
        criterion = torch.nn.BCEWithLogitsLoss(reduction = 'mean').to(device)
        optimizer = torch.optim.Adam(self.model.parameters(), lr=lr, weight_decay=1e-6)
        # optimizer = torch.optim.Adagrad(self.model.parameters(), 
        #                                 lr=lr, 
        #                                 lr_decay=0, 
        #                                 eps=1e-10)
        epochs = 50
        counter = 0
        print_every = 100
        clip = 5
        valid_loss_min = numpy.Inf
        # batch_size = 1

        wandb.config.epochs = epochs
        wandb.config.batch_size = batch_size
        wandb.config.learning_rate = lr
        wandb.config.weight_decay = 1e-6
        wandb.config.architecture = "GuessNet"
        
        print("--------------------STARTING TRAINING--------------------")

        self.model.train()
        total_words_processed = 0
        RIGHT = 0

        for epoch in range(epochs):
            h = self.model.init_hidden(batch_size)
            for train_input in train_loader :

                counter += 1
                # print(f'train_input is {len(train_input)} with shape {train_input[0].shape}')
                # print(f'The first 3 words are : {train_input[0][:3]}')
                h = tuple([e.data for e in h])
                player = SmartPlayer(train_input, self.model, criterion, optimizer, batched = True)
                loss, right = player.run()
                RIGHT +=right
                print("Epoch: {}/{}...".format(epoch+1,epochs),"Step: {}...".format(counter),
                          "Loss: {:.6f}...".format(loss.item())," Right : {}/{}...".format(RIGHT,counter*batch_size))
                
                if counter%print_every == 0:
                    val_h = self.model.init_hidden(batch_size)
                    val_losses = []
                    total_words_validated = 0
                    self.model.eval()
                    print('----------------------validating----------------------')
                    for val_input in val_loader :
                        total_words_validated+=1
                        val_player = SmartPlayer(val_input,self.model, criterion, optimizer, batched = True)
                        val_loss, right = val_player.run()
                        RIGHT +=right
                        print(type(val_loss))
                        val_losses.append(val_loss.item())
                        wandb.log({"val_loss": val_loss})
                    self.model.train()
                    print("Loss: {:.6f}...".format(loss.item()),
                          "Val Loss: {:.6f}".format(numpy.mean(numpy.array(val_losses)))
                          )
                    runs,_,_,success = self.my_status()
                    print('running %d practice games done. practice success rate so far = %.3f' % (runs, success/runs))
                    if numpy.mean(val_losses) <= valid_loss_min:
                          torch.save(self.model.state_dict(), model_filename)
                          print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(valid_loss_min,numpy.mean(val_losses)))
                          valid_loss_min = numpy.mean(val_losses)
                    
                total_words_processed+=1

  # gotta save it toooooooooo
        print(f'Got em right!!!!!!! :{got_em_right}')
        return

    def guess_1(self, word): # word input example: "_ p p _ e "
        ###############################################
        # Replace with your own "guess" function here #
        ###############################################

        # clean the word so that we strip away the space characters
        # replace "_" with "." as "." indicates any character in regular expressions
        clean_word = word[::2].replace("_",".")
        
        # find length of passed word
        len_word = len(clean_word)
        
        # grab current dictionary of possible words from self object, initialize new possible words dictionary to empty
        current_dictionary = self.current_dictionary
        new_dictionary = []
        
        # iterate through all of the words in the old plausible dictionary
        for dict_word in current_dictionary:
            # continue if the word is not of the appropriate length
            if len(dict_word) != len_word:
                continue
                
            # if dictionary word is a possible match then add it to the current dictionary
            if re.match(clean_word,dict_word):
                new_dictionary.append(dict_word)
        
        # overwrite old possible words dictionary with updated version
        self.current_dictionary = new_dictionary
        
        
        # count occurrence of all characters in possible word matches
        full_dict_string = "".join(new_dictionary)
        
        c = collections.Counter(full_dict_string)
        sorted_letter_count = c.most_common()                   
        
        guess_letter = '!'
        
        # return most frequently occurring letter in all possible words that hasn't been guessed yet
        for letter,instance_count in sorted_letter_count:
            if letter not in self.guessed_letters:
                guess_letter = letter
                break
            
        # if no word matches in training dictionary, default back to ordering of full dictionary
        if guess_letter == '!':
            sorted_letter_count = self.full_dictionary_common_letter_sorted
            for letter,instance_count in sorted_letter_count:
                if letter not in self.guessed_letters:
                    guess_letter = letter
                    break            
        
        return guess_letter

    ##########################################################
    # You'll likely not need to modify any of the code below #
    ##########################################################
    
    def build_dictionary(self, dictionary_file_location):
        text_file = open(dictionary_file_location,"r")
        full_dictionary = text_file.read().splitlines()
        text_file.close()
        return full_dictionary
                
    def start_game(self, practice=True, verbose=True):
        # reset guessed letters to empty set and current plausible dictionary to the full dictionary
        self.guessed_letters = []
        self.current_dictionary = self.full_dictionary
                         
        response = self.request("/new_game", {"practice":practice})
        if response.get('status')=="approved":
            game_id = response.get('game_id')
            word = response.get('word')
            tries_remains = response.get('tries_remains')
            if verbose:
                print("Successfully start a new game! Game ID: {0}. # of tries remaining: {1}. Word: {2}.".format(game_id, tries_remains, word))
            while tries_remains>0:
                # get guessed letter from user code
                guess_letter = self.guess(word)
                    
                # append guessed letter to guessed letters field in hangman object
                self.guessed_letters.append(guess_letter)
                if verbose:
                    print("Guessing letter: {0}".format(guess_letter))
                    
                try:    
                    res = self.request("/guess_letter", {"request":"guess_letter", "game_id":game_id, "letter":guess_letter})
                except HangmanAPIError:
                    print('HangmanAPIError exception caught on request.')
                    continue
                except Exception as e:
                    print('Other exception caught on request.')
                    raise e
               
                if verbose:
                    print("Sever response: {0}".format(res))
                status = res.get('status')
                tries_remains = res.get('tries_remains')
                if status=="success":
                    if verbose:
                        print("Successfully finished game: {0}".format(game_id))
                    return True
                elif status=="failed":
                    reason = res.get('reason', '# of tries exceeded!')
                    if verbose:
                        print("Failed game: {0}. Because of: {1}".format(game_id, reason))
                    return False
                elif status=="ongoing":
                    word = res.get('word')
        else:
            if verbose:
                print("Failed to start a new game")
        return status=="success"
        
    def my_status(self):
        return self.request("/my_status", {})
    
    def request(
            self, path, args=None, post_args=None, method=None):
        if args is None:
            args = dict()
        if post_args is not None:
            method = "POST"

        # Add `access_token` to post_args or args if it has not already been
        # included.
        if self.access_token:
            # If post_args exists, we assume that args either does not exists
            # or it does not need `access_token`.
            if post_args and "access_token" not in post_args:
                post_args["access_token"] = self.access_token
            elif "access_token" not in args:
                args["access_token"] = self.access_token

        time.sleep(0.2)

        num_retry, time_sleep = 50, 2
        for it in range(num_retry):
            try:
                response = self.session.request(
                    method or "GET",
                    HANGMAN_URL + path,
                    timeout=self.timeout,
                    params=args,
                    data=post_args
                )
                break
            except requests.HTTPError as e:
                response = json.loads(e.read())
                raise HangmanAPIError(response)
            except requests.exceptions.SSLError as e:
                if it + 1 == num_retry:
                    raise
                time.sleep(time_sleep)

        headers = response.headers
        if 'json' in headers['content-type']:
            result = response.json()
        elif "access_token" in parse_qs(response.text):
            query_str = parse_qs(response.text)
            if "access_token" in query_str:
                result = {"access_token": query_str["access_token"][0]}
                if "expires" in query_str:
                    result["expires"] = query_str["expires"][0]
            else:
                raise HangmanAPIError(response.json())
        else:
            raise HangmanAPIError('Maintype was not text, or querystring')

        if result and isinstance(result, dict) and result.get("error"):
            raise HangmanAPIError(result)
        return result
    
class HangmanAPIError(Exception):
    def __init__(self, result):
        self.result = result
        self.code = None
        try:
            self.type = result["error_code"]
        except (KeyError, TypeError):
            self.type = ""

        try:
            self.message = result["error_description"]
        except (KeyError, TypeError):
            try:
                self.message = result["error"]["message"]
                self.code = result["error"].get("code")
                if not self.type:
                    self.type = result["error"].get("type", "")
            except (KeyError, TypeError):
                try:
                    self.message = result["error_msg"]
                except (KeyError, TypeError):
                    self.message = result

        Exception.__init__(self, self.message)

# API Usage Examples

## To start a new game:
1. Make sure you have implemented your own "guess" method.
2. Use the access_token that we sent you to create your HangmanAPI object. 
3. Start a game by calling "start_game" method.
4. If you wish to test your function without being recorded, set "practice" parameter to 1.
5. Note: You have a rate limit of 20 new games per minute. DO NOT start more than 20 new games within one minute.

In [None]:
api = HangmanAPI(access_token= "4252c648598f8d531973074650c262", timeout=2000)


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
         27.]]) with shape torch.Size([1, 29])
guesses are : [0, 18, 19]
The guessed output num is 19 and letter is t
obscured input looks like : tensor([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0., 19., 27.,  1., 27., 27., 27.,  1.,
         27.]]) with shape torch.Size([1, 29])
guesses are : [0, 18, 19, 3]
The guessed output num is 3 and letter is d
obscured input looks like : tensor([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0., 19., 27.,  1., 27., 27., 27.,  1.,
         27.]]) with shape torch.Size([1, 29])
guesses are : [0, 18, 19, 3, 19]
The guessed output num is 19 and letter is t
obscured input looks like : tensor([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0., 19., 27.,  1., 27., 27., 27.,  1.,
         27.]]

In [None]:
runs,_,_,win = api.my_status()
print('running %d practice games done. practice success rate so far = %.3f' % (runs, win/runs))


## Playing practice games:
You can use the command below to play up to 100,000 practice games.

In [None]:
api.start_game(practice=1,verbose=True)
[total_practice_runs,total_recorded_runs,total_recorded_successes,total_practice_successes] = api.my_status() # Get my game stats: (# of tries, # of wins)
practice_success_rate = total_practice_successes / total_practice_runs
print('run %d practice games out of an allotted 100,000. practice success rate so far = %.3f' % (total_practice_runs, practice_success_rate))


## Playing recorded games:
Please finalize your code prior to running the cell below. Once this code executes once successfully your submission will be finalized. Our system will not allow you to rerun any additional games.

Please note that it is expected that after you successfully run this block of code that subsequent runs will result in the error message "Your account has been deactivated".

Once you've run this section of the code your submission is complete. Please send us your source code via email.

In [None]:
# for i in range(1000):
#     print('Playing ', i, ' th game')
#     # Uncomment the following line to execute your final runs. Do not do this until you are satisfied with your submission
#     #api.start_game(practice=0,verbose=False)
    
#     # DO NOT REMOVE as otherwise the server may lock you out for too high frequency of requests
#     time.sleep(0.5)

## To check your game statistics
1. Simply use "my_status" method.
2. Returns your total number of games, and number of wins.

In [None]:
[total_practice_runs,total_recorded_runs,total_recorded_successes,total_practice_successes] = api.my_status() # Get my game stats: (# of tries, # of wins)
success_rate = total_recorded_successes/total_recorded_runs
print('overall success rate = %.3f' % success_rate)