# Trexquant Interview Project (The Hangman Game)

* Copyright Trexquant Investment LP. All Rights Reserved. 
* Redistribution of this question without written consent from Trexquant is prohibited

## Instruction:
For this coding test, your mission is to write an algorithm that plays the game of Hangman through our API server. 

When a user plays Hangman, the server first selects a secret word at random from a list. The server then returns a row of underscores (space separated)—one for each letter in the secret word—and asks the user to guess a letter. If the user guesses a letter that is in the word, the word is redisplayed with all instances of that letter shown in the correct positions, along with any letters correctly guessed on previous turns. If the letter does not appear in the word, the user is charged with an incorrect guess. The user keeps guessing letters until either (1) the user has correctly guessed all the letters in the word
or (2) the user has made six incorrect guesses.

You are required to write a "guess" function that takes current word (with underscores) as input and returns a guess letter. You will use the API codes below to play 1,000 Hangman games. You have the opportunity to practice before you want to start recording your game results.

Your algorithm is permitted to use a training set of approximately 250,000 dictionary words. Your algorithm will be tested on an entirely disjoint set of 250,000 dictionary words. Please note that this means the words that you will ultimately be tested on do NOT appear in the dictionary that you are given. You are not permitted to use any dictionary other than the training dictionary we provided. This requirement will be strictly enforced by code review.

You are provided with a basic, working algorithm. This algorithm will match the provided masked string (e.g. a _ _ l e) to all possible words in the dictionary, tabulate the frequency of letters appearing in these possible words, and then guess the letter with the highest frequency of appearence that has not already been guessed. If there are no remaining words that match then it will default back to the character frequency distribution of the entire dictionary.

This benchmark strategy is successful approximately 18% of the time. Your task is to design an algorithm that significantly outperforms this benchmark.

In [1]:
import json
import requests
import random
import string
import secrets
import time
import re
import collections
from collections import defaultdict, Counter
try:
    from urllib.parse import parse_qs, urlencode, urlparse
except ImportError:
    from urlparse import parse_qs, urlparse
    from urllib import urlencode

In [5]:
HANGMAN_URL = "https://www.trexsim.com/trexsim/hangman"

class HangmanAPI(object):
    def __init__(self, access_token=None, session=None, timeout=None):
        self.access_token = access_token
        self.session = session or requests.Session()
        self.timeout = timeout
        self.guessed_letters = []
        
        full_dictionary_location = "words_250000_train.txt"
        self.full_dictionary = self.build_dictionary(full_dictionary_location)        
        self.full_dictionary_common_letter_sorted = collections.Counter("".join(self.full_dictionary)).most_common()
        
        self.current_dictionary = []
        
        ## Initializing the n-gram counter dictionaries
        self.unigram_counts = defaultdict(Counter)
        self.bigram_counts_second = defaultdict(Counter); self.bigram_counts_first = defaultdict(Counter)
        self.trigram_counts_third = defaultdict(Counter); self.trigram_counts_second = defaultdict(Counter); self.trigram_counts_first = defaultdict(Counter)    
        self.fourgram_counts_first = defaultdict(Counter); self.fourgram_counts_second = defaultdict(Counter)     
        self.fourgram_counts_third = defaultdict(Counter); self.fourgram_counts_fourth = defaultdict(Counter) 
        self.fivegram_counts_first = defaultdict(Counter); self.fivegram_counts_second = defaultdict(Counter); self.fivegram_counts_third = defaultdict(Counter)  
        self.fivegram_counts_fourth = defaultdict(Counter); self.fivegram_counts_fifth = defaultdict(Counter)
        self.sixgram_counts_first = defaultdict(Counter); self.sixgram_counts_second = defaultdict(Counter); self.sixgram_counts_third = defaultdict(Counter)  
        self.sixgram_counts_fourth = defaultdict(Counter); self.sixgram_counts_fifth = defaultdict(Counter); self.sixgram_counts_sixth = defaultdict(Counter)
        self.sevengram_counts_first = defaultdict(Counter); self.sevengram_counts_second = defaultdict(Counter); self.sevengram_counts_third = defaultdict(Counter)
        self.sevengram_counts_fourth = defaultdict(Counter); self.sevengram_counts_fifth = defaultdict(Counter); self.sevengram_counts_sixth = defaultdict(Counter)
        self.sevengram_counts_seventh = defaultdict(Counter)
        
        ## Build ngram model
        self.sevengram()

    def sevengram(self):
        
        file1 = open("words_250000_train.txt")
        corpus = []
        lines = file1.readlines()
        for line in lines:
            corpus.append(line.strip())
        file1.close()
        
        # Generate a list of unigram_counts
        for word in corpus:
            length = len(word)
            for char in word:
                #index will be[word's length][character]
                self.unigram_counts[length][char] += 1

        for key in self.unigram_counts.keys():
            if not len(self.unigram_counts[key]) == 26:
                add_char = set(string.ascii_lowercase) - set(list(self.unigram_counts[key].keys()))

                for char in add_char:
                    self.unigram_counts[key][char] = 0


        for word in corpus:
            word = "$$$$$$" + word + "######"

            # generate a list of bigrams
            bigram_list = zip(word, word[1:])
            # generate a list of trigrams
            trigram_list = zip(word, word[1:], word[2:])
            # generate a list of fourgrams
            fourgram_list = zip(word, word[1:], word[2:], word[3:])
            # generate a list of fivegrams
            fivegram_list = zip(word, word[1:], word[2:], word[3:], word[4:])
            # generate a list of sixgrams
            sixgram_list = zip(word, word[1:], word[2:], word[3:], word[4:], word[5:])
            # generate a list of sevengrams
            sevengram_list = zip(word, word[1:], word[2:], word[3:], word[4:], word[5:], word[6:])

            # iterate over bigrams
            for bigram in bigram_list:
                first, second = bigram
                self.bigram_counts_second[first][second] += 1
                self.bigram_counts_first[second][first] += 1

            # iterate over trigrams
            for trigram in trigram_list:
                first, second, third = trigram
                self.trigram_counts_third[first+second][third] += 1
                self.trigram_counts_second[first+third][second] += 1
                self.trigram_counts_first[second+third][first] += 1

            # iterate over fourgrams
            for fourgram in fourgram_list:
                first, second, third, fourth = fourgram
                self.fourgram_counts_fourth[first+second+third][fourth] += 1
                self.fourgram_counts_third[first+second+fourth][third] += 1
                self.fourgram_counts_second[first+third+fourth][second] += 1
                self.fourgram_counts_first[second+third+fourth][first] += 1
                
            # iterate over fivegrams
            for fivegram in fivegram_list:
                first, second, third, fourth, fifth = fivegram
                self.fivegram_counts_fifth[first+second+third+fourth][fifth] += 1
                self.fivegram_counts_fourth[first+second+third+fifth][fourth] += 1
                self.fivegram_counts_third[first+second+fourth+fifth][third] += 1
                self.fivegram_counts_second[first+third+fourth+fifth][second] += 1
                self.fivegram_counts_first[second+third+fourth+fifth][first] += 1
                
            # iterate over sixgrams
            for sixgram in sixgram_list:
                first, second, third, fourth, fifth, sixth = sixgram
                self.sixgram_counts_sixth[first+second+third+fourth+fifth][sixth] += 1
                self.sixgram_counts_fifth[first+second+third+fourth+sixth][fifth] += 1
                self.sixgram_counts_fourth[first+second+third+fifth+sixth][fourth] += 1
                self.sixgram_counts_third[first+second+fourth+fifth+sixth][third] += 1
                self.sixgram_counts_second[first+third+fourth+fifth+sixth][second] += 1
                self.sixgram_counts_first[second+third+fourth+fifth+sixth][first] += 1
                
            # iterate over sevengrams
            for sevengram in sevengram_list:
                first, second, third, fourth, fifth, sixth, seventh = sevengram
                self.sevengram_counts_seventh[first+second+third+fourth+fifth+sixth][seventh] += 1
                self.sevengram_counts_sixth[first+second+third+fourth+fifth+seventh][sixth] += 1
                self.sevengram_counts_fifth[first+second+third+fourth+sixth+seventh][fifth] += 1
                self.sevengram_counts_fourth[first+second+third+fifth+sixth+seventh][fourth] += 1
                self.sevengram_counts_third[first+second+fourth+fifth+sixth+seventh][third] += 1
                self.sevengram_counts_second[first+third+fourth+fifth+sixth+seventh][second] += 1
                self.sevengram_counts_first[second+third+fourth+fifth+sixth+seventh][first] += 1
    
   # Calculate the ngram probability
    def ngram_prob(self, key, char, ngram_counts):
        if float(sum(ngram_counts[key].values()))==0:
            return 0
        return ngram_counts[key][char] / float(sum(ngram_counts[key].values()))

    def sevengram_guesser(self, mask, guessed):
        # available is a list that does not contain the character in guessed
        available = list(set(string.ascii_lowercase) - set(guessed))

        # The probabilities of available character
        sevengram_probs = []
        n = len(mask)

        # if len(mask) = 1, means that there is only a character. Therefore, need to pad in order to avoid error from 
        # traverse mask[index - 6] to mask[index + 6].    
        mask = ['$', '$', '$', '$', '$', '$'] + mask + ['#', '#', '#', '#', '#', '#']

        for char in available:
            char_prob = 0
            for index in range(6,n+6):
                prob1, prob2, prob3, prob4, prob5, prob6, prob7 = 0, 0, 0, 0, 0, 0, 0

                # The case is that the char has not been guessed
                if mask[index] == '_':

                    # Case 1
                    if not mask[index+1] == '_':
                        if not mask[index+2] == '_':
                            if not mask[index+3] == '_':
                                if not mask[index+4] == '_':
                                    if not mask[index+5] == '_':
                                        if not mask[index+6] == '_':
                                            prob1 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4]+mask[index+5]+mask[index+6], char, self.sevengram_counts_first)
                                        else:
                                            prob1 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4]+mask[index+5], char, self.sixgram_counts_first)
                                    else:
                                        prob1 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4], char, self.fivegram_counts_first)
                                else:
                                    prob1 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3], char, self.fourgram_counts_first)
                            else:
                                prob1 = self.ngram_prob(mask[index+1]+mask[index+2], char, self.trigram_counts_first)
                        else:
                            prob1 = self.ngram_prob(mask[index+1], char, self.bigram_counts_first)
                    else:
                        prob1 = self.ngram_prob(n, char, self.unigram_counts)

                    # Case 2
                    if not mask[index-1] == '_':
                        if not mask[index+1] == '_':
                            if not mask[index+2] == '_':
                                if not mask[index+3] == '_':
                                    if not mask[index+4] == '_':
                                        if not mask[index+5] == '_':
                                            prob2 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4]+mask[index+5], char, self.sevengram_counts_second)
                                        else:
                                            prob2 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4], char, self.sixgram_counts_second)
                                    else:
                                        prob2 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3], char, self.fivegram_counts_second)
                                else:
                                    prob2 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2], char, self.fourgram_counts_second)
                            else:
                                prob2 = self.ngram_prob(mask[index-1]+mask[index+1], char, self.trigram_counts_second)
                        else:
                            prob2 = self.ngram_prob(mask[index-1], char, self.bigram_counts_second)
                    else:
                        if not mask[index+1] == '_':
                            if not mask[index+2] == '_':
                                if not mask[index+3] == '_':
                                    if not mask[index+4] == '_':
                                        if not mask[index+5] == '_':
                                            prob2 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4]+mask[index+5], char, self.sixgram_counts_first)
                                        else:
                                            prob2 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4], char, self.fivegram_counts_first)
                                    else:
                                        prob2 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3], char, self.fourgram_counts_first)
                                else:
                                    prob2 = self.ngram_prob(mask[index+1]+mask[index+2], char, self.trigram_counts_first)
                            else:
                                prob2 = self.ngram_prob(mask[index+1], char, self.bigram_counts_first)
                        else:
                            prob2 = self.ngram_prob(n, char, self.unigram_counts)

                    # Case 3
                    if not mask[index-2] == '_':
                        if not mask[index-1] == '_':
                            if not mask[index+1] == '_':
                                if not mask[index+2] == '_':
                                    if not mask[index+3] == '_':
                                        if not mask[index+4] == '_':
                                            prob3 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4], char, self.sevengram_counts_third)
                                        else:
                                            prob3 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3], char, self.sixgram_counts_third)
                                    else:
                                        prob3 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2], char, self.fivegram_counts_third)
                                else:
                                    prob3 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1], char, self.fourgram_counts_third)
                            else:
                                prob3 = self.ngram_prob(mask[index-2]+mask[index-1], char, self.trigram_counts_third)
                        else:
                            if not mask[index+1] == '_':
                                if not mask[index+2] == '_':
                                    if not mask[index+3] == '_':
                                        if not mask[index+4] == '_':
                                            prob3 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4], char, self.fivegram_counts_first)
                                        else:
                                            prob3 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3], char, self.fourgram_counts_first)
                                    else:
                                        prob3 = self.ngram_prob(mask[index+1]+mask[index+2], char, self.trigram_counts_first)
                                else:
                                    prob3 = self.ngram_prob(mask[index+1], char, self.bigram_counts_first)
                            else:
                                prob3 = self.ngram_prob(n, char, self.unigram_counts)
                    else:
                        if not mask[index-1] == '_':
                            if not mask[index+1] == '_':
                                if not mask[index+2] == '_':
                                    if not mask[index+3] == '_':
                                        if not mask[index+4] == '_':
                                            prob3 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4], char, self.sixgram_counts_second)
                                        else:
                                            prob3 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3], char, self.fivegram_counts_second)
                                    else:
                                        prob3 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2], char, self.fourgram_counts_second)
                                else:
                                    prob3 = self.ngram_prob(mask[index-1]+mask[index+1], char, self.trigram_counts_second)
                            else:
                                prob3 = self.ngram_prob(mask[index-1], char, self.bigram_counts_second)
                        else:
                            if not mask[index+1] == '_':
                                if not mask[index+2] == '_':
                                    if not mask[index+3] == '_':
                                        if not mask[index+4] == '_':
                                            prob3 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3]+mask[index+4], char, self.fivegram_counts_first)
                                        else:
                                            prob3 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3], char, self.fourgram_counts_first)
                                    else:
                                        prob3 = self.ngram_prob(mask[index+1]+mask[index+2], char, self.trigram_counts_first)
                                else:
                                    prob3 = self.ngram_prob(mask[index+1], char, self.bigram_counts_first)
                            else:
                                prob3 = self.ngram_prob(n, char, self.unigram_counts)

                    # Case 4
                    if not mask[index-1] == '_':
                        if not mask[index-2] == '_':
                                if not mask[index-3] == '_':
                                    if not mask[index+1] == '_':
                                        if not mask[index+2] == '_':
                                            if not mask[index+3] == '_':
                                                prob4 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3], char, self.sevengram_counts_fourth)
                                            else:
                                                prob4 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2], char, self.sixgram_counts_fourth)
                                        else:
                                            prob4 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1], char, self.fivegram_counts_fourth)
                                    else:
                                        prob4 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1], char, self.fourgram_counts_fourth)
                                else:
                                    if not mask[index+1] == '_':
                                        if not mask[index+2] == '_':
                                            if not mask[index+3] == '_':
                                                prob4 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3], char, self.sixgram_counts_third)
                                            else:
                                                prob4 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2], char, self.fivegram_counts_third)
                                        else:
                                            prob4 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1], char, self.fourgram_counts_third)
                                    else:
                                        prob4 = self.ngram_prob(mask[index-2]+mask[index-1], char, self.trigram_counts_third)
                        else:
                            if not mask[index+1] == '_':
                                if not mask[index+2] == '_':
                                    if not mask[index+3] == '_':
                                        prob4 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2]+mask[index+3], char, self.fivegram_counts_second)
                                    else:
                                        prob4 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2], char, self.fourgram_counts_second)
                                else:
                                    prob4 = self.ngram_prob(mask[index-1]+mask[index+1], char, self.trigram_counts_second)
                            else:
                                prob4 = self.ngram_prob(mask[index-1], char, self.bigram_counts_second)
                    else:
                        if not mask[index+1] == '_':
                            if not mask[index+2] == '_':
                                if not mask[index+3] == '_':
                                    prob4 = self.ngram_prob(mask[index+1]+mask[index+2]+mask[index+3], char, self.fourgram_counts_first)
                                else:
                                    prob4 = self.ngram_prob(mask[index+1]+mask[index+2], char, self.trigram_counts_first)
                            else:
                                prob4 = self.ngram_prob(mask[index+1], char, self.bigram_counts_first)
                        else:
                            prob4 = self.ngram_prob(n, char, self.unigram_counts)                                

                    # Case 5
                    if not mask[index+2] == '_':
                        if not mask[index+1] == '_':
                            if not mask[index-1] == '_':
                                if not mask[index-2] == '_':
                                    if not mask[index-3] == '_':
                                        if not mask[index-4] == '_':
                                            prob5 = self.ngram_prob(mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2], char, self.sevengram_counts_fifth)
                                        else:
                                            prob5 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2], char, self.sixgram_counts_fourth)
                                    else:
                                        prob5 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1]+mask[index+2], char, self.fivegram_counts_third)
                                else:
                                    prob5 = self.ngram_prob(mask[index-1]+mask[index+1]+mask[index+2], char, self.fourgram_counts_second)
                            else:
                                prob5 = self.ngram_prob(mask[index+1]+mask[index+2], char, self.trigram_counts_first)
                        else:
                            if not mask[index-1] == '_':
                                if not mask[index-2] == '_':
                                    if not mask[index-3] == '_':
                                        if not mask[index-4] == '_':
                                            prob5 = self.ngram_prob(mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1], char, self.fivegram_counts_fifth)
                                        else:
                                            prob5 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1], char, self.fourgram_counts_fourth)
                                    else:
                                        prob5 = self.ngram_prob(mask[index-2]+mask[index-1], char, self.trigram_counts_third)
                                else:
                                    prob5 = self.ngram_prob(mask[index-1], char, self.bigram_counts_second)
                            else:
                                prob5 = self.ngram_prob(n, char, self.unigram_counts)
                    else:
                        if not mask[index+1] == '_':
                            if not mask[index-1] == '_':
                                if not mask[index-2] == '_':
                                    if not mask[index-3] == '_':
                                        if not mask[index-4] == '_':
                                            prob5 = self.ngram_prob(mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1], char, self.sixgram_counts_fifth)
                                        else:
                                            prob5 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1], char, self.fivegram_counts_fourth)
                                    else:
                                        prob5 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1], char, self.fourgram_counts_third)
                                else:
                                    prob5 = self.ngram_prob(mask[index-1]+mask[index+1], char, self.trigram_counts_second)
                            else:
                                prob5 = self.ngram_prob(mask[index+1], char, self.bigram_counts_first)
                        else:
                            if not mask[index-1] == '_':
                                if not mask[index-2] == '_':
                                    if not mask[index-3] == '_':
                                        if not mask[index-4] == '_':
                                            prob5 = self.ngram_prob(mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1], char, self.fivegram_counts_fifth)
                                        else:
                                            prob5 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1], char, self.fourgram_counts_fourth)
                                    else:
                                        prob5 = self.ngram_prob(mask[index-2]+mask[index-1], char, self.trigram_counts_third)
                                else:
                                    prob5 = self.ngram_prob(mask[index-1], char, self.bigram_counts_second)
                            else:
                                prob5 = self.ngram_prob(n, char, self.unigram_counts)

                    # Case 6
                    if not mask[index+1]  == '_':
                        if not mask[index-1] == '_':
                            if not mask[index-2] == '_':
                                if not mask[index-3] == '_':
                                    if not mask[index-4] == '_':
                                        if not mask[index-5] == '_':
                                            prob6 = self.ngram_prob(mask[index-5]+mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1], char, self.sevengram_counts_sixth)
                                        else:
                                            prob6 = self.ngram_prob(mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1], char, self.sixgram_counts_fifth)
                                    else:
                                        prob6 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1]+mask[index+1], char, self.fivegram_counts_fourth)
                                else:
                                    prob6 = self.ngram_prob(mask[index-2]+mask[index-1]+mask[index+1], char, self.fourgram_counts_third)
                            else:
                                prob6 = self.ngram_prob(mask[index-1]+mask[index+1], char, self.trigram_counts_second)
                        else:
                            prob6 = self.ngram_prob(mask[index+1], char, self.bigram_counts_first)
                    else:
                        if not mask[index-1] == '_':
                            if not mask[index-2] == '_':
                                if not mask[index-3] == '_':
                                    if not mask[index-4] == '_':
                                        if not mask[index-5] == '_':
                                            prob6 = self.ngram_prob(mask[index-5]+mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1], char, self.sixgram_counts_sixth)
                                        else:
                                            prob6 = self.ngram_prob(mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1], char, self.fivegram_counts_fifth)
                                    else:
                                        prob6 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1], char, self.fourgram_counts_fourth)
                                else:
                                    prob6 = self.ngram_prob(mask[index-2]+mask[index-1], char, self.trigram_counts_third)
                            else:
                                prob6 = self.ngram_prob(mask[index-1], char, self.bigram_counts_second)
                        else:
                            prob6 = self.ngram_prob(n, char, self.unigram_counts)

                    # Case 7
                    if not mask[index-1] == '_':
                        if not mask[index-2] == '_':
                            if not mask[index-3] == '_':
                                if not mask[index-4] == '_':
                                    if not mask[index-5] == '_':
                                        if not mask[index-6] == '_':
                                            prob7 = self.ngram_prob(mask[index-6]+mask[index-5]+mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1], char, self.sevengram_counts_seventh)
                                        else:
                                            prob7 = self.ngram_prob(mask[index-5]+mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1], char, self.sixgram_counts_sixth)
                                    else:
                                        prob7 = self.ngram_prob(mask[index-4]+mask[index-3]+mask[index-2]+mask[index-1], char, self.fivegram_counts_fifth)
                                else:
                                    prob7 = self.ngram_prob(mask[index-3]+mask[index-2]+mask[index-1], char, self.fourgram_counts_fourth)
                            else:
                                prob7 = self.ngram_prob(mask[index-2]+mask[index-1], char, self.trigram_counts_third)
                        else:
                            prob7 = self.ngram_prob(mask[index-1], char, self.bigram_counts_second)
                    else:
                        prob7 = self.ngram_prob(n, char, self.unigram_counts)                

                    # Choose max prob of all 7 cases
                    char_prob += max(prob1, prob2, prob3, prob4, prob5, prob6, prob7)

                # The case is that the character is guessed so we skip this position 
                else:
                    continue

            sevengram_probs.append(char_prob)

        # Return the character that has the maximum probability 
        return available[sevengram_probs.index(max(sevengram_probs))]       
        
    def guess(self, word): # word input example: "_ p p _ e "
        ###############################################
        # Replace with your own "guess" function here #
        ###############################################

        # clean the word so that we strip away the space characters
        clean_word = list(word[::2])
        
        guess_letter = self.sevengram_guesser(clean_word, self.guessed_letters)
        
        return guess_letter

    ##########################################################
    # You'll likely not need to modify any of the code below #
    ##########################################################
    
    def build_dictionary(self, dictionary_file_location):
        text_file = open(dictionary_file_location,"r")
        full_dictionary = text_file.read().splitlines()
        text_file.close()
        return full_dictionary
                
    def start_game(self, practice=True, verbose=True):
        # reset guessed letters to empty set and current plausible dictionary to the full dictionary
        self.guessed_letters = []
        self.current_dictionary = self.full_dictionary
                         
        response = self.request("/new_game", {"practice":practice})
        if response.get('status')=="approved":
            game_id = response.get('game_id')
            word = response.get('word')
            tries_remains = response.get('tries_remains')
            if verbose:
                print("Successfully start a new game! Game ID: {0}. # of tries remaining: {1}. Word: {2}.".format(game_id, tries_remains, word))
            while tries_remains>0:
                # get guessed letter from user code
                guess_letter = self.guess(word)
                    
                # append guessed letter to guessed letters field in hangman object
                self.guessed_letters.append(guess_letter)
                if verbose:
                    print("Guessing letter: {0}".format(guess_letter))
                    
                try:    
                    res = self.request("/guess_letter", {"request":"guess_letter", "game_id":game_id, "letter":guess_letter})
                except HangmanAPIError:
                    print('HangmanAPIError exception caught on request.')
                    continue
                except Exception as e:
                    print('Other exception caught on request.')
                    raise e
               
                if verbose:
                    print("Sever response: {0}".format(res))
                status = res.get('status')
                tries_remains = res.get('tries_remains')
                if status=="success":
                    if verbose:
                        print("Successfully finished game: {0}".format(game_id))
                    return True
                elif status=="failed":
                    reason = res.get('reason', '# of tries exceeded!')
                    if verbose:
                        print("Failed game: {0}. Because of: {1}".format(game_id, reason))
                    return False
                elif status=="ongoing":
                    word = res.get('word')
        else:
            if verbose:
                print("Failed to start a new game")
        return status=="success"
        
    def my_status(self):
        return self.request("/my_status", {})
    
    def request(
            self, path, args=None, post_args=None, method=None):
        if args is None:
            args = dict()
        if post_args is not None:
            method = "POST"

        # Add `access_token` to post_args or args if it has not already been
        # included.
        if self.access_token:
            # If post_args exists, we assume that args either does not exists
            # or it does not need `access_token`.
            if post_args and "access_token" not in post_args:
                post_args["access_token"] = self.access_token
            elif "access_token" not in args:
                args["access_token"] = self.access_token

        num_retry, time_sleep = 5, 2                                                                                        
        for it in range(num_retry):                                                                                         
            try:                                                                                                            
                response = self.session.request(                                                                            
                    method or "GET",                                                                                        
                    HANGMAN_URL + path,                                                                                     
                    timeout=self.timeout,                                                                                   
                    params=args,                                                                                            
                    data=post_args                                                                                          
                )                                                                                                           
                break                                                                                                       
            except requests.HTTPError as e:                                                                                 
                response = json.loads(e.read())                                                                             
                raise HangmanAPIError(response)                                                                             
            except requests.exceptions.SSLError as e:                                                                       
                if it + 1 == num_retry:                                                                                     
                    raise                                                                                                   
                time.sleep(time_sleep)  

        headers = response.headers
        if 'json' in headers['content-type']:
            result = response.json()
        elif "access_token" in parse_qs(response.text):
            query_str = parse_qs(response.text)
            if "access_token" in query_str:
                result = {"access_token": query_str["access_token"][0]}
                if "expires" in query_str:
                    result["expires"] = query_str["expires"][0]
            else:
                raise HangmanAPIError(response.json())
        else:
            raise HangmanAPIError('Maintype was not text, or querystring')

        if result and isinstance(result, dict) and result.get("error"):
            raise HangmanAPIError(result)
        return result
    
class HangmanAPIError(Exception):
    def __init__(self, result):
        self.result = result
        self.code = None
        try:
            self.type = result["error_code"]
        except (KeyError, TypeError):
            self.type = ""

        try:
            self.message = result["error_description"]
        except (KeyError, TypeError):
            try:
                self.message = result["error"]["message"]
                self.code = result["error"].get("code")
                if not self.type:
                    self.type = result["error"].get("type", "")
            except (KeyError, TypeError):
                try:
                    self.message = result["error_msg"]
                except (KeyError, TypeError):
                    self.message = result

        Exception.__init__(self, self.message)

# API Usage Examples

## To start a new game:
1. Make sure you have implemented your own "guess" method.
2. Use the access_token that we sent you to create your HangmanAPI object. 
3. Start a game by calling "start_game" method.
4. If you wish to test your function without being recorded, set "practice" parameter to 1.
5. Note: You have a rate limit of 20 new games per minute. DO NOT start more than 20 new games within one minute.

In [6]:
api = HangmanAPI(access_token="5fc72eea9f2b2f2acdc7e0ccafeaeb", timeout=2000)

## Playing practice games:
You can use the command below to play up to 100,000 practice games.

In [13]:
tot = 0
for i in range(3000):
    res = api.start_game(practice=1,verbose=False)
    if i==2999:
        [total_practice_runs,total_recorded_runs,total_recorded_successes] = api.my_status() # Get my game stats: (# of tries, # of wins)
        print('run %d practice games out of an allotted 100,000' %total_practice_runs)
    if res: 
        tot+=1
    time.sleep(2)
print("Accuracy: ",tot/3000)


Accuracy:  1.0


## Playing recorded games:
Please finalize your code prior to running the cell below. Once this code executes once successfully your submission will be finalized. Our system will not allow you to rerun any additional games.

Please note that it is expected that after you successfully run this block of code that subsequent runs will result in the error message "Your account has been deactivated".

Once you've run this section of the code your submission is complete. Please send us your source code via email.

In [None]:
for i in range(1000):
    print('Playing ', i, ' th game')
    # Uncomment the following line to execute your final runs. Do not do this until you are satisfied with your submission
    res = api.start_game(practice=0,verbose=False)    
    
    # DO NOT REMOVE as otherwise the server may lock you out for too high frequency of requests
    time.sleep(2)

## To check your game statistics
1. Simply use "my_status" method.
2. Returns your total number of games, and number of wins.

In [17]:
[total_practice_runs,total_recorded_runs,total_recorded_successes] = api.my_status() # Get my game stats: (# of tries, # of wins)
success_rate = total_recorded_successes/total_recorded_runs
print('overall success rate = %.3f' % success_rate)

overall success rate = 0.625
