# Hat game demo
In this notebook we
1. download sample corpus
1. train few models on it
1. write a class which will follow AbstractPlayer conventions
1. finally, play the game between local models (players) and one remote :)

Note that exact output may be not reproducible as the remote player could change over time or even fail/timeout at some point

In [1]:
from pathlib import Path

import numpy as np
import pandas as pd
import fasttext
from sklearn.metrics.pairwise import cosine_similarity
from tqdm import tqdm

from the_hat_game.game import Game
from the_hat_game.players import PlayerDefinition, AbstractPlayer, RemotePlayer

pd.set_option('display.max_colwidth', 200)

[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/olegpolivin/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     /Users/olegpolivin/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


## download sample text corpus

In [2]:
%%bash
cd texts
wget --quiet http://qwone.com/~jason/20Newsgroups/20news-19997.tar.gz
tar -zxf 20news-19997.tar.gz
rm 20news-19997.tar.gz

In [3]:
file_path =  Path('texts/20-newsgroups.txt')
folder = Path('texts/20_newsgroups/')

In [4]:
with open(file_path, 'w', encoding='utf-8') as f_write:
    files = list(folder.rglob('*'))
    for object_path in tqdm(files):
        if object_path.is_dir():
            continue
        with open(object_path, encoding='latin-1') as stream:
            for line in stream:
                f_write.write(line)

100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 20017/20017 [00:02<00:00, 7542.79it/s]


In [5]:
!wc -w {file_path}

 6046669 texts/20-newsgroups.txt


In [6]:
!head -5 {file_path}

Newsgroups: talk.politics.mideast
Path: cantaloupe.srv.cs.cmu.edu!crabapple.srv.cs.cmu.edu!bb3.andrew.cmu.edu!news.sei.cmu.edu!cis.ohio-state.edu!zaphod.mps.ohio-state.edu!cs.utexas.edu!uunet!brunix!doorknob!hm
From: hm@cs.brown.edu (Harry Mamaysky)
Subject: Heil Hernlem 
In-Reply-To: hernlem@chess.ncsu.edu's message of Wed, 14 Apr 1993 12:58:13 GMT


## Train several models

In [7]:
%%time

model_skipgram = fasttext.train_unsupervised(str(file_path), model='skipgram', dim=5)
model_cbow = fasttext.train_unsupervised(str(file_path), model='cbow', dim=16)
model_skipgram2 = fasttext.train_unsupervised(str(file_path), model='skipgram', dim=10)

Read 7M words
Number of words:  72228
Number of labels: 0
Progress: 100.0% words/sec/thread:  178952 lr:  0.000000 avg.loss:  1.811777 ETA:   0h 0m 0s  8.5% words/sec/thread:  154469 lr:  0.045733 avg.loss:  1.986145 ETA:   0h 1m 9s 29.3% words/sec/thread:  190170 lr:  0.035336 avg.loss:  1.855177 ETA:   0h 0m43s 37.6% words/sec/thread:  191334 lr:  0.031216 avg.loss:  1.841427 ETA:   0h 0m38s 56.3% words/sec/thread:  179810 lr:  0.021848 avg.loss:  1.848673 ETA:   0h 0m28s 73.0% words/sec/thread:  180225 lr:  0.013486 avg.loss:  1.845630 ETA:   0h 0m17s 81.3% words/sec/thread:  180140 lr:  0.009373 avg.loss:  1.848708 ETA:   0h 0m12s 89.9% words/sec/thread:  179037 lr:  0.005073 avg.loss:  1.832059 ETA:   0h 0m 6s words/sec/thread:  178427 lr:  0.004913 avg.loss:  1.831022 ETA:   0h 0m 6s 90.7% words/sec/thread:  177622 lr:  0.004655 avg.loss:  1.829071 ETA:   0h 0m 6s
Read 7M words
Number of words:  72228
Number of labels: 0
Progress: 100.0% words/sec/thread:  310643 lr:  0.000000 av

CPU times: user 6min 10s, sys: 5.4 s, total: 6min 16s
Wall time: 2min 50s



Progress: 100.0% words/sec/thread:  187530 lr: -0.000000 avg.loss:  1.744667 ETA:   0h 0m 0s
Progress: 100.0% words/sec/thread:  187529 lr:  0.000000 avg.loss:  1.744667 ETA:   0h 0m 0s


In [8]:
model_skipgram.words[1]

'the'

In [9]:
len(model_skipgram.words)

72228

In [10]:
model_skipgram['song']

array([-1.5538737 , -0.27101302,  0.41967082, -0.797175  ,  0.45186338],
      dtype=float32)

In [11]:
!mkdir models
model_skipgram.save_model('models/skipgram.model')
model_skipgram2.save_model('models/skipgram2.model')
model_cbow.save_model('models/cbow.model')

mkdir: models: File exists


In [12]:
!ls -lh models

total 2190872
-rw-r--r--  1 olegpolivin  staff   787M Jun 10 14:47 2021_06_05_processed.model
-rw-r--r--  1 olegpolivin  staff   132M Nov  9 00:03 cbow.model
-rw-r--r--  1 olegpolivin  staff    42M Nov  9 00:03 skipgram.model
-rw-r--r--  1 olegpolivin  staff    83M Nov  9 00:03 skipgram2.model


## Players' classes for fasttext models

In [13]:
class LocalFasttextPlayer(AbstractPlayer):
    def __init__(self, model):
        self.model = model

    def find_words_for_sentence(self, sentence, n_closest):
        neighbours = self.model.get_nearest_neighbors(sentence)
        words = [word for similariry, word in neighbours][:n_closest]
        return words

    def explain(self, word, n_words):
        return self.find_words_for_sentence(word, n_words)

    def guess(self, words, n_words):
        return self.find_words_for_sentence(' '.join(words), n_words)

In [14]:
# check remotely deployed service
remote_player = RemotePlayer('https://obscure-everglades-02893.herokuapp.com')

print(remote_player.explain('work', 10))
print(remote_player.guess(['job', 'employee', 'office'], 5))

['work', 'discontent', 'probably:', 'lopid', 'gives', 'putty', 'refund', 'strangest', 'enuff', 'inovative']
{'word_list': ['bars;', 'earnings', 'appellate', 'discoverd', 'phage'], 'time': 1.49043, 'code': 200}


In [15]:
local_player = LocalFasttextPlayer(model_skipgram)
print(local_player.explain('work', 10))
print(local_player.guess(['job', 'employee', 'office'], 5))

['outlets.', 'outlets,', 'moot', 'perm', 'floating', 'fixing', 'PMP', 'suspension,', 'labels,', 'expensive']
{'word_list': ['et.', '+|>', 'skeletons', 'echoes', 'nn'], 'time': 0, 'code': 200}


## Playing game!

In [None]:
N_EXPLAIN_WORDS = 10
N_GUESSING_WORDS = 5
N_ROUNDS = 1
CRITERIA = 'soft'

PLAYERS = [
    PlayerDefinition('HerokuOrg team', RemotePlayer('https://obscure-everglades-02893.herokuapp.com')),
    PlayerDefinition('skipgram team', LocalFasttextPlayer(model_skipgram)),
    PlayerDefinition('skipgram2 team', LocalFasttextPlayer(model_skipgram2)),
    PlayerDefinition('cbow team', LocalFasttextPlayer(model_cbow))
]

WORDS = ['dollar', 'percent', 'billion', 'money']

game = Game(PLAYERS, WORDS, CRITERIA, N_ROUNDS, N_EXPLAIN_WORDS, N_GUESSING_WORDS, random_state=0)
game.run(verbose='print_logs', complete=False)

HOST to EXPLAINING PLAYER (HerokuOrg team): the word is "billion"
EXPLAINING PLAYER (HerokuOrg team) to HOST: my wordlist is ['billion', 'dings', '>near', 'pellets', 'prelude', '100th', 'ammo', 'calibre', 'mile', 'corollas']
HOST TO EXPLAINING PLAYER (HerokuOrg team): cleaning your word list. Now the list is ['dings', 'near', 'pellets', 'prelude', 'th', 'ammo', 'calibre', 'mile', 'corollas']

===ROUND 1===

HOST: ['dings']
GUESSING PLAYER (skipgram2 team) to HOST: ['headings', 'joints', 'astronauts.', 'launchers.', 'launchers']
HOST: False
GUESSING PLAYER (skipgram team) to HOST: ['probes', 'es', 'steroid', 'extern', 'district']
HOST: False
GUESSING PLAYER (cbow team) to HOST: ['headings', 'earnings', 'stunts', 'hinges', 'rings']
HOST: False

===ROUND 2===

HOST: ['dings', 'near']
GUESSING PLAYER (skipgram2 team) to HOST: ['launchers', 'rocket', 'plaques', 'windings', 'landings,']
HOST: False
GUESSING PLAYER (skipgram team) to HOST: ['airport', 'Moon,', '(west', 'launched', 'coast']
HO

Unnamed: 0,"Explanation for ""billion"" (HerokuOrg team)",Guess (skipgram2 team),Guess (skipgram team),Guess (cbow team)
0,[dings],"[headings, joints, astronauts., launchers., launchers]","[probes, es, steroid, extern, district]","[headings, earnings, stunts, hinges, rings]"
1,"[dings, near]","[launchers, rocket, plaques, windings, landings,]","[airport, Moon,, (west, launched, coast]","[colliding, buildings,, fielding, delivering, buildings]"
2,"[dings, near, pellets]","[""SCHWARZENEGGER"", housekeeping, (carrying, navy, lanterns.]","[2.5million, gland, IgA, Five, actor]","[putouts, stacks, puts, hotels, bulletins]"
3,"[dings, near, pellets, prelude]","[""SCHWARZENEGGER"", >rules, barracks, parachute, F-holder]","[RBC, ""SCHWARZENEGGER"", >Niles, Passenger, Losing]","[infringed.', disclose, brownbladerunnersugarcubeselectronicblaylockpowersspikeleekatebushhamcornpizza, diarrhea, elephant]"
4,"[dings, near, pellets, prelude, th]","[tinnitus, parachute, kc>, F-holder, Scandinavians]","[spectacular,, Losing, later),, mins, 'Karla',]","[shortcomings, colliding, barehanded, syringe, barracks]"
5,"[dings, near, pellets, prelude, th, ammo]","[""SCHWARZENEGGER"", turbocharged, housekeeping, gallon, Oxygen]","[later),, Pentagon, plywood/carpet, >>Red, spectacular,]","[shortcomings, infringed.', diarrhea, disclose, deluxe]"
6,"[dings, near, pellets, prelude, th, ammo, calibre]","[turbocharged, barbeque, landmark, MR\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+, tailgate]","[later;, close-out, Pentagon, Caps,, IVF-ET]","[semi, deluxe, deploy, disclosure, disclose]"
7,"[dings, near, pellets, prelude, th, ammo, calibre, mile]","[turbocharged, MR\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+, barbeque, landmark, tailgate]","[later;, IVF-ET, Euclidean, ndet_loop.c:687:, 'Karla',]","[deluxe, deploy, seminar, brownbladerunnersugarcubeselectronicblaylockpowersspikeleekatebushhamcornpizza, semi]"
8,"[dings, near, pellets, prelude, th, ammo, calibre, mile, corollas]","[turbocharged, MR\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+R\O+, gallon, 'B', axel]","[km,, ""Power, Hasbani, Euclidean, Mk]","[brownbladerunnersugarcubeselectronicblaylockpowersspikeleekatebushhamcornpizza, mass-market, bulletin, seminar, tin]"


HOST to EXPLAINING PLAYER (skipgram2 team): the word is "money"
EXPLAINING PLAYER (skipgram2 team) to HOST: my wordlist is ['taxes.', 'pay.', 'paying', 'money.', 'spend', 'spending.', 'guards', 'insure', 'pay', 'bank.']
HOST TO EXPLAINING PLAYER (skipgram2 team): cleaning your word list. Now the list is ['taxes', 'pay', 'paying', 'spend', 'spending', 'guards', 'insure', 'bank']

===ROUND 1===

HOST: ['taxes']


## View final game report

In [None]:
game.report_results(each_game=True)