# Just one

<div class="alert alert-info" role="alert">
Description from boardgamegeek.com: Just One is a cooperative party game in which you play together to discover as many mystery words as possible. Find the best clue to help your teammate. <b>Be unique, as all identical clues will be cancelled!</b>

<br>

A complete game is played over 13 cards. The goal is to get a score as close to 13 as possible. In case of a right answer, the players score 1 point. In case of wrong answer, they lose the current card as well as the top card of the deck. Thus losing 2 points. In case of lack of answer, the players only lose the current card, and therefore only 1 point.
</div>

## Set up
Heavily borrowed from Sebastian Theiler's [tutorial](https://medium.com/analytics-vidhya/basics-of-using-pre-trained-glove-vectors-in-python-d38905f356db)

### Importing libraries for general functioning of the project

In [1]:
# typical imports
import numpy as np
import pandas as pd
import matplotlib as plt
import random
import math
# for embedding distances and plotting
from scipy import spatial
from sklearn.manifold import TSNE

### Now getting word embeddings
Using Wikipedia 2014 + Gigaword 5: (http://nlp.stanford.edu/data/glove.6B.zip)

In [2]:
# dictionary to store embeddings
embeddings_dict = {}
# looping over each line of the glove file
with open("../glove.6B/glove.6B.50d.txt", 'r') as f:
    for line in f:
        values = line.split()
        word = values[0]
        vector = np.asarray(values[1:], "float32")
        embeddings_dict[word] = vector

In [3]:
# test that it works
embeddings_dict['counterfactual'][:4]

array([-0.22282 ,  0.078798, -1.1952  ,  0.072751], dtype=float32)

### Simple processing

In [4]:
# input:  word embedding
# output: sorted list of closest words
def find_closest_embeddings(embedding):
    return sorted(embeddings_dict.keys(), 
                  key=lambda word: spatial.distance.euclidean(embeddings_dict[word], 
                                                              embedding))

In [5]:
# test that it works
find_closest_embeddings(embeddings_dict['cause'])[:10]

['failure',
 'serious',
 'result',
 'risk',
 'danger',
 'fear',
 'prevent',
 'damage',
 'suffer']

## The game
The game involves two phases:
1. Generation of hints from the "hinters"
2. Guess of the secret word by the "guesser"

In [6]:
# takes in a secret word, passes to arbitrary hint function that generates hints, eliminates duplicates, then to arbitrary guess function
# outputs 1 if the guess matches the hint, 0 otherwise
def just_one(secret_word, hint_fn, guess_fn, n_hinters):
    # gathering hints
    hints = []
    for hinter in range(n_hinters):
        hints.append(hint_fn(secret_word, n_hinters))
    
    # removing duplicates
    surviving_hints = []
    for hint in hints:
        tmp = hints.count(hint)
        if tmp == 1: surviving_hints.append(hint)
            
    # just for now so i can see what's going on
    print(hints)
    print(surviving_hints)
    
    # passing to the guesser
    guess = guess_fn(surviving_hints, n_hinters)
    return(guess)

### Simple functions to get the ball rolling

#### Hinters

In [7]:
# returns a random word
def random_hinter(secret_word, n_hinters):
    return(random.sample(embeddings_dict.keys(),1))

# returns a random selection from the top 10 closest words
def sort_hinter(secret_word, n_hinters):
    tmp = find_closest_embeddings(embeddings_dict[secret_word])[:10]
    return(random.choice(tmp))

#### Guessers

In [8]:
# guesses a random word
def random_guesser(surviving_hints, n_hinters):
    return(random.sample(embeddings_dict.keys(),1))

# returns nearest neighbor to mean
# ZD note: I'm not sure this is the right way to do this
def nn_guesser(surviving_hints, n_hinter):
    # 50-D embeddings
    hints_array = np.empty((len(surviving_hints),50))
    for ix in range(len(surviving_hints)):
        hints_array[ix,:] = embeddings_dict[surviving_hints[ix]]
    # finding the center of the hints and returning nearest neighbor
    hint_mean = np.mean(hints_array, 0)
    nearest_neighbors = find_closest_embeddings(hint_mean)
    # making sure the guess isn't one of the hints
    safe_guess = next(guess for guess in nearest_neighbors if guess not in surviving_hints)
    return(safe_guess)

### Running the game

In [13]:
random.seed(2020)
secret_word = ''.join(random.sample(embeddings_dict.keys(),1))
secret_word = 'cat'
print(secret_word)
a = just_one(secret_word, sort_hinter, nn_guesser, 8)
a

cat
['snake', 'rabbit', 'monster', 'monster', 'monster', 'rat', 'beast', 'pet']
['snake', 'rabbit', 'rat', 'beast', 'pet']


'cat'

## Next steps

What is the right thing to do?

- compute probability of other people giving words, to have an expected distribution over mean word embeddings
- hinter should factor in the probability of words getting deleted
- build-up of norms? For example might be cool as a start to have hinter 1 always give the closest on dimensions 1-10, 2 on dimensions 11-20, etc. Later this could be learned

Experiments

- free responses in a group?
- give a dictionary of, say, 100 words and let them choose from them?