#### How does Contexto.me work?

Contexto.me is a word game that challenges players to guess the correct word based on its proximity to other words in a network or graph. The game uses a concept called **semantic distance** to determine the closeness or similarity between words.

1. Players are presented with a word and a list of related words.
2. The goal is to choose the word that is closest in meaning to the original word based on the connections between the words.
3. The game uses algorithms and natural language processing techniques to analyze the relationships between words and calculate the semantic distance between them.

> "The closer two words are in the graph, the more closely related they are in meaning."

Players must consider the meanings of the related words and choose the one that is most closely related to the original word. For example, if the original word is **"cat,"** the related words might include **"kitten,"** **"feline,"** **"pet,"** and **"whiskers."**

Overall, Contexto.me is a challenging and engaging game that tests players' vocabulary, knowledge, and ability to make connections between words. It can be a fun way to **improve your language skills** and **expand your vocabulary**.





![Drag Racing](img/contexto.png)

# My approach to coding a version of Contexto

I'm currently working on coding my own version of Contexto, inspired by the Brazilian game of the same name. While the game mechanics are similar to the original, I'm approaching the development process in my own unique way.

1. **Customized word sets:** Rather than using pre-determined word sets, I'm creating my own custom sets to add a personal touch to the game. This way, I can tailor the game to specific themes or interests.

2. **Flexibility in game play:** While the original game uses a fixed structure for the game play, I'm building in flexibility to allow for different types of games. This way, players can choose the game play that best suits their preferences.

3. **Simplified user interface:** To make the game more accessible to all players, I'm working on a simplified user interface that is easy to navigate and understand.

4. **Scoring system:** In order to add an element of competition to the game, I'm developing a scoring system that will allow players to compare their scores with other players.

Overall, my goal is to create a fun, engaging, and personalized version of Contexto that players will enjoy playing. While I'm drawing inspiration from the original game, I'm excited to put my own unique spin on it.


## Imports and reading initial Model Data

In [136]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from gensim.models import KeyedVectors
from gensim.scripts.glove2word2vec import glove2word2vec
from sklearn.manifold import TSNE

# load the data in the inputs folder
df = pd.read_csv('inputs/_glove.840B.300d.txt',sep=' ')
df.head()

Unnamed: 0,",",-0.082752,0.67204,-0.14987,-0.064983,0.056491,0.40228,0.0027747,-0.3311,-0.30691,...,-0.14331,0.018267,-0.18643,0.20709,-0.35598,0.05338,-0.050821,-0.1918,-0.37846,-0.06589
0,.,0.012001,0.20751,-0.12578,-0.59325,0.12525,0.15975,0.13748,-0.33157,-0.13694,...,0.16165,-0.066737,-0.29556,0.022612,-0.28135,0.0635,0.14019,0.13871,-0.36049,-0.035
1,the,0.27204,-0.06203,-0.1884,0.023225,-0.018158,0.006719,-0.13877,0.17708,0.17709,...,-0.4281,0.16899,0.22511,-0.28557,-0.1028,-0.018168,0.11407,0.13015,-0.18317,0.1323
2,and,-0.18567,0.066008,-0.25209,-0.11725,0.26513,0.064908,0.12291,-0.093979,0.024321,...,-0.59396,-0.097729,0.20072,0.17055,-0.004736,-0.039709,0.32498,-0.023452,0.12302,0.3312
3,to,0.31924,0.06316,-0.27858,0.2612,0.079248,-0.21462,-0.10495,0.15495,-0.03353,...,-0.12977,0.3713,0.18888,-0.004274,-0.10645,-0.2581,-0.044629,0.082745,0.097801,0.25045
4,of,0.060216,0.21799,-0.04249,-0.38618,-0.15388,0.034635,0.22243,0.21718,0.006848,...,-0.42484,0.11606,0.004813,-0.39629,-0.26823,0.3292,-0.17597,0.11709,-0.16692,-0.094085


## Helper Function to Transfer and load Model

In [139]:
def transfer(gloveFile, word2vecFile):
    glove2word2vec(gloveFile, word2vecFile)

def load_model(word2vecFile):
    model = KeyedVectors.load_word2vec_format(word2vecFile, binary=False,limit=len(df))
    return model

## Transfer the GloVe Model to a Word2Vec Model

In [17]:
transfer ('inputs/_glove.840B.300d.txt', 'inputs/_glove.840B.300d.word2vec.txt')

  glove2word2vec(gloveFile, word2vecFile)


In [140]:
model = load_model('inputs/_glove.840B.300d.word2vec.txt')

In [141]:
def get_similar_words(model, word_of_day):
    similar_words = model.similar_by_word(word_of_day, topn=len(model.key_to_index))
    return similar_words

def get_distance_between_words(similar_words, word_of_day, guess):
    # do not accept empty strings
    if guess == '':
        return  -1
    
    if guess == word_of_day:
        return 0
    for i, (word, sim) in enumerate(similar_words):
        if word == guess:
            return i + 1
    return len(similar_words) 

print(len(model.key_to_index))

2036775


## Chosing the Word of the Day

In [163]:
themes = ['sports', 'animals', 'fruits', 'vegetables', 'food', 'clothes', 'colors', 'body', 'family', 'jobs', 'transport', 'weather', 'house', 'furniture', 'kitchen', 'school', 'office', 'holidays']
regional_set = ['countries', 'cities']
education_set = ['school', 'science', 'history']


# user pick gamemode 
gamemode = input('Choose a gamemode: \n 1. Themes \n 2. Regional \n 3. Education \n')

if gamemode == '1':
    gamemode = themes
elif gamemode == '2':
    gamemode = regional_set
elif gamemode == '3':
    gamemode = education_set
else:
    print('Invalid input')
    exit()

# user pick difficulty
difficulty = input('Choose a difficulty: \n 1. Easy \n 2. Medium \n 3. Hard \n')

if difficulty == '1':
    difficultyStart = 0
    difficultyEnd = 50
elif difficulty == '2':
    difficultyStart = 51
    difficultyEnd = 100
elif difficulty == '3':
    difficultyStart = 100
    difficultyEnd = 200
else:
    print('Invalid input')
    exit()

theme = np.random.choice(gamemode)

similar_words = get_similar_words(model, theme)

# get word of the day randomly from the first 200 words in similar_words, depending on the difficulty level, 
# the first 50 words are the easiest words, the next 100 words are the medium words, and the last 50 words are the hardest words

word_of_day = similar_words[np.random.randint(difficultyStart, difficultyEnd)][0]
word_of_day

'birds'

# Playing the Contexto Game

In [155]:
# get the similar words
similar_words = get_similar_words(model, word_of_day)
words_guessed = []
# ask the user to guess the word of the day

while True:
    guess = input("Guess another word:")
    d = get_distance_between_words(similar_words, word_of_day, guess)
    words_guessed.append(guess)
    if(d == 0):
        print("{} ---------------------- {}".format(guess, d))
        print("Correct!")
        break
    else:
        print("{} ---------------------- {}".format(guess, d))
similar_words

sports ---------------------- 0
Correct!


[('sport', 0.8484779596328735),
 ('sporting', 0.7160251140594482),
 ('football', 0.6938873529434204),
 ('Sports', 0.6796543598175049),
 ('athletics', 0.6743817925453186),
 ('soccer', 0.6695200204849243),
 ('basketball', 0.6565209031105042),
 ('tennis', 0.6506130695343018),
 ('baseball', 0.647473156452179),
 ('athletic', 0.6409682631492615),
 ('hockey', 0.6343894600868225),
 ('athletes', 0.5989591479301453),
 ('golf', 0.5830652713775635),
 ('racing', 0.5783706903457642),
 ('volleyball', 0.5780236124992371),
 ('rugby', 0.567213773727417),
 ('recreation', 0.555584728717804),
 ('entertainment', 0.5529596209526062),
 ('fitness', 0.5486113429069519),
 ('athlete', 0.5477192997932434),
 ('boxing', 0.5408746004104614),
 ('motorsports', 0.5350232124328613),
 ('Sport', 0.5262331962585449),
 ('softball', 0.5181825160980225),
 ('recreational', 0.5152878165245056),
 ('gymnastics', 0.50932776927948),
 ('league', 0.5080278515815735),
 ('lacrosse', 0.5031997561454773),
 ('leisure', 0.5022414922714233),

# References:

- https://contexto.me/
- https://nlp.stanford.edu/projects/glove/ 
- https://nlp.stanford.edu/pubs/glove.pdf 
- https://towardsdatascience.com/t-distributed-stochastic-neighbor-embedding-t-sne-bb60ff109561 
- https://github.com/stanfordnlp/GloVe 
