Tutorial on using GloVe in Python from https://medium.com/analytics-vidhya/basics-of-using-pre-trained-glove-vectors-in-python-d38905f356db

## Brief introduction to GloVe
**Glo**bal **Ve**ctors for Word Representation (GloVe) is an "unsupervised learning algorithm for obtaining vector representations for words." 

## Downloading pre-trained vectors
Download at https://nlp.stanford.edu/projects/glove/

## Import packages

In [2]:
import numpy as np
from scipy import spatial
import matplotlib as plt
from sklearn.manifold import TSNE

## Load the pretrained vectors

In [13]:
embeddings_dict = {}

with open("glove.6B.50d.txt", 'r', encoding = "utf-8") as f:
    for line in f:
        values = line.split()
        word = values[0]
        vector = np.asarray(values[1:], "float32")
        embeddings_dict[word] = vector

## Finding similar vectors

* `sorted()` takes and iterable as input and sorts it using a key
* The iterable we're passing is all the words, `embeddings_dict.keys()`
* By default Python would sort the list alphabetically, we need to specify a ***key*** to sort the list the way we want it. In our case, the ***key*** will be a lambda function that takes a word as input and returns the distance between that word's embedding and the embedding we gave the function. This is done via `lambda word: spatial.distance.euclidean(embeddings_dict[word], embedding)


In [16]:
# Define a function to find similar vectors
def find_closest_embeddings(embedding):
    return sorted(embeddings_dict.keys(), key=lambda word: spatial.distance.euclidean(embeddings_dict[word], embedding))

# How to use this function
find_closest_embeddings(embeddings_dict["king"])[:5]

['king', 'prince', 'queen', 'uncle', 'ii']

## Math with Words

In [17]:
print(find_closest_embeddings(
    embeddings_dict["twig"] - embeddings_dict["branch"] + embeddings_dict["hand"]
)[:5])

['fingernails', 'toenails', 'stringy', 'peeling', 'shove']


KeyError: 'AIDS'