<a href="https://colab.research.google.com/github/Bosy-Ayman/IR/blob/main/week_9_Term_Representations%26Embeddings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



The **learning outcomes** of the this notebook are:


*   Get words embeddings using Skip Gram / Glove / Cbow model.
*   Compare the similarity of words embeddings of those models.  


### **Last Lab Recap:**


A **word embedding** is a distributed (dense) vector representation for each word of a vocabulary. It is capable of capturing semantic and syntactic properties of the input texts .

Interestingly, even arithmetic operations on the word vectors are meaningful: e.g. King - Queen = Man - Woman.

**The two most popular approaches to learn a word embedding from raw text are:**

- **Skip-Gram Negative Sampling** by Mikolov et al. (2013) (Word2Vec)
- **Global Word Vectors** by Pennington, Socher, and Manning (2014) (GloVe).

## **Skip-Gram Word2Vec — Intuition**

<img src= "https://raw.githubusercontent.com/gaoisbest/NLP-Projects/master/0_Word2vec/Word2vec_Skip_gram.png" alt="drawing" width="750"/>

**The skip-gram model**  aims to generate word embeddings, which are dense vector representations, where similar words are closer together in the vector space.

In the skip-gram model, each word is represented by a **unique vector**. The model takes a target word as input and tries to predict the context words within a certain window size. By training the model on large amounts of text data.



- Skip-gram, **predicts the context words** given a target word. It takes a target word as input and aims to predict the surrounding context words.

- The **input** to the skip-gram model is a **target word**, and the **output** is a **set of context words**.

- The target word is typically represented as **a one-hot encoded vector**.

- Skip-gram learns to map the input target word to the context words by **training a neural network**, with an input layer, a hidden layer, and an output layer.

- During training, the weights of the neural network are adjusted to minimize the prediction error between the predicted context words and the actual context words.

- Skip-gram is computationally **more expensive** compared to CBOW as it requires making predictions for each context word, but it tends to **perform better on larger datasets** and captures well the semantics of infrequent words.



###  **In a two-gram example:**

$$\underbrace{\textrm{The quick}}_{\textrm{left } n}\underbrace{\textrm{ brown }}_{target} \underbrace{\textrm{for jumps}}_{\textrm{right } n}$$

<img src="https://nbviewer.jupyter.org/github/DSKSD/DeepNLP-models-Pytorch/blob/master/images/01.skipgram-prepare-data.png">


\begin{equation}\large \prod P(V_c|V_t) \rightarrow \sum logP(V_c|V_t) \rightarrow \sum log\frac{\exp^{u_{t,c}}}{\sum_{k=1}^{|V|}\exp^{u_{t,k}}}\end{equation}


In [None]:

# Skip - Gram / Example1:

import gensim
from gensim.models import Word2Vec

# Example corpus
corpus = [
    ['apple', 'banana', 'orange', 'fruit'],
    ['car', 'bus', 'train', 'vehicle'],
    ['cat', 'dog', 'pet', 'animal'],
    ['computer', 'keyboard', 'mouse', 'technology']
]

In [None]:
# Train skip-gram model
model = Word2Vec(sentences=corpus,
                 sg=1,                  #indicates the skip-gram model
                 vector_size=100,       #each word will be represented as a vector of 100 dimensions.
                 window=2,              #size of the context window/ model will consider two words to the left/right of the target word.
                 min_count=1,           #minimum frequency count of words required to be included in the vocabulary.
                 workers=4,             #utilizing multiple CPU cores.
                 epochs=20)             #Each epoch goes through the entire dataset once

# Get word embeddings
word_embeddings = model.wv

# Example usage
print(word_embeddings['apple'])  # Get the embedding vector for the word 'apple'

[-7.1907109e-03  4.2323531e-03  2.1672160e-03  7.4456348e-03
 -4.8877355e-03 -4.5667640e-03 -6.0967854e-03  3.3006475e-03
 -4.5027859e-03  8.5193124e-03 -4.2901128e-03 -9.1109248e-03
 -4.8195980e-03  6.4187679e-03 -6.3732183e-03 -5.2567786e-03
 -7.3009222e-03  6.0254098e-03  3.3571543e-03  2.8429865e-03
 -3.1370206e-03  6.0324781e-03 -6.1509013e-03 -1.9880428e-03
 -5.9824740e-03 -9.9266437e-04 -2.0254641e-03  8.4903855e-03
  7.5093449e-05 -8.5716275e-03 -5.4278509e-03 -6.8761348e-03
  2.6935276e-03  9.4525181e-03 -5.8173346e-03  8.2637137e-03
  8.5330810e-03 -7.0587411e-03 -8.8811945e-03  9.4740056e-03
  8.3817160e-03 -4.6954509e-03 -6.7312834e-03  7.8448728e-03
  3.7651756e-03  8.0972137e-03 -7.5730658e-03 -9.5266094e-03
  1.5822783e-03 -9.8044751e-03 -4.8857513e-03 -3.4626198e-03
  9.6272007e-03  8.6237462e-03 -2.8371222e-03  5.8311294e-03
  8.2379077e-03 -2.2612293e-03  9.5285922e-03  7.1662096e-03
  2.0402709e-03 -3.8501143e-03 -5.0848583e-03 -3.0499455e-03
  7.8895157e-03 -6.18569

In [None]:
print(word_embeddings['banana'])

[ 7.6964465e-03  9.1211973e-03  1.1375165e-03 -8.3227847e-03
  8.4265033e-03 -3.6962493e-03  5.7440475e-03  4.3913233e-03
  9.6879499e-03 -9.2945024e-03  9.2078838e-03 -9.2838397e-03
 -6.9083665e-03 -9.1018500e-03 -5.5472087e-03  7.3707197e-03
  9.1653587e-03 -3.3237957e-03  3.7222467e-03 -3.6280856e-03
  7.8822952e-03  5.8679120e-03  2.2938245e-06 -3.6323101e-03
 -7.2242739e-03  4.7704726e-03  1.4511441e-03 -2.6123982e-03
  7.8363018e-03 -4.0482013e-03 -9.1480883e-03 -2.2557415e-03
  1.2593227e-04 -6.6412883e-03 -5.4865568e-03 -8.5008750e-03
  9.2294523e-03  7.4265879e-03 -2.9409389e-04  7.3685488e-03
  7.9536289e-03 -7.8567682e-04  6.6098371e-03  3.7691286e-03
  5.0779395e-03  7.2527640e-03 -4.7397711e-03 -2.1858814e-03
  8.7507279e-04  4.2361082e-03  3.3043462e-03  5.0948756e-03
  4.5897844e-03 -8.4397728e-03 -3.1843686e-03 -7.2352518e-03
  9.6811568e-03  5.0071445e-03  1.7087515e-04  4.1145915e-03
 -7.6567428e-03 -6.2962854e-03  3.0757205e-03  6.5357955e-03
  3.9496026e-03  6.02048

In [None]:
print(word_embeddings.similarity('apple', 'banana'))  # Compute similarity between two words

-0.098008834


In [None]:
print(word_embeddings.most_similar('car'))  # Find most similar words to 'car'

[('keyboard', 0.17271797358989716), ('pet', 0.16696742177009583), ('cat', 0.11119077354669571), ('bus', 0.1094270721077919), ('technology', 0.07960997521877289), ('computer', 0.04131349176168442), ('vehicle', 0.037712957710027695), ('apple', 0.013257332146167755), ('mouse', 0.008347253315150738), ('animal', -0.0058680507354438305)]


In [None]:
## Skip - Gram / Example2:
from gensim.models import Word2Vec
from gensim.models import KeyedVectors #utility class in Gensim models such as (Word2Vec, GloVe, FastText, etc.)

In [None]:
collection = [["king", "is", "to", "queen", "as", "man", "is", "to", "woman"],
    ["london", "is", "the", "capital", "of", "egland"],
    ["paris", "is", "the", "capital", "of", "france"]]

In [None]:
# Train Word2Vec model
model = Word2Vec(collection, vector_size=100, #each word will be represented as a 100-dimensional vector.
                             window=5,        #model will look at 5 words before and 5 words after the target word in each sentence.
                             min_count = 1,   #All words in the corpus are considered, regardless of their frequency.
                             workers=4)       #Training is performed using 4 CPU cores in parallel.

In [None]:
# Access word vectors
man_vector = model.wv['man']
print(man_vector)

[ 7.0887972e-03 -1.5679300e-03  7.9474989e-03 -9.4886590e-03
 -8.0294991e-03 -6.6403709e-03 -4.0034545e-03  4.9892161e-03
 -3.8135587e-03 -8.3199050e-03  8.4117772e-03 -3.7470020e-03
  8.6086961e-03 -4.8957514e-03  3.9185942e-03  4.9220170e-03
  2.3926091e-03 -2.8188038e-03  2.8491246e-03 -8.2562361e-03
 -2.7655398e-03 -2.5911583e-03  7.2490061e-03 -3.4634031e-03
 -6.5997029e-03  4.3404270e-03 -4.7448516e-04 -3.5975564e-03
  6.8824720e-03  3.8723124e-03 -3.9002013e-03  7.7188847e-04
  9.1435025e-03  7.7546560e-03  6.3618720e-03  4.6673026e-03
  2.3844899e-03 -1.8416261e-03 -6.3712932e-03 -3.0181051e-04
 -1.5653884e-03 -5.7228567e-04 -6.2628710e-03  7.4340473e-03
 -6.5914928e-03 -7.2392775e-03 -2.7571463e-03 -1.5154004e-03
 -7.6357173e-03  6.9824100e-04 -5.3261113e-03 -1.2755442e-03
 -7.3651113e-03  1.9605684e-03  3.2731986e-03 -2.3138524e-05
 -5.4483581e-03 -1.7260861e-03  7.0849168e-03  3.7362587e-03
 -8.8810492e-03 -3.4135508e-03  2.3541022e-03  2.1380198e-03
 -9.4640078e-03  4.57116

In [None]:
# Find similar words to 'king' and 'Paris' based on their vector representations learned during training
similar_to_man = model.wv.most_similar(positive=['man'], topn=3)
similar_to_paris = model.wv.most_similar(positive=['paris'], topn=3)

# Print results
print("Words similar to man:", similar_to_man)
print("Words similar to Paris:", similar_to_paris)

Words similar to man: [('queen', 0.12813478708267212), ('as', 0.10944210737943649), ('egland', 0.1088901236653328)]
Words similar to Paris: [('capital', 0.14593380689620972), ('man', 0.05048326402902603), ('the', 0.04167982563376427)]


In [None]:
# Calculate word similarity (cosine similarity)
king_queen_sim = model.wv.similarity('king', 'queen')
paris_london_sim = model.wv.similarity('paris', 'london')

print("Similarity between king and queen:", king_queen_sim)
print("Similarity between Paris and London:", paris_london_sim)

Similarity between king and queen: -0.076410025
Similarity between Paris and London: 0.008817988


In [None]:
model.wv.most_similar('king')

[('to', 0.25285014510154724),
 ('of', 0.1372225135564804),
 ('france', 0.04404586926102638),
 ('paris', 0.012902338989078999),
 ('the', 0.006779309827834368),
 ('man', -0.001192279625684023),
 ('egland', -0.025496706366539),
 ('is', -0.04117467626929283),
 ('queen', -0.07641002535820007),
 ('woman', -0.10560528188943863)]

In [None]:
model.wv.most_similar(positive=["king", "woman"], negative=["man"])

[('to', 0.20373943448066711),
 ('of', 0.10580124706029892),
 ('is', 0.06748092174530029),
 ('paris', -0.021386904641985893),
 ('capital', -0.03990182653069496),
 ('france', -0.04878506809473038),
 ('london', -0.06293847411870956),
 ('the', -0.10544881224632263),
 ('egland', -0.1370125561952591),
 ('queen', -0.15060536563396454)]

## **2- CBOW**

<img src= "https://raw.githubusercontent.com/gaoisbest/NLP-Projects/master/0_Word2vec/Word2vec_CBOW.png" width=750>
   
**CBOW (Continuous Bag of Words):**

- CBOW predicts the target word based on its **surrounding** context words. It treats **the context words as input** and tries to predict the target word in the middle.

- The input to the CBOW model is a set of context words, and the output is the target word.
- The context words are usually represented as one-hot encoded vectors.
- CBOW learns to map the input context words to the target word by training a neural network. The network typically consists of an **input layer, a hidden layer, and an output layer**.
- CBOW is **computationally efficient** as it aggregates the context words to predict the target word, making it **faster to train** compared to skip-gram.
- CBOW is more suitable for **small to medium**-sized datasets and performs well when the frequency of words is balanced.

<img src= "https://machinelearninginterview.com/wp-content/uploads/2019/02/CBOW_eta_Skipgram.png" width=750>


### Comparisons of CBOW and Skip-gram
- speed
    - cbow: **faster**, skip-gram: **slower**,
- infrequent words
    - cbow: **bad**, skip-gram: **better**, [why](https://stats.stackexchange.com/questions/180548/why-is-skip-gram-better-for-infrequent-words-than-cbow)?
- training data
    - cbow: **smaller datasets**, skip-gram: **larger datasets**.
    - CBOW smoothes over a lot of the distributional information (by treating an entire context as one observation), useful for **smaller datasets**. Skip-gram treats each context-target pair as a new observation, and tends to do better when **larger datasets**.




### The key difference from the skip-gram example is that we set **sg=0** to indicate that we want to train a CBOW model.

In [None]:
# Cbow - Example1:
import gensim
from gensim.models import Word2Vec

# Example corpus
corpus = [
    ['apple', 'banana', 'orange', 'fruit'],
    ['car', 'bus', 'train', 'vehicle'],
    ['cat', 'dog', 'pet', 'animal'],
    ['computer', 'keyboard', 'mouse', 'technology']
]

In [None]:
# Train CBOW model
model = Word2Vec(sentences=corpus,
                 sg=0,                 #indicates the Cbow model
                 vector_size=100,
                 window=2,
                 min_count=1,
                 workers=4,
                 epochs=20)

# Get word embeddings
word_embeddings = model.wv

# Example usage
print(word_embeddings['apple'])  # Get the embedding vector for the word 'apple'

[-7.1908082e-03  4.2325766e-03  2.1617643e-03  7.4409596e-03
 -4.8884735e-03 -4.5629623e-03 -6.0976543e-03  3.2995059e-03
 -4.5010662e-03  8.5241403e-03 -4.2908899e-03 -9.1059618e-03
 -4.8195105e-03  6.4160223e-03 -6.3749091e-03 -5.2612112e-03
 -7.3046330e-03  6.0232445e-03  3.3607727e-03  2.8443786e-03
 -3.1379368e-03  6.0308878e-03 -6.1522140e-03 -1.9797122e-03
 -5.9856055e-03 -9.9437870e-04 -2.0221702e-03  8.4889755e-03
  7.5262338e-05 -8.5753957e-03 -5.4289075e-03 -6.8746139e-03
  2.6933060e-03  9.4538145e-03 -5.8159786e-03  8.2620177e-03
  8.5308459e-03 -7.0604258e-03 -8.8847755e-03  9.4710710e-03
  8.3781090e-03 -4.6936437e-03 -6.7288997e-03  7.8465613e-03
  3.7670960e-03  8.0935322e-03 -7.5730742e-03 -9.5260078e-03
  1.5809102e-03 -9.8075606e-03 -4.8859590e-03 -3.4576450e-03
  9.6275695e-03  8.6216936e-03 -2.8370069e-03  5.8246474e-03
  8.2333554e-03 -2.2656773e-03  9.5290951e-03  7.1597849e-03
  2.0436456e-03 -3.8520712e-03 -5.0809868e-03 -3.0489094e-03
  7.8873020e-03 -6.18963

In [None]:
print(word_embeddings.similarity('apple', 'banana'))  # Compute similarity between two words
print(word_embeddings.most_similar('car'))  # Find most similar words to 'car'

-0.098114535
[('keyboard', 0.17277397215366364), ('pet', 0.1669520139694214), ('cat', 0.11120960861444473), ('bus', 0.10944785177707672), ('technology', 0.07959192991256714), ('computer', 0.04128841683268547), ('vehicle', 0.037712957710027695), ('apple', 0.013238158077001572), ('mouse', 0.008328912779688835), ('animal', -0.005901669152081013)]


## **3- Glove**

**GloVe (Global Vectors for Word Representation):**

- GloVe aims to capture the global **co-occurrence statistics** of words in a corpus. It considers the overall word co-occurrence patterns, rather than just the local context of individual words.
- GloVe constructs a co-occurrence matrix that represents the frequency of word co-occurrences in the corpus. Each entry in the matrix represents **how often two words co-occur** within a certain context window.

- The key idea behind GloVe is that the ratio of co-occurrence probabilities of two words should encode meaningful information about their relationship.

- The model optimizes the embeddings by minimizing the difference between the dot product of word vectors and the logarithm of the co-occurrence probabilities.


# **Glove using FLAIR Library**

We will use [FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP](https://www.aclweb.org/anthology/N19-4010/). FLAIR make it easy to get words and documents embeddings using a huge number of SOTA models.

In [None]:
#install FLAIR
!pip install flair



In [None]:
from flair.data import Sentence # represent a sentence
from flair.embeddings import WordEmbeddings

from termcolor import colored #add color to text output

# initialize embedding by specifying which model we want to use
glove_embedding = WordEmbeddings('glove')

2024-04-21 11:09:20,047 https://flair.informatik.hu-berlin.de/resources/embeddings/token/glove.gensim.vectors.npy not found in cache, downloading to /tmp/tmpuurl4w23


100%|██████████| 153M/153M [00:07<00:00, 22.7MB/s]

2024-04-21 11:09:27,522 copying /tmp/tmpuurl4w23 to cache at /root/.flair/embeddings/glove.gensim.vectors.npy





2024-04-21 11:09:27,828 removing temp file /tmp/tmpuurl4w23
2024-04-21 11:09:28,312 https://flair.informatik.hu-berlin.de/resources/embeddings/token/glove.gensim not found in cache, downloading to /tmp/tmp3vztd5k_


100%|██████████| 20.5M/20.5M [00:01<00:00, 12.8MB/s]

2024-04-21 11:09:30,424 copying /tmp/tmp3vztd5k_ to cache at /root/.flair/embeddings/glove.gensim





2024-04-21 11:09:30,451 removing temp file /tmp/tmp3vztd5k_


In [None]:
# create sentence. Sentence class holds all meta related to a text
glove_sentence = Sentence('We are travelling to Italy to watch a famous play')
print(glove_sentence)
print(glove_sentence.tokens)

#Sentence will split our sentence to tokens. Let's access the first token
print(glove_sentence[0])

Sentence[10]: "We are travelling to Italy to watch a famous play"
[Token[0]: "We", Token[1]: "are", Token[2]: "travelling", Token[3]: "to", Token[4]: "Italy", Token[5]: "to", Token[6]: "watch", Token[7]: "a", Token[8]: "famous", Token[9]: "play"]
Token[0]: "We"


In [None]:
#print each token embedding. We will get empty vectors because we did not get the embeddings yet
for token in glove_sentence:
    print(colored(token,attrs=['bold']))

    #print the embedding for each token
    print(token.embedding)

In [None]:
# embed a sentence using glove.
glove_embedding.embed(glove_sentence)

# now check out the embedded tokens.
for token in glove_sentence:
    print(colored(token,attrs=['bold']))
    #print the embedding for each token
    print(token.embedding)

Token[0]: "We"
tensor([-0.1779,  0.6267,  0.4787, -0.5530, -0.8493, -0.0708, -0.3472,  0.4628,
         0.1261, -0.2488,  0.4688,  0.0836,  0.5606, -0.2193,  0.0156, -0.5581,
        -0.2074,  0.9123, -1.2034,  0.3011,  0.4668,  0.4830, -0.1020, -0.5680,
        -0.0271,  0.4057, -0.1406, -0.5548,  0.0946, -0.6221, -0.3034,  0.6064,
         0.0498,  0.2220,  0.4855,  0.1763, -0.0905,  0.5371,  0.2755, -0.7883,
        -0.7095, -0.1668,  0.1121, -0.4849, -0.6664,  0.0840,  0.3289, -0.4585,
        -0.3721, -1.5315,  0.1299, -0.2409, -0.1722,  1.3740, -0.2231, -2.6150,
         0.3520,  0.3360,  1.6117,  0.9295, -0.3753,  0.8203, -1.0677, -0.4533,
         1.2332,  0.2375,  0.6352,  0.8286, -0.1744, -0.5853,  0.5634, -0.7309,
         0.3081, -1.0888,  0.4614,  0.0454, -0.1783, -0.0541, -0.8831,  0.0339,
         0.6308, -0.1974, -0.9905,  0.2002, -1.9266, -0.2588,  0.1037, -0.3413,
        -0.9351, -0.5467, -0.4017, -0.3778, -0.0658, -0.1384, -0.9187, -0.0556,
        -0.0806, -0.1953,

In [None]:
#print the embedding for the word "play"

print(colored("The embedding of the word play",attrs=['bold']))
print(glove_sentence[9].embedding)

The embedding of the word play
tensor([-0.2408,  0.0247,  0.6461, -0.4000, -0.3512,  0.7456,  0.2530,  0.1407,
        -0.9319, -0.3551, -0.0583, -0.4629, -0.3528,  0.1506, -0.1548,  0.2209,
         0.1969,  0.9385, -0.3012,  0.6651,  0.0238,  0.1202,  0.4089,  0.3576,
         0.7272, -0.3942, -0.3571, -0.5079,  0.7247,  0.5239, -1.4761,  0.9837,
         0.1517, -0.2047,  0.4378, -0.3446, -0.5340,  0.5334, -0.6866, -0.5667,
         0.3157, -0.0532, -0.1194, -0.1369, -0.1898, -0.1227,  0.1451, -0.6482,
         0.2514, -1.2370, -0.6425,  0.4000, -0.0588,  0.7735,  0.2392, -2.9341,
        -0.3087, -0.4429,  0.6963,  0.9167, -0.6856,  0.9386, -0.7600, -0.1033,
         0.5508, -0.0460,  0.2931,  0.6355, -0.6446, -0.0816, -0.0425, -0.6625,
         0.5626, -0.4048,  0.2786, -0.1148, -0.4131, -0.0099,  0.1606,  0.1285,
         0.4992, -0.0717, -0.5237, -0.0472, -1.7793, -0.1810, -0.3947,  0.1824,
        -0.1078, -0.2051, -0.5151,  0.1035, -0.3436,  0.1991, -0.4173,  0.0461,
        -

In [None]:
#print the length of the embedding vector
print(colored("The size of the embedding vector of the word play",attrs=['bold']))
len(glove_sentence[9].embedding)

The size of the embedding vector of the word play


100

Let's create another sentence that contains the word **"play"** but with a different meaning.

In [None]:
# create sentence.
glove_sentence2 = Sentence('They play tennis on their break')

# embed a sentence using glove.
glove_embedding.embed(glove_sentence2)

[Sentence[6]: "They play tennis on their break"]

In [None]:
#print the embedding of the word "play" in the first sentence
print(colored("The embedding of the word play in the first sentence",attrs=['bold']))
print(glove_sentence[9].embedding)

#print the embedding for the word "play" you will notice it is similar to the emebdding of "play" in the previous sentence
print(colored("The embedding of the word play in the second sentence",attrs=['bold']))
print(glove_sentence2[1].embedding)

The embedding of the word play in the first sentence
tensor([-0.2408,  0.0247,  0.6461, -0.4000, -0.3512,  0.7456,  0.2530,  0.1407,
        -0.9319, -0.3551, -0.0583, -0.4629, -0.3528,  0.1506, -0.1548,  0.2209,
         0.1969,  0.9385, -0.3012,  0.6651,  0.0238,  0.1202,  0.4089,  0.3576,
         0.7272, -0.3942, -0.3571, -0.5079,  0.7247,  0.5239, -1.4761,  0.9837,
         0.1517, -0.2047,  0.4378, -0.3446, -0.5340,  0.5334, -0.6866, -0.5667,
         0.3157, -0.0532, -0.1194, -0.1369, -0.1898, -0.1227,  0.1451, -0.6482,
         0.2514, -1.2370, -0.6425,  0.4000, -0.0588,  0.7735,  0.2392, -2.9341,
        -0.3087, -0.4429,  0.6963,  0.9167, -0.6856,  0.9386, -0.7600, -0.1033,
         0.5508, -0.0460,  0.2931,  0.6355, -0.6446, -0.0816, -0.0425, -0.6625,
         0.5626, -0.4048,  0.2786, -0.1148, -0.4131, -0.0099,  0.1606,  0.1285,
         0.4992, -0.0717, -0.5237, -0.0472, -1.7793, -0.1810, -0.3947,  0.1824,
        -0.1078, -0.2051, -0.5151,  0.1035, -0.3436,  0.1991, -0.41

Check if the word **"play"** have the same embeddings in both sentences when **GloVe** was used.

In [None]:
from scipy import spatial
similarity= 1 - spatial.distance.cosine(glove_sentence[9].embedding, glove_sentence2[1].embedding)
similarity

1

**Notice that the similarity between the words is equal 1 when GloVe was used which means they are exacly similar.**

### **Exercise1**
Choose either the word "rose" or "tie" to create two different sentences such that they share the same word but with different meanings. Use GloVe to get the words embeddings. Check the similarity between the embeddings of the common word in both sentences when GloVe was used.

In [None]:
#add your solution here

### **References**

* https://radimrehurek.com/gensim/index.html
*   https://radimrehurek.com/gensim/models/word2vec.html
*    Word Embeddings introduction: https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
* Word2Vec introduction: http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/
* A great Gensim implentation tutorial: http://kavita-ganesan.com/gensim-word2vec-tutorial-starter-code/#.W467ScBjM2x
* Original articles from Mikolov et al.: https://arxiv.org/abs/1301.3781 and https://arxiv.org/abs/1310.4546

*   [Flair word emebddings tutorial.](https://github.com/flairNLP/flair/blob/master/resources/docs/TUTORIAL_3_WORD_EMBEDDING.md)
*   [Flair Elmo embedding tutorial.](https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/ELMO_EMBEDDINGS.**md**)
* [Flair document embeddings tutorial.](https://github.com/flairNLP/flair/blob/master/resources/docs/TUTORIAL_5_DOCUMENT_EMBEDDINGS.md)

