# Embeddings using Gensim Pre-trained Models
In this notebook, we will look at the embeddings that are made available through Gensim. Gensim provides access to a variety of pre-trained word embedding models that are useful for a range of NLP tasks such as text classification, clustering, semantic analysis, and more. These models are trained on large datasets and can be used out-of-the-box for various applications.

In [None]:
import gensim.downloader as api

  First, we can have a look at all the models that are available through Gensim.

In [None]:
models = api.info()['models'].keys()
print(models)

We will load the model `fasttext-wiki-news-subwords-300`.

In [None]:
ft_model = api.load('fasttext-wiki-news-subwords-300')

This model provides different functions. For instance, we can get the embedding
for a certain word.

In [None]:
vector = ft_model['queen']
print(f"Vector for 'queen':\n{vector}")

We can find the most similar words to a word we specify.

In [None]:
similar_words = ft_model.most_similar('queen', topn=5)
print(f"Most similar words to 'queen':\n{similar_words}")

We can calculate the similarity between two words.

In [None]:
similarity = ft_model.similarity('woman', 'man')
print(f"Similarity between 'woman' and 'man':\n{similarity}")

Or the distance (dissimliarity) between two words.




In [None]:
distance = ft_model.distance('woman', 'man')
print(f"Distance between 'woman' and 'man':\n{distance}")

We can specify a list and find the word that does not match the group.

In [None]:
odd_word = ft_model.doesnt_match(['breakfast', 'lunch', 'dinner', 'car'])
print(f"Odd word out: {odd_word}")

We can also find words by analogy (e.g., "man is to king as woman is to ___"):

In [None]:
analogy = ft_model.most_similar(positive=['king', 'woman'], negative=['man'], topn=1)
print(f"Man is to king as woman is to: {analogy}")

To better understand what is going on, we do the math behind the above function call. The embedding of  `man` is subtracted from the embedding of `king` to give the "royalty" direction. If we add that the the embedding value of `woman`, we get the `queen_vector`, and expect this to be close to the embedding of `queen`. Let's see:

In [None]:
queen_vector = ft_model['king'] - ft_model['man'] + ft_model['woman']
ft_model.most_similar(queen_vector, topn=5)

**Exercise:**

Now it is your turn. Complete the following tasks:
1.  Find the 3 most similar words to the word 'university'.
2.  Calculate the similarity between the words 'computer' and 'robot'. Compare it to the similarity between 'computer' and 'tablet'.
3.  Calculate the distance between 'queen' and 'king. Compare it to the distance between 'prince' and 'princess'.
4.  Find the odd word from the list `['Monday', 'Tuesday', 'March', 'Wednesday']`
5.  Find the analogy for: "man is to father as woman is to ___ "