1\. Linguistic features
-----------------------

00:00 - 00:09

Welcome! In this video, we will cover more details on POS tagging and introduce dependency parsing.


2\. POS tagging
---------------

00:09 - 00:44

We have learned how we can use spaCy to extract part-of-speech tags. Each word is tagged by a POS tag depending on its context and the other surrounding words, and their POS tags. For example, given a tricky sentence such as "My cat will fish for a fish tomorrow in a fishy way.", the spaCy tagger makes correct POS tag predictions for the fish and fishy words by tagging the first word fish to VERB, the second word fish to NOUN and the word fishy to ADJ.

POS tags depend on the context, surrounding words and their tags

```python
import spacy
nlp = spacy.load("en_core_web_sm")
text = "My cat will fish for a fish tomorrow in a fishy way."
print([(token.text, token.pos_, spacy.explain(token.pos_))
      for token in nlp(text)])
```
```
[('My', 'PRON', 'pronoun'), ('cat', 'NOUN', 'noun'), ('will', 'AUX', 'auxiliary'), ('fish', 'VERB', 'verb'), ('for', 'ADP', 'adposition'), ('a', 'DET', 'determiner'), ('fish', 'NOUN', 'noun'), ('tomorrow', 'NOUN', 'noun'), ('in', 'ADP', 'adposition'), ('a', 'DET', 'determiner'), ('fishy', 'ADJ', 'adjective'), ('way', 'NOUN', 'noun'), ('.', 'PUNCT', 'punctuation')]
```

3\. What is the importance of POS?
----------------------------------

00:44 - 01:05

Now, the question we might ask is what is the importance of POS tags? Many applications need to know the word type for better accuracy. For example, in translation systems, the word fish as a verb and as a noun will map to different words in Spanish.

Better accuracy for many NLP tasks

```
I will fish tomorrow.
I ate fish.
```

Translation system use case

```
verb -> pescaré
noun -> pescado
```

4\. What is the importance of POS?
----------------------------------

01:05 - 01:46

Syntactic information such as POS tags can help many tasks further down the pipeline such as word-sense disambiguation (WSD). WSD is a classical problem of deciding in which sense a word is used in a sentence. Determining the sense of the word can be crucial in search engines, machine translation, and question-answering systems. For example, for the word "Play", the POS tagger can help with WSD when the tagger labels the senses of word with a NOUN or VERB depending on its context.

Word-sense disambiguation (WSD) is the problem of deciding in which sense a word is used in a sentence.

Determining the sense of the word can be crucial in machine translation, etc.

| Word | POS tag | Description |
|------|---------|-------------|
| Play | VERB | engage in activity for enjoyment and recreation |
| Play | NOUN | a dramatic work for the stage or to be broadcast |

5\. Word-sense disambiguation
-----------------------------

01:46 - 02:27

Let's use POS tagging for WSD. We create a tuple of the token and the dot-pos_ tag by looping over each token in the Doc container and check if "fish" is in the tokenized text. The word fish, in "I will fish tomorrow", has a -dot-pos_ tag of a VERB, which identifies its sense correctly as "to catch fish". In the sentence "I ate fish", the word fish has a -dot-pos_ tag of NOUN and it identifies the sense as "an animal".

```python
import spacy
nlp = spacy.load("en_core_web_sm")

verb_text = "I will fish tomorrow."
noun_text = "I ate fish."

print([(token.text, token.pos_) for token in nlp(verb_text) if "fish" in token.text], "\n")
print([(token.text, token.pos_) for token in nlp(noun_text) if "fish" in token.text])
```
```
[('fish', 'VERB')]
[('fish', 'NOUN')]
```

6\. Dependency parsing
----------------------

02:27 - 03:31

We have learned POS tags, which are grammatical categories of words. POS tags do not reveal any relation between distant words in a given sentence. This is where dependency parsing comes in. This process provides a structured way of exploring the sentence syntax. It is analyzing sentence structure via dependencies between tokens. A dependency or a dependency relation is a directed link between two tokens. The result of this procedure is always a tree. For example, for the sentence "We understand the differences.", spaCy assigns a dependency label for each token such as "nsubj", "dobj" and "det". For example, the first arc with nsubj label shows the subject and verb relationship between "we" and "understand".

- Explores a sentence syntax
- Links between two tokens
- Results in a tree

                                       dobj
                             nsubj     ⌢⌢⌢⌢⌢⌢⌢⌢⌢
                             ⌢⌢⌢⌢⌢               
                                              det
                                              ⌢⌢⌢
    We                  understand         the     differences
    PRON                VERB              DET     NOUN

7\. Dependency parsing and spaCy
--------------------------------

03:31 - 03:55

A dependency label describes the type of syntactic relation between two tokens. A few of the most common dependency labels is provided in the table such as nsubj (Nominal subject), root, det (determiner), dobj (direct object) and aux (auxiliary).

Dependency label describes the type of syntactic relation between two tokens

| Dependency label | Description |
|-----------------|-------------|
| nsubj | Nominal subject |
| root | Root |
| det | Determiner |
| dobj | Direct object |
| aux | Auxiliary |

8\. Dependency parsing and displaCy
-----------------------------------

03:55 - 04:40

Let's draw our first dependency tree using displaCy. We can use spacy-dot-displacy-dot-serve by passing two arguments, a Doc container of a given text and a word of "dep" (dependency) to display a dependency tree. In a dependency relation, one of the tokens is the parent, and the other is its dependent. For example, for the dependency relation between the words "the" and "differences", "the" is the dependent, and the dependency label is "det", which stands for determiner.

displaCy can draw dependency trees

```python
doc = nlp("We understand the differences.")
spacy.displacy.serve(doc, style="dep")
```

                                       dobj
                             nsubj     ⌢⌢⌢⌢⌢⌢⌢⌢⌢
                             ⌢⌢⌢⌢⌢               
                                              det
                                              ⌢⌢⌢
    We                  understand         the     differences
    PRON                VERB              DET     NOUN

9\. Dependency parsing and spaCy
--------------------------------

04:40 - 04:58

We use -dot-text and -dot-dep_ attributes of a token to access the dependency label of each of the tokens. We can also use spacy-dot-explain() method to view definition of each dependency label.

`.dep_` attribute to access the dependency label of a token

```python
doc = nlp("We understand the differences.")
print([(token.text, token.dep_, spacy.explain(token.dep_)) for token in doc])
```
```
[('We', 'nsubj', 'nominal subject'), ('understand', 'ROOT', 'root'),
('the', 'det', 'determiner'), ('differences', 'dobj', 'direct object'),
('.', 'punct', 'punctuation')]
```

10\. Let's practice!
--------------------

04:58 - 05:02

Great job! Let's practice.

#### Linguistic annotations in spaCy

Linguistic annotations that are available in spaCy, such as POS tagging, named-entity recognition, and dependency parsing, can be used to better understand writing quality and learn about many aspects of human language, including words, sentences, and meaning. In this exercise, you'll practice your learnings on different linguistic features.

##### Instructions

-   Drag and drop a definition or spaCy token attribute into the correct bucket of the corresponding linguistic feature.

Here's the same content formatted as a markdown table:

| POS tagging | Dependency parsing | Named-entity recognition |
|-------------|-------------------|------------------------|
| Categorizes words based on their function and context within a sentence | `.dep_` attribute of a Token object | `.label_` attribute of a Token object |
| `.pos_` attribute of a Token object | Analyzing sentence structure via dependencies between the tokens | Classify entities into corresponding categories |

Word-sense disambiguation with spaCy
====================================

WSD is a classical problem of deciding in which sense a word is used in a sentence. Determining the sense of the word can be crucial in search engines, machine translation, and question-answering systems. In this exercise, you will practice using POS tagging for word-sense disambiguation. 

There are two sentences containing the word **jam**, with two different senses and you are tasked to identify the POS tags to help you determine the corresponding sense of the word in a given sentence. 

The two sentences are available in the `texts` list. The `en_core_web_sm` model is already loaded and available for your use as `nlp`.

Instructions 1/2
----------------

-   Create a `documents` list containing the `Doc` containers of each element in the `texts` list.
-   Print a tuple of the token's text and POS tags per each `Doc` container only if the word **jam** is in the token text.

In [None]:
texts = ["This device is used to jam the signal.",
         "I am stuck in a traffic jam"]

# Create a list of Doc containers in the texts list
documents = [nlp(t) for t in texts]

# Print a token's text and POS tag if the word jam is in the token's text
for i, doc in enumerate(documents):
    print(f"Sentence {i+1}: ", [(token.text, token.pos_) for token in doc if "jam" in token.text], "\n")

Instructions 2/2
----------------

Question
--------

The word **jam** has multiple senses. In the sentence **"This device is used to jam the signal."**, what is the sense of the word **jam**?

### Possible answers

VERB: become or make unable to move or work due to a part seizing or becoming stuck.

[/] NOUN: an instance of a machine or thing seizing or becoming stuck.

NOUN: an awkward situation or predicament.

Dependency parsing with spaCy
=============================

Dependency parsing analyzes the grammatical structure in a sentence and finds out related words as well as the type of relationship between them. An application of dependency parsing is to identify a sentence object and subject. In this exercise, you will practice extracting dependency labels for given texts. 

Three comments from the Airline Travel Information System (ATIS) dataset have been provided for you in a list called `texts`. `en_core_web_sm` model is already loaded and available for your use as `nlp`.

Instructions
------------

-   Create a `documents` list containing the `doc` containers of each element in the `texts` list.
-   Print a tuple of (the token's text, dependency label, and label's explanation) per each `doc` container.

In [None]:
# Create a list of Doc containers of texts list
documents = [nlp(t) for t in texts]

# Print each token's text, dependency label and its explanation
for doc in documents:
    print([(token.text, token.dep_, spacy.explain(token.dep_)) for token in doc], "\n")

1\. Introduction to word vectors
--------------------------------

00:00 - 00:04

Welcome! Let's learn about word vectors.

2\. Word vectors (embeddings)
-----------------------------

00:04 - 01:41

Word vectors, or word embeddings, are numerical representations of words that allow computers to perform complex tasks using text data. The purpose of word vectors is to allow a computer to understand words. Computers cannot understand text as is, but they can process numbers efficiently. For this reason, we'll convert words into numbers. Traditional methods, such as the "bag-of-words" method, take all words in a corpus and convert them into a unique number, when creating word vectors. These words are then stored in a dictionary where "I" can be mapped to one, "got" can be mapped to two and so on. The older methods allow a computer to understand words numerically, however, they do not enable understanding the meaning of the words. Consider an example with two sentences: "I got covid" and "I got coronavirus". These sentences represented as numerical arrays of [1, 2, 3] and [1, 2, 4] respectively with a bag of words model. The two sentences are identical, but they have different word embeddings. The computer does not have a certain way of knowing that the words "covid" and "coronavirus" refer to the same thing. The model just sees these as two different words represented by two different numbers. Hence, the model is oblivious to context and semantics.

- Numerical representations of words
- Bag of words method:
```python
{"I": 1, "got": 2, ...}
```

- Older methods do not allow to understand the meaning:

| Sentences | I | got | covid | coronavirus |
|-----------|---|-----|-------|-------------|
| I got covid | 1 | 2 | 3 | |
| I got coronavirus | 1 | 2 | | 4 |

3\. Word vectors
----------------

01:41 - 02:33

But all hope is not lost. We can use recent methodologies to find word vectors that can be used to teach a computer if two words have similar meanings. Word vectors have a pre-defined number of dimensions. Statistical and machine learning models take into account word frequencies in a corpus and the presence of other words in similar contexts. A computer can then use this information to understand the similarity of words numerically by using vectors. For example, the table shows 7-dimensional word vectors that can help distinguish animals from houses or cats from dogs by capturing different aspects of these words from their surrounding context in a large corpus of text.

- A pre-defined number of dimensions
- Considers word frequencies and the presence of other words in similar contexts

| Word/Dimension | living being | feline | human | gender | royalty | verb | plural |
|---------------|--------------|--------|-------|--------|---------|------|--------|
| cat | 0.6 | 0.9 | 0.1 | 0.4 | -0.7 | -0.3 | -0.2 |
| kitten | 0.5 | 0.8 | -0.1 | 0.2 | -0.6 | -0.5 | -0.1 |
| dog | 0.7 | -0.1 | 0.4 | 0.3 | -0.4 | -0.1 | -0.3 |
| houses | -0.8 | -0.4 | -0.5 | 0.1 | -0.9 | 0.3 | 0.8 |

4\. Word vectors
----------------

02:33 - 03:12

There are multiple approaches to produce word vectors. Some of the most well-known algorithms are word2vec, Glove, fastText, and transformer-based models. To process and train n-dimensional word vectors, Word2vec and fastText use neural network architectures, while Glove uses the word co-occurrences matrix and transformer-based models use more complex architectures to train and predict word vectors. spaCy uses some of these methodologies to enable access to word vectors.

- Multiple approaches to produce word vectors:
  - word2vec, Glove, fastText and transformer-based architectures 

- An example of a word vector:
```python
array([ 2.2407   ,  1.0389   ,  1.3092   , -1.7335   , -0.78466  ,
       -0.29269  , -1.8859   , -2.5223   ,  0.78025  ,  2.4899   ,
       -0.091849 ,  0.28755  , -1.5057   ,  2.6337   ,  2.5252   ,
       -0.22432  , -2.2068   , -0.57895  , -0.56551  , -1.9338   ,
        1.4973   ,  0.85889  ,  3.3559   , -3.7527   ,  0.22585  ,
       -0.16969  ,  0.51389  ,  0.46073  , -0.28248  , -2.6048   ,
       -3.5896   , -1.0971   , -1.5517   , -0.12185  ,  2.8633   ,
       -1.2525   , -1.6924   , -2.2917   ,  0.97793  ,  0.46954  ,
       -3.595    , -0.17357  ,  0.9805   , -1.8044   , -0.72183  ,
       -0.40709  , -3.0943   ,  0.13095  , -2.9015   ,  1.4768   ,
       -1.0588   , -2.8123   ,  1.2936   , -0.0075977,  2.9975   ,
       -2.4438   ,  0.12348  ,  1.8322   ,  0.35869  , -0.018335 ,
        1.9534   ,  1.4417   ,  0.99895  , -2.8209   , -0.75846  ,
       -1.8438   , -3.2658   , -0.46574  ,  0.90322  ,  0.79868  ,
       -1.6134   , -0.33082  ,  1.1541   , -4.7334   ,  1.4964   ,
       -2.4014   , -1.3461   , -0.95551  ,  0.29671  , -1.4506   ,
       -0.87128  , -3.0714   ,  1.3597   , -0.038133 ,  1.6414   ,
       -0.90879  ,  2.7406   ,  2.2951   , -3.1423   , -3.7525   ,
        0.74033  ,  1.4921   ,  0.47422  , -1.8337   , -1.8168   ,
        0.66901  , -1.3612   , -2.2729   , -1.7656   , -0.73968  ],
      dtype=float32)
```

5\. spaCy vocabulary
--------------------

03:12 - 03:49

Word vectors are a part of many spaCy models, however, a few of the models do not have word vectors. For instance, en_core_web_sm model, the small spaCy model, does not have any word vectors, while the medium-sized model, en_core_web_md, has 20,000-word vectors. We can learn about the size of vocabulary and word vector dimensions by checking the value of the nlp-dot-meta for the keyword "vectors".

- A part of many spaCy models.
- `en_core_web_md` has 300-dimensional vectors for 20,000 words.

```python
import spacy
nlp = spacy.load("en_core_web_md")
print(nlp.meta["vectors"])
```

```
>>> {'width': 300, 'vectors': 20000, 'keys': 514157, 
'name': 'en_vectors', 'mode': 'default'}
```

6\. Word vectors in spaCy
-------------------------

03:49 - 04:51

When using spaCy, we can only extract vectors of words that exist in a model's vocabulary. We use the nlp-dot-vocab method of a spaCy model to access a vocabulary object. Then the nlp-dot-vocab-dot-strings attribute of a Vocab object can be used to access word IDs in the vocabulary. Later, the vocab-dot-vectors can be used to access word vectors of a word using its ID. For example, given a word "like", we first access the mapping of the word to its ID in the vocabulary using nlp-dot-vocab-strings["like"], then use this ID to access the corresponding word vector using nlp-dot-vocab-dot-vectors[extracted_word-id].

- `nlp.vocab`: to access vocabulary (Vocab class)
- `nlp.vocab.strings`: to access word IDs in a vocabulary

```python
import spacy
nlp = spacy.load("en_core_web_md")
like_id = nlp.vocab.strings["like"]
print(like_id)
```

```
>>> 18194338103975822726
```

- `.vocab.vectors`: to access words vectors of a model or a word, given its corresponding ID

```python
print(nlp.vocab.vectors[like_id])
```

```
>>> array([-2.3334e+00, -1.3695e+00, -1.1330e+00, -6.8461e-01, ...])
```

7\. Let's practice!
-------------------

04:51 - 04:55

Great, let's exercise our learnings!

spaCy vocabulary
================

Word vectors, or word embeddings, are numerical representations of words that allow computers to perform complex tasks using text data. Word vectors are a part of many spaCy models, however, a few of the models do not have word vectors. 

In this exercise, you will practice accessing `spaCy` vocabulary information. Some meta information about word vectors are stored in each `spaCy` model. You can access this information to learn more about the vocabulary size, word vectors dimensions, etc.

The `spaCy` package is already imported for your use. In a `spaCy` model's metadata, the number of words is stored as an element with the "**vectors**" key and the dimension of word vectors is stored as an element with the "**width**" key.

Instructions
------------

-   Load the `en_core_web_md` model.
-   Print the number of words in the `en_core_web_md` model's vocabulary.
-   Print the dimensions of word vectors in the `en_core_web_md` model.

In [None]:
# Load the en_core_web_md model
md_nlp = spacy.load("en_core_web_md")

# Print the number of words in the model's vocabulary
print("Number of words: ", md_nlp.meta["vectors"]["vectors"], "\n")

# Print the dimensions of word vectors in en_core_web_md model
print("Dimension of word vectors: ", md_nlp.meta["vectors"]["width"])

Word vectors in spaCy vocabulary
================================

The purpose of word vectors is to allow a computer to understand words. In this exercise, you will practice extracting word vectors for a given list of words. 

A list of words is compiled as `words`. The `en_core_web_md` model is already imported and available as `nlp`. 

The vocabulary of `en_core_web_md` model contains 20,000 words. If a word does not exist in the vocabulary, you will not be able to extract its corresponding word vector. In this exercise, for simplicity, it is ensured that all the given words exist in this model's vocabulary.

Instructions
------------

-   Extract the IDs of all the given `words` and store them in an `ids` list.
-   For each ID from `ids`, store the first ten elements of the word vector in the `word_vectors` list.
-   Print the first ten elements of the first word vector from `word_vectors`.

In [None]:
words = ["like", "love"]

# IDs of all the given words
ids = [nlp.vocab.strings[w] for w in words]

# Store the first ten elements of the word vectors for each word
word_vectors = [nlp.vocab.vectors[i][:10] for i in ids]

# Print the first ten elements of the first word vector
print(word_vectors[0])

1\. Word vectors and spaCy
--------------------------

00:00 - 00:08

Welcome! Let's learn how to visualize word vectors and utilize them to find similar contexts.

2\. Word vectors visualization
------------------------------

00:08 - 01:09

We can visualize word vectors in a scatter plot to help us understand how the vocabulary words are grouped. In order to visualize word vectors, we need to project them into a two-dimensional space. We can project vectors by extracting the two principal components via Principal Component Analysis (PCA). We won't go into further details on PCA, but it is a way to reduce a high-dimensional dataset into a dataset of fewer dimensions (two in this case). By applying PCA and projecting word vectors of words such as wonderful, horrible, apple, banana, orange, watermelon, dog, and cat in two-dimensional space, we see these words are grouped into three semantic classes (animals, fruits and emotional context). This shows that we are moving closer to finding the meaning of the words.

- Word vectors allow to understand how words are grouped
- Principal Component Analysis projects word vectors into a two-dimensional space

```python
array([ 2.2407   ,  1.0389   ,  1.3092   , -1.7335   , -0.78466  ,
       -0.29269  , -1.8859   , -2.5223   ,  0.78025  ,  2.4899   ,
       -0.091849 ,  0.28755  , -1.5057   ,  2.6337   ,  2.5252   ,
       -0.22432  , -2.2068   , -0.57895  , -0.56551  , -1.9338   ,
       1.4973   ,  0.85889  ,  3.3559   , -3.7527   ,  0.22585  ,
       -0.16969  ,  0.51389  ,  0.46073  , -0.28248  , -2.6048   ,
       -3.5896   , -1.0971   , -1.5517   , -0.12185  ,  2.8633   ,
       -1.2525   , -1.6924   , -2.2917   ,  0.97793  ,  0.46954  ,
       -3.595    , -0.17357  ,  0.9805   , -1.8044   , -0.72183  ,
       -0.40709  , -3.0943   ,  0.13095  , -2.9015   ,  1.4768   ,
       -1.0588   , -2.8123   ,  1.2936   , -0.0075977,  2.9975   ,
       -2.4438   ,  0.12348  ,  1.8322   ,  0.35869  , -0.018335 ,
       1.9534   ,  1.4417   ,  0.99895  , -2.8209   , -0.75846  ,
       -1.8438   , -3.2658   , -0.46574  ,  0.90322  ,  0.79868  ,
       -1.6134   , -0.33082  ,  1.1541   , -4.7334   ,  1.4964   ,
       -2.4014   , -1.3461   , -0.95551  ,  0.29671  , -1.4506   ,
       -0.87128  , -3.0714   ,  1.3597   , -0.038133 ,  1.6414   ,
       -0.90879  ,  2.7406   ,  2.2951   , -3.1423   , -3.7525   ,
        0.74033  ,  1.4921   ,  0.47422  , -1.8337   , -1.8168   ,
        0.66901  , -1.3612   , -2.2729   , -1.7656   , -0.73968  ],
      dtype=float32)
```

Here's a simple ASCII/markdown representation of the word vector plot:

```
   20 |    horrible
      |   wonderful
   10 |
      |
    0 |                                           cat    dog
      |                                                
  -10 | watermelon
      | banana
  -15 | orange
      | apple
      |___________________________________________________
      -20     -10      0      10     20     30     40     50
```

Note: This is a simplified ASCII representation. The actual plot is a more precise scatter plot showing the word vectors projected into 2D space using Principal Component Analysis, but this gives you a rough idea of the relative positions of the words.

3\. Word vectors visualization
------------------------------

01:09 - 01:44

Now that we have seen the projected word vectors, let us learn how to use matplotlib, spaCy and sklearn packages to create such a visualization. First, we import the required libraries (matplotlib, PCA and numpy) and load a spaCy model. Then we extract word vectors for a given list of words by using nlp-dot-vocab-dot-strings and nlp-dot-vocab-dot-vectors method. Later, we stack these vectors vertically using np-dot-vstack() method for PCA calculations.

- Import required libraries and a spaCy model.

```python
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
import numpy as np
nlp = spacy.load("en_core_web_md")
```

- Extract word vectors for a given list of words and stack them vertically.

```python
words = ["wonderful", "horrible",
         "apple", "banana", "orange", "watermelon",
         "dog", "cat"]
word_vectors = np.vstack([nlp.vocab.vectors[nlp.vocab.strings[w]] for w in words])
```

4\. Word vectors visualizations
-------------------------------

01:44 - 02:20

Since the word vectors are 300-dimensions, we need to project them into two-dimensional space. We use the PCA library from sklearn and extract two principal components using pca-dot-fit_transform() method. We can later use these two components as x and y coordinates per word by accessing 0 and 1 indices of the transformed word vectors and then visualize a scatter plot using plt-dot-text and plt-show methods.

- Extract two principal components using PCA.

```python
pca = PCA(n_components=2)
word_vectors_transformed = pca.fit_transform(word_vectors)
```

- Visualize the scatter plot of transformed vectors.

```python
plt.figure(figsize=(10, 8))
plt.scatter(word_vectors_transformed[:, 0], word_vectors_transformed[:, 1])
for word, coord in zip(words, word_vectors_transformed):
    x, y = coord
    plt.text(x, y, word, size=10)
plt.show()
```

5\. Analogies and vector operations
-----------------------------------

02:20 - 03:25

Now into analogies and semantic understanding of the words! Word vectors can capture semantics and can also support vector operations, such as vector addition and subtraction. A word analogy is a semantic relationship between a pair of words. There are many types of relationships, such as synonymity, anonymity, and whole-part relation. Some example pairs are (king - man, queen - woman) and (walked - walking, swam - swimming). Word vectors can generate remarkable analogies such as gender and tense. For example, we can represent gender mapping between the queen and king as queen - woman + man = king. We subtract woman from queen and add man instead, and we get king. Then this analogy reads as queen is to king as woman is to man.

- A semantic relationship between a pair of words.
- Word embeddings generate analogies such as gender and tense:
  - queen - woman + man = king

```
| Gender analogy plot           | Tense analogy plot          |
|------------------------------|----------------------------|
|  300                         | 300                        |
|            woman             |           swimming         |
|  250         ↘               | 200         ↘              |
|                ↘             |                 ↘          |
|  200              queen      | 100                 swam   |
|      man                     |     walking               |
|  150    ↘                    |         ↘                 |
|           ↘                  |           ↘               |
|  100         king           | -100           walked      |
|                             |                           |
|  50                         |                           |
|------------------------------|----------------------------|
| -250  -200  -150  -100      | -200  -100   0   100  200 |
```

6\. Similar words in a vocabulary
---------------------------------

03:25 - 04:45

Now let us learn how we can use spaCy and its vocabulary to find similar words to a given term or phrase, such as "covid". For this purpose, we first extract covid's word vector using nlp-dot-vocab-dot-vectors and nlp-dot-vocab-dot-strings as discussed before and convert it to a Numpy array using np-dot-asarray() method. Then we can use nlp-dot-vocab-vectors-most_similar() method to search among vectors of all the words in its vocabulary for five most similar terms. We can see similar words such as covid-19, corona and covi in the output. These words are commonly present in the same context as the word "covid" and are semantically similar. spaCy uses the most_similar() function to return word IDs of the most similar terms from its vocabulary by finding the word vectors that have the minimum distance to the word vector of covid. We then use nlp-dot-vocab-dot-strings inside brackets <the extracted-word-id> to find the similar words.

- spaCy find semantically similar terms to a given term

```python
import numpy as np
import spacy
nlp = spacy.load("en_core_web_md")

word = "covid"
most_similar_words = nlp.vocab.vectors.most_similar(
    np.asarray([nlp.vocab.vectors[nlp.vocab.strings[word]]]), n=5)

words = [nlp.vocab.strings[w] for w in most_similar_words[0][0]]
print(words)
```

```
>>> ['Covi', 'CoVid', 'Covici', 'COVID-19', 'corona']
```

7\. Let's practice!
-------------------

04:45 - 04:49

Let's practice our learnings!

#### Analogies and vector operations

Word vectors can capture semantics and can also support vector operations and word analogies. A word analogy is a semantic relationship between a pair of words. In this exercise, you will practice your understanding of analogies and vector operations.

##### Instructions

-   Drag and drop relevant components into the correct bucket to complete the vector operation.

Here's the analogies from the image formatted as a markdown table:

| Equation 1 | Equation 2 | Equation 3 |
|------------|------------|------------|
| airplane - air + ship = sea | fish - sea + bird = air | arm - human + branch = tree |

Each equation follows a word analogy pattern where certain elements are substituted to create a logical relationship between terms.

Word vectors projection
=======================

You can visualize word vectors in a scatter plot to help you understand how the vocabulary words are grouped. In order to visualize word vectors, you need to project them into a two-dimensional space. You can project vectors by extracting the two principal components via Principal Component Analysis (PCA). 

In this exercise, you will practice how to extract word vectors and project them into two-dimensional space using the `PCA`library from `sklearn`.

A short list of words that are stored in the `words` list and the `en_core_web_md` model are available for use. The model is loaded as `nlp`. All necessary libraries and packages are already imported for your use (`PCA`, `numpy` as `np`).

Instructions
------------

-   Extract the word IDs from the given words and store them in the `word_ids` list.
-   Extract the first five elements of the word vectors of the words and then stack them vertically using `np.vstack()` in `word_vectors`.
-   Given a `pca` object, calculate the transformed word vectors using the `.fit_transform()` function of the `pca`class.
-   Print the first component of the transformed word vectors using `[:, 0]`indexing.

In [None]:
words = ["tiger", "bird"]

# Extract word IDs of given words
word_ids = [nlp.vocab.strings[w] for w in words]

# Extract word vectors and stack the first five elements vertically
word_vectors = np.vstack([nlp.vocab.vectors[i][:5] for i in word_ids])

# Calculate the transformed word vectors using the pca object
pca = PCA(n_components=2)
word_vectors_transformed = pca.fit_transform(word_vectors)

# Print the first component of the transformed word vectors
print(word_vectors_transformed[:, 0])

Similar words in a vocabulary
=============================

Finding semantically similar terms has various applications in information retrieval. In this exercise, you will practice finding the most semantically similar term to the word **computer** from the `en_core_web_md` model's vocabulary. 

The **computer** word vector is already extracted and stored as `word_vector`. The `en_core_web_md` model is already loaded as `nlp`, and NumPy package is loaded as `np`. 

You can use the `.most_similar()` function of the `nlp.vocab.vectors` object to find the most semantically similar terms. Using `[0][0]` to index the output of this function will return the word IDs of the semantically similar terms. `nlp.vocab.strings[<a given word>]` can be used to find the word ID of a given word and it can similarly return the word associated with a given word ID.

Instructions
------------

-   Find the most semantically similar term from the `en_core_web_md` vocabulary.
-   Find the list of similar words given the word IDs of the similar terms.

In [None]:
# Find the most similar word to the word computer
most_similar_words = nlp.vocab.vectors.most_similar(np.asarray([word_vector]), n = 1)

# Find the list of similar words given the word IDs
words = [nlp.vocab.strings[w] for w in most_similar_words[0][0]]
print(words)

1\. Measuring semantic similarity with spaCy
--------------------------------------------

00:00 - 00:07

Welcome! Let's determine how to find semantically similar contexts using spaCy.

2\. The semantic similarity method
----------------------------------

00:07 - 00:49

Semantic similarity is the process of analyzing multiple sentences to identify similarities between them. Determining semantic similarity can help us to categorize texts into predefined categories or detect relevant texts, or to flag duplicate content. Suppose we need to find relevant customer questions to the word "price". Given a list of sentences such as "what is the cheapest flight from Boston to Seattle?", only the first sentence is related, because it contains the word cheapest. To measure how similar two pieces of text are, we need to calculate their similarity scores.

Process of analyzing texts to identify similarities

Categorizes texts into predefined categories or detect relevant texts  

Similarity score measures how similar two pieces of text are

```python
# Example airline queries
text_queries = [
    "What is the cheapest flight from Boston to Seattle?",
    "Which airline serves Denver, Pittsburgh and Atlanta?", 
    "What kinds of planes are used by American Airlines?"
]
```

3\. Similarity score
--------------------

00:49 - 01:26

Semantic similarity score is a metric that is defined over texts, where the similarity between two texts is measured using their representative word vectors. We will use cosine similarity and word vectors to measure similarity between two pieces of text. The cosine similarity of two vectors is the cosine of the angle that's created by these two vectors, and it will always have a number between 0 and 1. A larger cosine similarity metric (closer to one) represents more similar word vectors.

A metric defined over texts

To measure similarity use Cosine similarity and word vectors

Cosine similarity is any number between 0 and 1

```
Case 1:                     Case 2:                     Case 3:
   ^                          ^                          ^   
   |    x↗                    |    x↗                    |
   |     ↗                    |     ↗                    |     y↗
   |      ↗θ                  |      |                   |      ↗
   |       ↗                  |      |θ                  |       ↗θ
   |     y↗                   |      |                   |     x↙
   |                          |    y→                    |
---+---------------------->  -+------------>            -+--------->
   |                          |                          |
   |                          |                          |
   |                          |                          |

- Angle θ close to 0         - Angle θ close to 90       - Angle θ close to 180
- Cos(θ) close to 1         - Cos(θ) close to 0         - Cos(θ) close to -1  
- Similar vectors           - Orthogonal vectors         - Opposite vectors
```

4\. Token similarity
--------------------

01:26 - 02:18

We can calculate similarity scores between Token objects by using the context around tokens. Let's say we want to find out whether two words of Pizza and Pasta from the sentences "We eat Pizza" and "We like to eat Pasta" are similar, and what their similarity score is. We first create two Doc containers per sentence and extract tokens associated to words pizza and pasta by using indices of each word. Then we use the first token's similarity function to calculate the similarity score between pizza and pasta by calling token1-dot-similarity(token2). According to the word vectors, the tokens "pizza" and "pasta" are somewhat similar, and receive a similarity score of 0-point-685.

spaCy calculates similarity scores between Token objects

```python
nlp = spacy.load("en_core_web_md")
doc1 = nlp("We eat pizza")
doc2 = nlp("We like to eat pasta")
token1 = doc1[2]
token2 = doc2[4]
print(f"Similarity between {token1} and {token2} = ", round(token1.similarity(token2), 3))
```

```
>>> Similarity between pizza and pasta = 0.685
```

5\. Span similarity
-------------------

02:18 - 03:08

Similarly, spaCy can calculate the similarity score of two Spans of texts. We previously learned that Span is a slice from a Doc container. Subsetting a Doc container results in a Span object. Similar to the Token class, the Span class also has a span-dot-similarity() method that can be used to calculate the similarity score between two spans. We can see the Span objects of "eat pizza" and "eat pasta" have a much higher cosine similarity score of 0-dot-936 and hence are a lot more similar compared to "eat pizza" and "like to eat pasta" spans with similarity score of 0-dot-588.

spaCy calculates semantic similarity of two given Span objects

```python
doc1 = nlp("We eat pizza")
doc2 = nlp("We like to eat pasta")

span1 = doc1[1:]
span2 = doc2[1:]
print(f"Similarity between \"{span1}\" and \"{span2}\" = ",
      round(span1.similarity(span2), 3))

print(f"Similarity between \"{doc1[1:]}\" and \"{doc2[3:]}\" = ",
      round(doc1[1:].similarity(doc2[3:]), 3))
```

```
>>> Similarity between "eat pizza" and "like to eat pasta" = 0.588
>>> Similarity between "eat pizza" and "eat pasta" = 0.936
```

6\. Doc similarity
------------------

03:08 - 03:49

We can also determine whether two documents are similar using spaCy. First, we create Doc containers per document and use the first document's -dot-similarity() method to compare it to the second document. The cosine similarity for "I like to play basketball" and "I love to play basketball" is 0-dot-975 and close to 1. This shows the strength of word vectors in understanding meanings of the words and semantic similarity. spaCy Doc vectors default to an average of word vectors in a document.

spaCy calculates the similarity scores between two documents

```python
nlp = spacy.load("en_core_web_md")

doc1 = nlp("I like to play basketball")
doc2 = nlp("I love to play basketball") 
print("Similarity score :", round(doc1.similarity(doc2), 3))
```

```
>>> Similarity score : 0.975
```

High cosine similarity shows highly semantically similar contents

Doc vectors default to an average of word vectors

7\. Sentence similarity
-----------------------

03:49 - 04:32

Lastly, we can use spaCy to find relevant sentences to a given keyword. For example, given a list of customer questions, we can find the most relevant sentence to a given keyword, such as price. Similar to Token, Span, and Doc objects, a spaCy sentence (from sentence-dot-sents) also has a -dot-similarity() method that can be used to compare a sentence with a word vector of a keyword. We observe that the similarity score of the first question "What is the cheapest flight from Boston to Seattle" is the highest and hence most relevant to the keyword price.

spaCy finds relevant content to a given keyword

Finding similar customer questions to the word price:

```python
sentences = nlp("What is the cheapest flight from Boston to Seattle? \
                Which airline serves Denver, Pittsburgh and Atlanta? \
                What kinds of planes are used by American Airlines?")

keyword = nlp("price")
for i, sentence in enumerate(sentences.sents):
    print(f"Similarity score with sentence {i+1}: ", round(sentence.similarity(keyword), 5))
```

```
>>> Similarity score with sentence 1: 0.26136
Similarity score with sentence 2: 0.14021 
Similarity score with sentence 3: 0.13885
```

8\. Let's practice!
-------------------

04:32 - 04:35

Let's practice!

Doc similarity with spaCy
=========================

Semantic similarity is the process of analyzing multiple sentences to identify similarities between them. In this exercise, you will practice calculating semantic similarities of documents to a given document. The goal is to categorize a list of given reviews that are relevant to **canned dog food**. 

The **canned dog food** category is stored at `category`. A sample of five food reviews has been provided for you in a list called `texts`. `en_core_web_md` is loaded as `nlp`.

Instructions
------------

100 XP

-   Create a `documents` list containing `Doc`containers of all `texts`.
-   Create a `Doc` container of the `category`and store it as `category_document`.
-   Iterate through `documents` and print the similarity scores of each `Doc` container and the `category_document`, rounded to three digits.

In [None]:
# Create a documents list containing Doc containers
documents = [nlp(t) for t in texts]

# Create a Doc container of the category
category = "canned dog food"
category_document = nlp(category)

# Print similarity scores of each Doc container and the category_document
for i, doc in enumerate(documents):
  print(f"Semantic similarity with document {i+1}:", round(doc.similarity(category_document), 3))

Span similarity with spaCy
==========================

Determining semantic similarity can help you to categorize texts into predefined categories or detect relevant texts, or to flag duplicate content. In this exercise, you will practice calculating the semantic similarities of spans of a document to a given document. The goal is to find the most relevant `Span` of three tokens that are relevant to **canned dog food**. 

The given category of **canned dog food** is stored at `category`. A text string is already stored in the `text` object and the `en_core_web_md` is loaded as `nlp`. The `Doc` container of the `text` is also already created and stored at `document`.

Instructions
------------

-   Create a `Doc` container for the `category` and store at `category_document`.
-   Print similarity score of a given `Span` and the `category_document`, rounded to three digits.

In [None]:
# Create a Doc container for the category
category = "canned dog food"
category_document = nlp(category)

# Print similarity score of a given Span and category_document
document_span = document[0:3]
print(f"Semantic similarity with", document_span.text, ":", round(document_span.similarity(category_document), 3))

Semantic similarity for categorizing text
=========================================

The main objective of semantic similarity is to measure the distance between the semantic meanings of a pair of words, phrases, sentences, or documents. For example, the word "car" is more similar to "bus" than it is to "cat". In this exercise, you will find similar sentences to the word **sauce**from an example text in Amazon Fine Food Reviews. You can use `spacy` to calculate the similarity score of the word `sauce` and any of the sentences in a given `texts` string and report the most similar sentence's score.

A `texts` string is pre-loaded that contains all reviews' `Text` data. You'll use `en_core_web_md` English model for this exercise which is already available as `nlp`.

Instructions
------------

-   Use `nlp` to generate `Doc` containers for the word `sauce` and for `texts` and store them at `key` and `sentences`respectively.
-   Calculate similarity scores of the word `sauce` with each sentence in the `texts`string (rounded to two digits).

In [None]:
# Populate Doc containers for the word "sauce" and for "texts" string
key = nlp("sauce")
sentences = nlp(texts)

# Calculate similarity score of each sentence and a Doc container for the word sauce
semantic_scores = []
for sent in sentences.sents:
    semantic_scores.append({"score": round(sent.similarity(key), 2)})
print(semantic_scores)