# Part of Speech
Within a sentence, different words play different roles, e.g., noun, verb, adjective, etc. These roles are the part of speech.

## Coarse-grained Part-of-speech Tags
Every token is assigned a POS Tag from the following list:


<table><tr><th>POS</th><th>DESCRIPTION</th><th>EXAMPLES</th></tr>
    
<tr><td>ADJ</td><td>adjective</td><td>*big, old, green, incomprehensible, first*</td></tr>
<tr><td>ADP</td><td>adposition</td><td>*in, to, during*</td></tr>
<tr><td>ADV</td><td>adverb</td><td>*very, tomorrow, down, where, there*</td></tr>
<tr><td>AUX</td><td>auxiliary</td><td>*is, has (done), will (do), should (do)*</td></tr>
<tr><td>CONJ</td><td>conjunction</td><td>*and, or, but*</td></tr>
<tr><td>CCONJ</td><td>coordinating conjunction</td><td>*and, or, but*</td></tr>
<tr><td>DET</td><td>determiner</td><td>*a, an, the*</td></tr>
<tr><td>INTJ</td><td>interjection</td><td>*psst, ouch, bravo, hello*</td></tr>
<tr><td>NOUN</td><td>noun</td><td>*girl, cat, tree, air, beauty*</td></tr>
<tr><td>NUM</td><td>numeral</td><td>*1, 2017, one, seventy-seven, IV, MMXIV*</td></tr>
<tr><td>PART</td><td>particle</td><td>*'s, not,*</td></tr>
<tr><td>PRON</td><td>pronoun</td><td>*I, you, he, she, myself, themselves, somebody*</td></tr>
<tr><td>PROPN</td><td>proper noun</td><td>*Mary, John, London, NATO, HBO*</td></tr>
<tr><td>PUNCT</td><td>punctuation</td><td>*., (, ), ?*</td></tr>
<tr><td>SCONJ</td><td>subordinating conjunction</td><td>*if, while, that*</td></tr>
<tr><td>SYM</td><td>symbol</td><td>*$, %, §, ©, +, −, ×, ÷, =, :), 😝*</td></tr>
<tr><td>VERB</td><td>verb</td><td>*run, runs, running, eat, ate, eating*</td></tr>
<tr><td>X</td><td>other</td><td>*sfpksdpsxmsa*</td></tr>
<tr><td>SPACE</td><td>space</td></tr>

For more information about pos-tagging: https://spacy.io/api/annotation#pos-tagging

In [1]:
import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("The quick brown fox jumped over the lazy dog's back.")
print(doc)

The quick brown fox jumped over the lazy dog's back.


For each token in the document we can get the detected POS tag

In [6]:
def print_pos_tag(doc, token_index):
    print(f"POS-tag: {doc[token_index].pos_}, Fine grained POS-tag: {doc[token_index].tag_}")

In [5]:
print_pos_tag(doc, 4)

POS-tag: VERB, Fine grained POS-tag: VBD


In [18]:
def print_all_pos_tags(doc):
    print("{:20}{:20}{:20}{:20}".format("Text", "POS", "TAG", "Explanation"))
    for token in doc:
        print(f"{token.text:{20}}{token.pos_:{20}}{token.tag_:{20}}{spacy.explain(token.tag_):{20}}")

In [19]:
print_all_pos_tags(doc)

Text                POS                 TAG                 Explanation         
The                 DET                 DT                  determiner          
quick               ADJ                 JJ                  adjective           
brown               ADJ                 JJ                  adjective           
fox                 NOUN                NN                  noun, singular or mass
jumped              VERB                VBD                 verb, past tense    
over                ADP                 IN                  conjunction, subordinating or preposition
the                 DET                 DT                  determiner          
lazy                ADJ                 JJ                  adjective           
dog                 NOUN                NN                  noun, singular or mass
's                  PART                POS                 possessive ending   
back                NOUN                NN                  noun, singular or mass
.

Spacy is capable to identify verb tense

In [21]:
doc1 = nlp("I read books about NLP.")
doc2 = nlp("I read a book on NLP.")

print(f"First sentence verb tag: {doc1[1].tag_}: {spacy.explain(doc1[1].tag_)}")
print(f"Second sentence verb tag: {doc2[1].tag_}: {spacy.explain(doc2[1].tag_)}")

First sentence verb tag: VBP: verb, non-3rd person singular present
Second sentence verb tag: VBD: verb, past tense


Counting the Parts of Speech tokens in the first document

In [23]:
POS_counts = doc.count_by(spacy.attrs.POS)
POS_counts = {doc.vocab[key].text:val for key,val in POS_counts.items()}
POS_counts

{'PUNCT': 1, 'ADJ': 3, 'VERB': 1, 'ADP': 1, 'DET': 2, 'NOUN': 3, 'PART': 1}

Visualizing POS

In [27]:
from spacy import displacy

options = {'distance': 110, 'compact': 'True', 'color': 'yellow', 'bg':'#09a3d5', 'font':'Arial'}
displacy.render(doc, style='dep', jupyter=True, options=options)