# **Part-of-speech tagging**

Part-of-speech (POS) tagging is used to solve syntactic ambiguity. It adds grammatical word functions and categories to a
given text [[1]](#scrollTo=op-j6UywUt5i). 

In the sentence “Our dogs bark all day,” the word “bark” appears as a verb
(word category) taking the function of the predicate (word function). In “The bark of the
old oak tree was wet,” the word “bark” is a noun (word category) in the function of the
subject (word function). This example illustrates that context plays an important role in
POS tagging [[1]](#scrollTo=op-j6UywUt5i).


This notebook shows some basic examples for POS tagging with the help of spaCy.


## **Part-of-speech tagging in spaCy**

spaCy is a free, open-source library for advanced Natural Language Processing (NLP) in Python. It can be used to build information extraction or natural language understanding systems, or to pre-process text for deep learning [[2]](https://spacy.io/usage/spacy-101). For example, it supports the implementation of tasks for sentiment analysis, chatbots, text summarization, intent and entity extraction, and others [[1]](#scrollTo=op-j6UywUt5i). More information about spaCy please refer to  [[3]](https://spacy.io/).



For POS tagging, we will apply the following steps:
1. Import the spaCy library
2. Load the language model (English)
3. Create a spaCy document
4. Access the POS tags by iterating over the document object
5. Print the POS tags

### Import spaCy library

In [None]:
# Import spaCy library to process the text
import spacy

### Load language model
We will import "en_core_web_sm" English language model by using spaCy library.
It is a small English pipeline trained on written web text (blogs, news, comments), that includes vocabulary, syntax and entities [[4]](https://spacy.io/models).
It is optimized for CPU and its components are: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer [[5]](https://spacy.io/models/en).

In [None]:
# Import "en_core_web_sm" English language model
sp = spacy.load('en_core_web_sm')

### Create spaCy document and perform POS tagging

When creating a Doc object, spaCy automatically produces POS tags (tagger) for an input text. The following figure demonstrates the processing pipeline of a given text to produce a Doc object [[6]](https://spacy.io/usage/processing-pipelines).

![spaCy](https://spacy.io/pipeline-fde48da9b43661abcdf62ab70a546d71.svg)

For POS tagging, we will simply use the "pos_" attribute of the "Morphologizer" class in spaCy. For more details about the "Morphologizer" class, please refer to [[7]](https://spacy.io/api/morphologizer#section-assigned-attributes).

In [None]:
# Create a sample spaCy document
doc_POS = sp(u"I am going to complete this book by this weekend")

### Print POS tags
To print POS tags, we use the "pos_" attribute of spaCy.
spaCy predicts the morphological features of a given text.
These predictions are returned by using the "pos_" attribute [[7]](https://spacy.io/api/morphologizer#section-assigned-attributes).

In [None]:
# Print each word token with its related POS tag.
## For this, we use the "pos_" attribute of spaCy.
## spaCy predicts the morphological features of a given text.
## These predictions are returned by using the "pos_" attribute based on [7].
for word in doc_POS:
    print(word.text + '-->' + word.pos_)

I-->PRON
am-->AUX
going-->VERB
to-->PART
complete-->VERB
this-->DET
book-->NOUN
by-->ADP
this-->DET
weekend-->NOUN


### Print POS tags and explanations

To improve readability, we can define columns. The numbers in curly brackets indicate the space between the  columns [[8]](https://stackabuse.com/python-for-nlp-parts-of-speech-tagging-and-named-entity-recognition/).

To add explanations, "spacy.explain" returns a description for a given POS tag, dependency label or entity type [[9]](https://spacy.io/api/top-level).

In [None]:
# We will now print each word with its related POS tag and explanation:
for word in doc_POS:
    print(f'{word.text:{12}} {word.pos_:{10}} {spacy.explain(word.tag_)}')

I            PRON       pronoun, personal
am           AUX        verb, non-3rd person singular present
going        VERB       verb, gerund or present participle
to           PART       infinitival "to"
complete     VERB       verb, base form
this         DET        determiner
book         NOUN       noun, singular or mass
by           ADP        conjunction, subordinating or preposition
this         DET        determiner
weekend      NOUN       noun, singular or mass


# **References**

- [1] NLP and Computer Vision_DLMAINLPCV01 Lecture Book
- [2] https://spacy.io/usage/spacy-101
- [3] https://spacy.io/
- [4] https://spacy.io/models
- [5] https://spacy.io/models/en
- [6] https://spacy.io/usage/processing-pipelines
- [7] https://spacy.io/api/morphologizer#section-assigned-attributes
- [8] https://stackabuse.com/python-for-nlp-parts-of-speech-tagging-and-named-entity-recognition/
- [9] https://spacy.io/api/top-level


Copyright © 2022 IU International University of Applied Sciences