# POS-Tagging and Its Applications                                                        

## What is POS-tagging?


The obvious first step in understanding POS-tagging is to expand the acronym – Part-Of-Speech tagging. Now, that makes things a lot easier now, doesn't it? As the name suggests, it is the process of tagging words in a textual input with their appropriate part of speech. We've already discussed this before briefly, particularly when dealing with spaCy and its language models. So, while we know that POS-tagging refers to the action of tagging words with their POS, we haven't talked very much about what exactly a part of speech in natural language (and in particular, English) is, and why it might be relevant to us in the realm of text analysis.

- Noun - The name of a person, place, thing, or idea

- Verb - The action or beingAdjective - This modifies or describes a noun or a pronoun

- Adverb - This modifies or describes a verb, adjective, or another adverb

- Pronoun - The word to be used in place of a noun

- Preposition - The word placed before a noun or pronoun to form a phrase modifying another word in the sentence

- Conjunction - This joins words, phrases, or clauses

- Interjection - A word used to express emotion

## POS tagging in python

In [1]:
import nltk
text = nltk.word_tokenize("And now for something completely different")
nltk.pos_tag(text)

[('And', 'CC'),
 ('now', 'RB'),
 ('for', 'IN'),
 ('something', 'NN'),
 ('completely', 'RB'),
 ('different', 'JJ')]

In [2]:
bigram_tagger = nltk.BigramTagger(train_sents)
bigram_tagger.tag(text)

NameError: name 'train_sents' is not defined

## POS tagging with spaCy

In [4]:
import spacy

nlp = spacy.load('en')

In [6]:
sent_0 = nlp(u'Mathieu and I went to the park.')
sent_1 = nlp(u'If Clement was asked to take out the garbage, he would refuse.')
sent_2 = nlp(u'Baptiste was in charge of the refuse treatment center.')
sent_3 = nlp(u'Marie took out her rather suspicious and fishy cat to go fish for fish.')

sentence 0

In [9]:
for token in sent_0:
    print('{} | {} | {}'.format(token.text, token.pos_, token.tag_))

Mathieu | PROPN | NNP
and | CCONJ | CC
I | PRON | PRP
went | VERB | VBD
to | ADP | IN
the | DET | DT
park | NOUN | NN
. | PUNCT | .


Let's look at a few of the tags here – Mathieu is a name, and it is correctly marked as a proper noun, went is a verb, and the park is a noun – all that we would expect it to be. We previously talked about the word refuse, and how it can be both a noun and a verb.

---

In [10]:
for token in sent_1:
    print('{} | {} | {}'.format(token.text, token.pos_, token.tag_))

If | ADP | IN
Clement | PROPN | NNP
was | VERB | VBD
asked | VERB | VBN
to | PART | TO
take | VERB | VB
out | PART | RP
the | DET | DT
garbage | NOUN | NN
, | PUNCT | ,
he | PRON | PRP
would | VERB | MD
refuse | VERB | VB
. | PUNCT | .


Here, the word refuse is a verb, as we expect it to be. The word garbage is a noun and is the object which our friend Clement is refusing to take out. Our next sentence is also an example involving garbage, but here the word refuse is the substance being treated in the plant.

---

In [11]:
for token in sent_2:
    print('{} | {} | {}'.format(token.text, token.pos_, token.tag_))

Baptiste | PROPN | NNP
was | VERB | VBD
in | ADP | IN
charge | NOUN | NN
of | ADP | IN
the | DET | DT
refuse | ADJ | JJ
treatment | NOUN | NN
center | NOUN | NN
. | PUNCT | .


And voila! As we wanted to see, the refuse word is now correctly tagged as a noun. With the context of it appearing as something Baptiste is in charge of, it is appropriately changed to a noun. In fact, the last three words are all nouns, or is something which we call a noun phrase. We will deal with this term in more detail in the chapter on dependency parsing.Let's now have a look at our last sentence:

---

In [12]:
for token in sent_3:
    print('{} | {} | {}'.format(token.text, token.pos_, token.tag_))

Marie | PROPN | NNP
took | VERB | VBD
out | PART | RP
her | PRON | PRP
rather | ADV | RB
suspicious | ADJ | JJ
and | CCONJ | CC
fishy | ADJ | JJ
cat | NOUN | NN
to | PART | TO
go | VERB | VB
fish | NOUN | NN
for | ADP | IN
fish | NOUN | NN
. | PUNCT | .


The purpose of this sentence was to attempt to fool our tagger with different variations of the word fish, but our tagger could easily tell the difference in the appropriate context. Our model is a machine learning model which, among other training features, uses the tags of the previous words and upcoming words to decide the new tag – the word fishy was tagged as a verb partly because of the fact that a noun comes right after, partly because a conjunction came before, and also possibly because it ends with the letter y. Most machine learning models take multiple features into account when deciding a new label.