### Part of Speech Tagging 
##### Parts of speech tagging simply refers to assigning parts of speech to individual words in a sentence

reference:https://stackabuse.com/python-for-nlp-parts-of-speech-tagging-and-named-entity-recognition/

In [14]:
#!pip install spacy
import spacy

In [16]:
import en_core_web_sm
nlp = en_core_web_sm.load()

In [28]:
sen = nlp(u"I like to play football. I hated it in my childhood though")
sen.text

'I like to play football. I hated it in my childhood though'

In [37]:
for i in sen:
    print(i,i.pos_)

I PRON
like VERB
to PART
play VERB
football NOUN
. PUNCT
I PRON
hated VERB
it PRON
in ADP
my DET
childhood NOUN
though ADV


In [43]:
print(sen[1],sen[1].tag_)
print(spacy.explain(sen[1].tag_))
print(20*'%')
print(sen[3], sen[3].tag_)
print(spacy.explain(sen[3].tag_))
print(20*'%')
print(sen[7], sen[7].tag_)
print(spacy.explain(sen[7].tag_))

like VBP
verb, non-3rd person singular present
%%%%%%%%%%%%%%%%%%%%
play VB
verb, base form
%%%%%%%%%%%%%%%%%%%%
hated VBD
verb, past tense


In [48]:
for word in sen:
    print(f'{word.text:{12}} {word.pos_:{10}} {word.tag_:{8}} {spacy.explain(word.tag_)}')

I            PRON       PRP      pronoun, personal
like         VERB       VBP      verb, non-3rd person singular present
to           PART       TO       infinitival "to"
play         VERB       VB       verb, base form
football     NOUN       NN       noun, singular or mass
.            PUNCT      .        punctuation mark, sentence closer
I            PRON       PRP      pronoun, personal
hated        VERB       VBD      verb, past tense
it           PRON       PRP      pronoun, personal
in           ADP        IN       conjunction, subordinating or preposition
my           DET        PRP$     pronoun, possessive
childhood    NOUN       NN       noun, singular or mass
though       ADV        RB       adverb


In the script above we improve the readability and formatting by adding 12 spaces between the text and coarse-grained POS tag and then another 10 spaces between the coarse-grained POS tags and fine-grained POS tags.

#### Why POS Tagging is Useful?
POS tagging can be really useful, particularly if you have words or tokens that can have multiple POS tags. For instance, the word "google" can be used as both a noun and verb, depending upon the context. While processing natural language, it is important to identify this difference. Fortunately, the spaCy library comes pre-built with machine learning algorithms that, depending upon the context (surrounding words), it is capable of returning the correct POS tag for the word.

In [49]:
sen = nlp(u'Can you google it?')
word = sen[2]

print(f'{word.text:{12}} {word.pos_:{10}} {word.tag_:{8}} {spacy.explain(word.tag_)}')

google       VERB       VB       verb, base form


In [51]:
sen = nlp(u'Can you search it on google?')
word = sen[5]

print(f'{word.text:{12}} {word.pos_:{10}} {word.tag_:{8}} {spacy.explain(word.tag_)}')

google       PROPN      NNP      noun, proper singular


You can find the number of occurrences of each POS tag by calling the count_by on the spaCy document object. The method takes spacy.attrs.POS as a parameter value.


In [52]:
sen = nlp(u"I like to play football. I hated it in my childhood though")

num_pos = sen.count_by(spacy.attrs.POS)
num_pos

{95: 3, 100: 3, 94: 1, 92: 2, 97: 1, 85: 1, 90: 1, 86: 1}

In [53]:
for k,v in sorted(num_pos.items()):
    print(f'{k}. {sen.vocab[k].text:{8}}: {v}')

85. ADP     : 1
86. ADV     : 1
90. DET     : 1
92. NOUN    : 2
94. PART    : 1
95. PRON    : 3
97. PUNCT   : 1
100. VERB    : 3


#### Visualizing Parts of Speech Tags
Visualizing POS tags in a graphical way is extremely easy. The displacy module from the spacy library is used for this purpose. To visualize the POS tags inside the Jupyter notebook, you need to call the render method from the displacy module and pass it the spacy document, the style of the visualization, and set the jupyter attribute to True as shown below:

In [54]:
from spacy import displacy

sen = nlp(u"I like to play football. I hated it in my childhood though")
displacy.render(sen, style='dep', jupyter=True, options={'distance': 85})

In [57]:
#If you want to visualize the POS tags outside the Jupyter notebook, then you need to call the serve method. The plot for POS tags will be printed in the HTML form inside your default browser. Execute the following script:
# displacy.serve(sen, style='dep', options={'distance': 120})