## Parts of Speech Tagging

In [1]:
s1 = 'Apple is looking at buying U.K. startup for $1 billion'

In [3]:
#importing the spacy Library and Loading the spacy pre-trained small model
import spacy
nlp = spacy.load('en_core_web_sm')

In [4]:
doc = nlp(s1)

In [8]:
# Like many NLP libraries, spaCy encodes all strings to hash values to reduce
#  memory usage and improve efficiency. So to get the readable string 
#  representation of an attribute, we need to add an underscore _ to its name:
for token in doc:
  print(token.text , " ----- " ,  token.pos_ , " ----- " , token.tag_ , " ----- " , spacy.explain(token.tag_))

Apple  -----  PROPN  -----  NNP  -----  noun, proper singular
is  -----  AUX  -----  VBZ  -----  verb, 3rd person singular present
looking  -----  VERB  -----  VBG  -----  verb, gerund or present participle
at  -----  ADP  -----  IN  -----  conjunction, subordinating or preposition
buying  -----  VERB  -----  VBG  -----  verb, gerund or present participle
U.K.  -----  PROPN  -----  NNP  -----  noun, proper singular
startup  -----  NOUN  -----  NN  -----  noun, singular or mass
for  -----  ADP  -----  IN  -----  conjunction, subordinating or preposition
$  -----  SYM  -----  $  -----  symbol, currency
1  -----  NUM  -----  CD  -----  cardinal number
billion  -----  NUM  -----  CD  -----  cardinal number


In [10]:
#To count the number of different type of Parts of speech
for key,val in doc.count_by(spacy.attrs.POS).items():
  print(key , doc.vocab[key].text , " ----- ", val)

96 PROPN  -----  2
87 AUX  -----  1
100 VERB  -----  2
85 ADP  -----  2
92 NOUN  -----  1
99 SYM  -----  1
93 NUM  -----  2


In [16]:
#Visulization of Parts of Speech Tagging
from spacy import displacy
displacy.render(docs = doc , style = 'dep', options = {'distance' : 120}, jupyter = True)