Part-of-speech (POS) tagging is the process of assigning each word in a text its corresponding part of speech (e.g., noun, verb, adjective), often with fine-grained subcategories. It’s a core step in many NLP pipelines because it adds syntactic information that downstream tasks (parsing, information extraction, etc.) can leverage.

In [44]:
# CC coordinating conjunction 
# CD cardinal digit 
# DT determiner 
# EX existential there (like: “there is” … think of it like “there exists”) 
# FW foreign word 
# IN preposition/subordinating conjunction 
# JJ adjective – ‘big’ 
# JJR adjective, comparative – ‘bigger’ 
# JJS adjective, superlative – ‘biggest’ 
# LS list marker 1) 
# MD modal – could, will 
# NN noun, singular ‘- desk’ 
# NNS noun plural – ‘desks’ 
# NNP proper noun, singular – ‘Harrison’ 
# NNPS proper noun, plural – ‘Americans’ 
# PDT predeterminer – ‘all the kids’ 
# POS possessive ending parent’s 
# PRP personal pronoun –  I, he, she 
# PRP$ possessive pronoun – my, his, hers 
# RB adverb – very, silently, 
# RBR adverb, comparative – better 
# RBS adverb, superlative – best 
# RP particle – give up 
# TO – to go ‘to’ the store. 
# UH interjection – errrrrrrrm 
# VB verb, base form – take 
# VBD verb, past tense – took 
# VBG verb, gerund/present participle – taking 
# VBN verb, past participle – taken 
# VBP verb, sing. present, non-3d – take 
# VBZ verb, 3rd person sing. present – takes 
# WDT wh-determiner – which 
# WP wh-pronoun – who, what 
# WP$ possessive wh-pronoun, eg- whose 
# WRB wh-adverb, eg- where, when

In [3]:
pargraph= """
I am indeed delighted to participate in the 21st Convocation of Sri Sathya Sai Institute of Higher Learning.
I take this opportunity to congratulate the young graduates for their achievement.
I greet the Vice Chancellor, Professors, teachers and staff for the excellent contribution in shaping young minds to contribute to the nation in multiple fields.
It is a great honour for me that the Chancellor, Swamiji, has given me this opportunity to share my thoughts at this Convocation.

Is value based education possible? Sri Sathya Sai Institute of Higher Learning has given an answer in the affirmative.
Our ultimate goal is: all human beings should be prosperous and should have all forms of security like food security, social security and future security of their children.
How to achieve them? How can a nation be secured from external and internal problems?
National security and economic prosperity are interconnected.

Sathyam, Dharma, Shanti and Prema are the eternal human values.
Efforts and endeavour are man's duty. Success or failure is God's domain.
I can see in this campus high calibre graduates bubbling with creativity.
There is a virtual presence of divine blessings all around.
I could sense intervention to alleviate the people's pain, difficulties and problems.
The integrated effect of this place is how a Guru can integrate both spiritual and material wealth.

When I was thinking what thoughts I could share with you, young graduates, a beautiful divine message was ringing in me:
"Where there is righteousness in the heart
 There is a beauty in the character.
 When there is beauty in the character,
 There is harmony in the home.
 When there is harmony in the home,
 There is order in the nation.
 When there is order in the nation,
 There is peace in the world."

Thinking is progress. Non-thinking is destruction to the individual, organization and the country.
Thinking leads to action. Knowledge without action is useless and irrelevant.
Knowledge with action brings prosperity.

I would like you, dear youth, to have a mind to explore every aspect of human life.
Look at the sky. We are not alone. The whole universe is friendly to us and conspires to give the best to those who dream.
Like Chandrasekhar Subramaniam discovered the black hole using Chandrasekhar's limit.
Like Sir C.V. Raman looked at the sea and questioned why it is blue, leading to the Raman Effect.
Like Albert Einstein, armed with the complexity of the universe, asked questions about its nature and arrived at E = mc².

To become a developed India, the essential needs are:
(a) India has to be economically and commercially powerful, aiming for 9% annual GDP growth and near-zero poverty.
(b) Near self-reliance in defence equipment with no umbilical attached to the outside world.
(c) India should have a right place in world forums.

Technology Vision 2020 is a pathway to realise this cherished mission.
We have identified five areas for integrated action:
  1. Agriculture and food processing
  2. Reliable and quality electric power for all parts of the country
  3. Education and Healthcare
  4. Information Communication Technology
  5. Strategic sectors (nuclear, space, defence, advanced sensors and materials)

These five areas are closely inter-related and will lead to national, food, and economic security.
A strong partnership among R&D, academia, industry, the community, and government will be essential to accomplish the vision.
"""

In [4]:
import nltk
from nltk.corpus import stopwords
sentences=nltk.sent_tokenize(pargraph)

In [5]:
sentences

['\nI am indeed delighted to participate in the 21st Convocation of Sri Sathya Sai Institute of Higher Learning.',
 'I take this opportunity to congratulate the young graduates for their achievement.',
 'I greet the Vice Chancellor, Professors, teachers and staff for the excellent contribution in shaping young minds to contribute to the nation in multiple fields.',
 'It is a great honour for me that the Chancellor, Swamiji, has given me this opportunity to share my thoughts at this Convocation.',
 'Is value based education possible?',
 'Sri Sathya Sai Institute of Higher Learning has given an answer in the affirmative.',
 'Our ultimate goal is: all human beings should be prosperous and should have all forms of security like food security, social security and future security of their children.',
 'How to achieve them?',
 'How can a nation be secured from external and internal problems?',
 'National security and economic prosperity are interconnected.',
 'Sathyam, Dharma, Shanti and Prem

In [6]:
from nltk.corpus  import stopwords

import nltk
# Download and install the models into your nltk_data folder:
nltk.download('punkt')                      # tokenizer
nltk.download('stopwords')                  # stop-word lists
nltk.download('averaged_perceptron_tagger_eng') # POS tagger model


[nltk_data] Downloading package punkt to C:\Users\PC/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to C:\Users\PC/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     C:\Users\PC/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger_eng is already up-to-
[nltk_data]       date!


True

In [7]:
## We will find the Pos Tag

for i in range(len(sentences)):
    words=nltk.word_tokenize(sentences[i])
    words=[word for word in words if word not in set(stopwords.words('english'))]
    #sentences[i]=' '.join(words)# converting all the list of words into sentences
    pos_tag=nltk.pos_tag(words)
    print(pos_tag)

[('I', 'PRP'), ('indeed', 'RB'), ('delighted', 'VBD'), ('participate', 'JJ'), ('21st', 'CD'), ('Convocation', 'NNP'), ('Sri', 'NNP'), ('Sathya', 'NNP'), ('Sai', 'NNP'), ('Institute', 'NNP'), ('Higher', 'NNP'), ('Learning', 'NNP'), ('.', '.')]
[('I', 'PRP'), ('take', 'VBP'), ('opportunity', 'NN'), ('congratulate', 'NN'), ('young', 'JJ'), ('graduates', 'NNS'), ('achievement', 'NN'), ('.', '.')]
[('I', 'PRP'), ('greet', 'VBP'), ('Vice', 'NNP'), ('Chancellor', 'NNP'), (',', ','), ('Professors', 'NNP'), (',', ','), ('teachers', 'VBZ'), ('staff', 'NN'), ('excellent', 'JJ'), ('contribution', 'NN'), ('shaping', 'VBG'), ('young', 'JJ'), ('minds', 'NNS'), ('contribute', 'JJ'), ('nation', 'NN'), ('multiple', 'JJ'), ('fields', 'NNS'), ('.', '.')]
[('It', 'PRP'), ('great', 'JJ'), ('honour', 'JJ'), ('Chancellor', 'NNP'), (',', ','), ('Swamiji', 'NNP'), (',', ','), ('given', 'VBN'), ('opportunity', 'NN'), ('share', 'NN'), ('thoughts', 'NNS'), ('Convocation', 'NNP'), ('.', '.')]
[('Is', 'VBZ'), ('valu