#### Parts Of speech Tagging

Parts of speech (POS) tagging is the process of assigning grammatical labels (tags) to words in a sentence based on their syntactic role and relationship to other words in the sentence. These tags represent the lexical category or part of speech of each word, such as noun, verb, adjective, adverb, etc.

Here's a brief explanation of some common parts of speech:

1. **Noun (NN)**: A word that represents a person, place, thing, or idea. Examples: "cat," "house," "love."

2. **Verb (VB)**: A word that expresses an action or state of being. Examples: "run," "eat," "is."

3. **Adjective (JJ)**: A word that describes or modifies a noun. Examples: "red," "happy," "tall."

4. **Adverb (RB)**: A word that modifies a verb, adjective, or other adverb, often indicating manner, time, place, degree, etc. Examples: "quickly," "very," "here."

5. **Pronoun (PRP)**: A word that substitutes for a noun or noun phrase. Examples: "he," "she," "they."

6. **Preposition (IN)**: A word that shows the relationship between a noun (or pronoun) and other words in a sentence. Examples: "in," "on," "at."

7. **Conjunction (CC)**: A word that connects words, phrases, or clauses. Examples: "and," "but," "or."

8. **Interjection (UH)**: A word or phrase that expresses emotion or exclamation. Examples: "wow," "ouch," "hey."

POS tagging is a fundamental task in natural language processing (NLP) and is used in various applications such as text analysis, information extraction, machine translation, and sentiment analysis. It helps in understanding the grammatical structure of sentences and extracting valuable insights from text data.

In [28]:
## Speech Of DR APJ Abdul Kalam
paragraph = """I envision three paths for India's future. Over our 3000-year history, numerous civilizations have invaded our land, imposing their rule and ideologies. Despite this, we have refrained from imposing our will on others, respecting their freedom.

My first vision is centered around preserving this freedom. Our struggle for independence in 1857 marked the beginning of our journey towards this ideal. It is imperative that we safeguard and nurture this freedom, as it is the cornerstone of our identity and dignity.

Moving forward, I envision India as a developed nation. Despite being one of the top five economies globally, we have hesitated to recognize our potential. It is time to shed this doubt and embrace our status as a developed, self-reliant nation.

Lastly, I believe India must assert itself on the global stage. Only by demonstrating strength—both militarily and economically—will we earn respect from the international community. I draw inspiration from the remarkable individuals I have had the privilege to work with, such as Dr. Vikram Sarabhai, Professor Satish Dhawan, and Dr. Brahm Prakash, who have shaped my perspective on leadership and progress."""

In [29]:
import nltk
from nltk.corpus import stopwords
sentences = nltk.sent_tokenize(paragraph)

In [30]:
sentences


["I envision three paths for India's future.",
 'Over our 3000-year history, numerous civilizations have invaded our land, imposing their rule and ideologies.',
 'Despite this, we have refrained from imposing our will on others, respecting their freedom.',
 'My first vision is centered around preserving this freedom.',
 'Our struggle for independence in 1857 marked the beginning of our journey towards this ideal.',
 'It is imperative that we safeguard and nurture this freedom, as it is the cornerstone of our identity and dignity.',
 'Moving forward, I envision India as a developed nation.',
 'Despite being one of the top five economies globally, we have hesitated to recognize our potential.',
 'It is time to shed this doubt and embrace our status as a developed, self-reliant nation.',
 'Lastly, I believe India must assert itself on the global stage.',
 'Only by demonstrating strength—both militarily and economically—will we earn respect from the international community.',
 'I draw insp

In [31]:
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\mistr\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


True

In [32]:
## we will Find The Pos Tag

for i in range(len(sentences)):
  words = nltk.word_tokenize(sentences[i])
  words = [word for word in words if word not in set(stopwords.words('english'))]
  pos_tag = nltk.pos_tag(words)
  print(pos_tag)
 

[('I', 'PRP'), ('envision', 'VBP'), ('three', 'CD'), ('paths', 'NNS'), ('India', 'NNP'), ("'s", 'POS'), ('future', 'NN'), ('.', '.')]
[('Over', 'IN'), ('3000-year', 'JJ'), ('history', 'NN'), (',', ','), ('numerous', 'JJ'), ('civilizations', 'NNS'), ('invaded', 'VBD'), ('land', 'NN'), (',', ','), ('imposing', 'VBG'), ('rule', 'NN'), ('ideologies', 'NNS'), ('.', '.')]
[('Despite', 'IN'), (',', ','), ('refrained', 'VBD'), ('imposing', 'VBG'), ('others', 'NNS'), (',', ','), ('respecting', 'VBG'), ('freedom', 'NN'), ('.', '.')]
[('My', 'PRP$'), ('first', 'JJ'), ('vision', 'NN'), ('centered', 'VBN'), ('around', 'IN'), ('preserving', 'VBG'), ('freedom', 'NN'), ('.', '.')]
[('Our', 'PRP$'), ('struggle', 'NN'), ('independence', 'NN'), ('1857', 'CD'), ('marked', 'VBD'), ('beginning', 'VBG'), ('journey', 'NN'), ('towards', 'NNS'), ('ideal', 'NN'), ('.', '.')]
[('It', 'PRP'), ('imperative', 'JJ'), ('safeguard', 'JJ'), ('nurture', 'NN'), ('freedom', 'NN'), (',', ','), ('cornerstone', 'NN'), ('ident

In [43]:
# Tokenize the input string into sentences
sentences = nltk.sent_tokenize("Taj Mahal is a beautiful Monument")

# Tokenize each sentence into words
tokenized_sentence = [nltk.word_tokenize(sentence) for sentence in sentences]

# perform POS tagging on the tokenized sentences
pos_tagged_sentences = nltk.pos_tag_sents(tokenized_sentence)

print(pos_tagged_sentences)

[[('Taj', 'NNP'), ('Mahal', 'NNP'), ('is', 'VBZ'), ('a', 'DT'), ('beautiful', 'JJ'), ('Monument', 'NN')]]


In [41]:
nltk.pos_tag(["Taj Mahal is a beautiful Monument"])

[('Taj Mahal is a beautiful Monument', 'NN')]

In [44]:
"Taj Mahal is a beautiful Monument".split()

['Taj', 'Mahal', 'is', 'a', 'beautiful', 'Monument']

In [48]:
print(nltk.pos_tag("Taj Mahal is a beautiful Monument".split()))

[('Taj', 'NNP'), ('Mahal', 'NNP'), ('is', 'VBZ'), ('a', 'DT'), ('beautiful', 'JJ'), ('Monument', 'NN')]
