# Parts of speech tagging

Parts of speech (POS) tagging is the process of labeling each word in a sentence with its corresponding part of speech, such as noun, verb, adjective, adverb, etc. This is a fundamental task in natural language processing (NLP) that helps computers understand the grammatical structure of text. POS tagging enables further analysis like parsing, information extraction, and machine translation by providing syntactic information about each word in a sentence.

- **CC**: Coordinating conjunction  
- **CD**: Cardinal number  
- **DT**: Determiner  
- **EX**: Existential there  
- **FW**: Foreign word  
- **IN**: Preposition or subordinating conjunction  
- **JJ**: Adjective  
- **JJR**: Adjective, comparative  
- **JJS**: Adjective, superlative  
- **LS**: List item marker  
- **MD**: Modal  
- **NN**: Noun, singular or mass  
- **NNS**: Noun, plural  
- **NNP**: Proper noun, singular  
- **NNPS**: Proper noun, plural  
- **PDT**: Predeterminer  
- **POS**: Possessive ending  
- **PRP**: Personal pronoun  
- **PRP$**: Possessive pronoun  
- **RB**: Adverb  
- **RBR**: Adverb, comparative  
- **RBS**: Adverb, superlative  
- **RP**: Particle  
- **SYM**: Symbol  
- **TO**: to  
- **UH**: Interjection  
- **VB**: Verb, base form  
- **VBD**: Verb, past tense  
- **VBG**: Verb, gerund or present participle  
- **VBN**: Verb, past participle  
- **VBP**: Verb, non-3rd person singular present  
- **VBZ**: Verb, 3rd person singular present  
- **WDT**: Wh-determiner  
- **WP**: Wh-pronoun  
- **WP$**: Possessive wh-pronoun  
- **WRB**: Wh-adverb

In [5]:
corpus ="""Once upon a time, in a colorful forest, there lived a curious rabbit named Ruby. Ruby loved to explore and make new friends. One sunny morning, she hopped out of her burrow and met a wise old owl named Oliver. Oliver told Ruby about a hidden garden filled with magical flowers.

Excited, Ruby invited her friends—Benny the bear and Sally the squirrel—to join the adventure. Together, they followed clues, crossed a sparkling stream, and solved riddles left by the clever fox, Felix. At last, they found the secret garden, where the flowers glowed in every color of the rainbow.

The friends danced and played among the blossoms, promising to always help each other and share their discoveries. From that day on, Ruby, Benny, Sally, and Oliver explored the forest together, making every day a new adventure."""

In [17]:
from nltk.tokenize import sent_tokenize

sentences = sent_tokenize(corpus)
print(sentences)

['Once upon a time, in a colorful forest, there lived a curious rabbit named Ruby.', 'Ruby loved to explore and make new friends.', 'One sunny morning, she hopped out of her burrow and met a wise old owl named Oliver.', 'Oliver told Ruby about a hidden garden filled with magical flowers.', 'Excited, Ruby invited her friends—Benny the bear and Sally the squirrel—to join the adventure.', 'Together, they followed clues, crossed a sparkling stream, and solved riddles left by the clever fox, Felix.', 'At last, they found the secret garden, where the flowers glowed in every color of the rainbow.', 'The friends danced and played among the blossoms, promising to always help each other and share their discoveries.', 'From that day on, Ruby, Benny, Sally, and Oliver explored the forest together, making every day a new adventure.']


In [19]:
import nltk
from nltk import word_tokenize, pos_tag
nltk.download('averaged_perceptron_tagger_eng')
# Flatten all words from all sentences
all_words = [word for sentence in sentences for word in word_tokenize(sentence)]

# POS tagging for all words
pos_tags = pos_tag(all_words)
print(pos_tags)

[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     C:\Users\Prithivi\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping taggers\averaged_perceptron_tagger_eng.zip.


[('Once', 'RB'), ('upon', 'IN'), ('a', 'DT'), ('time', 'NN'), (',', ','), ('in', 'IN'), ('a', 'DT'), ('colorful', 'JJ'), ('forest', 'NN'), (',', ','), ('there', 'EX'), ('lived', 'VBD'), ('a', 'DT'), ('curious', 'JJ'), ('rabbit', 'NN'), ('named', 'VBN'), ('Ruby', 'NNP'), ('.', '.'), ('Ruby', 'NNP'), ('loved', 'VBD'), ('to', 'TO'), ('explore', 'VB'), ('and', 'CC'), ('make', 'VB'), ('new', 'JJ'), ('friends', 'NNS'), ('.', '.'), ('One', 'CD'), ('sunny', 'JJ'), ('morning', 'NN'), (',', ','), ('she', 'PRP'), ('hopped', 'VBD'), ('out', 'IN'), ('of', 'IN'), ('her', 'PRP$'), ('burrow', 'NN'), ('and', 'CC'), ('met', 'VBD'), ('a', 'DT'), ('wise', 'NN'), ('old', 'JJ'), ('owl', 'NN'), ('named', 'VBN'), ('Oliver', 'NNP'), ('.', '.'), ('Oliver', 'NNP'), ('told', 'VBD'), ('Ruby', 'NNP'), ('about', 'IN'), ('a', 'DT'), ('hidden', 'JJ'), ('garden', 'NN'), ('filled', 'VBN'), ('with', 'IN'), ('magical', 'JJ'), ('flowers', 'NNS'), ('.', '.'), ('Excited', 'NNP'), (',', ','), ('Ruby', 'NNP'), ('invited', 'VBD

In [20]:
pos_tags

[('Once', 'RB'),
 ('upon', 'IN'),
 ('a', 'DT'),
 ('time', 'NN'),
 (',', ','),
 ('in', 'IN'),
 ('a', 'DT'),
 ('colorful', 'JJ'),
 ('forest', 'NN'),
 (',', ','),
 ('there', 'EX'),
 ('lived', 'VBD'),
 ('a', 'DT'),
 ('curious', 'JJ'),
 ('rabbit', 'NN'),
 ('named', 'VBN'),
 ('Ruby', 'NNP'),
 ('.', '.'),
 ('Ruby', 'NNP'),
 ('loved', 'VBD'),
 ('to', 'TO'),
 ('explore', 'VB'),
 ('and', 'CC'),
 ('make', 'VB'),
 ('new', 'JJ'),
 ('friends', 'NNS'),
 ('.', '.'),
 ('One', 'CD'),
 ('sunny', 'JJ'),
 ('morning', 'NN'),
 (',', ','),
 ('she', 'PRP'),
 ('hopped', 'VBD'),
 ('out', 'IN'),
 ('of', 'IN'),
 ('her', 'PRP$'),
 ('burrow', 'NN'),
 ('and', 'CC'),
 ('met', 'VBD'),
 ('a', 'DT'),
 ('wise', 'NN'),
 ('old', 'JJ'),
 ('owl', 'NN'),
 ('named', 'VBN'),
 ('Oliver', 'NNP'),
 ('.', '.'),
 ('Oliver', 'NNP'),
 ('told', 'VBD'),
 ('Ruby', 'NNP'),
 ('about', 'IN'),
 ('a', 'DT'),
 ('hidden', 'JJ'),
 ('garden', 'NN'),
 ('filled', 'VBN'),
 ('with', 'IN'),
 ('magical', 'JJ'),
 ('flowers', 'NNS'),
 ('.', '.'),
 ('Excit

In [21]:
print(nltk.pos_tag('taj mahal is a beautiful monument'.split()))

[('taj', 'NN'), ('mahal', 'NN'), ('is', 'VBZ'), ('a', 'DT'), ('beautiful', 'JJ'), ('monument', 'NN')]
