# POS-Tagging

POS stands for Part of Speech.

POS Tagging is the process of assigning each word in a sentence its grammatical category, such as:

Noun (NN)- dog, computer

Verb (VB)-run, eat

Adjective (JJ)-big, beautiful

Adverb (RB)-quickly, very

It is one of the fundamental steps in NLP for understanding sentence structure.

2. Why POS Tagging is Important

Helps in syntactic parsing and chunking.

Useful for:

Named Entity Recognition (NER)

Text-to-speech systems

Sentiment analysis

Machine translation

In [28]:
import nltk
from collections import Counter
from nltk.tokenize import word_tokenize
from nltk.util import bigrams

nltk.download('punkt')
nltk.download('punkt_tab')
nltk.download('averaged_perceptron_tagger_eng')
nltk.download('universal_tagset')

text = """Your knowledge might feel useless because you're in an environment
that doesn't value your skills, but they’re highly valued in data science and analytics."""

print("=== INPUT TEXT ===")
print(text)

# POS Tagging
words = word_tokenize(text)
pos_tags = nltk.pos_tag(words, tagset='universal')

print("\n=== (a) POS TAGGING ===")
for w, t in pos_tags:
    print(f"{w:15} -> {t}")

# Tag Frequency
tags = [t for _, t in pos_tags]
print("\n=== (b) TAG FREQUENCY ===")
for tag, freq in Counter(tags).most_common():
    print(f"{tag:5} -> {freq}")

# Common Tag after NOUN
tag_bigrams = list(bigrams(pos_tags))
after_noun = Counter(n2 for (_, t1), (_, n2) in tag_bigrams if t1 == 'NOUN')

print("\n=== (c) MOST COMMON TAGS AFTER NOUN ===")
for tag, freq in after_noun.most_common():
    print(f"{tag:5} -> {freq}")

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger_eng is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package universal_tagset to /root/nltk_data...


=== INPUT TEXT ===
Your knowledge might feel useless because you're in an environment
that doesn't value your skills, but they’re highly valued in data science and analytics.

=== (a) POS TAGGING ===
Your            -> PRON
knowledge       -> NOUN
might           -> VERB
feel            -> VERB
useless         -> ADV
because         -> ADP
you             -> PRON
're             -> VERB
in              -> ADP
an              -> DET
environment     -> NOUN
that            -> DET
does            -> VERB
n't             -> ADV
value           -> NOUN
your            -> PRON
skills          -> NOUN
,               -> .
but             -> CONJ
they            -> PRON
’               -> VERB
re              -> ADJ
highly          -> ADV
valued          -> VERB
in              -> ADP
data            -> NOUN
science         -> NOUN
and             -> CONJ
analytics       -> NOUN
.               -> .

=== (b) TAG FREQUENCY ===
NOUN  -> 7
VERB  -> 6
PRON  -> 4
ADV   -> 3
ADP   -> 3
DET   -> 2
. 

[nltk_data]   Unzipping taggers/universal_tagset.zip.
