# 🔹 Step 6: POS Tagging (Part-of-Speech)

Concept: Identifying grammatical categories (noun, verb, adjective, etc.)

- Goal: Identify the grammatical role of each word — noun, verb, adjective, etc.
- Use case: Helps in syntactic analysis, sentiment, and entity extraction.

✅ We’ll cover:

POS tagging with NLTK and spaCy

How POS tags help in tasks like NER, lemmatization, and syntax parsing

# Downloading the necessary packages for POS

In [3]:
import nltk
from nltk import word_tokenize, pos_tag

# # Download tagger model
# nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('averaged_perceptron_tagger_eng')


[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     c:\Users\jsril\anaconda3\envs\nlp_env\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     c:\Users\jsril\anaconda3\envs\nlp_env\nltk_data...
[nltk_data]   Unzipping taggers\averaged_perceptron_tagger_eng.zip.


[('Apple', 'NNP'), ('is', 'VBZ'), ('looking', 'VBG'), ('at', 'IN'), ('buying', 'VBG'), ('U.K.', 'NNP'), ('startup', 'NN'), ('for', 'IN'), ('$', '$'), ('1', 'CD'), ('billion', 'CD')]


Each word tokens are tagged to each parts of speech (noun, adjective etc)

In [4]:

text = "Apple is looking at buying U.K. startup for $1 billion"

# Tokenize and POS tag
tokens = word_tokenize(text)
pos_tags = pos_tag(tokens)

print(pos_tags)


[('Apple', 'NNP'), ('is', 'VBZ'), ('looking', 'VBG'), ('at', 'IN'), ('buying', 'VBG'), ('U.K.', 'NNP'), ('startup', 'NN'), ('for', 'IN'), ('$', '$'), ('1', 'CD'), ('billion', 'CD')]


In [6]:
nltk.download('tagsets_json')



[nltk_data] Downloading package tagsets_json to
[nltk_data]     c:\Users\jsril\anaconda3\envs\nlp_env\nltk_data...


NNP: noun, proper, singular
    Motown Venneboerger Czestochwa Ranzer Conchita Trumplane Christos
    Oceanside Escobar Kreisler Sawyer Cougar Yvette Ervin ODI Darryl CTCA
    Shannon A.K.C. Meltex Liverpool ...


[nltk_data]   Unzipping help\tagsets_json.zip.


In [7]:
nltk.help.upenn_tagset('NNP')

NNP: noun, proper, singular
    Motown Venneboerger Czestochwa Ranzer Conchita Trumplane Christos
    Oceanside Escobar Kreisler Sawyer Cougar Yvette Ervin ODI Darryl CTCA
    Shannon A.K.C. Meltex Liverpool ...


In [8]:
nltk.help.upenn_tagset('VBZ')

VBZ: verb, present tense, 3rd person singular
    bases reconstructs marks mixes displeases seals carps weaves snatches
    slumps stretches authorizes smolders pictures emerges stockpiles
    seduces fizzes uses bolsters slaps speaks pleads ...


NNP → Proper noun

VBZ → Verb, 3rd person singular

VBG → Verb, gerund or present participle

IN → Preposition

CD → Cardinal number