âœ… Lemmatization in NLP
Lemmatization is a process in Natural Language Processing (NLP) that reduces a word to its base or dictionary form (lemma) while ensuring that it remains a valid word.

Unlike stemming, which simply removes suffixes, lemmatization considers the meaning and context of the word using a lexical database like WordNet.

#

When to Use Lemmatization?
âœ” When accurate dictionary-based normalization is required

âœ” When words must remain meaningful (unlike stemming)

âœ” When working on machine learning and NLP tasks like sentiment analysis


âœ… Lemmatization is preferred over stemming for advanced NLP 
tasks since it ensures the output remains a valid word. ðŸš€

In [8]:
import nltk
nltk.download('wordnet')


[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\naman\AppData\Roaming\nltk_data...


True

In [10]:
from nltk.stem import WordNetLemmatizer

In [24]:
words = [
    "running", "runs", "ran", "easily", "fairly", "happily", "studying",
    "studies", "arguing", "argued", "flies", "crying", "cries", "playing",
    "played", "playing", "dancing", "danced", "better", "faster", "largest",
    "beautifully", "organization", "organizing", "organized", "happiness"
]


In [27]:
lemma = WordNetLemmatizer()

ðŸ”¹ Improving Accuracy with POS Tags
To get better results, specify the Part of Speech (POS) tag:

pos='n' â†’ Noun (default)

pos='v' â†’ Verb

pos='a' â†’ Adjective

pos='r' â†’ Adverb

In [31]:
for word in words:
    print(f"{word} â†’ {lemma.lemmatize(word,pos = 'v')}")

running â†’ run
runs â†’ run
ran â†’ run
easily â†’ easily
fairly â†’ fairly
happily â†’ happily
studying â†’ study
studies â†’ study
arguing â†’ argue
argued â†’ argue
flies â†’ fly
crying â†’ cry
cries â†’ cry
playing â†’ play
played â†’ play
playing â†’ play
dancing â†’ dance
danced â†’ dance
better â†’ better
faster â†’ faster
largest â†’ largest
beautifully â†’ beautifully
organization â†’ organization
organizing â†’ organize
organized â†’ organize
happiness â†’ happiness


In [46]:
# Adjective
for word in words:
    print(f"{word} â†’ {lemma.lemmatize(word,pos = 'a')}")#


running â†’ running
runs â†’ runs
ran â†’ ran
easily â†’ easily
fairly â†’ fairly
happily â†’ happily
studying â†’ studying
studies â†’ studies
arguing â†’ arguing
argued â†’ argued
flies â†’ flies
crying â†’ crying
cries â†’ cries
playing â†’ playing
played â†’ played
playing â†’ playing
dancing â†’ dancing
danced â†’ danced
better â†’ good
faster â†’ fast
largest â†’ large
beautifully â†’ beautifully
organization â†’ organization
organizing â†’ organizing
organized â†’ organized
happiness â†’ happiness


In [47]:
#default 
for word in words:
    print(f"{word} â†’ {lemma.lemmatize(word)}")#


running â†’ running
runs â†’ run
ran â†’ ran
easily â†’ easily
fairly â†’ fairly
happily â†’ happily
studying â†’ studying
studies â†’ study
arguing â†’ arguing
argued â†’ argued
flies â†’ fly
crying â†’ cry
cries â†’ cry
playing â†’ playing
played â†’ played
playing â†’ playing
dancing â†’ dancing
danced â†’ danced
better â†’ better
faster â†’ faster
largest â†’ largest
beautifully â†’ beautifully
organization â†’ organization
organizing â†’ organizing
organized â†’ organized
happiness â†’ happiness


In [48]:
# Adverb
for word in words:
    print(f"{word} â†’ {lemma.lemmatize(word,pos = 'r')}")#


running â†’ running
runs â†’ runs
ran â†’ ran
easily â†’ easily
fairly â†’ fairly
happily â†’ happily
studying â†’ studying
studies â†’ studies
arguing â†’ arguing
argued â†’ argued
flies â†’ flies
crying â†’ crying
cries â†’ cries
playing â†’ playing
played â†’ played
playing â†’ playing
dancing â†’ dancing
danced â†’ danced
better â†’ well
faster â†’ faster
largest â†’ largest
beautifully â†’ beautifully
organization â†’ organization
organizing â†’ organizing
organized â†’ organized
happiness â†’ happiness
