## Wordnet Lemmatizer
Lemmatization technique is like stemming. The output we will get after lemmatization is called ‘lemma’, which is a root word rather than root stem, the output of stemming. After lemmatization, we will be getting a valid word that means the same thing.

NLTK provides WordNetLemmatizer class which is a thin wrapper around the wordnet corpus. This class uses morphy() function to the WordNet CorpusReader class to find a lemma. Let us understand it with an example −


Note : 
The WordNet Lemmatizer is a tool in Natural Language Toolkit (NLTK) that performs lemmatization by using WordNet, a lexical database of the English language. WordNet groups words into sets of synonyms called synsets, providing short definitions and usage examples. It also records various relations among words, such as hypernyms (broader terms), hyponyms (more specific terms), and more.

The WordNet Lemmatizer reduces words to their lemma (base form) based on part of speech (POS) information, using WordNet’s lexicon to ensure that the result is an actual, valid word in the language. Unlike stemming, which just cuts off prefixes and suffixes, WordNet Lemmatizer produces more meaningful and accurate results.

In [19]:
## Q&A,chatbots,text summarization
from nltk.stem import WordNetLemmatizer

In [2]:
lemmatizer=WordNetLemmatizer()

In [8]:
'''
POS- Noun-n
verb-v
adjective-a
adverb-r
'''
lemmatizer.lemmatize("going",pos='v')

'go'

In [9]:
words=["eating","eats","eaten","writing","writes","programming","programs","history","finally","finalized"]

In [14]:
for word in words:
    print(word+"---->"+lemmatizer.lemmatize(word,pos='v'))

eating---->eat
eats---->eat
eaten---->eat
writing---->write
writes---->write
programming---->program
programs---->program
history---->history
finally---->finally
finalized---->finalize


In [16]:
lemmatizer.lemmatize("goes",pos='v')

'go'

In [18]:
lemmatizer.lemmatize("fairly",pos='v'),lemmatizer.lemmatize("sportingly")

('fairly', 'sportingly')

In [3]:
# More Example  of lemmatization

import nltk
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet

# Download necessary NLTK resources
nltk.download('wordnet')
nltk.download('omw-1.4')  # for additional wordnet support
nltk.download('averaged_perceptron_tagger')  # for POS tagging

# Initialize WordNetLemmatizer
lemmatizer = WordNetLemmatizer()

# Lemmatize a few words
print("Lemmatized 'running':", lemmatizer.lemmatize('running', pos='v'))
print("Lemmatized 'better':", lemmatizer.lemmatize('better', pos='a'))   
print("Lemmatized 'children':", lemmatizer.lemmatize('children', pos='n')) 


[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/koushikdev/nltk_data...
[nltk_data] Downloading package omw-1.4 to
[nltk_data]     /Users/koushikdev/nltk_data...
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/koushikdev/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.


Lemmatized 'running': run
Lemmatized 'better': good
Lemmatized 'children': child


## Lemmatization with Automatic Part of Speech Detection

In [6]:
def get_wordnet_pos(word):
    """Converts POS tag to format understood by WordNetLemmatizer."""
    tag = nltk.pos_tag([word])[0][1][0].upper()
    tag_dict = {"J": wordnet.ADJ, "N": wordnet.NOUN, "V": wordnet.VERB, "R": wordnet.ADV}
    return tag_dict.get(tag, wordnet.NOUN)


In [8]:
# Lemmatize words with automatically detected POS
words = ['running', 'better', 'children', 'studies']
lemmatized_words = [lemmatizer.lemmatize(word, get_wordnet_pos(word)) for word in words]
print("Lemmatized Words:", lemmatized_words)

Lemmatized Words: ['run', 'well', 'child', 'study']


# conclusion 
The WordNet Lemmatizer in NLTK is a powerful tool for reducing words to their base forms in a contextually accurate way. It goes beyond simple string manipulation (as in stemming) and uses a lexical database to ensure that the resulting words are valid and meaningful. By considering the part of speech and leveraging WordNet’s synsets, it provides more accurate results, making it highly useful in advanced NLP tasks.