## Wordnet Lemmatizer
Lemmatization technique is like stemming. The output we will get after lemmatization is called ‘lemma’, which is a root word rather than root stem, the output of stemming. After lemmatization, we will be getting a valid word that means the same thing.

NLTK provides WordNetLemmatizer class which is a thin wrapper around the wordnet corpus. This class uses morphy() function to the WordNet CorpusReader class to find a lemma. Let us understand it with an example −


### Lemmatization is the process of reducing a word to its base or dictionary form (called a lemma) by considering the context such as part of speech (POS) and morphological analysis.

"running" → "run"

"better" → "good"

"mice" → "mouse"

Lemmatization requires linguistic knowledge, unlike stemming which blindly strips suffixes.

### Why is Lemmatization Important?

| Benefit                           | Explanation                                                        |
| --------------------------------- | ------------------------------------------------------------------ |
| ✅ Accurate root words             | Ensures valid dictionary forms (e.g., “was” → “be”)                |
| ✅ Context-aware normalization     | Considers word meaning and part of speech                          |
| ✅ Useful for downstream NLP tasks | Improves accuracy in parsing, tagging, NER, and sentiment analysis |




In [1]:
## Q&A,chatbots,text summarization

import nltk
nltk.download('wordnet')
from nltk.stem import WordNetLemmatizer

[nltk_data] Downloading package wordnet to C:\Users\Suraj
[nltk_data]     Khodade\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


In [2]:
lemmatizer=WordNetLemmatizer()

In [3]:

'''
POS- Noun-n
verb-v
adjective-a
adverb-r
'''
lemmatizer.lemmatize("going",pos='v')

'go'

In [4]:
words=["eating","eats","eaten","writing","writes","programming","programs","historical","finally","finalized"]

In [5]:
for word in words:
    print(word+"---->"+lemmatizer.lemmatize(word,pos='v'))

eating---->eat
eats---->eat
eaten---->eat
writing---->write
writes---->write
programming---->program
programs---->program
historical---->historical
finally---->finally
finalized---->finalize


In [6]:
lemmatizer.lemmatize("goes",pos='v')

'go'

In [7]:
lemmatizer.lemmatize("fairly",pos='v'),lemmatizer.lemmatize("sportingly")

('fairly', 'sportingly')

## Interview Questions – Lemmatization

### Q1. What is lemmatization and how does it differ from stemming?
#### Lemmatization maps words to their canonical dictionary form using linguistic analysis, while stemming strips affixes without understanding context.

### Q2. Why is POS tagging important for lemmatization?
#### Many words have different lemmas based on their part of speech. For instance:

"better" as adjective → "good"  
"better" as verb → "better"

Providing correct POS improves lemmatization accuracy.

### Q3. Can you explain how WordNetLemmatizer works?
#### It uses the WordNet lexical database to look up lemmas based on word and optional POS tag. It’s slower but more accurate than stemming.

### Q4. In which NLP scenarios would you prefer lemmatization over stemming?
#### In tasks like named entity recognition (NER), POS tagging, machine translation, and text classification where linguistic correctness matters.

### Q5. What are the limitations of lemmatization?
#### Requires language-specific corpora (e.g., WordNet)  
Slower compared to stemming  
May not capture domain-specific root forms unless customized

### ✅ Summary
| When to Use           | Recommendation                                 |
|-----------------------|------------------------------------------------|
| Search Engines        | Stemming (performance > precision)             |
| Language Understanding| Lemmatization (accuracy > speed)               |
| Domain-specific tasks | Use lemmatizer + custom dictionaries           |


### Lemmatization vs. Stemming

| Feature           | **Lemmatization**                      | **Stemming**                        |
| ----------------- | -------------------------------------- | ----------------------------------- |
| Output            | Valid dictionary words                 | May produce non-words               |
| Context-sensitive | Yes (POS-aware)                        | No                                  |
| Accuracy          | High                                   | Low to moderate                     |
| Speed             | Slower (due to linguistic analysis)    | Faster (rule-based)                 |
| Library           | `WordNetLemmatizer`                    | `PorterStemmer`, `Lancaster`, etc.  |
| Example           | `"running"` → `"run"`                  | `"running"` → `"run"` or `"runn"`   |
| Use case          | NLP tasks needing grammatical accuracy | IR, search indexing, high-speed NLP |
