---

## **Notes on the Code**

### **Concept of Lemmatization**
Lemmatization is the process of reducing a word to its **base or dictionary form** (called a "lemma") while considering the word's **part of speech (POS)**. Unlike stemming, which often truncates words without understanding their meaning, lemmatization produces meaningful base forms that exist in the dictionary.

---

### **Step-by-Step Explanation**

#### **1. Importing Required Libraries**
```python
from nltk.stem import WordNetLemmatizer
import nltk
nltk.download('wordnet')
```
- **`WordNetLemmatizer`**: A lemmatizer provided by NLTK that uses the WordNet lexical database to find the base form of words.
- **`nltk.download('wordnet')`**: Downloads the WordNet corpus, which is necessary for the lemmatizer to function.

---

#### **2. Lemmatization with POS**
```python
lemmatizer = WordNetLemmatizer()
'''
POS- Noun-n
POS- Verb-v
POS- Adjective-a
POS- Adverb-r
'''
lemmatizer.lemmatize("going", pos='v')
```
- **Part of Speech (POS)**: Lemmatization relies on specifying the word's grammatical role. The supported POS tags are:
  - **'n'**: Noun
  - **'v'**: Verb
  - **'a'**: Adjective
  - **'r'**: Adverb
- Example:
  - `"going"` with `pos='v'` → "go" (base verb form).

---

#### **3. Lemmatizing a List of Words**
```python
words = ["eating", "eats", "eaten", "writing", "writes", "programming", "programs", "history", "finally", "finalized"]
for word in words:
    print(word + "----->" + lemmatizer.lemmatize(word, pos='v'))
```
- Here, each word in the list is lemmatized using the **verb ('v')** POS tag.
- Example transformations:
  - `"eating"` → "eat"
  - `"writing"` → "write"
  - `"programming"` → "program"
  - `"finalized"` → "finalize"

---

#### **4. Lemmatization Without Specifying POS**
```python
lemmatizer.lemmatize("goes")
```
- If the POS is not specified, the lemmatizer assumes the word is a **noun ('n')**.
- Example:
  - `"goes"` (default POS: noun) → "go" (still correct here).

---

#### **5. Lemmatizing Adverbs and Other POS**
```python
lemmatizer.lemmatize("fairly", pos='v'), lemmatizer.lemmatize("sportingly", pos='v')
```
- Lemmatizing adverbs or less common POS forms may not always yield significant changes.
- Examples:
  - `"fairly"` with `pos='v'` → "fairly" (no change).
  - `"sportingly"` with `pos='v'` → "sportingly" (no change).

---

### **Comparison: Stemming vs. Lemmatization**
| **Feature**        | **Stemming**                               | **Lemmatization**                          |
|---------------------|--------------------------------------------|--------------------------------------------|
| **Approach**        | Rule-based truncation.                    | Dictionary-based normalization.            |
| **POS Consideration** | Does not consider POS.                  | Requires specifying POS for accuracy.      |
| **Output**          | May produce non-meaningful stems.         | Produces meaningful base words (lemmas).   |
| **Accuracy**        | Less accurate.                           | More accurate, context-aware.              |

---

### **Key Points**
1. Lemmatization provides **contextually meaningful base forms**, unlike stemming, which may produce truncated or non-meaningful words.
2. Specifying the correct **POS tag** enhances lemmatization accuracy.
3. Use lemmatization when you need **semantically valid base forms** for tasks like information retrieval, machine translation, and linguistic analysis.

---

This code demonstrates how the **WordNetLemmatizer** can effectively normalize text, considering part-of-speech tags for precise processing.

In [3]:
from nltk.stem import WordNetLemmatizer
import nltk
nltk.download('wordnet')

[nltk_data] Downloading package wordnet to /root/nltk_data...


True

In [7]:
lemmatizer = WordNetLemmatizer()
'''
POS- Noun-n
POS- Verb-v
POS- Adjective-a
POS- Adverb-r
'''
lemmatizer.lemmatize("going",pos='v')

'go'

In [8]:
words = ["eating","eats","eaten","writing","writes","programming","programs","history","finally","finalized"]

In [14]:
for word in words:
  print(word+"----->"+lemmatizer.lemmatize(word, pos='v'))

eating----->eat
eats----->eat
eaten----->eat
writing----->write
writes----->write
programming----->program
programs----->program
history----->history
finally----->finally
finalized----->finalize


In [15]:
lemmatizer.lemmatize("goes")

'go'

In [17]:
lemmatizer.lemmatize("fairly",pos='v'),lemmatizer.lemmatize("sportingly",pos='v')

('fairly', 'sportingly')