### What is **Lemmatization**?

**Lemmatization** is the process of reducing a word to its **base or dictionary form**, called a **lemma**. Unlike stemming, lemmatization uses **grammar and vocabulary** to find the proper word form.

---

### In Simple Words:

It turns:
- **“running” → “run”**
- **“better” → “good”**
- **“ate” → “eat”**

### ✅ Advantages:
- Gives **real words** as output.
- More **accurate** than stemming.
- **Understands part of speech (POS)** if you provide it.

### ❌ Disadvantages:
- **Slower** than stemming.
- Needs **POS tagging** for best results.


###  **WordNet Lemmatizer**?

The **WordNet Lemmatizer** is a tool in **NLTK** that uses the **WordNet database** (a large lexical database of English) to convert words to their **base/dictionary form** (called **lemma**).

---

### ✅ In Simple Words:

It changes:
- **"running" → "run"**
- **"better" → "good"** (only if correct POS is given)
- **"cars" → "car"**

It doesn’t just chop endings—it checks the **meaning and part of speech** too

### Notes:
- Default POS is **noun** if not provided.
- Much smarter than basic stemmers.
- Requires **POS tagging** for full accuracy.


In [1]:
# wordnet corpus 

from nltk.stem import WordNetLemmatizer

In [3]:
lemmatizer = WordNetLemmatizer()

In [6]:
lemmatizer.lemmatize('going',pos='n')

'going'

### What are **POS Tags** (Part-of-Speech Tags)? 
**POS tags** tell us **what role a word plays** in a sentence — like whether it’s a **noun**, **verb**, **adjective**, etc.

---

### ✅ Examples:
- **Noun (n)** → *dog, car, happiness*  
- **Verb (v)** → *run, eat, is*  
- **Adjective (a)** → *beautiful, big*  
- **Adverb (r)** → *quickly, very*  
- **Pronoun (PRP)** → *he, she, they*

In [7]:
lemmatizer.lemmatize('going',pos='v')

'go'

In [10]:
words = ["running", "runs", "runner", "easily", "fairly", "played", "playing", "play", "better", "faster", "going", "gone", "ate", "eating","programming", "programs"]

In [12]:
print(word)

['running', 'runs', 'runner', 'easily', 'fairly', 'played', 'playing', 'play', 'better', 'faster', 'going', 'gone', 'ate', 'eating', 'programming', 'programs']


In [23]:
for word in words:
    print(word,'---->',lemmatizer.lemmatize(word,pos='v'))

running ----> run
runs ----> run
runner ----> runner
easily ----> easily
fairly ----> fairly
played ----> play
playing ----> play
play ----> play
better ----> better
faster ----> faster
going ----> go
gone ----> go
ate ----> eat
eating ----> eat
programming ----> program
programs ----> program


In [25]:
print('fairly','---->',lemmatizer.lemmatize('fairly',pos='v'))

fairly ----> fairly


### Best for 
## Q&A, Chatbots and text summarisation