**Lemmatization** is the process of converting a word to its base form. 

The difference between **stemming** and **lemmatization** is, **lemmatization** considers the context and converts the word to its meaningful base form, whereas **stemming** just removes the last few characters, often leading to incorrect meanings and spelling errors.

In [22]:
from nltk.stem import WordNetLemmatizer

#### Install wordnet


In [23]:
import nltk
nltk.download('wordnet')

[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\krdhe\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

In [24]:
words = words = ["running", "jumping", "flies", "happily", "agreed",
    "studies", "boxes", "talking", "argued", "dancing",
    "crying", "flying", "happiness", "played", "singer",
    "fishing", "dogs", "loving", "terribly", "working",
    "watching", "coded", "married", "swimming", "typing"
]

In [25]:
lemmatizer= WordNetLemmatizer()

In [26]:
for word in words:
    print(word + "------------------>" + lemmatizer.lemmatize(word))

running------------------>running
jumping------------------>jumping
flies------------------>fly
happily------------------>happily
agreed------------------>agreed
studies------------------>study
boxes------------------>box
talking------------------>talking
argued------------------>argued
dancing------------------>dancing
crying------------------>cry
flying------------------>flying
happiness------------------>happiness
played------------------>played
singer------------------>singer
fishing------------------>fishing
dogs------------------>dog
loving------------------>loving
terribly------------------>terribly
working------------------>working
watching------------------>watching
coded------------------>coded
married------------------>married
swimming------------------>swimming
typing------------------>typing


#### Understanding POS Tags in Lemmatization

**Lemmatization** in NLP is highly dependent on the part of speech (POS) assigned to a word. When using a lemmatizer, it's important to provide the correct POS tag to achieve accurate results.

Here's a quick reference for common POS tags used in lemmatization:

- `'n'` → Noun  
- `'v'` → Verb  
- `'a'` → Adjective  
- `'r'` → Adverb

Passing the correct POS tag helps the lemmatizer understand the context of the word, resulting in better lemmatized outputs.

In [27]:
for word in words:
    print(word + "------------------>" + lemmatizer.lemmatize(word, pos='v'))

running------------------>run
jumping------------------>jump
flies------------------>fly
happily------------------>happily
agreed------------------>agree
studies------------------>study
boxes------------------>box
talking------------------>talk
argued------------------>argue
dancing------------------>dance
crying------------------>cry
flying------------------>fly
happiness------------------>happiness
played------------------>play
singer------------------>singer
fishing------------------>fish
dogs------------------>dog
loving------------------>love
terribly------------------>terribly
working------------------>work
watching------------------>watch
coded------------------>cod
married------------------>marry
swimming------------------>swim
typing------------------>type
