## Lemmatization
Lemmatization reduces words to their base word, which is linguistically correct lemmas. It transforms root word with the use of vocabulary and morphological analysis. 

Lemmatization: Finding the same root 
- Input : (Sings, Sung, Sang)
- Output : Sing


https://wordnet.princeton.edu/

In [1]:
import nltk
nltk.download('wordnet')

[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\User\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

In [2]:
from nltk.stem.wordnet import WordNetLemmatizer
lemmatizer=WordNetLemmatizer()

## Remove the s or es and finding the root

In [3]:
print("rocks :", lemmatizer.lemmatize("rocks"))
print("bues :", lemmatizer.lemmatize("buses"))
print("students :", lemmatizer.lemmatize("students"))
print("computers :", lemmatizer.lemmatize("computers"))
print("goes :", lemmatizer.lemmatize("goes"))
print("languages :", lemmatizer.lemmatize("languages"))

rocks : rock
bues : bus
students : student
computers : computer
goes : go
languages : language


## Lematize the verb with arguments

In [4]:
word="flying"
print("Lemmatization : "+word+" -> ",lemmatizer.lemmatize(word,'v'))

Lemmatization : flying ->  fly


In [5]:
word="swiming"
print("Lemmatization : "+word+" -> ",lemmatizer.lemmatize(word,'v'))

Lemmatization : swiming ->  swim


In [6]:
word="educated"
print("Lemmatization : "+word+" -> ",lemmatizer.lemmatize(word,'v'))

Lemmatization : educated ->  educate


In [7]:
word="had"
print("Lemmatization : "+word+" -> ",lemmatizer.lemmatize(word,'v'))

Lemmatization : had ->  have


## Lematize the adjective with arguments

In [8]:
# good -> better -> best
word="better"
print("Lemmatization : "+word+" -> ",lemmatizer.lemmatize(word,'a'))

Lemmatization : better ->  good


In [9]:
word="worst"
print("Lemmatization : "+word+" -> ",lemmatizer.lemmatize(word,'a'))

Lemmatization : worst ->  bad


In [10]:
# tall -> taller -> tallest
word="taller"
print("Lemmatization : "+word+" -> ",lemmatizer.lemmatize(word,'a'))

Lemmatization : taller ->  tall


In [12]:
word="beautiful"
print("Lemmatization : "+word+" -> ",lemmatizer.lemmatize(word,'a'))

Lemmatization : beautiful ->  beautiful


In [13]:
word="beauty"
print("Lemmatization : "+word+" -> ",lemmatizer.lemmatize(word,'a'))

Lemmatization : beauty ->  beauty


In [14]:
from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()

# Examples
words = ["happier", "happiest", "stronger", "strongest", "weaker"]
for word in words:
    print(f"Lemmatization: {word} -> {lemmatizer.lemmatize(word, 'a')}")


Lemmatization: happier -> happy
Lemmatization: happiest -> happy
Lemmatization: stronger -> strong
Lemmatization: strongest -> strong
Lemmatization: weaker -> weak


In [15]:
# Examples
words = ["worse", "worst", "lesser", "least"]
for word in words:
    print(f"Lemmatization: {word} -> {lemmatizer.lemmatize(word, 'a')}")

Lemmatization: worse -> bad
Lemmatization: worst -> bad
Lemmatization: lesser -> less
Lemmatization: least -> least


### Non-Adjectives with Different POS Tags

In [16]:
# Noun examples
words = ["running", "children", "feet"]
for word in words:
    print(f"Lemmatization (noun): {word} -> {lemmatizer.lemmatize(word, 'n')}")

# Verb examples
words = ["running", "swimming", "flies"]
for word in words:
    print(f"Lemmatization (verb): {word} -> {lemmatizer.lemmatize(word, 'v')}")

Lemmatization (noun): running -> running
Lemmatization (noun): children -> child
Lemmatization (noun): feet -> foot
Lemmatization (verb): running -> run
Lemmatization (verb): swimming -> swim
Lemmatization (verb): flies -> fly
