## Wordnet Lemmatizer
Lemmatization technique is like stemming. The output we will get after lemmatization is called ‘lemma’, which is a root word rather than root stem, the output of stemming. After lemmatization, we will be getting a valid word that means the same thing.

NLTK provides WordNetLemmatizer class which is a thin wrapper around the wordnet corpus. This class uses morphy() function to the WordNet CorpusReader class to find a lemma. Let us understand it with an example −


In [1]:
import nltk
nltk.download('wordnet')
nltk.download('omw-1.4')   # optional but useful (multilingual WordNet)

from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()

[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\bhara\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package omw-1.4 to
[nltk_data]     C:\Users\bhara\AppData\Roaming\nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


In [2]:
'''
POS- Noun-n
verb-v
adjective-a
adverb-r
'''

'\nPOS- Noun-n\nverb-v\nadjective-a\nadverb-r\n'

In [3]:
lemmatizer.lemmatize("consciousness",pos='a')

'consciousness'

In [4]:
verbs = [
    "running",
    "ran",
    "eating",
    "ate",
    "swimming",
    "driving",
    "writing",
    "spoken",
    "walking",
    "studied"
]


In [5]:
for verb in verbs:
    print(verb+"--->"+lemmatizer.lemmatize(verb,pos='v'))

running--->run
ran--->run
eating--->eat
ate--->eat
swimming--->swim
driving--->drive
writing--->write
spoken--->speak
walking--->walk
studied--->study


In [6]:
nouns = [
    "dogs",
    "children",
    "cities",
    "buses",
    "boxes",
    "mice",
    "geese",
    "leaves",
    "wolves",
    "men"
]


In [7]:
for noun in nouns:
    print(noun+"--->"+lemmatizer.lemmatize(noun,pos='n'))

dogs--->dog
children--->child
cities--->city
buses--->bus
boxes--->box
mice--->mouse
geese--->goose
leaves--->leaf
wolves--->wolf
men--->men


In [8]:
adjective = [
    "better",
    "best",
    "worse",
    "happier",
    "happiest",
    "bigger",
    "biggest",
    "faster",
    "slowly"
]


In [9]:
for adj in adjective:
    print(adj+"--->"+lemmatizer.lemmatize(adj,pos='a'))

better--->good
best--->best
worse--->bad
happier--->happy
happiest--->happy
bigger--->big
biggest--->big
faster--->fast
slowly--->slowly


In [10]:
adverbs = [
    "quickly",
    "slowly",
    "happily",
    "sadly",
    "carefully",
    "beautifully",
    "loudly",
    "silently",
    "easily",
    "rarely",
    "frequently",
    "usually",
    "suddenly",
    "nearly",
    "barely"
]


In [11]:
for adr in adverbs:
    print(adr+"--->"+lemmatizer.lemmatize(adr,pos='v'))

quickly--->quickly
slowly--->slowly
happily--->happily
sadly--->sadly
carefully--->carefully
beautifully--->beautifully
loudly--->loudly
silently--->silently
easily--->easily
rarely--->rarely
frequently--->frequently
usually--->usually
suddenly--->suddenly
nearly--->nearly
barely--->barely


In [12]:
from nltk.stem import PorterStemmer, SnowballStemmer
stemmer = PorterStemmer()
snow = SnowballStemmer('english')

In [13]:
for verb in verbs:
    print(verb+"--->"+lemmatizer.lemmatize(verb,pos='v'))
    print(verb+"--->"+stemmer.stem(verb))
    print(verb+"--->"+snow.stem(verb))
    print("===========")

running--->run
running--->run
running--->run
ran--->run
ran--->ran
ran--->ran
eating--->eat
eating--->eat
eating--->eat
ate--->eat
ate--->ate
ate--->ate
swimming--->swim
swimming--->swim
swimming--->swim
driving--->drive
driving--->drive
driving--->drive
writing--->write
writing--->write
writing--->write
spoken--->speak
spoken--->spoken
spoken--->spoken
walking--->walk
walking--->walk
walking--->walk
studied--->study
studied--->studi
studied--->studi


In [14]:
for noun in nouns:
    print(noun+"--->"+lemmatizer.lemmatize(noun,pos='n'))
    print(noun+"--->"+stemmer.stem(noun))
    print(noun+"--->"+snow.stem(noun))
    print("=====")
    

dogs--->dog
dogs--->dog
dogs--->dog
=====
children--->child
children--->children
children--->children
=====
cities--->city
cities--->citi
cities--->citi
=====
buses--->bus
buses--->buse
buses--->buse
=====
boxes--->box
boxes--->box
boxes--->box
=====
mice--->mouse
mice--->mice
mice--->mice
=====
geese--->goose
geese--->gees
geese--->gees
=====
leaves--->leaf
leaves--->leav
leaves--->leav
=====
wolves--->wolf
wolves--->wolv
wolves--->wolv
=====
men--->men
men--->men
men--->men
=====


In [15]:
lemmatizer.lemmatize('laziness',pos='n')

'laziness'