**Stemming is the process of removing suffix from a word to extract its base word.  
eg: ability -> abil**

**Lemmatization is the process of deriving a base word by using knowledge of a language (a.k.a linguistic knowledge).  
eg: ability -> ability**

In [1]:
import nltk
import spacy

In [3]:
from nltk.stem import PorterStemmer

In [4]:
stemmer = PorterStemmer()

In [12]:
words = ['eating', 'eats', 'eat', 'ate', 'adjustable', 'running', 'rafting', 'ability', 'meeting', 'better']

for word in words:
    print(word, '------->', stemmer.stem(word))

eating -------> eat
eats -------> eat
eat -------> eat
ate -------> ate
adjustable -------> adjust
running -------> run
rafting -------> raft
ability -------> abil
meeting -------> meet
better -------> better


In [13]:
nlp = spacy.load('en_core_web_sm')

In [14]:
doc = nlp('eating eats eat ate adjustable running rafting ability meeting better')

for token in doc:
    print(token, '---->', token.lemma_)

eating ----> eat
eats ----> eat
eat ----> eat
ate ----> eat
adjustable ----> adjustable
running ----> running
rafting ----> raft
ability ----> ability
meeting ----> meeting
better ----> well


In [16]:
doc = nlp("Mandalorian talked for 3 hours although talking isn't his thing. He became talkative.")

for token in doc:
    print(token, '----->', token.lemma_)

Mandalorian -----> Mandalorian
talked -----> talk
for -----> for
3 -----> 3
hours -----> hour
although -----> although
talking -----> talk
is -----> be
n't -----> not
his -----> his
thing -----> thing
. -----> .
He -----> he
became -----> become
talkative -----> talkative
. -----> .


In [17]:
nlp.pipe_names

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']

In [20]:
attr = nlp.get_pipe('attribute_ruler')

broCode = ['Bro', 'brah', 'brudda', 'mandem', 'homie', 'homeboy']
attr.add([ [{'TEXT': word}] for word in broCode ], 
         {'LEMMA': 'brother'})

doc = nlp("Bro brah brudda mandem homie homeboy")

for token in doc:
    print(token, '--->', token.lemma_)

Bro ---> brother
brah ---> brother
brudda ---> brother
mandem ---> brother
homie ---> brother
homeboy ---> brother


### Exercise

In [21]:
wordList = ['running', 'painting', 'walking', 'dressing', 'likely', 'children', 'whom', 'good', 'ate', 'fishing']

In [22]:
stemmer = PorterStemmer()
for word in wordList:
    print(word, '---->', stemmer.stem(word))

running ----> run
painting ----> paint
walking ----> walk
dressing ----> dress
likely ----> like
children ----> children
whom ----> whom
good ----> good
ate ----> ate
fishing ----> fish


In [29]:
doc = nlp(' '.join(wordList))
for token in doc:
    print(token, '---->', token.lemma_)

running ----> run
painting ----> paint
walking ----> walk
dressing ----> dress
likely ----> likely
children ----> child
whom ----> whom
good ----> good
ate ----> eat
fishing ----> fishing


In [44]:
text = """Latha is very multi talented girl She is good at many skills like dancing running singing playing
She also likes eating Pav Bhaji She has a habit of fishing and swimming too Besides all this she is wonderful at cooking too"""

In [45]:
for word in text.split():
    print(word, '---->', stemmer.stem(word))

Latha ----> latha
is ----> is
very ----> veri
multi ----> multi
talented ----> talent
girl ----> girl
She ----> she
is ----> is
good ----> good
at ----> at
many ----> mani
skills ----> skill
like ----> like
dancing ----> danc
running ----> run
singing ----> sing
playing ----> play
She ----> she
also ----> also
likes ----> like
eating ----> eat
Pav ----> pav
Bhaji ----> bhaji
She ----> she
has ----> ha
a ----> a
habit ----> habit
of ----> of
fishing ----> fish
and ----> and
swimming ----> swim
too ----> too
Besides ----> besid
all ----> all
this ----> thi
she ----> she
is ----> is
wonderful ----> wonder
at ----> at
cooking ----> cook
too ----> too


In [46]:
doc = nlp(text)
for token in doc:
    print(token, '---->', token.lemma_)

Latha ----> Latha
is ----> be
very ----> very
multi ----> multi
talented ----> talented
girl ----> girl
She ----> she
is ----> be
good ----> good
at ----> at
many ----> many
skills ----> skill
like ----> like
dancing ----> dance
running ----> run
singing ----> singing
playing ----> playing

 ----> 

She ----> she
also ----> also
likes ----> like
eating ----> eat
Pav ----> Pav
Bhaji ----> Bhaji
She ----> she
has ----> have
a ----> a
habit ----> habit
of ----> of
fishing ----> fishing
and ----> and
swimming ----> swim
too ----> too
Besides ----> besides
all ----> all
this ----> this
she ----> she
is ----> be
wonderful ----> wonderful
at ----> at
cooking ----> cook
too ----> too
