
**Stemming** and **Lemmatization** are both techniques used to reduce words to their root form, but they differ in their approach:

**Stemming:** Removes suffixes or prefixes based on rules, often producing morphological variants.

**Lemmatization:** Uses context and part of speech to find the true lemma (dictionary form).

In [1]:
import spacy
nlp = spacy.load("en_core_web_sm")

In [2]:
doc1 = nlp(u"I am a runner running in a race because I Love to run since  I ran today")

for token in doc1:
    print(token.text,'\t', '\t', token.pos_, '\t','\t', '\t' , token.lemma_)

I 	 	 PRON 	 	 	 I
am 	 	 AUX 	 	 	 be
a 	 	 DET 	 	 	 a
runner 	 	 NOUN 	 	 	 runner
running 	 	 VERB 	 	 	 run
in 	 	 ADP 	 	 	 in
a 	 	 DET 	 	 	 a
race 	 	 NOUN 	 	 	 race
because 	 	 SCONJ 	 	 	 because
I 	 	 PRON 	 	 	 I
Love 	 	 VERB 	 	 	 love
to 	 	 PART 	 	 	 to
run 	 	 VERB 	 	 	 run
since 	 	 SCONJ 	 	 	 since
  	 	 SPACE 	 	 	  
I 	 	 PRON 	 	 	 I
ran 	 	 VERB 	 	 	 run
today 	 	 NOUN 	 	 	 today


In [3]:
def show_lemmas(text):
    for token in text:
        print(f'{token.text:{12}} {token.pos_:{6}} {token.lemma:<{22}} {token.lemma_}')

In [4]:
doc2= nlp(u"I saw eighteen mice today! ")
show_lemmas(doc2)

I            PRON   4690420944186131903    I
saw          VERB   11925638236994514241   see
eighteen     NUM    9609336664675087640    eighteen
mice         NOUN   1384165645700560590    mouse
today        NOUN   11042482332948150395   today
!            PUNCT  17494803046312582752   !


In [5]:
doc3= nlp(u"I am meeting him tomorrow at the meeting ")
show_lemmas(doc3)

I            PRON   4690420944186131903    I
am           AUX    10382539506755952630   be
meeting      VERB   6880656908171229526    meet
him          PRON   1655312771067108281    he
tomorrow     NOUN   3573583789758258062    tomorrow
at           ADP    11667289587015813222   at
the          DET    7425985699627899538    the
meeting      NOUN   14798207169164081740   meeting


In [6]:
doc4 = nlp(u"That's an enormous automobile")
show_lemmas(doc4)

That         PRON   4380130941430378203    that
's           AUX    10382539506755952630   be
an           DET    15099054000809333061   an
enormous     ADJ    17917224542039855524   enormous
automobile   NOUN   7211811266693931283    automobile


In [7]:
doc5 = nlp("Once upon a time, there was a little fox named Pip. Pip loved to play hide-and-seek in the woods. One day, he hid behind a very big tree. He waited and waited, but no one found him. Pip giggled to himself and thought he was the best hider ever!")
show_lemmas(doc5)

Once         ADV    18381768081115421630   once
upon         SCONJ  12776617025319584140   upon
a            DET    11901859001352538922   a
time         NOUN   8885804376230376864    time
,            PUNCT  2593208677638477497    ,
there        PRON   2112642640949226496    there
was          VERB   10382539506755952630   be
a            DET    11901859001352538922   a
little       ADJ    9778055143417507723    little
fox          NOUN   4333436952782779665    fox
named        VERB   18309932012808971453   name
Pip          PROPN  4906171895756551431    Pip
.            PUNCT  12646065887601541794   .
Pip          PROPN  4906171895756551431    Pip
loved        VERB   3702023516439754181    love
to           PART   3791531372978436496    to
play         VERB   8228585124152053988    play
hide         VERB   12499326223551782790   hide
-            PUNCT  9153284864653046197    -
and          CCONJ  2283656566040971221    and
-            PUNCT  9153284864653046197    -
seek         VE