# Lemmatization
* In contrast to the stemming, lemmatization looks beyond word reduction and considers a language's full vocabulary to apply a morphological analysis to words.

* The lemma of word 'was' is "be" and the lemma of the word "mice" is "mouse", the lemma of "meeting" might be 'meet' or 'meeting' depending on its use in a sentence.

* Lemmatization is typically seen as much more information simple stemming, which is why spaCy has opted to only have lemmatization available instead of stemming.

In [1]:
# Import spacy
import spacy
import en_core_web_sm

In [3]:
nlp=en_core_web_sm.load()
doc1=nlp(u"I am a runner who runs daily in a race because I love running, today i ran almost 10 miles")

for token in doc1:
    print(token.text,"\t",token.pos_,"\t",token.lemma,"\t",token.lemma_)

I 	 PRON 	 561228191312463089 	 -PRON-
am 	 VERB 	 10382539506755952630 	 be
a 	 DET 	 11901859001352538922 	 a
runner 	 NOUN 	 12640964157389618806 	 runner
who 	 NOUN 	 3876862883474502309 	 who
runs 	 VERB 	 12767647472892411841 	 run
daily 	 ADV 	 7369875328865446693 	 daily
in 	 ADP 	 3002984154512732771 	 in
a 	 DET 	 11901859001352538922 	 a
race 	 NOUN 	 8048469955494714898 	 race
because 	 ADP 	 16950148841647037698 	 because
I 	 PRON 	 561228191312463089 	 -PRON-
love 	 VERB 	 3702023516439754181 	 love
running 	 VERB 	 12767647472892411841 	 run
, 	 PUNCT 	 2593208677638477497 	 ,
today 	 NOUN 	 11042482332948150395 	 today
i 	 PRON 	 5097672513440128799 	 i
ran 	 VERB 	 12767647472892411841 	 run
almost 	 ADV 	 9970931496028849525 	 almost
10 	 NUM 	 6572986864102252890 	 10
miles 	 NOUN 	 15996833532744392865 	 mile


### Function to display lemmas
Since the display above is staggared and hard to read, let's write a function that displays the information we want more neatly.

In [6]:
def show_lemma(text):
    for token in text:
        print(f"{token.text:{13}}  {token.pos_:{8}}  {token.lemma:<{24}} {token.lemma_}")

In [7]:
doc2=nlp(u"Apple is the biggest company in the world")
show_lemma(doc2)

Apple          PROPN     8566208034543834098      apple
is             VERB      10382539506755952630     be
the            DET       7425985699627899538      the
biggest        ADJ       15511632813958231649     big
company        NOUN      6905553075311563409      company
in             ADP       3002984154512732771      in
the            DET       7425985699627899538      the
world          NOUN      1703489418272052182      world


In [8]:
doc3=nlp(u"I have ate eleven apple yesterday")
show_lemma(doc3)

I              PRON      561228191312463089       -PRON-
have           VERB      14692702688101715474     have
ate            VERB      9837207709914848172      eat
eleven         NUM       2577106820672012207      eleven
apple          NOUN      8566208034543834098      apple
yesterday      NOUN      1756787072497230782      yesterday


In [9]:
doc4=nlp(u"We will be meeting tommorow be ready for the meeting")
show_lemma(doc4)

We             PRON      561228191312463089       -PRON-
will           VERB      18307573501153647118     will
be             VERB      10382539506755952630     be
meeting        VERB      6880656908171229526      meet
tommorow       NOUN      14881451523362505806     tommorow
be             VERB      10382539506755952630     be
ready          ADJ       16376148581985464650     ready
for            ADP       16037325823156266367     for
the            DET       7425985699627899538      the
meeting        NOUN      14798207169164081740     meeting
