# STEMMING

In [1]:
paragraph = """
              When I was 17, I read a quote that went something like: “If you live each day as if it was your last, 
              someday you’ll most certainly be right.” It made an impression on me, and since then, for the past 33 years, 
              I have looked in the mirror every morning and asked myself: “If today were the last day of my life, 
              would I want to do what I am about to do today?” And whenever the answer has been “No” for too many days in a row,
               I know I need to change something.
               No one wants to die. Even people who want to go to heaven don’t want to 
               die to get there. 
               And yet death is the destination we all share. 
               No one has ever escaped it. And that is as it should be, because 
               Death is very likely the single best invention of Life. It is Life’s change agent. 
               It clears out the old to make way for the new. Right now the new is you, but 
               someday not too long from now, 
               you will gradually become the old and be cleared away. Sorry to be so dramatic, 
               but it is quite true.
              Your time is limited, so don’t waste it living someone else’s life. 
              Don’t be trapped by dogma — which is living with the results of other people’s thinking. 
              Don’t let the noise of others’ opinions drown out your own inner voice. 
              And most important, have the courage to follow your heart and intuition. 
              They somehow already know what you truly want to become. Everything else is secondary.
              When I was young, there was an amazing publication called The Whole Earth Catalog,
               which was one of the bibles of my generation. 
               It was created by a fellow named Stewart Brand not far from here in Menlo Park, and 
               he brought it to life with his poetic touch. This was in the late 1960s, before
                personal computers and desktop publishing, so it was all made with typewriters, 
                scissors and Polaroid cameras. 
                It was sort of like Google in paperback form, 35 years before Google came along: 
                It was idealistic, and overflowing with neat tools and great notions.
              """

In [5]:
import nltk 
from nltk.stem import PorterStemmer
from nltk.corpus import stopwords
nltk.download('punkt')


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

In [8]:
nltk.download('stopwords')

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


True

In [6]:
#tokenize into sentences
sentences = nltk.sent_tokenize(paragraph)
print(len(sentences))

# define the object for stemming
stemmer = PorterStemmer()

20


In [9]:
# Stemming
for i in range(len(sentences)):
    words = nltk.word_tokenize(sentences[i])
    words = [stemmer.stem(word) for word in words if word not in set(stopwords.words('english'))]
    sentences[i] = ' '.join(words)   

In [10]:
sentences

['when I 17 , I read quot went someth like : “ If live day last , someday ’ certainli right. ” It made impress , sinc , past 33 year , I look mirror everi morn ask : “ If today last day life , would I want I today ? ” and whenev answer “ No ” mani day row , I know I need chang someth .',
 'No one want die .',
 'even peopl want go heaven ’ want die get .',
 'and yet death destin share .',
 'No one ever escap .',
 'and , death like singl best invent life .',
 'It life ’ chang agent .',
 'It clear old make way new .',
 'right new , someday long , gradual becom old clear away .',
 'sorri dramat , quit true .',
 'your time limit , ’ wast live someon els ’ life .',
 'don ’ trap dogma — live result peopl ’ think .',
 'don ’ let nois other ’ opinion drown inner voic .',
 'and import , courag follow heart intuit .',
 'they somehow alreadi know truli want becom .',
 'everyth els secondari .',
 'when I young , amaz public call the whole earth catalog , one bibl gener .',
 'It creat fellow name st

# LEMMATIZATION

In [13]:
from nltk.stem import WordNetLemmatizer
nltk.download('wordnet')

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.


True

In [16]:
sentences_lem = nltk.sent_tokenize(paragraph)

In [17]:
#Define the object for lemmatization
lemmatizer = WordNetLemmatizer()

# Lemmatization
for i in range(len(sentences_lem)):
    words = nltk.word_tokenize(sentences_lem[i])
    words = [lemmatizer.lemmatize(word) for word in words if word not in set(stopwords.words('english'))]
    sentences[i] = ' '.join(words)   

In [18]:
sentences_lem #meaningful words will be present as a result of lemmatization

['\n              When I was 17, I read a quote that went something like: “If you live each day as if it was your last, \n              someday you’ll most certainly be right.” It made an impression on me, and since then, for the past 33 years, \n              I have looked in the mirror every morning and asked myself: “If today were the last day of my life, \n              would I want to do what I am about to do today?” And whenever the answer has been “No” for too many days in a row,\n               I know I need to change something.',
 'No one wants to die.',
 'Even people who want to go to heaven don’t want to die to get there.',
 'And yet death is the destination we all share.',
 'No one has ever escaped it.',
 'And that is as it should be, because \n               Death is very likely the single best invention of Life.',
 'It is Life’s change agent.',
 'It clears out the old to make way for the new.',
 'Right now the new is you, but someday not too long from now, \n             