## Stopwords

In [31]:
corpus=""" We expect that many readers of this book have heard of deep learning as an exciting new technology, and are surprised to see a mention of “history” in a book about an emerging ﬁeld. In fact, deep learning dates back to the 1940s. Deep learning only appears to be new, because it was relatively unpopular for several years preceding its current popularity, and because it has gone through many diﬀerent names, and has only recently become called “deep learning.” The ﬁeld has been rebranded many times, reﬂecting the inﬂuence of diﬀerent researchers and diﬀerent perspectives.
A comprehensive history of deep learning is beyond the scope of this textbook. However, some basic context is useful for understanding deep learning. Broadly speaking, there have been three waves of development of deep learning: deep learning known as cybernetics in the 1940s–1960s, deep learning known as connectionism in the 1980s–1990s, and the current resurgence under the name deep learning beginning in 2006. This is quantitatively illustrated in ﬁgure 1.7.
Some of the earliest learning algorithms we recognize today were intended to be computational models of biological learning, i.e. models of how learning happens or could happen in the brain. As a result, one of the names that deep learning has gone by is artiﬁcial neural networks (ANNs). The corresponding perspective on deep learning models is that they are engineered systems inspired by the biological brain (whether the human brain or the brain of another animal). While the kinds of neural networks used for machine learning have sometimes been used to understand brain function (Hinton and Shallice, 1991), they are generally not designed to be realistic models of biological function. The neural perspective on deep learning is motivated by two main ideas. One idea is that the brain provides a proof by example that intelligent behavior is possible, and a conceptually straightforward path to building intelligence is to reverse engineer the computational principles behind the brain and duplicate its functionality. Another perspective is that it would be deeply interesting to understand the brain and the principles that underlie human intelligence, so machine learning models that shed light on these basic scientiﬁc questions are useful apart from their ability to solve engineering applications. 
"""

In [10]:
from nltk.stem import PorterStemmer
from nltk.corpus import stopwords

In [11]:
import nltk
nltk.download('stopwords')

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Intel\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

In [12]:
stopwords.words('english')

['i',
 'me',
 'my',
 'myself',
 'we',
 'our',
 'ours',
 'ourselves',
 'you',
 "you're",
 "you've",
 "you'll",
 "you'd",
 'your',
 'yours',
 'yourself',
 'yourselves',
 'he',
 'him',
 'his',
 'himself',
 'she',
 "she's",
 'her',
 'hers',
 'herself',
 'it',
 "it's",
 'its',
 'itself',
 'they',
 'them',
 'their',
 'theirs',
 'themselves',
 'what',
 'which',
 'who',
 'whom',
 'this',
 'that',
 "that'll",
 'these',
 'those',
 'am',
 'is',
 'are',
 'was',
 'were',
 'be',
 'been',
 'being',
 'have',
 'has',
 'had',
 'having',
 'do',
 'does',
 'did',
 'doing',
 'a',
 'an',
 'the',
 'and',
 'but',
 'if',
 'or',
 'because',
 'as',
 'until',
 'while',
 'of',
 'at',
 'by',
 'for',
 'with',
 'about',
 'against',
 'between',
 'into',
 'through',
 'during',
 'before',
 'after',
 'above',
 'below',
 'to',
 'from',
 'up',
 'down',
 'in',
 'out',
 'on',
 'off',
 'over',
 'under',
 'again',
 'further',
 'then',
 'once',
 'here',
 'there',
 'when',
 'where',
 'why',
 'how',
 'all',
 'any',
 'both',
 'each

In [49]:
stemmer=PorterStemmer()
sentences=nltk.sent_tokenize(corpus)

In [50]:
sentences

[' We expect that many readers of this book have heard of deep learning as an exciting new technology, and are surprised to see a mention of “history” in a book about an emerging ﬁeld.',
 'In fact, deep learning dates back to the 1940s.',
 'Deep learning only appears to be new, because it was relatively unpopular for several years preceding its current popularity, and because it has gone through many diﬀerent names, and has only recently become called “deep learning.” The ﬁeld has been rebranded many times, reﬂecting the inﬂuence of diﬀerent researchers and diﬀerent perspectives.',
 'A comprehensive history of deep learning is beyond the scope of this textbook.',
 'However, some basic context is useful for understanding deep learning.',
 'Broadly speaking, there have been three waves of development of deep learning: deep learning\xa0known\xa0as cybernetics in\xa0the\xa01940s–1960s,\xa0deep\xa0learning\xa0known as connectionism in the 1980s–1990s, and the current resurgence under the na

In [21]:
for i in range(len(sentences)):
    words=nltk.word_tokenize(sentences[i]) 
    # if word not in stopwords then only you apply stemming
    words=[stemmer.stem(word) for word in words if word not in set(stopwords.words('english')) ]
    #converting all the words into sentences
    sentences[i]=' '.join(words)
    

In [22]:
sentences

['we expect mani reader book heard deep learn excit new technolog , surpris see mention “ histori ” book emerg ﬁeld .',
 'in fact , deep learn date back 1940 .',
 'deep learn appear new , rel unpopular sever year preced current popular , gone mani diﬀer name , recent becom call “ deep learning. ” the ﬁeld rebrand mani time , reﬂect inﬂuenc diﬀer research diﬀer perspect .',
 'a comprehens histori deep learn beyond scope textbook .',
 'howev , basic context use understand deep learn .',
 'broadli speak , three wave develop deep learn : deep learn known cybernet 1940s–1960 , deep learn known connection 1980s–1990 , current resurg name deep learn begin 2006 .',
 'thi quantit illustr ﬁgure 1.7 .',
 'some earliest learn algorithm recogn today intend comput model biolog learn , i.e .',
 'model learn happen could happen brain .',
 'as result , one name deep learn gone artiﬁci neural network ( ann ) .',
 'the correspond perspect deep learn model engin system inspir biolog brain ( whether human 

In [23]:
from nltk.stem import SnowballStemmer
snowball=SnowballStemmer('english')

In [39]:
for i in range(len(sentences)):
    words=nltk.word_tokenize(sentences[i]) 
    # if word not in stopwords then only you apply stemming
    words=[snowball.stem(word) for word in words if word not in set(stopwords.words('english')) ]
    #converting all the words into sentences
    sentences[i]=' '.join(words)

In [40]:
sentences

['we expect mani reader book heard deep learn excit new technolog , surpris see mention “ histori ” book emerg ﬁeld .',
 'in fact , deep learn date back 1940s .',
 'deep learn appear new , relat unpopular sever year preced current popular , gone mani diﬀer name , recent becom call “ deep learning. ” the ﬁeld rebrand mani time , reﬂect inﬂuenc diﬀer research diﬀer perspect .',
 'a comprehens histori deep learn beyond scope textbook .',
 'howev , basic context use understand deep learn .',
 'broad speak , three wave develop deep learn : deep learn known cybernet 1940s–1960s , deep learn known connection 1980s–1990s , current resurg name deep learn begin 2006 .',
 'this quantit illustr ﬁgure 1.7 .',
 'some earliest learn algorithm recogn today intend comput model biolog learn , i.e .',
 'model learn happen could happen brain .',
 'as result , one name deep learn gone artiﬁci neural network ( ann ) .',
 'the correspond perspect deep learn model engin system inspir biolog brain ( whether hu

In [27]:
from nltk.stem import WordNetLemmatizer
Lemmatizer=WordNetLemmatizer()

In [51]:
for i in range(len(sentences)):
    sentences[i]=sentences[i].lower()
    words=nltk.word_tokenize(sentences[i]) 
    # if word not in stopwords then only you apply stemming
    words=[Lemmatizer.lemmatize(word, pos='v') for word in words if word not in set(stopwords.words('english')) ]
    #converting all the words into sentences
    sentences[i]=' '.join(words)

In [52]:
sentences

['expect many readers book hear deep learn excite new technology , surprise see mention “ history ” book emerge ﬁeld .',
 'fact , deep learn date back 1940s .',
 'deep learn appear new , relatively unpopular several years precede current popularity , go many diﬀerent name , recently become call “ deep learning. ” ﬁeld rebranded many time , reﬂecting inﬂuence diﬀerent researchers diﬀerent perspectives .',
 'comprehensive history deep learn beyond scope textbook .',
 'however , basic context useful understand deep learn .',
 'broadly speak , three wave development deep learn : deep learn know cybernetics 1940s–1960s , deep learn know connectionism 1980s–1990s , current resurgence name deep learn begin 2006 .',
 'quantitatively illustrate ﬁgure 1.7 .',
 'earliest learn algorithms recognize today intend computational model biological learn , i.e .',
 'model learn happen could happen brain .',
 'result , one name deep learn go artiﬁcial neural network ( anns ) .',
 'correspond perspective d