## Import Libraries

In [1]:
import nltk
from nltk.stem import PorterStemmer
from nltk.corpus import stopwords

## Stopwords

Stop words are words that are considered insignificant and are filtered out before or after processing natural language data

In English, examples of stop words include "a", "an", "the", "is", "are", "and", "but", "or", "in", "on", "at", "with", "he", "she", "it", "they"

In [2]:
paragraph = """Nelson Mandela was born on July 18, 1918 in a town called Umtata in the Transkei area of South Africa. 
Mandela grew up as any other young, South African black boy in an environment of poverty and oppression. As a young man he witnessed 
the white South African government imposing more and more restrictions on an already down trodden, uneducated, black majority. 
During his years at the University College of Fort Hare and the University of South Africa, where he studied law, he became even 
more aware of the atrocities and injustices committed in the name of apartheid. In 1944, Mandela joined the National African Congress (ANC) 
and became an outspoken, activist against the laws of apartheid. "Dangers and difficulties have not deterred us in the past, they will not 
frighten us now. But we must be prepared for them like men in business who do not waste energy in vain talk and idle action. The way of 
preparation (for action) lies in our rooting out all impurity and indiscipline from our organization and making it the bright and shining 
instrument that will cleave its way to freedom"( "No Easy Road to Freedom Speech by Nelson Mandela."). Mandela's charismatic speeches 
triggered an investigation by the ruling National Party Government, and in 1962 they arrested and charged Mandela with treason. 
The judge found him guilty, and sentenced him to life in prison. The first eighteen years of his incarceration he spent in 
Robben Island Prison, often in solitary confinement. Up until his release on February 11, 1990, he was held in Pollsmoor Prison. 
After his release, Mandela worked tirelessly towards a peaceful, democratic South Africa. He received The Nobel Peace Prize in 1993, 
and on April 27, 1994, South Africa held its first free election. The people elected Mandela as president. Mandela's strong, inimitable 
spirit allowed him to not only survive incredible hardships, but transformed him into an international symbol of peace and reconciliation. 
"I have cherished the ideal of a democratic and free society in which all persons live together in harmony and with equal opportunities. 
It is an ideal which I hope to live for and to achieve"("Nelson Mandela " I Am Prepared to Die".") He never once wavered in his convictions 
or his dreams and he has lived to see them all come to pass. Nelson Mandela, known to many as the "Grandfather" of South Africa, embodies 
all the characteristics of a true hero. In the face of seemingly insurmountable obstacles, he facilitated a peaceful transition to a 
democratic South Africa."""

### Tokenization

In [3]:
sentences = nltk.sent_tokenize(paragraph)

In [4]:
sentences

['Nelson Mandela was born on July 18, 1918 in a town called Umtata in the Transkei area of South Africa.',
 'Mandela grew up as any other young, South African black boy in an environment of poverty and oppression.',
 'As a young man he witnessed \nthe white South African government imposing more and more restrictions on an already down trodden, uneducated, black majority.',
 'During his years at the University College of Fort Hare and the University of South Africa, where he studied law, he became even \nmore aware of the atrocities and injustices committed in the name of apartheid.',
 'In 1944, Mandela joined the National African Congress (ANC) \nand became an outspoken, activist against the laws of apartheid.',
 '"Dangers and difficulties have not deterred us in the past, they will not \nfrighten us now.',
 'But we must be prepared for them like men in business who do not waste energy in vain talk and idle action.',
 'The way of \npreparation (for action) lies in our rooting out all 

### Removing Stopwords and Applying Stemming

In [5]:
stemmer = PorterStemmer()

In [6]:
stopwords.words('english')

['i',
 'me',
 'my',
 'myself',
 'we',
 'our',
 'ours',
 'ourselves',
 'you',
 "you're",
 "you've",
 "you'll",
 "you'd",
 'your',
 'yours',
 'yourself',
 'yourselves',
 'he',
 'him',
 'his',
 'himself',
 'she',
 "she's",
 'her',
 'hers',
 'herself',
 'it',
 "it's",
 'its',
 'itself',
 'they',
 'them',
 'their',
 'theirs',
 'themselves',
 'what',
 'which',
 'who',
 'whom',
 'this',
 'that',
 "that'll",
 'these',
 'those',
 'am',
 'is',
 'are',
 'was',
 'were',
 'be',
 'been',
 'being',
 'have',
 'has',
 'had',
 'having',
 'do',
 'does',
 'did',
 'doing',
 'a',
 'an',
 'the',
 'and',
 'but',
 'if',
 'or',
 'because',
 'as',
 'until',
 'while',
 'of',
 'at',
 'by',
 'for',
 'with',
 'about',
 'against',
 'between',
 'into',
 'through',
 'during',
 'before',
 'after',
 'above',
 'below',
 'to',
 'from',
 'up',
 'down',
 'in',
 'out',
 'on',
 'off',
 'over',
 'under',
 'again',
 'further',
 'then',
 'once',
 'here',
 'there',
 'when',
 'where',
 'why',
 'how',
 'all',
 'any',
 'both',
 'each

In [7]:
for i in range(len(sentences)):
    words = nltk.word_tokenize(sentences[i])
    words = [stemmer.stem(word) for word in words if word not in set(stopwords.words('english'))]
    sentences[i] = ' '.join(words)

In [8]:
sentences

['nelson mandela born juli 18 , 1918 town call umtata transkei area south africa .',
 'mandela grew young , south african black boy environ poverti oppress .',
 'as young man wit white south african govern impos restrict alreadi trodden , uneduc , black major .',
 'dure year univers colleg fort hare univers south africa , studi law , becam even awar atroc injustic commit name apartheid .',
 'in 1944 , mandela join nation african congress ( anc ) becam outspoken , activist law apartheid .',
 '`` danger difficulti deter us past , frighten us .',
 'but must prepar like men busi wast energi vain talk idl action .',
 "the way prepar ( action ) lie root impur indisciplin organ make bright shine instrument cleav way freedom '' ( `` no easi road freedom speech nelson mandela . `` ) .",
 "mandela 's charismat speech trigger investig rule nation parti govern , 1962 arrest charg mandela treason .",
 'the judg found guilti , sentenc life prison .',
 'the first eighteen year incarcer spent robben i