### STOPWORDS

In [1]:
paragraph = """
Dr. APJ Abdul Kalam's speeches often emphasized the importance of dreaming big, working hard, and contributing to India's development. He frequently encouraged young people to develop courage, think differently, and strive for excellence. He also highlighted the need for self-reliance and a strong sense of national identity. 
Here are some key themes and ideas often found in his speeches:
Dreaming Big and Setting Ambitious Goals:
Kalam frequently urged individuals, especially young people, to dream big and set ambitious goals for themselves and for the nation. He believed that dreams, coupled with hard work and perseverance, could lead to remarkable achievements. 
Importance of Education and Learning:
He stressed the significance of education, both formal and informal, and encouraged continuous learning and skill development. He believed that education empowers individuals to contribute meaningfully to society. 
Courage and Perseverance:
Kalam's speeches often celebrated courage, the willingness to challenge conventional thinking, and the ability to persevere in the face of challenges and setbacks. 
National Development and Self-Reliance:
He was a strong advocate for India's development and self-reliance. He believed that India had the potential to become a developed nation and urged citizens to work towards this goal. 
Ethical Leadership and Values:
Kalam emphasized the importance of ethical leadership, integrity, and the need to uphold values in all aspects of life. 
Contribution to Society:
He encouraged individuals to find ways to contribute to society, whether through scientific advancements, social service, or other means. 
Inspiration from Nature:
Kalam often used examples from nature, like the flight of birds, to illustrate complex concepts and inspire his audience. 
"""

In [2]:
import nltk
from nltk.stem import PorterStemmer

In [3]:
from nltk.corpus import stopwords

In [4]:
nltk.download('stopwords')

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Admin\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

In [5]:
stopwords.words('english')

['a',
 'about',
 'above',
 'after',
 'again',
 'against',
 'ain',
 'all',
 'am',
 'an',
 'and',
 'any',
 'are',
 'aren',
 "aren't",
 'as',
 'at',
 'be',
 'because',
 'been',
 'before',
 'being',
 'below',
 'between',
 'both',
 'but',
 'by',
 'can',
 'couldn',
 "couldn't",
 'd',
 'did',
 'didn',
 "didn't",
 'do',
 'does',
 'doesn',
 "doesn't",
 'doing',
 'don',
 "don't",
 'down',
 'during',
 'each',
 'few',
 'for',
 'from',
 'further',
 'had',
 'hadn',
 "hadn't",
 'has',
 'hasn',
 "hasn't",
 'have',
 'haven',
 "haven't",
 'having',
 'he',
 "he'd",
 "he'll",
 'her',
 'here',
 'hers',
 'herself',
 "he's",
 'him',
 'himself',
 'his',
 'how',
 'i',
 "i'd",
 'if',
 "i'll",
 "i'm",
 'in',
 'into',
 'is',
 'isn',
 "isn't",
 'it',
 "it'd",
 "it'll",
 "it's",
 'its',
 'itself',
 "i've",
 'just',
 'll',
 'm',
 'ma',
 'me',
 'mightn',
 "mightn't",
 'more',
 'most',
 'mustn',
 "mustn't",
 'my',
 'myself',
 'needn',
 "needn't",
 'no',
 'nor',
 'not',
 'now',
 'o',
 'of',
 'off',
 'on',
 'once',
 'on

In [6]:
from nltk.stem import PorterStemmer
from nltk.tokenize import sent_tokenize
from nltk.tokenize import word_tokenize

In [7]:
stemmer = PorterStemmer()

In [8]:
sentences = nltk.sent_tokenize(paragraph)

In [9]:
type(sentences)

list

In [10]:
## Apply stopwords and Filter and then apply stemming
porter_stemmer_sentences = []
for i in range(len(sentences)):
    words = nltk.word_tokenize(sentences[i])

    words = [stemmer.stem(word) for word in words if word not in set(stopwords.words('english'))]
    porter_stemmer_sentences.append(' '.join(words)) # convert all the words into sentences

porter_stemmer_sentences

["dr. apj abdul kalam 's speech often emphas import dream big , work hard , contribut india 's develop .",
 'he frequent encourag young peopl develop courag , think differ , strive excel .',
 'he also highlight need self-reli strong sens nation ident .',
 'here key theme idea often found speech : dream big set ambiti goal : kalam frequent urg individu , especi young peopl , dream big set ambiti goal nation .',
 'he believ dream , coupl hard work persever , could lead remark achiev .',
 'import educ learn : he stress signific educ , formal inform , encourag continu learn skill develop .',
 'he believ educ empow individu contribut meaning societi .',
 "courag persever : kalam 's speech often celebr courag , willing challeng convent think , abil persever face challeng setback .",
 "nation develop self-reli : he strong advoc india 's develop self-reli .",
 'he believ india potenti becom develop nation urg citizen work toward goal .',
 'ethic leadership valu : kalam emphas import ethic lead

In [11]:
from nltk.stem import SnowballStemmer

snowballstemmer = SnowballStemmer('english')

In [12]:
snowball_stemmer_sentences = []

for i in range(len(sentences)):
    words = nltk.word_tokenize(sentences[i])

    words = [snowballstemmer.stem(word) for word in words if word not in stopwords.words('english')]

    snowball_stemmer_sentences.append(' '.join(words))
snowball_stemmer_sentences

["dr. apj abdul kalam 's speech often emphas import dream big , work hard , contribut india 's develop .",
 'he frequent encourag young peopl develop courag , think differ , strive excel .',
 'he also highlight need self-reli strong sens nation ident .',
 'here key theme idea often found speech : dream big set ambiti goal : kalam frequent urg individu , especi young peopl , dream big set ambiti goal nation .',
 'he believ dream , coupl hard work persever , could lead remark achiev .',
 'import educ learn : he stress signific educ , formal inform , encourag continu learn skill develop .',
 'he believ educ empow individu contribut meaning societi .',
 "courag persever : kalam 's speech often celebr courag , willing challeng convent think , abil persever face challeng setback .",
 "nation develop self-reli : he strong advoc india 's develop self-reli .",
 'he believ india potenti becom develop nation urg citizen work toward goal .',
 'ethic leadership valu : kalam emphas import ethic lead

In [13]:
from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()

In [17]:
for i in range(len(sentences)):
    
    words = nltk.word_tokenize(sentences[i])

    words = [lemmatizer.lemmatize(word.lower(),pos='v') for word in words if word not in stopwords.words('english')]

    sentences[i] = ' '.join(words)

sentences

["dr. apj abdul kalam 's speech often emphasize importance dream big , work hard , contribute india 's development .",
 'he frequently encourage young people develop courage , think differently , strive excellence .',
 'he also highlight need self-reliance strong sense national identity .',
 'here key theme idea often find speech : dream big set ambitious goals : kalam frequently urge individual , especially young people , dream big set ambitious goal nation .',
 'he believe dream , couple hard work perseverance , could lead remarkable achievement .',
 'importance education learn : he stress significance education , formal informal , encourage continuous learn skill development .',
 'he believe education empower individual contribute meaningfully society .',
 "courage perseverance : kalam 's speech often celebrate courage , willingness challenge conventional think , ability persevere face challenge setback .",
 "national development self-reliance : he strong advocate india 's developme