Stemming
Stemming is a process in Natural Language Processing (NLP) that aims to reduce words to their base or root form, called a stem. 
It involves removing suffixes or prefixes from words to obtain the core meaning of the word. 
The resulting stems may not always be valid words, but they allow for grouping together different variations of the same word.

Stemming Example: Let's consider a sentence: "I am running in the park."

If we apply stemming to this sentence, we can use the popular stemming algorithm called the Porter stemming algorithm. 
Here's an example using the NLTK library in Python:

In [1]:
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize

sentence = "I am running in the park."

# Tokenize the sentence
tokens = word_tokenize(sentence)

# Initialize PorterStemmer
stemmer = PorterStemmer()

# Apply stemming to each token
stems = [stemmer.stem(token) for token in tokens]

# Print the stems
print(stems)

['i', 'am', 'run', 'in', 'the', 'park', '.']


In this example, we first import the necessary NLTK modules (installed in a previus chapter 01_Tokenization): PorterStemmer for stemming and word_tokenize for tokenization. 
The sentence "I am running in the park." is then tokenized into separate words.

Next, we initialize a PorterStemmer object, which implements the Porter stemming algorithm. 
We iterate over each token and apply stemming using the stem() method of the stemmer object.

Finally, we print the resulting stems, which will be ['I', 'am', 'run', 'in', 'the', 'park', '.']. 
Notice how words like "running" have been reduced to their stem "run", providing a normalized representation of the words.

Stemming is useful for various NLP tasks, such as information retrieval, text mining, and language modeling, 
where the focus is on the general meaning of words rather than their specific variations.