# Stemming

Stemming is the process of reducing inflected (or sometimes derived) words to their word stem, base or root form.

In [17]:
words = ['eating', 'eats', 'sleeping','sleeps', 'playing','played', 'working', 'studying', 'reading', 'writing', 'running', 'walking','walked', 'talking']

## Porter Stemmer

The Porter stemming algorithm (or 'Porter stemmer') is a process for removing the commoner morphological and inflexional endings from words in English. Its main use is as part of a term normalisation process that is usually done when setting up Information Retrieval systems.

In [18]:
from nltk.stem import PorterStemmer

In [19]:
stemming = PorterStemmer()


In [20]:
for word in words:
    print(word + ' : ' + stemming.stem(word))

eating : eat
eats : eat
sleeping : sleep
sleeps : sleep
playing : play
played : play
working : work
studying : studi
reading : read
writing : write
running : run
walking : walk
walked : walk
talking : talk


## RegexpStemmer

The RegexpStemmer uses regular expressions to identify the morphological endings of words. It is a simple stemmer that can be used to remove common morphological and inflexional endings from words.

In [21]:
from nltk.stem import RegexpStemmer

In [22]:
reg_stemmer = RegexpStemmer('ing$|ed$|es$|s$')

In [23]:
for word in words:
    print(word + ' : ' + reg_stemmer.stem(word))

eating : eat
eats : eat
sleeping : sleep
sleeps : sleep
playing : play
played : play
working : work
studying : study
reading : read
writing : writ
running : runn
walking : walk
walked : walk
talking : talk


## Snowball Stemmer

The Snowball Stemmer is a stemming algorithm that is more powerful and flexible than the Porter Stemmer. It supports several languages and can be extended to support more.

In [24]:
from nltk.stem import SnowballStemmer

In [25]:
snow_stemmer = SnowballStemmer('english')

In [26]:
for word in words:
    print(word + ' : ' + snow_stemmer.stem(word))

eating : eat
eats : eat
sleeping : sleep
sleeps : sleep
playing : play
played : play
working : work
studying : studi
reading : read
writing : write
running : run
walking : walk
walked : walk
talking : talk


## Conclusion

The stemmining process is useful for reducing the number of words that need to be processed in natural language processing tasks. It can help to improve the performance of text classification, information retrieval, and other tasks that involve processing large amounts of text data. But it is important to note that stemming is not always perfect and can sometimes produce incorrect results. It is important to evaluate the performance of the stemmer on your specific task and dataset to ensure that it is producing accurate results.