## Stemming

Stemming is the process of reducing a word to its word stem that **affixes to suffixes and prefixes** or to the roots of words known as a **lemma**. Stemming is important in natural language understanding `(NLU)` and natural language processing `(NLP)`.

**Classification Problem**

Comments of product is a positive review or negative review

| Review words        | Root Word |
|---------------------|-----------|
| eating, eats, eaten | eat       |
| going, gone, goes   | go        |

In [28]:
words=["eating","eats","eaten","writing","writes","programming","programs","history","finally","finalized","congratulations","sitting","fairly","sportingly","going","goes"]

## PortStemmer

In [29]:
from nltk.stem import PorterStemmer
stemmer=PorterStemmer()
for word in words:
    print(word,"-->",stemmer.stem(word))

eating --> eat
eats --> eat
eaten --> eaten
writing --> write
writes --> write
programming --> program
programs --> program
history --> histori
finally --> final
finalized --> final
congratulations --> congratul
sitting --> sit
fairly --> fairli
sportingly --> sportingli
going --> go
goes --> goe


## RegexpStemmer class

NLTK has `RegexpStemmer` class with the help of which we can easily implement **Regular Expression Stemmer algorithms**. It basically takes a single regular expression and removes any prefix or suffix that matches the expression. Let us see an example

In [32]:
new_words = words + ["ingeating","ingeat"]
from nltk.stem import RegexpStemmer
reg_stemmer=RegexpStemmer("ing$|s$|e$|able$|ing|es$",min=4)
for word in new_words:
    print(word,"-->",reg_stemmer.stem(word))

eating --> eat
eats --> eat
eaten --> eaten
writing --> writ
writes --> writ
programming --> programm
programs --> program
history --> history
finally --> finally
finalized --> finalized
congratulations --> congratulation
sitting --> sitt
fairly --> fairly
sportingly --> sportly
going --> go
goes --> go
ingeating --> eat
ingeat --> eat


## Snowball Stemmer

In [31]:
from nltk.stem import SnowballStemmer
snow_stemmer=SnowballStemmer("english")
for word in words:
    print(word,"-->",snow_stemmer.stem(word))

eating --> eat
eats --> eat
eaten --> eaten
writing --> write
writes --> write
programming --> program
programs --> program
history --> histori
finally --> final
finalized --> final
congratulations --> congratul
sitting --> sit
fairly --> fair
sportingly --> sport
going --> go
goes --> goe
