## Stemming
In Natural Language Processing (NLP), stemming is the process of reducing a word to its root or base form by removing suffixes or prefixes, without necessarily producing a valid word in the dictionary.

#### Common stemming algorithms
1. Porter Stemmer (most common, simple, rule-based)
2. Snowball Stemmer (an improved version of Porter)
3. Lancaster Stemmer (more aggressive — sometimes too much)

In [7]:
## Classification Problem
## Comments of product is a positive review or negative review
## Reviews----> eating, eat,eaten [going,gone,goes]--->go

words=["eating","eats","eaten","writing","writes","programming","programs","history","finally","finalized", "happy", "studing"]

#### PorterStemmer

In [2]:
from nltk.stem import PorterStemmer

stemming=PorterStemmer()

In [8]:
for word in words:
    print(word+" ----> "+stemming.stem(word))

eating ----> eat
eats ----> eat
eaten ----> eaten
writing ----> write
writes ----> write
programming ----> program
programs ----> program
history ----> histori
finally ----> final
finalized ----> final
happy ----> happi
studing ----> stude


Notice how "happi" and "studi" are not valid words — stemming doesn’t care about correct spelling, it just chops words to a base form.

In [9]:
stemming.stem('congratulations'), stemming.stem("sitting")

('congratul', 'sit')

#### RegexpStemmer
RegexpStemmer is a type of stemmer that uses regular expressions to remove suffixes (or even prefixes) from words, 
instead of relying on predefined linguistic rules like the Porter Stemmer.

In [None]:
from nltk.stem import RegexpStemmer

reg_stemmer=RegexpStemmer('ing$|s$|e$|able$', min=4)

In [13]:
reg_stemmer.stem('eating')


'eat'

In [12]:
reg_stemmer.stem('ingeating')

'ingeat'

#### SnowballStemmer
SnowballStemmer in NLTK is an improved, more aggressive version of the Porter stemmer,

In [14]:
from nltk.stem import SnowballStemmer
snowballsstemmer=SnowballStemmer('english')

In [None]:
for word in words:
    print(word+"---->"+snowballsstemmer.stem(word))

eating---->eat
eats---->eat
eaten---->eaten
writing---->write
writes---->write
programming---->program
programs---->program
history---->histori
finally---->final
finalized---->final
happy---->happi
studing---->stude


In [None]:
# See diference below between portar Stem and Snowball stem
stemming.stem('fairly'), stemming.stem('sportingly')

('fairli', 'sportingli')

In [16]:
snowballsstemmer.stem("fairly"),snowballsstemmer.stem("sportingly")

('fair', 'sport')

In [17]:
snowballsstemmer.stem('goes')

'goe'