## Stemming

Stemming involves removing suffixes from words to find the root form (or stem) of the word. It's helpful in tasks like text classification, search, and information retrieval, where different forms of a word (e.g., "connect", "connected", "connection") should be treated as equivalent

In [13]:
words = ['write','eaten','ate','eating','writing','written','find','found','findings']

In [14]:
## Porter Stemmer
from nltk.stem import PorterStemmer
stemming = PorterStemmer()
for word in words:
    print(word,'---->',stemming.stem(word))

write ----> write
eaten ----> eaten
ate ----> ate
eating ----> eat
writing ----> write
written ----> written
find ----> find
found ----> found
findings ----> find


## Regexp Stemmer

It's a simple stemmer that removes user-specified suffixes (or prefixes) using a regular expression. It doesn’t use linguistic rules or a large set of suffix-stripping rules like PorterStemmer—it’s more manual and customizable.

In [None]:
from nltk.stem import RegexpStemmer
stemming = RegexpStemmer('ing|able$|en$|ed$', min=4)
stemming.stem('ingeating')

'eat'

## Snowball Stemmer

Snowball Stemmer is better version of porter stemmer and widely used for stemming. It has language support as well.

In [None]:
from nltk.stem import SnowballStemmer
SnowballStemmer('english')