## Import Libraries

In [1]:
from nltk.stem import PorterStemmer, RegexpStemmer, SnowballStemmer

## Stemming 

Stemming is a natural language processing (NLP) technique that reduces a word to its root form, or "stem". It's a text preprocessing technique that involves removing affixes from words to normalize them

In [2]:
words = ["eating", "eats", "eaten", "writing", "writes", "programming", "programs",
        "history", "finally", "finalize"]

### Porter Stemming

Porter stemmer is a suffix stripping algorithm. It uses predefined rules to convert words into their root forms.

In [3]:
porter_stemming = PorterStemmer()

In [4]:
for word in words:
    print(f"{word} -----> {porter_stemming.stem(word)}")

eating -----> eat
eats -----> eat
eaten -----> eaten
writing -----> write
writes -----> write
programming -----> program
programs -----> program
history -----> histori
finally -----> final
finalize -----> final


In [5]:
porter_stemming.stem("congratulations")

'congratul'

In [6]:
porter_stemming.stem("sitting")

'sit'

### RegexpStemmer

A stemmer that uses regular expressions to identify morphological affixes. Any substrings that match the regular expressions will be removed.

In [7]:
regular_expression = "ing$|s$|e$|able$"
minimum_length = 4

In [8]:
regular_expression_stemming = RegexpStemmer(
    regexp=regular_expression,
    min=minimum_length
)

In [9]:
for word in words:
    print(f"{word} -----> {regular_expression_stemming.stem(word)}")

eating -----> eat
eats -----> eat
eaten -----> eaten
writing -----> writ
writes -----> write
programming -----> programm
programs -----> program
history -----> history
finally -----> finally
finalize -----> finaliz


In [10]:
regular_expression_stemming.stem("congratulations")

'congratulation'

In [11]:
regular_expression_stemming.stem("sitting")

'sitt'

### Snowball Stemmer

It is a stemming algorithm which is also known as the Porter2 stemming algorithm as it is a better version of the Porter Stemmer since some issues of it were fixed in this stemmer.

In [12]:
snowball_stemmer = SnowballStemmer(language='english', ignore_stopwords=False)

In [13]:
for word in words:
    print(f"{word} -----> {snowball_stemmer.stem(word)}")

eating -----> eat
eats -----> eat
eaten -----> eaten
writing -----> write
writes -----> write
programming -----> program
programs -----> program
history -----> histori
finally -----> final
finalize -----> final


In [14]:
snowball_stemmer.stem("congratulations")

'congratul'

In [15]:
snowball_stemmer.stem("sitting")

'sit'

### Comparison of Porter and Snowball

In [16]:
porter_stemming.stem("fairly"), porter_stemming.stem("sportingly")

('fairli', 'sportingli')

In [17]:
snowball_stemmer.stem("fairly"), snowball_stemmer.stem("sportingly")

('fair', 'sport')