# âœ… **What is Stemming?**

**Stemming = cutting a word down to its base/root form by applying simple rules.**

It does NOT check grammar or dictionary meaning.
It only chops endings like:

* *ing*
* *ed*
* *s*
* *ly*

So itâ€™s fast but sometimes messy.

### Examples:

| Original | Stem Output                  |
| -------- | ---------------------------- |
| playing  | play                         |
| studies  | studi                        |
| movement | move                         |
| better   | better (sometimes incorrect) |
| caring   | care                         |

---

## Why do we use stemming? 

* To reduce different forms of a word into a single root
* To make text processing simpler
* Useful for **search, quick text matching, keyword extraction**

But not good for grammar-sensitive NLP tasks.

---

## Commenly used Stemmers 

1. **Porter Stemmer** â€” most common
2. **Snowball Stemmer** â€” improved version of Porter
3. **Lancaster Stemmer** â€” aggressive (cuts too much)

Here just look at Porter or Snowball for now.

---

# **Stemming with NLTK**

### â–¶ Install NLTK (if not installed)

In [None]:
pip install nltk

### â–¶ Import and use stemmers

In [None]:
from nltk.stem import PorterStemmer, SnowballStemmer

# Create stemmer objects
porter = PorterStemmer()
snowball = SnowballStemmer("english")

words = ["playing", "studies", "movement", "caring", "happiest", "better","history","Fairly","finalized"]

# Using Porter Stemmer
porter_stems = [porter.stem(word) for word in words]
print("Porter:", porter_stems)
print()

# Using Snowball Stemmer
snowball_stems = [snowball.stem(word) for word in words]
print("Snowball:", snowball_stems)

Porter: ['play', 'studi', 'movement', 'care', 'happiest', 'better', 'histori', 'fairli', 'final']

Snowball: ['play', 'studi', 'movement', 'care', 'happiest', 'better', 'histori', 'fair', 'final']


### Stemming a sentence

In [9]:
sentence = "I was playing and enjoying the beautiful moments."
print(sentence)

I was playing and enjoying the beautiful moments.


In [10]:
from nltk.tokenize import word_tokenize

tokens = word_tokenize(sentence)

print(tokens)

['I', 'was', 'playing', 'and', 'enjoying', 'the', 'beautiful', 'moments', '.']


In [11]:
from nltk.stem import PorterStemmer

porter = PorterStemmer()

stems = [porter.stem(word) for word in tokens]
print(stems)

['i', 'wa', 'play', 'and', 'enjoy', 'the', 'beauti', 'moment', '.']



## ðŸŸ© When to use stemming?

Use stemming if:

âœ” You donâ€™t care about grammar

âœ” You want high speed

âœ” Youâ€™re building something simple (search, keywords)


Avoid stemming if:

âœ˜ You need clean and correct words

âœ˜ You are doing ML/NLP modeling

âœ˜ You want dictionary forms (then you need **lemmatization**)

---
