## Stemming
Stemming is the process of reducing a word to its word stem that affixes to suffixes and prefixes or to the roots of words known as a lemma. Stemming is important in **natural language understanding (NLU)** and **natural language processing (NLP).**

words=

In [None]:
words=[""]

In [5]:
words=["eating","eats","eaten","writing","writes","programming","programs","history","finally","finalized"]

In [6]:
from nltk.stem import PorterStemmer

In [7]:
stemming = PorterStemmer()

In [8]:
for word in words:
    print(word+"---->"+stemming.stem(word))

eating---->eat
eats---->eat
eaten---->eaten
writing---->write
writes---->write
programming---->program
programs---->program
history---->histori
finally---->final
finalized---->final


In [9]:
stemming.stem('congratulation')

'congratul'

In [10]:
stemming.stem('sitting')

'sit'

# RegexpStemmer class
NLTK has RegexpStemmer class with the help of which we can easily implement Regular Expression Stemmer algorithms. It basically takes a single regular expression and removes any prefix or suffix that matches the expression. Let us see an example

from 

In [11]:
from nltk.stem import RegexpStemmer

In [12]:
reg_stemmer=RegexpStemmer('ing$|s$|e$|able$', min=4)

In [13]:
reg_stemmer.stem('eating')

'eat'

In [14]:
reg_stemmer.stem('ingeating')

'ingeat'

## Snowball Stemmer
It is a stemming algorithm which is also known as the Porter2 stemming algorithm as it is a better version of the Porter Stemmer since some issues of it were fixed in this stemmer.

from nltk.stem import SnowballStemmer

In [15]:
from nltk.stem import SnowballStemmer

In [16]:
snowballsstemmer=SnowballStemmer('english')

In [17]:
for word in words:
    print(word+"---->"+snowballsstemmer.stem(word))

eating---->eat
eats---->eat
eaten---->eaten
writing---->write
writes---->write
programming---->program
programs---->program
history---->histori
finally---->final
finalized---->final


In [18]:
stemming.stem("fairly"),stemming.stem("sportingly")

('fairli', 'sportingli')

In [19]:
snowballsstemmer.stem("fairly"),snowballsstemmer.stem("sportingly")

('fair', 'sport')

In [20]:
snowballsstemmer.stem('goes')

'goe'

In [21]:
stemming.stem('goes')

'goe'

## Wordnet Lemmatizer
Lemmatization technique is like stemming. The output we will get after lemmatization is called ‘lemma’, which is a root word rather than root stem, the output of stemming. After lemmatization, we will be getting a valid word that means the same thing.

NLTK provides WordNetLemmatizer class which is a thin wrapper around the wordnet corpus. This class uses morphy() function to the WordNet CorpusReader class to find a lemma. Let us understand it with an example −

from nltk.stem import WordNetLemmatizer

In [2]:
from nltk.stem import WordNetLemmatizer

In [3]:
lemmatizer=WordNetLemmatizer()

In [7]:
import nltk
nltk.download('wordnet')      # Download WordNet corpus
nltk.download('omw-1.4')      # Optional: improves lemmatization for other languages


[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\ACER\AppData\Roaming\nltk_data...
[nltk_data] Downloading package omw-1.4 to
[nltk_data]     C:\Users\ACER\AppData\Roaming\nltk_data...


True

In [8]:
'''
POS- Noun-n
verb-v
adjective-a
adverb-r
'''
lemmatizer.lemmatize("going",pos='v')

'go'

In [9]:
words=["eating","eats","eaten","writing","writes","programming","programs","history","finally","finalized"]


In [10]:
for word in words:
    print(word+"---->"+lemmatizer.lemmatize(word,pos='v'))

eating---->eat
eats---->eat
eaten---->eat
writing---->write
writes---->write
programming---->program
programs---->program
history---->history
finally---->finally
finalized---->finalize


In [11]:
lemmatizer.lemmatize('goes',pos='v')

'go'

In [12]:
lemmatizer.lemmatize("fairly",pos='v'),lemmatizer.lemmatize("sportingly")


('fairly', 'sportingly')