# Stemming
Often when searching text for a certain keyword, it helps if the search returns variations of the word. For instance, searching for "boat" might also return "boats" and "boating". Here, "boat" would be the **stem** for [boat, boater, boating, boats].

Stemming is a somewhat crude method for cataloging related words; it essentially chops off letters from the end until the stem is reached. This works fairly well in most cases, but unfortunately English has many exceptions where a more sophisticated process is required. In fact, spaCy doesn't include a stemmer, opting instead to rely entirely on lemmatization. For those interested, there's some background on this decision [here](https://github.com/explosion/spaCy/issues/327). We discuss the virtues of *lemmatization* in the next section.

Instead, we'll use another popular NLP tool called **nltk**, which stands for *Natural Language Toolkit*. For more information on nltk visit https://www.nltk.org/

In [1]:
import nltk

  LARGE_SPARSE_SUPPORTED = LooseVersion(scipy_version) >= '0.14.0'


In [2]:
from nltk.stem.porter import PorterStemmer

In [3]:
p_stemmer = PorterStemmer()

In [4]:
words = ['run', 'runner','ran', 'runs','easily','fairly']

![stemming1.png](../stemming1.png)

In [11]:
for word in words:
    print(word + '  ----->  ' + p_stemmer.stem(word))

run  ----->  run
runner  ----->  runner
ran  ----->  ran
runs  ----->  run
easily  ----->  easili
fairly  ----->  fairli


In [7]:
from nltk.stem.snowball import SnowballStemmer

In [8]:
s_stemmer = SnowballStemmer(language='english')

In [12]:
for word in words:
    print(word + '  ----->  ' + s_stemmer.stem(word))

run  ----->  run
runner  ----->  runner
ran  ----->  ran
runs  ----->  run
easily  ----->  easili
fairly  ----->  fair


In [13]:
words = ['generous', 'generation','generously','generate']

In [14]:
for word in words:
    print(word + '  ----->  ' + s_stemmer.stem(word))

generous  ----->  generous
generation  ----->  generat
generously  ----->  generous
generate  ----->  generat
