# Stop Words
Words like "a" and "the" appear so frequently that they don't require tagging as thoroughly as nouns, verbs and modifiers. We call these *stop words*, and they can be filtered from the text to be processed. spaCy holds a built-in list of some 305 English stop words.

In [1]:
# Perform standard imports:
import spacy
nlp = spacy.load('en_core_web_sm')

In [2]:
# Print the set of spaCy's default stop words (remember that sets are unordered):
print(nlp.Defaults.stop_words)

{'get', 'onto', 'meanwhile', 'anything', 'beside', 'yet', 'again', 'all', 'why', 'whatever', 'take', 'their', 'both', 'almost', 'neither', 'then', 'first', 'part', 'full', 'when', 'it', 'the', 'only', 'here', 'unless', 'become', '‘d', 'whereupon', 'via', 'whether', 'not', 'very', 'can', 'around', 'beyond', 'by', 'since', 'will', 'everything', 'now', 'same', 'therefore', 'did', 'your', 'forty', 'thus', 'more', 'those', 'former', 'might', 'nine', 'whereby', 'no', 'has', 'often', 'an', 'there', 'hereupon', 'were', 'bottom', 'how', 'two', 'fifty', 'anyhow', 'its', 'others', 'own', 'however', 'seeming', 'into', 'latterly', 'that', 'herself', 'every', 'to', 'could', "'s", 'was', 'whereas', '’d', 'anywhere', 'are', 'eight', 'sometime', "'m", 'even', 'as', 'empty', 'enough', 'whenever', 'ours', 'and', 'my', 'yourselves', 'among', "'ve", 'what', 'but', 'or', 'any', 'perhaps', "'re", 'above', 'most', '’s', 'behind', 'made', 'her', 'few', 'for', 'must', 'everywhere', 'upon', 'twelve', 'name', 'be

In [3]:
len(nlp.Defaults.stop_words)

326

## To see if a word is a stop word

In [4]:
nlp.vocab['is'].is_stop

True

In [5]:
nlp.vocab['mystery'].is_stop

False

## To add a stop word
There may be times when you wish to add a stop word to the default set. Perhaps you decide that `'btw'` (common shorthand for "by the way") should be considered a stop word.

In [6]:
nlp.Defaults.stop_words.add('btw')

In [7]:
nlp.vocab['btw'].is_stop = True

In [8]:
len(nlp.Defaults.stop_words)

327

<font color=green>When adding stop words, always use lowercase. Lexemes are converted to lowercase before being added to **vocab**.</font>

## To remove a stop word
Alternatively, you may decide that `'beyond'` should not be considered a stop word.

In [9]:
nlp.Defaults.stop_words.remove('btw')

In [10]:
nlp.vocab['btw'].is_stop = False