# Stop Words
Words like "a" and "the" appear so frequently that they don't require tagging as thoroughly as nouns, verbs and modifiers. We call these *stop words*, and they can be filtered from the text to be processed. spaCy holds a built-in list of some 326 English stop words.（up till 05/27/2020)

In [56]:
# Perform standard imports:
import spacy
nlp = spacy.load('en_core_web_sm')

In [57]:
# Print the set of spaCy's default stop words (remember that sets are unordered):
print(nlp.Defaults.stop_words)

{'’d', 'as', 'when', 'put', 'though', 'beside', 'three', 'either', 'being', 'had', 'our', 'wherever', 'us', 'but', 'my', 'this', 'also', 'we', 'sometime', 'her', 'no', 'might', 'was', 'amount', 'five', '’ll', 'who', 'yourselves', 'everyone', 'doing', 'almost', 'empty', 'himself', 'must', 'noone', 'except', 'more', 'mostly', 'i', 'whatever', 'beyond', 'anything', 'they', 'indeed', 'seem', 'although', 'afterwards', 'done', 'per', 'back', 'namely', 'these', 'all', 'toward', 'above', 'those', 'them', 'down', 'there', 'call', 'four', '’re', 'already', 'anywhere', '‘ve', "'re", 'using', 'because', 'last', 'used', 'neither', 'fifteen', 'nothing', 'top', 'part', 'whereby', 'somehow', 'up', 'elsewhere', 'every', 'first', 'hereupon', 'however', 'thus', 'besides', 'below', 'none', 'via', 'well', 'by', 'whether', 'itself', 'move', 'not', 'everywhere', 'their', 'serious', 'an', 'herein', 'myself', 'over', 'ever', 'if', 'least', 'out', 'latter', 'something', 'still', 'until', 'him', 'am', 'fifty', '

In [58]:
len(nlp.Defaults.stop_words)

326

## To see if a word is a stop word

In [59]:
nlp.vocab['myself'].is_stop

True

In [60]:
nlp.vocab['mystery'].is_stop

False

## To add a stop word
There may be times when you wish to add a stop word to the default set. Perhaps you decide that `'btw'` (common shorthand for "by the way") should be considered a stop word.

In [61]:
# Add the word to the set of stop words. Use lowercase!
nlp.Defaults.stop_words.add('btw')

# Set the stop_word tag on the lexeme
nlp.vocab['btw'].is_stop = True

In [62]:
len(nlp.Defaults.stop_words)

327

In [63]:
nlp.vocab['btw'].is_stop

True

<font color=red>When adding stop words, always use lowercase. Lexemes are converted to lowercase before being added to **vocab**.</font>

## To remove a stop word
Alternatively, you may decide that `'beyond'` should not be considered a stop word.

In [64]:
# Remove the word from the set of stop words
nlp.Defaults.stop_words.remove('beyond')

# Remove the stop_word tag from the lexeme
nlp.vocab['beyond'].is_stop = False

In [65]:
len(nlp.Defaults.stop_words)

326

In [66]:
nlp.vocab['beyond'].is_stop

False

#### reset to origion version

In [55]:
nlp.Defaults.stop_words.add('beyond')
nlp.Defaults.stop_words.remove('btw')