# Stop Words
Words like "a" and "the" appear so frequently that they don't require tagging as thoroughly as nouns, verbs and modifiers. We call these *stop words*, and they can be filtered from the text to be processed. spaCy holds a built-in list of some 305 English stop words.

In [0]:
# Perform standard imports:
import spacy
nlp = spacy.load('en_core_web_sm')

In [2]:
# Print the set of spaCy's default stop words (remember that sets are unordered):
print(nlp.Defaults.stop_words)

{'other', 'top', 'had', 'at', 'her', 'otherwise', 'still', 'further', 'less', 'my', 'along', 'nine', 'really', 'throughout', 'as', 'various', 'will', 'even', 'anyway', 'ten', 'became', 'this', 'thru', 'except', 'formerly', 'nobody', 'hereafter', 'elsewhere', 'anyhow', 'by', 'yours', 'why', 'off', 'please', 'because', 'just', 'therein', 'without', 'so', 'does', 'never', 'twenty', 'us', 'wherever', 'they', 'more', 'anywhere', 'could', 'nevertheless', 'than', 'moreover', 'per', 'put', 'two', 'yourself', 'fifty', 'he', 'beyond', 'ca', 'or', 'quite', 'beforehand', "'ve", 'seem', 'him', 'hundred', 'empty', 'then', 'himself', 'during', 'with', 'was', 'all', 'hers', 'me', 'yet', 'nothing', 'show', 'whenever', 'whereby', 'your', "'re", 'any', 'sixty', 'anyone', 'bottom', 'cannot', 'keep', 'latter', 'seems', 'which', 'about', 'would', 'often', 'beside', 'one', 'we', 'n‘t', 'same', 'an', 'while', 'enough', 'see', 'around', "n't", 'such', 'made', "'d", 'always', 'myself', 'first', 'several', 'alth

In [3]:
len(nlp.Defaults.stop_words)

326

## To see if a word is a stop word

In [4]:
nlp.vocab['myself'].is_stop

True

In [5]:
nlp.vocab['Mystery'].is_stop

False

## To add a stop word
There may be times when you wish to add a stop word to the default set. Perhaps you decide that `'btw'` (common shorthand for "by the way") should be considered a stop word.

In [6]:
# Add the word to the set of stop words. Use lowercase!
nlp.Defaults.stop_words.add('btw')
# Set the stop_word tag on the lexeme
nlp.vocab['btw'].is_stop

True

In [7]:
len(nlp.Defaults.stop_words)

327

True

<font color=green>When adding stop words, always use lowercase. Lexemes are converted to lowercase before being added to **vocab**.</font>

## To remove a stop word
Alternatively, you may decide that `'beyond'` should not be considered a stop word.

In [0]:
# Remove the word from the set of stop words
nlp.Defaults.stop_words.remove("btw")
# Remove the stop_word tag from the lexeme
nlp.vocab['btw'].is_stop=False

In [12]:
len(nlp.Defaults.stop_words)

324

In [20]:
nlp.vocab['btw'].is_stop

False

Great! Now you should be able to access spaCy's default set of stop words, and add or remove stop words as needed.
## Next up: Vocabulary and Matching