# Stop Words
Words like "a" and "the" appear so frequently that they don't require tagging as thoroughly as nouns, verbs and modifiers. We call these *stop words*, and they can be filtered from the text to be processed. spaCy holds a built-in list of some 305 English stop words.

In [1]:
# Perform standard imports:
import spacy
nlp = spacy.load('en_core_web_sm')

In [3]:
print(nlp.Defaults.stop_words)


{'everything', 'nevertheless', 'again', 'onto', 'then', 'me', '‘ll', 'against', 'give', 'such', 'same', 'seem', 'go', 'further', 'where', 'thence', 'part', 'sixty', "'ll", 'still', 'had', 'never', 'amongst', '‘re', 'get', '’s', 'sometimes', 'bottom', 'she', '‘m', 'n’t', 'too', 'since', 'whence', 'be', 'serious', 'but', 'indeed', 'to', 'without', 'himself', 'can', 'least', 'any', 'put', 'about', 'am', '’d', 'noone', 'were', 'alone', 'another', 'beside', 'could', 'those', 'behind', 'if', 'from', 'someone', 'its', 'no', 'here', 'although', 'when', 'third', 'do', 'before', 'is', 'perhaps', 'for', 'own', 'via', 're', 'other', 'their', 'not', 'always', 'because', 'some', 'fifteen', 'done', 'whither', "'m", 'off', 'yourselves', 'so', 'did', 'either', 'down', 'may', 'already', 'seeming', 'three', 'a', 'beyond', 'at', 'these', 'the', 'he', 'only', 'how', 'also', 'them', 'out', 'all', 'and', 'whom', 'namely', 'after', 'together', 'cannot', 'moreover', 'whole', 'none', 'ours', 'with', 'thru', 'af

In [4]:
len(nlp.Defaults.stop_words)

326

## To see if a word is a stop word

In [5]:
nlp.vocab['myself'].is_stop

True

In [6]:
nlp.vocab['mystery'].is_stop

False

## To add a stop word
There may be times when you wish to add a stop word to the default set. Perhaps you decide that `'btw'` (common shorthand for "by the way") should be considered a stop word.

In [7]:
# Add the word to the set of stop words. Use lowercase!
nlp.Defaults.stop_words.add('btw')

# Set the stop_word tag on the lexeme
nlp.vocab['btw'].is_stop = True

In [8]:
len(nlp.Defaults.stop_words)

327

In [9]:
nlp.vocab['btw'].is_stop

True

<font color=green>When adding stop words, always use lowercase. Lexemes are converted to lowercase before being added to **vocab**.</font>

## To remove a stop word
Alternatively, you may decide that `'beyond'` should not be considered a stop word.

In [10]:
# Remove the word from the set of stop words
nlp.Defaults.stop_words.remove('beyond')

# Remove the stop_word tag from the lexeme
nlp.vocab['beyond'].is_stop = False

In [11]:
len(nlp.Defaults.stop_words)

326

In [12]:
nlp.vocab['beyond'].is_stop

False

Great! Now you should be able to access spaCy's default set of stop words, and add or remove stop words as needed.
## Next up: Vocabulary and Matching