In [1]:
from pathlib import Path
import re
from re import match
import pandas as pd
import spacy

The directory `data` contains a subdirectory named `sotu` with five [State of the Union](https://en.wikipedia.org/wiki/State_of_the_Union) speeches from various presidents of the United States, which are stored as UTF-8 encoded plain text files.

Import the `pathlib` module and use the module to read the contents of each text file into string objects.

Then import the `spacy` library and load a [small language model](https://spacy.io/models/en#en_core_web_sm) for English. Assign the model under the variable `nlp`.

Process the texts using the language model and store the resulting *Doc* objects into a list named `speeches`.

In [3]:
nlp = spacy.load("en_core_web_sm")
corpus_dir = Path('data/sotu')
files = list(corpus_dir.glob('*.txt'))
sotu_articles = []
speeches = []
for file in files:
    text = file.read_text(encoding='utf-8')
    sotu_articles.append(text)
    nlp_sotu = nlp(text)
    speeches.append(nlp_sotu)

Retrieve named entities from each *Doc* object in the list `speeches` and store the named entities into a list named `entities`.

In [4]:
entities = []
for speech in speeches:
    entities.extend(speech.ents)

Retrieve all *Tokens* that have the coarse part-of-speech tag `VERB` from the *Doc* object under `speeches`. 

In [5]:
verbs = []
for speech in speeches:
    for token in speech:
        if token.pos_ == 'VERB':
            verbs.append(token.lemma_)

In [6]:
verbs

['HARRY',
 'stand',
 'lay',
 'rest',
 'move',
 'give',
 'see',
 'take',
 'love',
 'belove',
 'fill',
 'leave',
 'ease',
 'ache',
 'know',
 'lose',
 'thrust',
 'carry',
 'depart',
 'look',
 'look',
 'move',
 'want',
 'do',
 'do',
 'shed',
 'cherish',
 'live',
 'die',
 'permit',
 'look',
 'require',
 'provide',
 'call',
 'help',
 'keep',
 'unite',
 'proclaim',
 'want',
 'assure',
 'love',
 'support',
 'defend',
 'shirk',
 'continue',
 'remain',
 'have',
 'pay',
 'make',
 'become',
 'settle',
 'jeopardize',
 'remain',
 'traffic',
 'make',
 'rest',
 'wish',
 'see',
 'violate',
 'go',
 'shake',
 'punish',
 'pursue',
 'last',
 'secure',
 'permit',
 'plot',
 'shrink',
 'seek',
 'find',
 'labor',
 'achieve',
 'make',
 'let',
 'assure',
 'look',
 'improve',
 'face',
 'fear',
 'face',
 'win',
 'cease',
 'preserve',
 'maintain',
 'pay',
 'assist',
 'break',
 'pray',
 'delay',
 'cost',
 'bring',
 'dominate',
 'rock',
 'determine',
 'depart',
 'carry',
 'want',
 'know',
 'remain',
 'repay',
 'earn'