The learning objectives of this assignment are to:
- extract and cleanup text from a html document
- run and customize spaCy for text pre-processing
First, please follow the General Instructions for Programming.
To install the libraries required for this assignment run:
pip install -r requirements.txt
To download the spaCy English pipeline run:
python -m spacy download en_core_web_sm