Skip to content

wasnikh0/Data-preprocessing

Repository files navigation

Objectives

The learning objectives of this assignment are to:

  1. extract and cleanup text from a html document
  2. run and customize spaCy for text pre-processing

Setup your environment

First, please follow the General Instructions for Programming.

To install the libraries required for this assignment run:

pip install -r requirements.txt

To download the spaCy English pipeline run:

python -m spacy download en_core_web_sm

About

Extract and cleanup text from a html document

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors