This is one of the test pipelines included in Pimlico's repository. See test-pipelines
for more details.
The complete config file for this test pipeline:
[pipeline]
name=spacy_parse_text
release=latest
# Prepared tarred corpus
[europarl]
type=pimlico.datatypes.corpora.GroupedCorpus
data_point_type=RawTextDocumentType
dir=%(test_data_dir)s/datasets/text_corpora/europarl
[tokenize]
type=pimlico.modules.spacy.parse_text
model=en_core_web_sm
The following Pimlico module types are used in this pipeline:
pimlico.modules.spacy.parse_text