### Create config file

Generate config.cfg file which supports the English language and has the Named Entity Recognition (ner) component

In [1]:
!python -m spacy init config ./config.cfg --lang en --pipeline ner --force

[38;5;3m⚠ To generate a more effective transformer-based config (GPU-only),
install the spacy-transformers package and re-run this command. The config
generated now does not use transformers.[0m
[38;5;4mℹ Generated config template specific for your use case[0m
- Language: en
- Pipeline: ner
- Optimize for: efficiency
- Hardware: CPU
- Transformer: None
[38;5;2m✔ Auto-filled config with all values[0m
[38;5;2m✔ Saved config[0m
config.cfg
You can now add your data and train your pipeline:
python -m spacy train config.cfg --paths.train ./train.spacy --paths.dev ./dev.spacy


### Create and train the model

Note: if there is not GPU available on the local machine, remove the parameter `--gpu-id 0` from the line below

In [3]:
!python -m spacy train config.cfg --gpu-id 0 --output ./output --paths.train ./train_detect_ml_models.spacy --paths.dev ./test_detect_ml_models.spacy

[38;5;4mℹ Saving to output directory: output[0m
[38;5;4mℹ Using GPU: 0[0m
[1m
[38;5;2m✔ Initialized pipeline[0m
[1m
[38;5;4mℹ Pipeline: ['tok2vec', 'ner'][0m
[38;5;4mℹ Initial learn rate: 0.001[0m
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00     81.83    1.38    0.82    4.51    0.01
  0     200         76.96   2209.05   88.31   90.35   86.36    0.88
  0     400        105.76    386.21   94.54   96.10   93.04    0.95
  0     600        126.80    275.77   94.58   95.91   93.29    0.95
  0     800         96.86    191.79   96.69   97.49   95.90    0.97
  0    1000        156.95    197.60   96.78   97.07   96.50    0.97
  0    1200        129.72    130.11   97.14   98.24   96.07    0.97
  0    1400        282.98    181.33   97.17   98.81   95.59    0.97
  0    1600        146.59     94.86   97.44   97.97   96.92    0.97
  0    1800        167.21    142.10   97.38

### Package the last model

In [3]:
!pip install build -q

In [2]:
!mkdir ./packages

In [3]:
!python -m spacy package ./output/model-best ./packages --name ml_method_pipeline --version 1.0.0

[38;5;4mℹ Building package artifacts: sdist[0m
[38;5;2m✔ Including 1 package requirement(s) from meta and config[0m
spacy>=3.8.2,<3.9.0
[38;5;2m✔ Loaded meta.json from file[0m
output/model-best/meta.json
[38;5;2m✔ Generated README.md from meta.json[0m
[38;5;2m✔ Successfully created package directory
'en_ml_method_pipeline-1.0.0'[0m
packages/en_ml_method_pipeline-1.0.0
[1m* Creating isolated environment: venv+pip...[0m
[1m* Installing packages in isolated environment:[0m
  - setuptools >= 40.8.0
[1m* Getting build dependencies for sdist...[0m
running egg_info
creating en_ml_method_pipeline.egg-info
writing en_ml_method_pipeline.egg-info/PKG-INFO
writing dependency_links to en_ml_method_pipeline.egg-info/dependency_links.txt
writing entry points to en_ml_method_pipeline.egg-info/entry_points.txt
writing requirements to en_ml_method_pipeline.egg-info/requires.txt
writing top-level names to en_ml_method_pipeline.egg-info/top_level.txt
writing manifest file 'en_ml_method_pip

### Install the new package

In [4]:
%%bash
cd ./packages/en_ml_method_pipeline-1.0.0/dist/
pip install en_ml_method_pipeline-1.0.0.tar.gz -q

### Test loading of new pipeline package

In [1]:
import spacy

In [2]:
nlp = spacy.load('en_ml_method_pipeline')