# BentoML Example: spaCy named entity recognizer  


BentoML is an open-source framework for machine learning **model serving**, aiming to **bridge the gap between Data Science and DevOps.**

Data Scientists can easily package their models trained with any ML framework using BentoMl and reproduce the model for serving in production. BentoML helps with managing packaged models in the BentoML format, and allows DevOps to deploy them as online API serving endpoints or offline batch inference jobs, on any cloud platform.

Before reading this example project, be sure to check out the [Getting started guide](https://github.com/bentoml/BentoML/blob/master/guides/quick-start/bentoml-quick-start-guide.ipynb) to learn about the basic concepts in BentoML.


Make sure to __use GPU runtime when running this notebook in Google Colab__, you can set it in top menu: `Runtime > Change Runtime Type > Hardware accelerator`.

![Impression](https://www.google-analytics.com/collect?v=1&tid=UA-112879361-3&cid=555&t=event&ec=spacy&ea=spacy-ner&dt=spacy-ner)

In [11]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [1]:
!pip install -q bentoml spacy>=2.3.0

You should consider upgrading via the '/usr/local/anaconda3/envs/dev-py3/bin/python -m pip install --upgrade pip' command.[0m


In [5]:
!python3 -m spacy download en_core_web_sm

Collecting en_core_web_sm==2.1.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.1.0/en_core_web_sm-2.1.0.tar.gz (11.1 MB)
[K     |████████████████████████████████| 11.1 MB 621 kB/s eta 0:00:01
[?25hBuilding wheels for collected packages: en-core-web-sm
  Building wheel for en-core-web-sm (setup.py) ... [?25ldone
[?25h  Created wheel for en-core-web-sm: filename=en_core_web_sm-2.1.0-py3-none-any.whl size=11074433 sha256=ff129b3c2e08aa5b9555bc2e2287f0a3aca5ae92b920065887151b7bb2347c94
  Stored in directory: /private/var/folders/kn/xnc9k74x03567n1mx2tfqnpr0000gn/T/pip-ephem-wheel-cache-36sq9_n2/wheels/59/4f/8c/0dbaab09a776d1fa3740e9465078bfd903cc22f3985382b496
Successfully built en-core-web-sm
Installing collected packages: en-core-web-sm
Successfully installed en-core-web-sm-2.1.0
You should consider upgrading via the '/usr/local/anaconda3/envs/dev-py3/bin/python3 -m pip install --upgrade pip' command.[0m
[38;5;2m✔ Download and installati

In [1]:
import en_core_web_sm

nlp = en_core_web_sm.load()

# Getting the pipeline component
ner=nlp.get_pipe("ner")

In [2]:
# training data
TRAIN_DATA = [
              ("Walmart is a leading e-commerce company", {"entities": [(0, 7, "ORG")]}),
              ("I reached Chennai yesterday.", {"entities": [(19, 28, "GPE")]}),
              ("I recently ordered a book from Amazon", {"entities": [(24,32, "ORG")]}),
              ("I was driving a BMW", {"entities": [(16,19, "PRODUCT")]}),
              ("I ordered this from ShopClues", {"entities": [(20,29, "ORG")]}),
              ("Fridge can be ordered in Amazon ", {"entities": [(0,6, "PRODUCT")]}),
              ("I bought a new Washer", {"entities": [(16,22, "PRODUCT")]}),
              ("I bought a old table", {"entities": [(16,21, "PRODUCT")]}),
              ("I bought a fancy dress", {"entities": [(18,23, "PRODUCT")]}),
              ("I rented a camera", {"entities": [(12,18, "PRODUCT")]}),
              ("I rented a tent for our trip", {"entities": [(12,16, "PRODUCT")]}),
              ("I rented a screwdriver from our neighbour", {"entities": [(12,22, "PRODUCT")]}),
              ("I repaired my computer", {"entities": [(15,23, "PRODUCT")]}),
              ("I got my clock fixed", {"entities": [(16,21, "PRODUCT")]}),
              ("I got my truck fixed", {"entities": [(16,21, "PRODUCT")]}),
              ("Flipkart started it's journey from zero", {"entities": [(0,8, "ORG")]}),
              ("I recently ordered from Max", {"entities": [(24,27, "ORG")]}),
              ("Flipkart is recognized as leader in market",{"entities": [(0,8, "ORG")]}),
              ("I recently ordered from Swiggy", {"entities": [(24,29, "ORG")]})
              ]

In [3]:
for _, annotations in TRAIN_DATA:
  for ent in annotations.get("entities"):
    ner.add_label(ent[2])
    
# Disable pipeline components you dont need to change
pipe_exceptions = ["ner", "trf_wordpiecer", "trf_tok2vec"]
unaffected_pipes = [pipe for pipe in nlp.pipe_names if pipe not in pipe_exceptions]

In [6]:
# Import requirements
import random
from spacy.util import minibatch, compounding
from pathlib import Path

# TRAINING THE MODEL
with nlp.disable_pipes(*unaffected_pipes):

  # Training for 30 iterations
  for iteration in range(300):

    # shuufling examples  before every iteration
    random.shuffle(TRAIN_DATA)
    losses = {}
    # batch up the examples using spaCy's minibatch
    batches = minibatch(TRAIN_DATA, size=compounding(4.0, 32.0, 1.001))
    for batch in batches:
        texts, annotations = zip(*batch)
        nlp.update(
                    texts,  # batch of texts
                    annotations,  # batch of annotations
                    drop=0.5,  # dropout - make it harder to memorise data
                    losses=losses,
                )
        print("Losses", losses)

Losses {'ner': 0.004536969179435968}
Losses {'ner': 0.004546050815906144}
Losses {'ner': 0.025669093115227526}
Losses {'ner': 0.03411956118854484}
Losses {'ner': 0.03412825152052374}
Losses {'ner': 0.009122331338935474}
Losses {'ner': 0.7829139325543504}
Losses {'ner': 0.7829374250377971}
Losses {'ner': 0.8027441355587118}
Losses {'ner': 0.8028533373785098}
Losses {'ner': 0.06220998885687429}
Losses {'ner': 0.065771993306111}
Losses {'ner': 0.06577205994675647}
Losses {'ner': 0.09421975443447955}
Losses {'ner': 0.09969790003680326}
Losses {'ner': 0.0002805721257755911}
Losses {'ner': 0.0003179105598114438}
Losses {'ner': 0.15430579701299763}
Losses {'ner': 1.0006281762037488}
Losses {'ner': 1.3332326435204318}
Losses {'ner': 3.6815314428395585e-06}
Losses {'ner': 0.8758659695773919}
Losses {'ner': 0.878305612639057}
Losses {'ner': 0.878330049155932}
Losses {'ner': 0.9103632920625356}
Losses {'ner': 1.6874938412439158}
Losses {'ner': 1.6904258969040495}
Losses {'ner': 1.6929271873195455

Losses {'ner': 2.675922140425377e-07}
Losses {'ner': 2.912788776078251e-07}
Losses {'ner': 4.33417887981451e-07}
Losses {'ner': 0.0027446311229834064}
Losses {'ner': 0.0027446314388635496}
Losses {'ner': 7.507198626551534e-09}
Losses {'ner': 6.200635189219786e-08}
Losses {'ner': 0.00015820934826849296}
Losses {'ner': 0.00015843956024385138}
Losses {'ner': 0.00015857005856521048}
Losses {'ner': 1.524618139169768e-08}
Losses {'ner': 0.15727115923384272}
Losses {'ner': 0.1585237923672903}
Losses {'ner': 0.15852379259225552}
Losses {'ner': 0.15852384788700005}
Losses {'ner': 1.0503046192114184e-06}
Losses {'ner': 1.6048574642190867e-06}
Losses {'ner': 0.00015187160894708428}
Losses {'ner': 0.00015212282444433897}
Losses {'ner': 0.00016422041565620197}
Losses {'ner': 3.915418532568539e-07}
Losses {'ner': 3.9177150942235783e-07}
Losses {'ner': 5.725248494695859e-05}
Losses {'ner': 1.812888251995458}
Losses {'ner': 1.8128905797110755}
Losses {'ner': 4.529802811292837e-06}
Losses {'ner': 0.002

Losses {'ner': 8.512383342641951e-08}
Losses {'ner': 8.512718135357298e-08}
Losses {'ner': 3.1662201903764304e-06}
Losses {'ner': 7.967988137978128e-12}
Losses {'ner': 6.670390262462802e-11}
Losses {'ner': 4.708142035955484e-08}
Losses {'ner': 9.46719701050122e-07}
Losses {'ner': 9.467197568755813e-07}
Losses {'ner': 6.073478108333844e-12}
Losses {'ner': 1.2389870419370935e-11}
Losses {'ner': 1.619774198160979e-09}
Losses {'ner': 1.6277907954236972e-09}
Losses {'ner': 1.9292793658786926e-05}
Losses {'ner': 3.425172981958448e-12}
Losses {'ner': 7.208564106640355e-09}
Losses {'ner': 1.3825880200888334e-05}
Losses {'ner': 6.747715122522286e-05}
Losses {'ner': 6.747715135828744e-05}
Losses {'ner': 3.87194505574461e-09}
Losses {'ner': 0.8942452127118534}
Losses {'ner': 0.8977369887772906}
Losses {'ner': 0.8977370806676557}
Losses {'ner': 0.8977372089598187}
Losses {'ner': 4.0659336923079137e-07}
Losses {'ner': 4.0677998285204943e-07}
Losses {'ner': 0.00041004043534019973}
Losses {'ner': 0.0

Losses {'ner': 2.2754721592204523e-05}
Losses {'ner': 2.2754721592233556e-05}
Losses {'ner': 1.7700541963811338e-09}
Losses {'ner': 1.9343706784540097e-09}
Losses {'ner': 2.8165622438463114e-09}
Losses {'ner': 2.8274583247537743e-09}
Losses {'ner': 2.914443866410991e-09}
Losses {'ner': 2.656789296716592e-10}
Losses {'ner': 0.0007469212720326684}
Losses {'ner': 0.0007469212750852647}
Losses {'ner': 0.0007538405867740352}
Losses {'ner': 0.0007689076825693904}
Losses {'ner': 7.324801665111896e-13}
Losses {'ner': 3.2693313918818276e-06}
Losses {'ner': 3.3029561699517856e-06}
Losses {'ner': 1.2693536760217044e-05}
Losses {'ner': 1.2711169645092215e-05}
Losses {'ner': 1.7843633558570215e-09}
Losses {'ner': 5.675168270276143e-08}
Losses {'ner': 5.6790749844159916e-08}
Losses {'ner': 5.7056111765539005e-08}
Losses {'ner': 5.711101396173784e-08}
Losses {'ner': 7.229488525264548e-13}
Losses {'ner': 0.05921484416271677}
Losses {'ner': 0.05921484416275569}
Losses {'ner': 0.05921485896499099}
Losse

Losses {'ner': 3.854150726434757e-06}
Losses {'ner': 2.4836104756947345e-09}
Losses {'ner': 9.583287393059652e-08}
Losses {'ner': 0.00025474894503264}
Losses {'ner': 0.0002547490078526133}
Losses {'ner': 0.0002547502195048238}
Losses {'ner': 7.251837240807394e-13}
Losses {'ner': 2.2529457919942813e-08}
Losses {'ner': 2.2538671834064382e-08}
Losses {'ner': 2.2615796779757855e-08}
Losses {'ner': 2.5922137357414793e-08}
Losses {'ner': 0.004659505137783196}
Losses {'ner': 0.004659505148542115}
Losses {'ner': 0.004712004834306389}
Losses {'ner': 0.004712005034276777}
Losses {'ner': 0.004712005034315628}
Losses {'ner': 1.3020435911418863e-06}
Losses {'ner': 1.302043879184802e-06}
Losses {'ner': 2.9825898084421566e-06}
Losses {'ner': 3.110396875605646e-06}
Losses {'ner': 3.1104384039251014e-06}
Losses {'ner': 3.329361166862065e-10}
Losses {'ner': 3.3591369410708986e-10}
Losses {'ner': 6.966081465921589e-10}
Losses {'ner': 6.974348499184351e-10}
Losses {'ner': 9.898838008911066e-10}
Losses {'n

Losses {'ner': 8.415964948054669e-10}
Losses {'ner': 1.0754304602505561e-07}
Losses {'ner': 1.0833576245015928e-07}
Losses {'ner': 1.0834817111114673e-07}
Losses {'ner': 1.1470859612387332e-09}
Losses {'ner': 1.1562151388107906e-09}
Losses {'ner': 1.1562332451597481e-09}
Losses {'ner': 3.5156105316765326e-08}
Losses {'ner': 3.5157529161595756e-08}
Losses {'ner': 5.901853108000323e-12}
Losses {'ner': 5.926692586361701e-12}
Losses {'ner': 6.7647037581345335e-12}
Losses {'ner': 3.6653331029572507e-10}
Losses {'ner': 3.665368504899307e-10}
Losses {'ner': 1.615168462591677e-12}
Losses {'ner': 1.54615325009875e-11}
Losses {'ner': 1.6056384582565806e-11}
Losses {'ner': 3.900569924616159e-08}
Losses {'ner': 9.686606729837248e-07}
Losses {'ner': 2.8499057133378076e-11}
Losses {'ner': 1.99997258189197}
Losses {'ner': 1.9999725818921374}
Losses {'ner': 1.9999725896084788}
Losses {'ner': 1.9999725896085}
Losses {'ner': 6.811014943261421e-10}
Losses {'ner': 7.004710929248925e-10}
Losses {'ner': 1.3

Losses {'ner': 3.208256979562539e-08}
Losses {'ner': 3.208276562500754e-08}
Losses {'ner': 3.217791000373682e-08}
Losses {'ner': 2.219445036083539e-11}
Losses {'ner': 5.224270523195646e-11}
Losses {'ner': 5.2245907516439536e-11}
Losses {'ner': 5.5651241397749405e-11}
Losses {'ner': 5.575680190821096e-11}
Losses {'ner': 6.971314346064923e-10}
Losses {'ner': 5.766429214217932e-06}
Losses {'ner': 5.7664431643148615e-06}
Losses {'ner': 5.766443857867826e-06}
Losses {'ner': 5.7664637359690676e-06}
Losses {'ner': 4.687326692884447e-06}
Losses {'ner': 5.044549042122341e-06}
Losses {'ner': 5.0445490493275435e-06}
Losses {'ner': 8.491733935478595e-06}
Losses {'ner': 8.491821927307508e-06}
Losses {'ner': 1.3318096122768894e-10}
Losses {'ner': 1.3318097489809817e-10}
Losses {'ner': 1.3318097865940805e-10}
Losses {'ner': 1.3320221332745937e-10}
Losses {'ner': 1.3320222265172534e-10}
Losses {'ner': 2.365575567172023e-13}
Losses {'ner': 3.0104325253840117e-13}
Losses {'ner': 2.3183169792097804e-10}


In [10]:
# Testing the model
doc = nlp("I was driving a Ford")
print(doc.ents)
print("Entities", [(ent.text, ent.label_) for ent in doc.ents])

(Ford,)
Entities [('Ford', 'PRODUCT')]


## Create BentoService for model serving

In [22]:
%%writefile spacy_ner.py


from bentoml import BentoService, api, env, artifacts
from bentoml.frameworks.spacy import SpacyModelArtifact
from bentoml.adapters import JsonInput


@env(infer_pip_packages=True)
@artifacts([SpacyModelArtifact('nlp')])
class SpacyNERService(BentoService):
    @api(input=JsonInput())
    def predict(self, parsed_json_list):
        result = []
        for index, parsed_json in enumerate(parsed_json_list):
            doc = self.artifacts.nlp(parsed_json['text'])
            result.append([{'entity': ent.text, 'label': ent.label_} for ent in doc.ents])
        return result

Overwriting spacy_ner.py


In [18]:
from spacy_ner import SpacyNERService

svc = SpacyNERService()
svc.pack('nlp', nlp)

saved_path = svc.save()

[2020-09-15 16:17:01,530] INFO - Detected non-PyPI-released BentoML installed, copying local BentoML modulefiles to target saved bundle path..


no previously-included directories found matching 'e2e_tests'
no previously-included directories found matching 'tests'
no previously-included directories found matching 'benchmark'


UPDATING BentoML-0.8.6+43.g53afaa73/bentoml/_version.py
set BentoML-0.8.6+43.g53afaa73/bentoml/_version.py to '0.8.6+43.g53afaa73'
[2020-09-15 16:17:06,326] INFO - BentoService bundle 'SpacyNERService:20200915161701_4475B2' saved to: /Users/bozhaoyu/bentoml/repository/SpacyNERService/20200915161701_4475B2


## REST API Model Serving


To start a REST API model server with the BentoService saved above, use the bentoml serve command:

In [14]:
!bentoml serve SpacyNERService:latest

[2020-09-15 16:13:38,999] INFO - Getting latest version SpacyNERService:20200915161253_DC0550
[2020-09-15 16:13:38,999] INFO - Starting BentoML API server in development mode..
 * Serving Flask app "SpacyNERService" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
^C


If you are running this notebook from Google Colab, you can start the dev server with `--run-with-ngrok` option, to gain acccess to the API endpoint via a public endpoint managed by [ngrok](https://ngrok.com/):

In [23]:
!bentoml serve SpacyNERService:latest --run-with-ngrok

[2020-09-15 16:39:23,654] INFO - Getting latest version SpacyNERService:20200915161701_4475B2
[2020-09-15 16:39:23,655] INFO - Starting BentoML API server in development mode..
 * Serving Flask app "SpacyNERService" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
7[?47h[?1h=[H[2J[m[38;5;6m[48;5;16m[1m[1;1Hngrok[m[38;5;16m[48;5;16m [m[38;5;7m[48;5;16mby[m[38;5;16m[48;5;16m [m[38;5;6m[48;5;16m[1m@inconshreveable[m[38;5;16m[48;5;16m                                       [m[38;5;7m[48;5;16m(Ctrl+C to quit)[m[38;5;16m[48;5;16m[2;1H                                                                                [m[38;5;6m[48;5;16m[3;1HSession Status                connecting[m[38;5;16m[48;5;16m                                        [m[38;5;7m[48;5;16m[4;1HVersion                       2.3.35[m[38;5;16m[48;5;16m            

Open http://127.0.0.1:5000 to see more information about the REST APIs server in your
browser.


### Send prediction requeset to the REST API server

Navigate to parent directory of the notebook(so you have reference to the `test.jpg` image), and run the following `curl` command to send the image to REST API server and get a prediction result:

```bash
curl -i \
    --request POST \
    --header "Content-Type: application/json" \
    --data "{\"text\":\"I am driving BMW\"}" \
    localhost:5000/predict
```

## Containerize model server with Docker


One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.

Note that docker is **not available in Google Colab**. You will need to download and run this notebook locally to try out this containerization with docker feature.

If you already have docker configured, simply run the follow command to product a docker container serving the IrisClassifier prediction service created above:

In [None]:
!bentoml containerize SpacyNERService:latest

In [None]:
!docker run -p 5000:5000 spacynerservice

## Load saved BentoService

bentoml.load is the API for loading a BentoML packaged model in python:

In [25]:
from bentoml import load

service = load(saved_path)

print(service.predict([{'text': 'I am driving BMW'}]))

[[{'entity': 'BMW', 'label': 'PRODUCT'}]]


## Launch inference job from CLI

BentoML cli supports loading and running a packaged model from CLI. With the DataframeInput adapter, the CLI command supports reading input Dataframe data from CLI argument or local csv or json files:

In [27]:
!bentoml run SpacyNERService:latest predict --input "{\"text\":\"I am driving BMW\"}"

[2020-09-15 16:44:08,832] INFO - Getting latest version SpacyNERService:20200915161701_4475B2
[{"entity": "BMW", "label": "PRODUCT"}]


# Deployment Options

If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:
- [AWS Lambda Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_lambda.html)
- [AWS SageMaker Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html)
- [Azure Functions Deployment Guide](https://docs.bentoml.org/en/latest/deployment/azure_functions.html)

If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:
- [AWS ECS Deployment](https://docs.bentoml.org/en/latest/deployment/aws_ecs.html)
- [Google Cloud Run Deployment](https://docs.bentoml.org/en/latest/deployment/google_cloud_run.html)
- [Azure container instance Deployment](https://docs.bentoml.org/en/latest/deployment/azure_container_instance.html)
- [Heroku Deployment](https://docs.bentoml.org/en/latest/deployment/heroku.html)

Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy:
- [Kubernetes Deployment](https://docs.bentoml.org/en/latest/deployment/kubernetes.html)
- [Knative Deployment](https://docs.bentoml.org/en/latest/deployment/knative.html)
- [Kubeflow Deployment](https://docs.bentoml.org/en/latest/deployment/kubeflow.html)
- [KFServing Deployment](https://docs.bentoml.org/en/latest/deployment/kfserving.html)
- [Clipper.ai Deployment Guide](https://docs.bentoml.org/en/latest/deployment/clipper.html)

