# **BentoML Example: Linear Regression with Paddlepaddle**
**BentoML makes moving trained ML models to production easy:**



*   Package models trained with any ML framework and reproduce them for model serving in production
* **Deploy anywhere** for online API serving or offline batch serving
* High-Performance API model server with adaptive micro-batching support
* Central hub for managing models and deployment process via Web UI and APIs
* Modular and flexible design making it adaptable to your infrastrcuture

BentoML is a framework for serving, managing, and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable teams to deliver prediction services in a fast, repeatable, and scalable way.

Before reading this example project, be sure to check out the [Getting started guide](https://github.com/bentoml/BentoML/blob/master/guides/quick-start/bentoml-quick-start-guide.ipynb) to learn about the basic concepts in BentoML.

This notebook demonstrates how to use BentoML to turn a paddlepaddle model into a docker image containing a REST API server serving this model, how to use your ML service built with BentoML as a CLI tool, and how to distribute it a pypi package.

The example is based on [this tutorial](https://www.paddlepaddle.org.cn/documentation/docs/en/1.5/beginners_guide/basics/fit_a_line/README.html), using dataset from the [UCI Machine Learning Repository](https://www.kaggle.com/schirmerchad/bostonhoustingmlnd)

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [2]:
!python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple
!git clone https://github.com/PaddlePaddle/PaddleOCR

Looking in indexes: https://mirror.baidu.com/pypi/simple
fatal: destination path 'PaddleOCR' already exists and is not an empty directory.


In [3]:
#Installing PaddleOCR's dependencies and creating folder to store inference files and results
%cd PaddleOCR
!pip3 install -r requirements.txt

%mkdir inference
%cd inference
%ls

/content/PaddleOCR
mkdir: cannot create directory ‘inference’: File exists
/content/PaddleOCR/inference
[0m[34;42mch_ppocr_mobile_v2.0_cls_infer[0m/       ch_ppocr_mobile_v2.0_det_infer.tar.2
ch_ppocr_mobile_v2.0_cls_infer.tar    ch_ppocr_mobile_v2.0_det_infer.tar.3
ch_ppocr_mobile_v2.0_cls_infer.tar.1  ch_ppocr_mobile_v2.0_det_infer.tar.4
ch_ppocr_mobile_v2.0_cls_infer.tar.2  [01;34mch_ppocr_mobile_v2.0_rec_infer[0m/
ch_ppocr_mobile_v2.0_cls_infer.tar.3  ch_ppocr_mobile_v2.0_rec_infer.tar
[01;34mch_ppocr_mobile_v2.0_det_infer[0m/       ch_ppocr_mobile_v2.0_rec_infer.tar.1
ch_ppocr_mobile_v2.0_det_infer.tar    ch_ppocr_mobile_v2.0_rec_infer.tar.2
ch_ppocr_mobile_v2.0_det_infer.tar.1  ch_ppocr_mobile_v2.0_rec_infer.tar.3


In [4]:
# Download the detection model of the ultra-lightweight Chinese OCR model and uncompress it
!wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar
# Download the recognition model of the ultra-lightweight Chinese OCR model and uncompress it
!wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar
# Download the angle classifier model of the ultra-lightweight Chinese OCR model and uncompress it
!wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar && tar xf ch_ppocr_mobile_v2.0_cls_infer.tar
%cd /content/PaddleOCR

--2021-04-23 22:59:08--  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
Resolving paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)... 103.235.46.61, 2409:8c00:6c21:10ad:0:ff:b00e:67d
Connecting to paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)|103.235.46.61|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3164160 (3.0M) [application/x-tar]
Saving to: ‘ch_ppocr_mobile_v2.0_det_infer.tar.5’

_det_infer.tar.5     15%[==>                 ] 474.97K  43.0KB/s    eta 70s    ^C
--2021-04-23 22:59:25--  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
Resolving paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)... ^C
--2021-04-23 22:59:25--  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
Resolving paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)... ^C
/content/PaddleOCR


In [5]:
%cd /content/PaddleOCR
!pip install bentoml
!pip install imageio

/content/PaddleOCR


# **Prepare Custom BentoService Artifact to handle PaddleOCR's pretrained models**

In [15]:
%%writefile paddleOCRArtifact.py

import os, shutil, json
from bentoml.exceptions import InvalidArgument
from bentoml.service.artifacts import BentoServiceArtifact

class PaddleOCRArtifact(BentoServiceArtifact):
    def __init__(self, name):
        super(PaddleOCRArtifact, self).__init__(name)
        self._predictor = None
        self._inference_path = None
        self._model_type = None

    def _saved_inference_file_path(self, base_path):
        return os.path.join(base_path, self.name)

    def load(self, path):
      pass

    def pack(self, path, model_type, metadata=None):
        self._inference_path = path
        self._model_type = model_type
        return self
    
    def save(self, dst):
        print(self._inference_path, dst)
        if self._inference_path:
            shutil.copytree(self._inference_path, self._saved_inference_file_path(dst))
        pass

    def do_command(image_name, image_dir, det_model_dir, rec_model_dir, cls_model_dir, use_angle_cls=True, use_space_char=True):
        cmd = "python3 tools/infer/predict_system.py --image_dir=" + image_dir + " --det_model_dir=" + det_model_dir + " --rec_model_dir=" + rec_model_dir + " --cls_model_dir=" + cls_model_dir + " --use_angle_cls=" + str(use_angle_cls) + " --use_space_char=" + str(use_space_char)
        os.system(cmd)
        return './inference_results/' + image_name

    def get(self):
        pass

Overwriting paddleOCRArtifact.py


# **Create BentoService for model serving**

In [16]:
%%writefile paddleOCR_service.py
import pandas as pd

import bentoml, imageio
from typing import List
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import StringInput
from paddleOCRArtifact import PaddleOCRArtifact

@env(infer_pip_packages=True)
@artifacts([PaddleOCRArtifact('model')])
class paddleOCRService(bentoml.BentoService):

  @api(input=StringInput(), batch=True)
  def predict(self, image_info):
        attribs = image_info.split(" ")
        image_name = attribs[0]
        image_dir = attribs[1]
        det_model_dir = attribs[2]
        rec_model_dir = attribs[3]
        cls_model_dir = attribs[4]
        return PaddleOCRArtifact.do_command(image_name, image_dir, det_model_dir, rec_model_dir, cls_model_dir)

Overwriting paddleOCR_service.py


In [17]:
# 1) import the custom BentoService defined above
from paddleOCR_service import paddleOCRService

# 2) `pack` it with required artifacts
bento_svc = paddleOCRService()
bento_svc.pack(name='model', path='/content/PaddleOCR/inference', model_type='ch_ppocr_mobile_v2.0_')

# 3) save your BentoSerivce
saved_path = bento_svc.save()

from bentoml import load

svc = load(saved_path)


/content/PaddleOCR/inference /tmp/bentoml-temp-3432ud6l/paddleOCRService/artifacts
[2021-04-23 23:01:55,131] INFO - BentoService bundle 'paddleOCRService:20210423230154_54E746' saved to: /root/bentoml/repository/paddleOCRService/20210423230154_54E746


# **REST API Model Serving**

In [9]:
!bentoml serve paddleOCRService:latest

  """)
[2021-04-23 22:59:37,435] INFO - Getting latest version paddleOCRService:20210423225933_91A28E
[2021-04-23 22:59:37,453] INFO - Starting BentoML API proxy in development mode..
[2021-04-23 22:59:37,454] INFO - Starting BentoML API server in development mode..
[2021-04-23 22:59:37,600] INFO - Your system nofile limit is 1048576, which means each instance of microbatch service is able to hold this number of connections at same time. You can increase the number of file descriptors for the server process, or launch more microbatch instances to accept more concurrent connection.
(Press CTRL+C to quit)

Aborted!


If you are running this notebook from Google Colab, you can start the dev server with --run-with-ngrok option, to gain acccess to the API endpoint via a public endpoint managed by ngrok:

In [10]:
!bentoml serve paddleOCRService:latest --run-with-ngrok

  """)

Aborted!


# **Make request to the REST server**

*After navigating to the location of this notebook, copy and paste the following code to your terminal and run it to make request*

In [11]:
curl -i \
--request POST \
--header "Content-Type: text/csv" \
-d @test.csv \
localhost:5000/predict

SyntaxError: ignored

# **Containerize model server with Docker**

One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.

Note that docker is **not available in Google Colab**. You will need to download and run this notebook locally to try out this containerization with docker feature.

If you already have docker configured, simply run the follow command to product a docker container serving the PaddleLinearRegression prediction service created above:

In [None]:
!bentoml containerize paddleOCRService:latest

In [None]:
!docker run --rm -p 5000:5000 paddleOCRService:20210423230154_54E746

# **Launch inference job from CLI**

In [20]:
%%writefile input.txt
"11.jpg" "./doc/imgs/11.jpg" "./inference/ch_ppocr_mobile_v2.0_det_infer/" "./inference/ch_ppocr_mobile_v2.0_rec_infer/" "./inference/ch_ppocr_mobile_v2.0_cls_infer/"

Overwriting input.txt


In [21]:
!bentoml run paddleOCRService:latest predict --input-file input.txt

  """)
[2021-04-23 23:02:10,853] INFO - Getting latest version paddleOCRService:20210423230154_54E746
"11.jpg" "./doc/imgs/11.jpg" "./inference/ch_ppocr_mobile_v2.0_det_infer/" "./inference/ch_ppocr_mobile_v2.0_rec_infer/" "./inference/ch_ppocr_mobile_v2.0_cls_infer/"
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True
E0423 23:02:14.118669  1311 analysis_config.cc:78] Please compile with gpu to EnableGpu()
E0423 23:02:14.285444  1311 analysis_config.cc:78] Please compile with gpu to EnableGpu()
E0423 23:02:14.402758  1311 analysis_config.cc:78] Please compile with gpu to EnableGpu()
[2021/04/23 23:02:15] root INFO: dt_boxes num : 16, elapse : 0.6765918731689453
[2021/04/23 23:02:15] root INFO: cls num  : 16, elapse : 0.13097333908081055
[2021

# **Deployment Options**

If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:

* [AWS Lambda Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_lambda.html)
* [AWS SageMaker Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html)
* [Azure Functions Deployment Guide](https://docs.bentoml.org/en/latest/deployment/azure_functions.html)

If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:

* [AWS ECS Deployment](https://docs.bentoml.org/en/latest/deployment/aws_ecs.html)
* [Google Cloud Run Deployment](https://docs.bentoml.org/en/latest/deployment/google_cloud_run.html)
* [Azure container instance Deployment](https://docs.bentoml.org/en/latest/deployment/azure_container_instance.html)
* [Heroku Deployment](https://docs.bentoml.org/en/latest/deployment/heroku.html)

Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy:

* [Kubernetes Deployment](https://docs.bentoml.org/en/latest/deployment/kubernetes.html)
* [Knative Deployment](https://docs.bentoml.org/en/latest/deployment/knative.html)
* [Kubeflow Deployment](https://docs.bentoml.org/en/latest/deployment/kubeflow.html)
* [KFServing Deployment](https://docs.bentoml.org/en/latest/deployment/kfserving.html)
* [Clipper.ai Deployment Guide](https://docs.bentoml.org/en/latest/deployment/clipper.html)