# **BentoML Example: Image Segmentation with PaddleHub**
**BentoML makes moving trained ML models to production easy:**



*   Package models trained with any ML framework and reproduce them for model serving in production
* **Deploy anywhere** for online API serving or offline batch serving
* High-Performance API model server with adaptive micro-batching support
* Central hub for managing models and deployment process via Web UI and APIs
* Modular and flexible design making it adaptable to your infrastrcuture

BentoML is a framework for serving, managing, and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable teams to deliver prediction services in a fast, repeatable, and scalable way.

Before reading this example project, be sure to check out the [Getting started guide](https://github.com/bentoml/BentoML/blob/master/guides/quick-start/bentoml-quick-start-guide.ipynb) to learn about the basic concepts in BentoML.

This notebook demonstrates how to use BentoML to turn a Paddlehub module into a docker image containing a REST API server serving this model, how to use your ML service built with BentoML as a CLI tool, and how to distribute it a pypi package.

This example notebook is based on the [Python quick guide from PaddleHub](https://github.com/PaddlePaddle/PaddleHub/blob/release/v2.0/docs/docs_en/quick_experience/python_use_hub_en.md).


In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
!pip3 install -q bentoml paddlepaddle paddlehub

In [None]:
!hub install deeplabv3p_xception65_humanseg

## Prepare Input Data

In [None]:
!wget https://paddlehub.bj.bcebos.com/resources/test_image.jpg

## Create BentoService with PaddleHub Module Instantiation

In [None]:
%%writefile paddlehub_service.py
import paddlehub as hub
import bentoml
from bentoml import env, artifacts, api, BentoService
import imageio
from bentoml.adapters import ImageInput


@env(infer_pip_packages=True)
class PaddleHubService(bentoml.BentoService):
    def __init__(self):
      super(PaddleHubService, self).__init__()
      self.module = hub.Module(name="deeplabv3p_xception65_humanseg")

    @api(input=ImageInput(), batch=True)
    def predict(self, images):
        results = self.module.segmentation(images=images, visualization=True)
        return [result['data'] for result in results]


In [None]:
# Import the custom BentoService defined above
from paddlehub_service import PaddleHubService
import numpy as np
import cv2

# Pack it with required artifacts
bento_svc = PaddleHubService()

In [None]:
# Predict with the initialized module
image = cv2.imread("test_image.jpg")
images = [image]
segmentation_results = bento_svc.predict(images)

### Visualizing the result

In [None]:
# View the segmentation mask layer
from matplotlib import pyplot as plt

for result in segmentation_results:
    plt.imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
    plt.axis('off')
    plt.show()

In [None]:
# Get the segmented image of the original image
for result, original in zip(segmentation_results, images):
    result = cv2.cvtColor(result, cv2.COLOR_GRAY2RGB)
    original_mod = cv2.cvtColor(original, cv2.COLOR_RGB2RGBA)
    mask = result / 255
    *_, alpha = cv2.split(mask)
    mask = cv2.merge((mask, alpha))
    segmented_image = (original_mod * mask).clip(0, 255).astype(np.uint8)
    
    plt.imshow(cv2.cvtColor(segmented_image, cv2.COLOR_BGRA2RGBA))
    plt.axis('off')
    plt.show()

### Start dev server for testing

In [None]:
# Start a dev model server
bento_svc.start_dev_server()

In [None]:
!curl -i \
  -F image=@test_image.jpg \
  localhost:5000/predict

In [None]:
# Stop the dev model server
bento_svc.stop_dev_server()

### Save the BentoService for deployment

In [None]:
saved_path = bento_svc.save()

## REST API Model Serving

In [None]:
!bentoml serve PaddleHubService:latest

If you are running this notebook from Google Colab, you can start the dev server with --run-with-ngrok option, to gain acccess to the API endpoint via a public endpoint managed by ngrok:

In [None]:
!bentoml serve PaddleHubService:latest --run-with-ngrok

## Make request to the REST server

*After navigating to the location of this notebook, copy and paste the following code to your terminal and run it to make request*

In [None]:
curl -i \
  --header "Content-Type: image/jpeg" \
  --request POST \
  --data-binary @test_image.jpg \
  localhost:5000/predict

## Launch inference job from CLI

In [None]:
!bentoml run PaddleHubService:latest predict --input-file test_image.jpg

## Containerize model server with Docker

One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.

Note that docker is **not available in Google Colab**. You will need to download and run this notebook locally to try out this containerization with docker feature.

If you already have docker configured, simply run the follow command to product a docker container serving the PaddeHub prediction service created above:

In [None]:
!bentoml containerize PaddleHubService:latest

In [None]:
!docker run --rm -p 5000:5000 PaddleHubService:latest

# **Deployment Options**

If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:

* [AWS Lambda Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_lambda.html)
* [AWS SageMaker Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html)
* [Azure Functions Deployment Guide](https://docs.bentoml.org/en/latest/deployment/azure_functions.html)

If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:

* [AWS ECS Deployment](https://docs.bentoml.org/en/latest/deployment/aws_ecs.html)
* [Google Cloud Run Deployment](https://docs.bentoml.org/en/latest/deployment/google_cloud_run.html)
* [Azure container instance Deployment](https://docs.bentoml.org/en/latest/deployment/azure_container_instance.html)
* [Heroku Deployment](https://docs.bentoml.org/en/latest/deployment/heroku.html)

Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy:

* [Kubernetes Deployment](https://docs.bentoml.org/en/latest/deployment/kubernetes.html)
* [Knative Deployment](https://docs.bentoml.org/en/latest/deployment/knative.html)
* [Kubeflow Deployment](https://docs.bentoml.org/en/latest/deployment/kubeflow.html)
* [KFServing Deployment](https://docs.bentoml.org/en/latest/deployment/kfserving.html)
* [Clipper.ai Deployment Guide](https://docs.bentoml.org/en/latest/deployment/clipper.html)