# Predict on a InferenceService with BentoML


The notebook shows how to deploy and make predict against a KFServing InferenceService with BentoML. [BentoML](https://bentoml.org) is an open-source platform for high-performance ML model serving, which supports all major machine learning frameworks including Keras, Tensorflow, PyTorch, Fast.ai, XGBoost and etc.


In this notebook, it will trains a classification model with the iris data set, packages with BentoML, and then deploys to KFserving installed cluster for inferencing.


### Setup

* Your ~/.kube/config should point to a cluster with KFServing installed.
* Your cluster's Istio Ingress gateway must be network accessible.
* Docker and Docker hub must be properly configured

## Train and save model

In [None]:
from sklearn import svm
from sklearn import datasets


# Load training data
iris = datasets.load_iris()
X, y = iris.data, iris.target

# Model Training
clf = svm.SVC(gamma='scale')
clf.fit(X, y)

**Define ML service with BentoML**


BentoML creates a model API server, via prediction service abstraction.

The following code will be saved to a file name `iris_classifier.py`. It defines a prediction service that requires a scikit-learn model, and asks BentoML to figure out the required PyPI pip packages automatically. It also defined an API, which is the entry point for accessing this prediction service. The API is expecting a pandas.DataFrame object as its input data.

In [None]:
%%writefile iris_classifier.py

from bentoml import env, artifacts, api, BentoService
from bentoml.handlers import DataframeHandler
from bentoml.artifact import SklearnModelArtifact


@env(auto_pip_dependencies=True)
@artifacts([SklearnModelArtifact('model')])
class IrisClassifier(BentoService):

    @api(DataframeHandler)
    def predict(self, df):
        return self.artifacts.model.predict(df)

Save the trained model to local disk with the BentoML prediction service defined above

In [None]:
from iris_classifier import IrisClassifier

# Create a iris classifier service instance
iris_classifier_service = IrisClassifier()

# Pack the newly trained model artifact
iris_classifier_service.pack('model', clf)

# Save the prediction service to disk for model serving
saved_path = iris_classifier_service.save()

#### Validate prediction result with sample data using BentoML CLI

In [None]:
!bentoml run IrisClassifier:latest predict --input '[[5.1, 3.5, 1.4, 0.2]]'

## Deploy custom InferenceService


BentoML's REST interface is different than the Tensorflow V1 HTTP API that KFServing expects.  Requests will send directly to the prediction service and bypass the top level inferenceservice. 

*Note: Support for KFserving V2 prediction protocol with BentoML is coming soon.*

BentoML automatically generates a Dockerfile for API server when saving model.

In [None]:
%%bash

# Replace DOCKER_USERNAME with the Docker Hubb username
docker_username=DOCKER_USERNAME
model_path=$(bentoml get IrisClassifier:latest -q | jq -r ".uri.uri")

docker build -t $docker_username/iris-classifier $model_path

docker push $docker_username/iris-classifier

*Update the docker username inside InferenceServer configuration and apply to the cluster*

In [None]:
%%bash

# Replace DOCKER_USERNAME with the Docker Hub username
docker_username=DOCKER_USERNAME 

sed -i 's/{docker_username}/'"$docker_username"'/g' custom.yaml

kubectl apply -f custom.yaml

## Run prediction

*Note: Use kfserving-ingressgateway as your INGRESS_GATEWAY if you are deploying KFServing as part of Kubeflow install, and not independently.*

In [None]:
%%bash

MODEL_NAME=iris-classifier
INGRESS_GATEWAY=istio-ingressgateway
CLUSTER_IP=$(kubectl -n istio-system get service $INGRESS_GATEWAY -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}" \
  --header "Content-Type: application/json" \
  --request POST \
  --data '[[5.1, 3.5, 1.4, 0.2]]' \
  http://$CLUSTER_IP/model/predict

## Delete deployment

In [None]:
!kubectl delete -f custom.yaml