# Sample for KFServing SDK 

This is a sample for KFServing SDK. 

The notebook shows how to use KFServing SDK to create, get, rollout_canary, promote and delete InferenceService.

In [None]:
!pip install kfserving==0.3.0.1 --user

In [None]:
# Restart the kernel to pick up pip installed libraries
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [None]:
from kubernetes import client

from kfserving import KFServingClient
from kfserving import constants
from kfserving import utils
from kfserving import V1alpha2EndpointSpec
from kfserving import V1alpha2PredictorSpec
from kfserving import V1alpha2TensorflowSpec
from kfserving import V1alpha2InferenceServiceSpec
from kfserving import V1alpha2InferenceService
from kubernetes.client import V1ResourceRequirements

Define namespace where InferenceService needs to be deployed to. If not specified, below function defines namespace to the current one where SDK is running in the cluster, otherwise it will deploy to default namespace.

In [None]:
namespace = utils.get_default_target_namespace()
print(namespace)

## Define InferenceService
![Flowers](img/iris_three_species.jpg)
### The Iris classification problem
Imagine you are a botanist seeking an automated way to categorize each Iris flower you find. Machine learning provides many algorithms to classify flowers statistically. For instance, a sophisticated machine learning program could classify flowers based on photographs. Our ambitions are more modest—we're going to classify Iris flowers based on the length and width measurements of their sepals and petals.
The Iris genus entails about 300 species, but our program will only classify the following three:
* Iris setosa
* Iris virginica
* Iris versicolor


Firstly define default endpoint spec, and then define the inferenceservice basic on the endpoint spec.

In [None]:
api_version = constants.KFSERVING_GROUP + '/' + constants.KFSERVING_VERSION
default_endpoint_spec = V1alpha2EndpointSpec(
                          predictor=V1alpha2PredictorSpec(
                            min_replicas=1,
                            tensorflow=V1alpha2TensorflowSpec(
                              storage_uri='gs://kfserving-samples/models/tensorflow/flowers',
                              resources=V1ResourceRequirements(
                                  requests={'cpu':'100m','memory':'0.5Gi'},
                                  limits={'cpu':'100m', 'memory':'0.5Gi'}))))
    
isvc = V1alpha2InferenceService(api_version=api_version,
                          kind=constants.KFSERVING_KIND,
                          metadata=client.V1ObjectMeta(
                              name='flowers-sample', namespace=namespace),
                          spec=V1alpha2InferenceServiceSpec(default=default_endpoint_spec))

## Create InferenceService to Receive 100% Traffic

Call KFServingClient to create InferenceService.

In [None]:
kf_serving = KFServingClient()
kf_serving.create(isvc)

### Check the Status of InferenceService

In [None]:
!kubectl get inferenceservices -n $namespace

In [None]:
!kubectl get pod -n $namespace

In [None]:
kf_serving.get('flowers-sample', 
               namespace=namespace, 
               watch=True, 
               timeout_seconds=120)

## Run a Prediction
## *Please be patient, this might run for some minutes.*

In [None]:
%%bash
MODEL_NAME=flowers-sample
INPUT_PATH=@./input.json
INGRESS_GATEWAY=istio-ingressgateway

ISTIO_HOST_IP=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')
echo $ISTIO_HOST_IP

ISTIO_NODE_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath="{.spec.ports[?(@.name=='http2')].nodePort}")
echo $ISTIO_NODE_PORT

SERVICE_HOSTNAME=$(kubectl -n anonymous get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}"  http://${ISTIO_HOST_IP}:${ISTIO_NODE_PORT}/v1/models/${MODEL_NAME}:predict -d $INPUT_PATH

## Add a New InferenceService (with a New Model) to Receive 10% Traffic
Firstly define canary endpoint spec, and then rollout 10% traffic to the canary version, watch the rollout process.

In [None]:
canary_endpoint_spec = V1alpha2EndpointSpec(
                         predictor=V1alpha2PredictorSpec(
                           min_replicas=1,
                           tensorflow=V1alpha2TensorflowSpec(
                             storage_uri='gs://kfserving-samples/models/tensorflow/flowers-2',
                             resources=V1ResourceRequirements(
                                 requests={'cpu':'100m','memory':'0.5Gi'},
                                 limits={'cpu':'100m', 'memory':'0.5Gi'}))))

In [None]:
kf_serving.rollout_canary('flowers-sample', 
                          canary=canary_endpoint_spec, 
                          percent=10,
                          namespace=namespace, 
                          watch=True, 
                          timeout_seconds=120)

In [None]:
!kubectl get inferenceservices -n $namespace

In [None]:
%%bash
MODEL_NAME=flowers-sample
INPUT_PATH=@./input.json
INGRESS_GATEWAY=istio-ingressgateway

ISTIO_HOST_IP=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')
echo $ISTIO_HOST_IP

ISTIO_NODE_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath="{.spec.ports[?(@.name=='http2')].nodePort}")
echo $ISTIO_NODE_PORT

SERVICE_HOSTNAME=$(kubectl -n anonymous get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}"  http://${ISTIO_HOST_IP}:${ISTIO_NODE_PORT}/v1/models/${MODEL_NAME}:predict -d $INPUT_PATH

## Send More Traffic to the New InferenceService
Send 50% traffic to the new model

In [None]:
kf_serving.rollout_canary('flowers-sample',
                         percent=50, 
                         namespace=namespace,
                         watch=True, 
                         timeout_seconds=120)

In [None]:
!kubectl get inferenceservices -n $namespace

In [None]:
%%bash
MODEL_NAME=flowers-sample
INPUT_PATH=@./input.json
INGRESS_GATEWAY=istio-ingressgateway

ISTIO_HOST_IP=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')
echo $ISTIO_HOST_IP

ISTIO_NODE_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath="{.spec.ports[?(@.name=='http2')].nodePort}")
echo $ISTIO_NODE_PORT

SERVICE_HOSTNAME=$(kubectl -n anonymous get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}"  http://${ISTIO_HOST_IP}:${ISTIO_NODE_PORT}/v1/models/${MODEL_NAME}:predict -d $INPUT_PATH

## Delete the InferenceService

In [None]:
#kf_serving.delete('flowers-sample', 
#                  namespace=namespace)