# About this Jupyter Notebook

@author: Yingding Wang\
@last_update: 19.April 2023

This notebook shows how to use KServe to deploy a model in an on-prem kubeflow, and call the inference backend from istio.

* KServe Tutorial deprecated: https://www.kubeflow.org/docs/external-add-ons/kserve/first_isvc_kserve/
* KServe 0.7.0 Sklearn v2: https://kserve.github.io/website/modelserving/v1beta1/sklearn/v2/
* KServe 0.7 and 0.9 Schemas: https://kserve.github.io/website/master/get_started/first_isvc/#2-create-an-inferenceservice

Notice:\
Different version of KServe e.g. 0.7.0 has different sklearnserver preloaded with certain version of `joblib` and `scikit-learn`. For KServe to serve a model with sklearnserver, your model must be compiled with the associated `joblib` and `scikit-learn` version from the `sklearnserver`, the setup of the `sklearnserver` can be found at:
* https://github.com/kserve/kserve/blob/release-0.7/python/sklearnserver/setup.py
* https://github.com/kserve/kserve/blob/release-0.8/python/sklearnserver/setup.py
* https://github.com/kserve/kserve/blob/release-0.9/python/sklearnserver/setup.py


Additional Resources:
* Working Example for KServe backend 0.7.0 from KF 1.5.1 https://kserve.github.io/website/master/get_started/first_isvc/#4-determine-the-ingress-ip-and-ports
* Connect with istio-dex (?): https://github.com/KServe/KServe/tree/master/docs/samples/istio-dex

KF 1.6.1 has KServe 0.80
* Sklearnserver 1.0.1 https://github.com/kserve/kserve/blob/release-0.8/python/sklearnserver/setup.py

Issues:
* The dashboard of KF 1.6.1 is showing the wrong model endpoint: https://github.com/kubeflow/manifests/issues/2180#issuecomment-1082045570

Use `https://<kubeflow_domain>/_/kserve-endpoints/?ns=<namespace>` to see the endpoints or models

In [1]:
import sys

In [2]:
# (optional) remove the kfserving before install the kserve
!{sys.executable} -m pip uninstall -y kfserving kserve kfp

Found existing installation: kfp 1.6.3
Uninstalling kfp-1.6.3:
  Successfully uninstalled kfp-1.6.3


In [3]:
# !{sys.executable} -m pip install --upgrade --user kfp==1.8.19 kserve==0.9.0
!{sys.executable} -m pip install --upgrade --user kfp==1.8.19 kserve==0.9.0

Collecting kfp==1.8.19
  Using cached kfp-1.8.19-py3-none-any.whl
Collecting kserve==0.9.0
  Using cached kserve-0.9.0-py3-none-any.whl (304 kB)
Collecting argparse>=1.4.0
  Using cached argparse-1.4.0-py2.py3-none-any.whl (23 kB)
Collecting ray[serve]==1.10.0
  Using cached ray-1.10.0-cp38-cp38-manylinux2014_x86_64.whl (59.3 MB)
Installing collected packages: ray, argparse, kserve, kfp
  Attempting uninstall: ray
    Found existing installation: ray 2.0.0
    Uninstalling ray-2.0.0:
      Successfully uninstalled ray-2.0.0
Successfully installed argparse-1.4.0 kfp-1.8.19 kserve-0.9.0 ray-1.10.0


In [4]:
from kubernetes import client 
from kserve import KServeClient
from kserve import constants
from kserve import utils
from kserve import V1beta1InferenceService
from kserve import V1beta1InferenceServiceSpec
from kserve import V1beta1PredictorSpec
from kserve import V1beta1SKLearnSpec

In [5]:
namespace = utils.get_default_target_namespace()
name='sklearn-iris'
kserve_version='v1beta1'
api_version = constants.KSERVE_GROUP + '/' + kserve_version
model_storage_uri = "gs://kfserving-examples/models/sklearn/1.0/model" # compiled with sklearn 1.0.1
# model_storage_uri = "gs://seldon-models/sklearn/iris" # compiled with sklearn 0.23.1
model_protocol_version = "v2"

In [6]:
# this is still the old schema for kserve 0.7.0
# From the KServe 0.7.0 example: https://kserve.github.io/website/modelserving/v1beta1/sklearn/v2/
# For the new schema for kserve 0.9.0 see: https://kserve.github.io/website/master/get_started/first_isvc/
isvc = V1beta1InferenceService(api_version=api_version,
                               kind=constants.KSERVE_KIND,
                               metadata=client.V1ObjectMeta(
                                   name=name, namespace=namespace, annotations={'sidecar.istio.io/inject':'false'}),
                               spec=V1beta1InferenceServiceSpec(
                               predictor=V1beta1PredictorSpec(
                               sklearn=(V1beta1SKLearnSpec(
                                   protocol_version=model_protocol_version,
                                   storage_uri=model_storage_uri))))
)

In [7]:
KServe = KServeClient()

In [10]:
# optional, remove the old model
# KServe.delete(name, namespace=namespace)

{'apiVersion': 'serving.kserve.io/v1beta1',
 'kind': 'InferenceService',
 'metadata': {'annotations': {'sidecar.istio.io/inject': 'false'},
  'creationTimestamp': '2023-04-19T20:03:51Z',
  'deletionGracePeriodSeconds': 0,
  'deletionTimestamp': '2023-04-19T20:12:04Z',
  'finalizers': ['inferenceservice.finalizers'],
  'generation': 3,
  'labels': {'modelClass': 'mlserver_sklearn.SKLearnModel'},
  'managedFields': [{'apiVersion': 'serving.kserve.io/v1beta1',
    'fieldsType': 'FieldsV1',
    'fieldsV1': {'f:metadata': {'f:annotations': {'.': {},
       'f:sidecar.istio.io/inject': {}}},
     'f:spec': {'.': {},
      'f:predictor': {'.': {},
       'f:sklearn': {'.': {},
        'f:name': {},
        'f:protocolVersion': {},
        'f:storageUri': {}}}}},
    'manager': 'OpenAPI-Generator',
    'operation': 'Update',
    'time': '2023-04-19T20:03:48Z'},
   {'apiVersion': 'serving.kserve.io/v1beta1',
    'fieldsType': 'FieldsV1',
    'fieldsV1': {'f:metadata': {'f:finalizers': {'.': {},

In [11]:
KServe.create(isvc)

{'apiVersion': 'serving.kserve.io/v1beta1',
 'kind': 'InferenceService',
 'metadata': {'annotations': {'sidecar.istio.io/inject': 'false'},
  'creationTimestamp': '2023-04-19T20:12:26Z',
  'generation': 1,
  'labels': {'modelClass': 'mlserver_sklearn.SKLearnModel'},
  'managedFields': [{'apiVersion': 'serving.kserve.io/v1beta1',
    'fieldsType': 'FieldsV1',
    'fieldsV1': {'f:metadata': {'f:annotations': {'.': {},
       'f:sidecar.istio.io/inject': {}}},
     'f:spec': {'.': {},
      'f:predictor': {'.': {},
       'f:sklearn': {'.': {},
        'f:name': {},
        'f:protocolVersion': {},
        'f:storageUri': {}}}}},
    'manager': 'OpenAPI-Generator',
    'operation': 'Update',
    'time': '2023-04-19T20:12:22Z'}],
  'name': 'sklearn-iris',
  'namespace': 'kubeflow-kindfor',
  'resourceVersion': '1297956997',
  'uid': 'e2430a61-941e-4a4c-8d4f-aced3844fdc2'},
 'spec': {'predictor': {'model': {'env': [{'name': 'MLSERVER_MODEL_NAME',
      'value': 'sklearn-iris'},
     {'name'

In [24]:
a = KServe.get(name, namespace=namespace, watch=True, timeout_seconds=120)

NAME                 READY                           PREV                    LATEST URL                                                              
sklearn-iris         True                               0                       100 http://sklearn-iris.kubeflow-kindfor.example.com                 


### Call the inference service endpoint
* https://kserve.github.io/website/master/get_started/first_isvc/#5-perform-inference
* Setup knative gateway: https://knative.dev/docs/serving/setting-up-custom-ingress-gateway/

In [14]:
import requests
from requests.sessions import Session

isvc_resp = KServe.get(name, namespace=namespace)
isvc_url_external = isvc_resp['status']['url']
isvc_url_internal = isvc_resp['status']['address']['url']
print(f"URL External {isvc_url_external}")
print(f"URL Internal {isvc_url_internal}")

URL External http://sklearn-iris.kubeflow-kindfor.example.com
URL Internal http://sklearn-iris.kubeflow-kindfor.svc.cluster.local/v2/models/sklearn-iris/infer


### Inference with first data sample

In [15]:
# https://kserve.github.io/website/modelserving/v1beta1/sklearn/v2/
inference_input = {
  "inputs": [
    {
      "name": "input-0",
      "shape": [2, 4],
      "datatype": "FP32",
      "data": [
        [6.8, 2.8, 4.8, 1.4],
        [6.0, 3.4, 4.5, 1.6]
      ]
    }
  ]
}

response = requests.post(isvc_url_internal, json=inference_input)
print(response.text)
print(f"\nInference result: {response.json()['outputs'][0]['data']}")

{"model_name":"sklearn-iris","model_version":null,"id":"d18f30a9-4440-495d-8594-8c708220a13e","parameters":null,"outputs":[{"name":"predict","shape":[2],"datatype":"INT32","parameters":null,"data":[1,1]}]}

Inference result: [1, 1]


### Inference with second data sample

In [16]:
inference_input = {
  "inputs": [
    {
      "name": "input-0",
      "shape": [2, 4],
      "datatype": "FP32",
      "data": [
        [9.0, 3.8, 6.8, 2.4],
        [6.0, 3.4, 4.5, 1.6]
      ]
    }
  ]
}
response = requests.post(isvc_url_internal, json=inference_input)
print(response.text)
print(f"\nInference result: {response.json()['outputs'][0]['data']}")

{"model_name":"sklearn-iris","model_version":null,"id":"1114b2c3-8bb8-48f8-8084-d1a80739acd6","parameters":null,"outputs":[{"name":"predict","shape":[2],"datatype":"INT32","parameters":null,"data":[2,1]}]}

Inference result: [2, 1]


In [None]:
# optional
# KServe.delete(name, namespace=namespace)