## Run your first `InferenceService`

**In this tutorial, you will deploy a ScikitLearn InferenceService.**

This inference service loads a simple iris ML model, send a list of attributes and print the prediction for the class of iris plant."

Since your model is being deployed as an InferenceService, not a raw Kubernetes Service, you just need to provide the trained model and
it gets some **super powers out of the box** :rocket:.

## Install KServe SDK

In [1]:
!pip install kserve==0.7.0

Collecting argparse>=1.4.0
  Using cached argparse-1.4.0-py2.py3-none-any.whl (23 kB)
Installing collected packages: argparse
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
kfserving 0.5.1 requires azure-storage-blob<=2.1.0,>=1.3.0, but you have azure-storage-blob 12.8.1 which is incompatible.[0m
Successfully installed argparse-1.4.0


## Import packages

In [2]:
from kubernetes import client 
from kserve import KServeClient
from kserve import constants
from kserve import utils
from kserve import V1beta1InferenceService
from kserve import V1beta1InferenceServiceSpec
from kserve import V1beta1PredictorSpec
from kserve import V1beta1SKLearnSpec

## Declare Namespace

In [3]:
namespace = utils.get_default_target_namespace()

## Define InferenceService

Firstly define default endpoint spec, and then define the inferenceservice basic on the endpoint spec.

In [4]:
name='sklearn-iris'
kserve_version='v1beta1'
api_version = constants.KSERVE_GROUP + '/' + kserve_version

isvc = V1beta1InferenceService(api_version=api_version,
                               kind=constants.KSERVE_KIND,
                               metadata=client.V1ObjectMeta(
                                   name='sklearn-iris', namespace=namespace, annotations={'sidecar.istio.io/inject':'false'}),
                               spec=V1beta1InferenceServiceSpec(
                               predictor=V1beta1PredictorSpec(
                               sklearn=(V1beta1SKLearnSpec(
                                   storage_uri="gs://kfserving-samples/models/sklearn/iris"))))
)

## Create InferenceService

Call KServeClient to create InferenceService.

In [5]:
KServe = KServeClient()
KServe.create(isvc)

{'apiVersion': 'serving.kserve.io/v1beta1',
 'kind': 'InferenceService',
 'metadata': {'annotations': {'sidecar.istio.io/inject': 'false'},
  'creationTimestamp': '2022-02-27T07:54:22Z',
  'generation': 1,
  'managedFields': [{'apiVersion': 'serving.kserve.io/v1beta1',
    'fieldsType': 'FieldsV1',
    'fieldsV1': {'f:metadata': {'f:annotations': {'.': {},
       'f:sidecar.istio.io/inject': {}}},
     'f:spec': {'.': {},
      'f:predictor': {'.': {}, 'f:sklearn': {'.': {}, 'f:storageUri': {}}}}},
    'manager': 'OpenAPI-Generator',
    'operation': 'Update',
    'time': '2022-02-27T07:54:19Z'}],
  'name': 'sklearn-iris',
  'namespace': 'kubeflow-user-example-com',
  'resourceVersion': '6753817',
  'uid': '182227dd-7224-4111-8570-6fce80127540'},
 'spec': {'predictor': {'sklearn': {'name': 'kserve-container',
    'protocolVersion': 'v1',
    'resources': {'limits': {'cpu': '1', 'memory': '2Gi'},
     'requests': {'cpu': '1', 'memory': '2Gi'}},
    'runtimeVersion': 'v0.7.0',
    'stora

## Check the InferenceService

In [6]:
KServe.get('sklearn-iris', namespace=namespace, watch=True, timeout_seconds=120)

NAME                 READY                           PREV                    LATEST URL                                                              
sklearn-iris         Unknown                            0                       100                                                                  
sklearn-iris         Unknown                            0                       100                                                                  
sklearn-iris         Unknown                            0                       100                                                                  
sklearn-iris         Unknown                            0                       100                                                                  
sklearn-iris         True                               0                       100 http://sklearn-iris-kubeflow-user-example-com.pvaneck-iks-121-...


# Get Predictions

### Get isvc internal url

In [7]:
url = "http://{}-predictor-default.{}.svc.cluster.local/v1/models/{}:predict".format(name, namespace, name)
print(url)

http://sklearn-iris-predictor-default.kubeflow-user-example-com.svc.cluster.local/v1/models/sklearn-iris:predict


### Curl the url and pass data

In [8]:
!curl http://sklearn-iris.kubeflow-user-example-com.svc.cluster.local/v1/models/sklearn-iris:predict -d @./iris-input.json

{"predictions": [1, 1]}

## Run Performance Test

In [10]:
!kubectl create -f https://raw.githubusercontent.com/kserve/kserve/release-0.7/docs/samples/v1beta1/sklearn/v1/perf.yaml -n kubeflow-user-example-com

job.batch/load-test4kxf6 created
configmap/vegeta-cfg created


### Get Job Name

In [15]:
!kubectl get pods --namespace=kubeflow-user-example-com | grep load

load-test4kxf6-xfls4                                              0/1     Completed   0          10m


**Expected Output**

In [14]:
!kubectl logs load-test4kxf6-xfls4 -n kubeflow-user-example-com

Requests      [total, rate, throughput]         30000, 500.02, 0.00
Duration      [total, attack, wait]             1m0s, 59.998s, 2.921ms
Latencies     [min, mean, 50, 90, 95, 99, max]  2.536µs, 5.495ms, 2.661ms, 4.861ms, 7.556ms, 45.074ms, 567.662ms
Bytes In      [total, mean]                     0, 0.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           0.00%
Status Codes  [code:count]                      0:30000  
Error Set:
Post "http://sklearn-iris.kserve-test.svc.cluster.local/v1/models/sklearn-iris:predict": dial tcp: lookup sklearn-iris.kserve-test.svc.cluster.local on 172.21.0.10:53: no such host
