# Sample for KFServing SDK with a custom image

This is a sample for KFServing SDK using a custom image.

The notebook shows how to use KFServing SDK to create, get and delete InferenceService with a custom image.

### Setup
- Your `~/.kube/config` should point to a cluster with KFServing installed.
- Your cluster's Istio Ingress gateway must be network accessible.

### Build the docker image we will be using.

The goal of custom image support is to allow users to bring their own wrapped model inside a container and serve it with KFServing. Please note that you will need to ensure that your container is also running a web server e.g. Flask to expose your model endpoints. This example extends kfserving.KFModel which uses the tornado web server.


To build and push with Docker Hub set the `DOCKER_HUB_USERNAME` variable below with your Docker Hub username

In [1]:
# Set this to be your dockerhub username
# It will be used when building your image and when creating the InferenceService for your image
DOCKER_HUB_USERNAME = "rzgry"

In [2]:
%%bash -s "$DOCKER_HUB_USERNAME"
docker build -t $1/kfserving-custom-model .

Sending build context to Docker daemon  162.8kB
Step 1/7 : FROM python:3.7-slim
 ---> 7e61acc68112
Step 2/7 : ENV APP_HOME /app
 ---> Using cache
 ---> 579e63347278
Step 3/7 : WORKDIR $APP_HOME
 ---> Using cache
 ---> 8c34334f7db1
Step 4/7 : COPY requirements.txt ./
 ---> Using cache
 ---> 11d60915380b
Step 5/7 : RUN pip install --no-cache-dir -r ./requirements.txt
 ---> Using cache
 ---> ef26a0319a6a
Step 6/7 : COPY model.py  imagenet_classes.txt ./
 ---> Using cache
 ---> e92fc584f413
Step 7/7 : CMD ["python", "model.py"]
 ---> Using cache
 ---> 0638ef45edc1
Successfully built 0638ef45edc1
Successfully tagged rzgry/kfserving-custom-model:latest


In [3]:
%%bash -s "$DOCKER_HUB_USERNAME"
docker push $1/kfserving-custom-model

The push refers to repository [docker.io/rzgry/kfserving-custom-model]
e20c8c6118ee: Preparing
46e1361fc807: Preparing
1cbb632188b9: Preparing
1a2672635cd4: Preparing
db870d9948f7: Preparing
7d9bd7f5b03a: Preparing
83d8f2f27444: Preparing
cb82f398d4bd: Preparing
488dfecc21b1: Preparing
7d9bd7f5b03a: Waiting
83d8f2f27444: Waiting
cb82f398d4bd: Waiting
488dfecc21b1: Waiting
db870d9948f7: Layer already exists
1cbb632188b9: Layer already exists
1a2672635cd4: Layer already exists
46e1361fc807: Layer already exists
e20c8c6118ee: Layer already exists
7d9bd7f5b03a: Layer already exists
83d8f2f27444: Layer already exists
cb82f398d4bd: Layer already exists
488dfecc21b1: Layer already exists
latest: digest: sha256:584a690653612987b268ed90fc742bcef0d5ee9380613ae3bb21d9262cdcabc1 size: 2206


### KFServing Client SDK

We will use the [KFServing client SDK](https://github.com/kubeflow/kfserving/blob/master/python/kfserving/README.md#kfserving-client) to create the InferenceService and deploy our custom image.

In [4]:
from kubernetes import client
from kubernetes.client import V1Container

from kfserving import KFServingClient
from kfserving import constants
from kfserving import utils
from kfserving import V1alpha2EndpointSpec
from kfserving import V1alpha2PredictorSpec
from kfserving import V1alpha2InferenceServiceSpec
from kfserving import V1alpha2InferenceService
from kfserving import V1alpha2CustomSpec

In [5]:
namespace = utils.get_default_target_namespace()
print(namespace)

default


### Define InferenceService

Firstly define default endpoint spec, and then define the inferenceservice using the endpoint spec.

To use a custom image we need to use V1alphaCustomSpec which takes a [V1Container](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1Container.md)
 from the kuberenetes library


In [6]:
api_version = constants.KFSERVING_GROUP + '/' + constants.KFSERVING_VERSION

default_endpoint_spec = V1alpha2EndpointSpec(
                          predictor=V1alpha2PredictorSpec(
                              custom=V1alpha2CustomSpec(
                                  container=V1Container(
                                      name="kfserving-custom-model",
                                      image=f"{DOCKER_HUB_USERNAME}/kfserving-custom-model"))))

isvc = V1alpha2InferenceService(api_version=api_version,
                          kind=constants.KFSERVING_KIND,
                          metadata=client.V1ObjectMeta(
                              name='kfserving-custom-model', namespace=namespace),
                          spec=V1alpha2InferenceServiceSpec(default=default_endpoint_spec))

### Create the InferenceService

Call KFServingClient to create InferenceService.

In [7]:
KFServing = KFServingClient()
KFServing.create(isvc)



{'apiVersion': 'serving.kubeflow.org/v1alpha2',
 'kind': 'InferenceService',
 'metadata': {'creationTimestamp': '2020-03-09T19:49:59Z',
  'generation': 1,
  'name': 'kfserving-custom-model',
  'namespace': 'default',
  'resourceVersion': '97586539',
  'selfLink': '/apis/serving.kubeflow.org/v1alpha2/namespaces/default/inferenceservices/kfserving-custom-model',
  'uid': 'c31bd271-6296-4c52-9abf-ef1750ac64cc'},
 'spec': {'default': {'predictor': {'custom': {'container': {'image': 'rzgry/kfserving-custom-model',
      'name': 'kfserving-custom-model',
      'resources': {'limits': {'cpu': '1', 'memory': '2Gi'},
       'requests': {'cpu': '1', 'memory': '2Gi'}}}}}}},
 'status': {}}

### Check the InferenceService

In [8]:
KFServing.get('kfserving-custom-model', namespace=namespace, watch=True, timeout_seconds=120)

NAME                 READY      DEFAULT_TRAFFIC CANARY_TRAFFIC  URL                                               
kfserving-custom-... True                   100                 http://kfserving-custom-model.default.example.c...


### Run a prediction 

In [9]:
MODEL_NAME = "kfserving-custom-model"

In [10]:
%%bash --out CLUSTER_IP
echo "$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"

In [11]:
%%bash -s "$MODEL_NAME" --out SERVICE_HOSTNAME
echo "$(kubectl get inferenceservice $1 -o jsonpath='{.status.url}' | cut -d "/" -f 3)"

In [12]:
import requests
import json

with open('input.json') as json_file:
    data = json.load(json_file)
    url = f"http://{CLUSTER_IP.strip()}/v1/models/{MODEL_NAME}:predict"
    headers = {"Host": SERVICE_HOSTNAME.strip()}
    result = requests.post(url, data=json.dumps(data), headers=headers)
    print(result.content)

b'{"predictions": {"Labrador retriever": 0.4158518612384796, "golden retriever": 0.1659165322780609, "Saluki, gazelle hound": 0.16286855936050415, "whippet": 0.028539149090647697, "Ibizan hound, Ibizan Podenco": 0.023924754932522774}}'


### Delete the InferenceService

In [13]:
KFServing.delete(MODEL_NAME, namespace=namespace)

{'kind': 'Status',
 'apiVersion': 'v1',
 'metadata': {},
 'status': 'Success',
 'details': {'name': 'kfserving-custom-model',
  'group': 'serving.kubeflow.org',
  'kind': 'inferenceservices',
  'uid': 'c31bd271-6296-4c52-9abf-ef1750ac64cc'}}