# Object Segmenation Inference With KFServing on ASH Clusters

For this tutorial, you will deploy the model trained in [previous notebook](object_segmentation-ash.ipynb), then deploy to KFServing PytorchServer hosted by an Azure Stack Hub cluster.

## Prerequisite

*     Following [this notebook](object_segmentation-ash.ipynb) to train and register the parameter model in AML workspace.

*     Install KFServing as described [here](KFServing-setup.md)
    

## Deploy to KFServing PytorchServer

KFServing [PytorchServer](https://github.com/kubeflow/kfserving/blob/master/python/pytorchserver/pytorchserver/model.py) needs two files: "model.pt" which contains "state_dict" of the trained model, and a custom python file which contains the network information of the trained mode.  Here, "model.pt" is available in AML register model as described in [this notebook](object_segmentation-ash.ipynb).  You can download to your local file system.  The python file, named "score_model.py" here is included in this repository. These two files are passed to inference servers through [KFServing storageUri](https://github.com/kubeflow/kfserving/blob/master/python/kfserving/kfserving/storage.py) which supports azure storage blob.

### Preparation of StorageUri

The two files (model.pt, score_model.py) can be uploaded to your storage account as container's blob using portal (on upload page, click advanced, choose "upload to folder"). The azure storage path containing these two files should be looks like 

```https://<storage_account_name>.blob.core.windows.net/<container_name>/<folder_name>```

Note: KFServing StorageUri currently does not support Azure Stack Hub storage account

###  Create a kubernetes secret and service account for client credentials:

   The data in Secret is encoded in base64. Here is a simple way to encode a plain string in base64 in Linux:

<pre> $ echo -n "mystring" | base64 </pre>

   please make sure the "-n" option is used. To decode:

<pre> $ echo -n "base64-string" | base64 -d </pre>


In [None]:
%%writefile azure_secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: azcreds
type: Opaque
data:
  AZ_CLIENT_ID: "<base_64 encoded>"
  AZ_CLIENT_SECRET: "<base_64 encoded>"
  AZ_SUBSCRIPTION_ID: "<base_64 encoded>"
  AZ_TENANT_ID: "<base_64 encoded>"

In [None]:
%%writefile service_account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: azuresa
secrets:
- name: azcreds

### Install Secret and Service Account:

<pre>
kubectl apply -f azure_secret.yaml

kubectl apply -f service_account.yaml
</pre>

### Install InferenceService

In [None]:
%%writefile obj_seg_inferenceservice.yaml
apiVersion: "serving.kubeflow.org/v1alpha2"
kind: "InferenceService"
metadata:
  name: "obj-seg"
spec:
  default:
    predictor:
      pytorch:
        storageUri: "https://backupsli.blob.core.windows.net/kftorch/objectseg"
        modelClassName: "Net"
      serviceAccountName: azuresa

###  Get xip.io url for Testing:

If you configured DNS with xip.io as described in [KFserving installation guide](KFServing-setup.md), you can get the
xip.io url and use a web test tool like Insomnia or Postman to test your service.

*  Get host url:

<pre>
$ kubectl get ksvc

NAME                                URL                                                                                          LATESTCREATED                        LATESTREADY                       READY   REASON
obj-seg-predictor-default   http://obj-seg-predictor-default.default.10.217.119.227.xip.io   obj-seg-predictor-default-00002   obj-seg-predictor-default-00002   True
</pre>  

As displayed, host url is http://obj-seg-predictor-default.default.10.217.119.227.xip.io

*  The whole url:

    The whole url is composed as {host_url}/ v1/models/obj-seg:predictt
    
    For this particular example, it is:

    http://obj-seg-predictor-default.default.10.217.119.227.xip.io/v1/models/obj-seg:predict
    

*  Test the inference service:

   Once the end point is identified as describe above, you can run the same testing codes as in AML deployment cases shown in previous cells

## Test Service Using the Restful End Point.


### Create a  function to call the url end point

Creae a simple help function to wrap the restful endpoint call:

In [None]:
import urllib.request
import json

from PIL import Image
from torchvision.transforms import functional as F
import numpy as np


def service_infer(url, body, api_key):
    headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}

    req = urllib.request.Request(url, body, headers)

    try:
        response = urllib.request.urlopen(req)

        result = response.read()
        return result

    except urllib.error.HTTPError as error:
        print("The request failed with status code: " + str(error.code))

        # Print the headers - they include the requert ID and the timestamp, which are useful for debugging the failure
        print(error.info())
        print(json.loads(error.read().decode("utf8", 'ignore')))

### Test a Few Examples

In [None]:
url = 'http://obj-seg-predictor-default.default.10.217.119.227.xip.io/v1/models/obj-seg:predict' # replace with url from your deployment
api_key = ''  

img_nums = ["00001","00002"]
image_paths = ["PennFudanPed\\PNGImages\\FudanPed{}.png".format(item) for item in img_nums]
image_np_list = []
for image_path in image_paths:
    img = Image.open(image_path)
    img.show("input_image")
    img_rgb = img.convert("RGB")
    img_tensor = F.to_tensor(img_rgb)
    img_np = img_tensor.numpy()
    image_np_list.append(img_np.tolist())

request = {"instances": image_np_list}
inputs = json.dumps(request)

body = str.encode(inputs)
resp = service_infer(url, body, api_key)
p_obj = json.loads(resp)

# put the model in evaluation mode
for instance_pred in p_obj["predictions"]:
    image_data = instance_pred["masks"]
    img_np = np.array(image_data)
    output = Image.fromarray(img_np)
    output.show()
