## Model serving with HSML

In this example, we are going to serve the model that we created in the model training notebook.

For the example to work, you need to have serving enabled in your project. In the settings tab for your project, select Serving to enable it. Now your UI should show a new tab called Model Serving.

A model deployment (also called "model serving") can be created directly in the Hopsworks UI, by clicking on Model Serving and then on Create New Serving. In this example, however, we will create it through code with the HSML library.

![tutorial-flow](images/end_to_end.png)

### About Model Serving

Models can be served via KFServing or "default" serving, which means a Docker container exposing a Flask server. For KFServing models, or models written in Tensorflow, you do not need to write a prediction file (see the section below). However, for sklearn models using default serving, you do need to proceed to write a prediction file.

In order to use KFServing, you must have Kubernetes installed and enabled on your cluster.

## connect to feature store 

In [2]:
import hsfs

# connect to feature store
conn = hsfs.connection()
fs = conn.get_feature_store()

Connected. Call `.close()` to terminate connection gracefully.


## get feature view 

In [None]:
feature_view = fs.get_feature_view("transactions_view", 1)

##  get feature vectore from online feature store 

In [4]:
# Training data version is required for transformation. Call `feature_view.init_serving(version)` to pass the training dataset version.Training 
# data can be created by `feature_view.create_training_data` or `feature_view.get_training_data`.

feature_view.init_serving(1)

In [5]:
card_ids = [
    "4473593503484549",
    "4336399961348201",
    "4219785543443381",
    "4137709749259770",
    "4573366597272313",
    "4929411498746287",
    "4855787436134696"    
]

In [None]:
feature_view.get_feature_vector({"cc_num": "4473593503484549"})

## TODO (Davit): gif how to model serving from the UI
### Use REST endpoint 

You can also use a REST endpoint for your model. To do this you need to create an API key with 'serving' enabled, and retrieve the endpoint URL from the Model Serving UI.

Go to the Model Serving UI and click on the eye icon next to a model to retrieve the endpoint URL. The shorter URL is an internal endpoint that you can only reach from within Hopsworks. If you want to call it from outside, you need one of the longer URLs. Make sure to use https instead of http. (**TODO this should be fixed**)


In [3]:
import os
import requests

import hsml

conn = hsml.connection()
mr = conn.get_model_registry()

# Use the model name from the previous notebook.
model = mr.get_model("fraud_tutorial_model", version=1)

API_KEY = ""  # Put your API key here.
MODEL_SERVING_ENDPOINT = "" # Put model serving endppoint here.
HOST_NAMDE = "" # Put your hopsworks model serving endppoint here 

Connected. Call `.close()` to terminate connection gracefully.


{'predictions': [0]}

In [None]:
data = {"inputs": test_inputs}
url = os.environ["REST_ENDPOINT"] + MODEL_SERVING_ENDPOINT 
headers = {
    "Content-Type": "application/json", "Accept": "application/json",
    "Authorization": f"ApiKey {API_KEY}",
    "Host": ""}

response = requests.post(url, verify=False, headers=headers, json=data)
response.json()

### Stop Deployment

To stop the deployment we simply run:

In [None]:
deployment.stop()

### Next Steps

In the next notebook we'll take a look at how to automate jobs in Hopsworks using Airflow.