# <span style="font-width:bold; font-size: 3rem; color:#1EB182;"> **Hopsworks Feature Store** </span><span style="font-width:bold; font-size: 3rem; color:#333;">- Part 03: Online Inference</span>

<span style="font-width:bold; font-size: 1.4rem;">In this last notebook you will use your deployment for online inference.  </span>

## **🗒️ This notebook is divided into the following sections:** 
1. **Deployment Retrieval**: Retrieve your deployment from the model registry.
2. **Prediction using deployment**.
3. **REST endpoint usage for model serving**.

## <span style="color:#ff5f27;"> 📡 Connecting to Hopsworks Feature Store </span>

In [1]:
import hopsworks

project = hopsworks.login()

# Get the feature store handle for the project's feature store
fs = project.get_feature_store()

Connected. Call `.close()` to terminate connection gracefully.

Multiple projects found. 

	 (1) marco
	 (2) quickstart_shared



Enter project to access:  1



Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/397461
Connected. Call `.close()` to terminate connection gracefully.


## <span style="color:#ff5f27;">🗄 Model Registry</span>


In [2]:
# Get the Model Registry
mr = project.get_model_registry()

Connected. Call `.close()` to terminate connection gracefully.


In [3]:
# Retrieve the "aml_model" from the model registry
model = mr.get_model(
    name="aml_model", 
    version=1,
)

## <span style='color:#ff5f27'>⚙️ Fetch Deployment</span>

In [4]:
# Access the Model Serving
ms = project.get_model_serving()

# Specify the deployment name
deployment_name = "amlmodeldeployment"

# Get the deployment with the specified name
deployment = ms.get_deployment(deployment_name)

# Start the deployment and wait for it to be in a running state for up to 300 seconds
deployment.start(await_running=300)

Connected. Call `.close()` to terminate connection gracefully.


  0%|          | 0/6 [00:00<?, ?it/s]

Start making predictions by using `.predict()`


## <span style='color:#ff5f27'>🔮 Predicting using deployment</span>


Finally you can start making predictions with your model!

Send inference requests to the deployed model as follows:

In [None]:
# Prepare input data using the input example from the model
data = {
    "inputs": model.input_example,
}

# Make predictions using the deployed model
predictions = deployment.predict(data)

In [9]:
deployment.get_logs()

Explore all the logs and filters in the Kibana logs at https://c.app.hopsworks.ai:443/p/397461/deployments/201729

Instance name: amlmodeldeployment-predictor-default-00001-deployment-6b5d74nbt
2024-02-13 14:05:17.852486: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:213] Running initialization op on SavedModel bundle at path: /mnt/models/3
2024-02-13 14:05:18.106984: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:305] SavedModel load for tags { serve }; Status: success: OK. Took 991046 microseconds.
2024-02-13 14:05:18.123601: I tensorflow_serving/servables/tensorflow/saved_model_warmup_util.cc:62] No warmup data file found at /mnt/models/3/assets.extra/tf_serving_warmup_requests
2024-02-13 14:05:18.616640: I tensorflow_serving/core/loader_harness.cc:95] Successfully loaded servable version {name: amlmodeldeployment version: 3}
2024-02-13 14:05:18.617890: I tensorflow_serving/model_servers/server_core.cc:486] Finished adding/updating models
2024-02-13 14

In [None]:
# Now lets test feature vectors from online store
ids_to_score = [
    "0016359b", 
    "001dcc27", 
    "0054a022", 
    "00d6b609", 
    "00e14860", 
    "00e39a1b", 
    "014ed5cb", 
    "01ce3306", 
    "01fa19ae", 
    "01fa1d01", 
    "036dce03", 
    "03e09be4", 
    "04b23f4b",
]

for node_id in ids_to_score:
    data = {"inputs": [node_id]}
    print(" anomaly score for node_id ", node_id, " : ",   deployment.predict(data)["outputs"])

> For trouble shooting one can use `get_logs` method.

In [20]:
deployment.get_logs()

Explore all the logs and filters in the Kibana logs at https://c.app.hopsworks.ai:443/p/397461/deployments/200706

Instance name: amlmodeldeployment-predictor-default-00001-deployment-5fb7hrrr8
2024-02-13 13:14:29.034930: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:213] Running initialization op on SavedModel bundle at path: /mnt/models/1
2024-02-13 13:14:29.236935: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:305] SavedModel load for tags { serve }; Status: success: OK. Took 1020226 microseconds.
2024-02-13 13:14:29.254368: I tensorflow_serving/servables/tensorflow/saved_model_warmup_util.cc:62] No warmup data file found at /mnt/models/1/assets.extra/tf_serving_warmup_requests
2024-02-13 13:14:29.810802: I tensorflow_serving/core/loader_harness.cc:95] Successfully loaded servable version {name: amlmodeldeployment version: 1}
2024-02-13 13:14:29.812096: I tensorflow_serving/model_servers/server_core.cc:486] Finished adding/updating models
2024-02-13 1

## <span style='color:#ff5f27'>🚀 Use REST endpoint</span>

You can also use a REST endpoint for your model. To do this you need to create an API key with 'serving' enabled, and retrieve the endpoint URL from the Model Serving UI.

Go to the Model Serving UI and click on the eye icon next to a model to retrieve the endpoint URL. The shorter URL is an internal endpoint that you can only reach from within Hopsworks. If you want to call it from outside, you need one of the longer URLs. 


In [None]:
import os
import requests

mr = project.get_model_registry()

# Use the model name from the previous notebook.
model = mr.get_model(
    name="fraud_tutorial_model", 
    version=1,
)

test_inputs = model.input_example

API_KEY = "..."  # Put your API key here.
MODEL_SERVING_URL = "..." # Put model serving endppoint here.
HOST_NAME = "..." # Put your hopsworks model serving hostname here 

data = {"inputs": test_inputs}
headers = {
    "Content-Type": "application/json", "Accept": "application/json",
    "Authorization": f"ApiKey {API_KEY}",
    "Host": HOST_NAME}

response = requests.post(MODEL_SERVING_URL, verify=False, headers=headers, json=data)
response.json()

In [None]:
# Now lets test feature vectors from online store
ids_to_score = [
    "0016359b", 
    "001dcc27", 
    "0054a022", 
    "00d6b609", 
    "00e14860", 
    "00e39a1b", 
    "014ed5cb", 
    "01ce3306", 
    "01fa19ae", 
    "01fa1d01", 
    "036dce03", 
    "03e09be4", 
    "04b23f4b",
]

for node_id in ids_to_score:
    data = {"inputs": [node_id]}
    print(" anomaly score for node_id ", node_id, " : ",   deployment.predict(data)["outputs"])

## Stop Deployment
To stop the deployment you simply run:

In [23]:
deployment.stop()

  0%|          | 0/4 [00:00<?, ?it/s]

## <span style="color:#ff5f27;"> 🎁 Wrapping things up </span>

In this module you perforemed feature engineering, created feature view and traning dataset, trained advesarial anomaly detection model and depoyed it in production. To setup this pipeline in your enterprise settings contuct us.

<img src="images/contuct_us.png" width="400px"></img>