In [1]:
import time

notebook_start_time = time.time()

# Set up environment

In [2]:
import sys
from pathlib import Path


def is_google_colab() -> bool:
    if "google.colab" in str(get_ipython()):
        return True
    return False


def clone_repository() -> None:
    !git clone https://github.com/decodingml/hands-on-recommender-system.git
    %cd hands-on-recommender-system/


def install_dependencies() -> None:
    !pip install --upgrade uv
    !uv pip install --all-extras --system --requirement pyproject.toml


if is_google_colab():
    clone_repository()
    install_dependencies()

    root_dir = str(Path().absolute())
    print("⛳️ Google Colab environment")
else:
    root_dir = str(Path().absolute().parent)
    print("⛳️ Local environment")

# Add the root directory to the `PYTHONPATH` to use the `recsys` Python module from the notebook.
if root_dir not in sys.path:
    print(f"Adding the following directory to the PYTHONPATH: {root_dir}")
    sys.path.append(root_dir)

⛳️ Local environment
Adding the following directory to the PYTHONPATH: /Users/pauliusztin/Documents/01_projects/hopsworks_recsys/hands-on-recommender-system


# Inference pipeline: Deploying and testing the LLM ranker inference pipeline

In this notebook, we will dig into the inference pipeline and deploy it to Hopsworks as a real-time service.

## 📝 Imports

In [3]:
import warnings

warnings.filterwarnings("ignore")

from loguru import logger

from recsys import hopsworks_integration

## <span style="color:#ff5f27">🔮 Connect to Hopsworks Feature Store </span>

In [4]:
project, fs = hopsworks_integration.get_feature_store()

[32m2024-12-23 19:08:06.884[0m | [1mINFO    [0m | [36mrecsys.hopsworks_integration.feature_store[0m:[36mget_feature_store[0m:[36m13[0m - [1mLoging to Hopsworks using HOPSWORKS_API_KEY env var.[0m


2024-12-23 19:08:06,885 INFO: Initializing external client
2024-12-23 19:08:06,885 INFO: Base URL: https://c.app.hopsworks.ai:443
2024-12-23 19:08:08,252 INFO: Python Engine initialized.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/1192098


# Deploying the ranking inference pipeline


First you need to register your LLM ranking model:

In [5]:
ranking_model = hopsworks_integration.llm_ranking_serving.HopsworksLLMRankingModel()
ranking_model.register(project.get_model_registry())

Uploading: 100.000%|██████████| 5697/5697 elapsed<00:02 remaining<00:00:03,  1.54it/s]
Model export complete: 100%|██████████| 6/6 [00:08<00:00,  1.46s/it]                   

Model created, explore it at https://c.app.hopsworks.ai:443/p/1192098/models/llm_ranking_model/20





Then you can deploy your LLM ranking model, which implements a `Predict` class that tells Hopsworks how to perform inference on it:

In [6]:
ranking_deployment = hopsworks_integration.llm_ranking_serving.HopsworksLLMRankingModel.deploy()

2024-12-23 19:08:19,444 INFO: Closing external client and cleaning up certificates.
2024-12-23 19:08:19,447 INFO: Initializing external client
2024-12-23 19:08:19,447 INFO: Base URL: https://c.app.hopsworks.ai:443
2024-12-23 19:08:20,334 INFO: Closing external client and cleaning up certificates.
Connection closed.
2024-12-23 19:08:20,340 INFO: Initializing external client
2024-12-23 19:08:20,340 INFO: Base URL: https://c.app.hopsworks.ai:443
2024-12-23 19:08:21,701 INFO: Python Engine initialized.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/1192098
Secret created successfully, explore it at https://c.app.hopsworks.ai:443/account/secrets
2024-12-23 19:08:24,087 INFO: Closing external client and cleaning up certificates.
Connection closed.
2024-12-23 19:08:24,091 INFO: Initializing external client
2024-12-23 19:08:24,092 INFO: Base URL: https://c.app.hopsworks.ai:443
2024-12-23 19:08:25,414 INFO: Python Engine initialized.

Logged in to project, explore it he

Uploading: 100.000%|██████████| 42/42 elapsed<00:02 remaining<00:00
Uploading: 100.000%|██████████| 4491/4491 elapsed<00:01 remaining<00:00
Uploading: 100.000%|██████████| 5697/5697 elapsed<00:01 remaining<00:00


Deployment with the same name already exists. Getting existing deployment...
To create a new deployment choose a different name.


Now, we have to explicitly start the deployment:

In [7]:
ranking_deployment.start()

Deployment is already running


## <span style="color:#ff5f27"> Test the ranking inference pipeline</span>


In [8]:
def get_top_recommendations(ranked_candidates, k=3):
    return [candidate[-1] for candidate in ranked_candidates["ranking"][:k]]

Let's define a dummy test example to test our ranking deployment (only the `customer_id` has to match):

In [10]:
test_ranking_input = [
        {
            "customer_id": "d327d0ad9e30085a436933dfbb7f77cf42e38447993a078ed35d93e3fd350ecf",
            "month_sin": 1.2246467991473532e-16,
            "query_emb": [
                0.214135289,
                0.571055949,
                0.330709577,
                -0.225899458,
                -0.308674961,
                -0.0115124583,
                0.0730511621,
                -0.495835781,
                0.625569344,
                -0.0438038409,
                0.263472944,
                -0.58485353,
                -0.307070434,
                0.0414443575,
                -0.321789205,
                0.966559,
            ],
            "month_cos": -1.0,
        }
    ]

# Test ranking deployment
ranked_candidates = ranking_deployment.predict(inputs=test_ranking_input)

# Retrieve article ids of the top recommended items
recommendations = get_top_recommendations(ranked_candidates["predictions"], k=3)
recommendations

['899003002', '615192004', '398089004']

Check logs in case of failure:

In [10]:
# ranking_deployment.get_logs(component="predictor", tail=200)

Explore all the logs and filters in the Kibana logs at https://c.app.hopsworks.ai:443/p/1192098/deployments/352292

DeployableComponentLogs(instance_name: 'llmranking-predictor-00001-deployment-869b4cc969-wpllw', date: datetime.datetime(2024, 12, 23, 18, 52, 34, 305517)) 
            You are a helpful assistant specialized in predicting customer behavior. Your task is to analyze the features of a product and predict the probability of it being purchased by a customer.

            ### Instructions:
            1. Use the provided features of the product to make your prediction.
            2. Consider the following numeric and categorical features:
               - Numeric features: These are quantitative attributes, such as numerical identifiers or measurements.
               - Categorical features: These describe qualitative aspects, like product category, color, and material.
            3. Your response should only include the probability of purchase for the positive class (e.g., 

# Deploying the query inference pipeline

In [11]:
query_model_deployment = (
    hopsworks_integration.two_tower_serving.HopsworksQueryModel.deploy(ranking_model_type="llmranking")
)

2024-12-23 19:12:24,634 INFO: Closing external client and cleaning up certificates.
Connection closed.
2024-12-23 19:12:24,637 INFO: Initializing external client
2024-12-23 19:12:24,637 INFO: Base URL: https://c.app.hopsworks.ai:443
2024-12-23 19:12:26,070 INFO: Python Engine initialized.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/1192098
2024-12-23 19:12:27,338 INFO: Closing external client and cleaning up certificates.
2024-12-23 19:12:27,343 INFO: Initializing external client
2024-12-23 19:12:27,343 INFO: Base URL: https://c.app.hopsworks.ai:443
2024-12-23 19:12:28,235 INFO: Closing external client and cleaning up certificates.
Connection closed.
2024-12-23 19:12:28,239 INFO: Initializing external client
2024-12-23 19:12:28,240 INFO: Base URL: https://c.app.hopsworks.ai:443
2024-12-23 19:12:29,561 INFO: Python Engine initialized.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/1192098
Secret created successfully, explore it at htt

Uploading: 100.000%|██████████| 2948/2948 elapsed<00:01 remaining<00:00


Deployment with the same name already exists. Getting existing deployment...
To create a new deployment choose a different name.
Before making predictions, start the deployment by using `.start()`


At this point, you have registered your deployment. To start it up you need to run:

In [12]:
query_model_deployment.start()

Deployment is ready: 100%|██████████| 6/6 [00:26<00:00,  4.44s/it]    

Start making predictions by using `.predict()`





## <span style="color:#ff5f27"> Testing the inference pipeline </span>

Define a test input example:

In [13]:
data = [
    {
        "customer_id": "d327d0ad9e30085a436933dfbb7f77cf42e38447993a078ed35d93e3fd350ecf",
        "transaction_date": "2022-11-15T12:16:25.330916",
    }
]

Test out the deployment:

In [14]:
ranked_candidates = query_model_deployment.predict(inputs=data)

# Retrieve article ids of the top recommended items
recommendations = get_top_recommendations(ranked_candidates["predictions"], k=3)
recommendations

['584631021', '549253002', '557248025']

# <span style="color:#ff5f27"> Stopping the Hopsworks deployment </span>

Stop the deployment when you're not using it.

In [15]:
ranking_deployment.stop()
query_model_deployment.stop()

Deployment is stopped: 100%|██████████| 4/4 [00:10<00:00,  2.68s/it]        
Deployment is stopped: 100%|██████████| 4/4 [00:10<00:00,  2.67s/it]        


## <span style="color:#ff5f27"> Inspecting the deployments in Hopsworks UI </span>

Go to [Hopsworks UI](https://www.hopsworks.ai/), **Data Science → Deployments** section and inspect the newly created deployments.

---

In [16]:
notebook_end_time = time.time()
notebook_execution_time = notebook_end_time - notebook_start_time

logger.info(
    f"⌛️ Notebook Execution time: {notebook_execution_time:.2f} seconds ~ {notebook_execution_time / 60:.2f} minutes"
)

[32m2024-12-23 19:14:11.118[0m | [1mINFO    [0m | [36m__main__[0m:[36m<module>[0m:[36m4[0m - [1m⌛️ Notebook Execution time: 368.37 seconds ~ 6.14 minutes[0m


# <span style="color:#ff5f27">→ Next Steps </span>

The last step is to schedule the materialization jobs.