&nbsp;
&nbsp;
![](../_resources/images/e2eai-4.jpg)


# MLOps on Databricks

Being successful with ML and AI is about more than just building models.  We need to consider lineage, repeatability, model consumption and ongoing performance.  And all of this needs to be managed using a repeatable and hopefully automated process.  This is the idea behind MLOps.

In this notebook, we will walk through model deployment to a live API Endpoint managed by Databricks.

In [0]:
%pip install --quiet databricks-sdk==0.40.0 mlflow==2.22.0
dbutils.library.restartPython()

In [0]:
%run ../_resources/00-setup $reset_all_data=false

In [0]:

import numpy as np
import pandas as pd
import mlflow
from mlflow.models import infer_signature
from mlflow import MlflowClient
from mlflow.deployments import get_deploy_client
import os
import requests
import json


In [0]:
mlflow.set_registry_uri('databricks-uc')

### Create a local UDF for our saved model
By creating a local UDF with an ML model, you can seamlessly integrate machine learning predictions into your data processing workflows in Spark. This method makes the model available (via the UDF) in your local notebook scope and is useful for batch inference.

Databricks spins up a virtual environment for the model to ensure it runs in the same environment it was built.

In [0]:
#### Change the model_name here if you changed it in the prior notebook ####
model_name = "turbine_maintenance"

In [0]:
# Creating a User-Defined Function (UDF) with an ML model in Spark allows you to apply the model to data within a Spark DataFrame. This means you can use the model to make predictions directly in your Spark SQL queries or DataFrame operations.

# spark_udf loads the model in a virtual environment which can take 15+ minutes to build.

# This UDF is available in the context of this notebook.

predict_maintenance = mlflow.pyfunc.spark_udf(spark, 
                                              f"models:/{catalog}.{db}.{model_name}@prod", 
                                              "float", #output
                                              env_manager='virtualenv'
                                              )


# This registers the UDF with Spark SQL, allowing you to use it in SQL queries.
spark.udf.register("predict_maintenance", predict_maintenance)

In [0]:
# Retrieve the names of the input columns that the model expects.
columns = predict_maintenance.metadata.get_input_schema().input_names()

columns

In [0]:
# Check the signature / expected input schema
predict_maintenance.metadata.get_input_schema()

In [0]:
# Apply the UDF to a Spark DataFrame, adding a new column with the model's predictions.

batch_pred_df = spark.table('turbine_hourly_features').withColumn("predict_turbine_maintenance", predict_maintenance(*columns))

display(batch_pred_df)

# create a table in the catalog
batch_pred_df.write.mode("overwrite").saveAsTable("turbine_hourly_predictions")

Databricks visualization. Run in Databricks to view.

In [0]:
%sql
-- An example of using our UDF in a SQL query
SELECT turbine_id, 
    predict_maintenance(avg_energy, 
                        std_sensor_A, 
                        std_sensor_B, 
                        std_sensor_C, 
                        std_sensor_D, 
                        std_sensor_E, 
                        std_sensor_F) as prediction 
FROM turbine_hourly_features
LIMIT 10

### Create a serving endpoint for our model
By creating a serving endpoint for an ML model, you can seamlessly integrate machine learning predictions into external applications or agents.  This method makes the model available (via the endpoint) outside your local notebook scope and is useful for near real-time or on-demand inference.

Once again, Databricks spins up a virtual environment for the model to ensure it runs in the same environment it was built.  This virtual environment remains up and running, waiting for inference requests.  To help save cost, Databricks sets the allocated compute to scale to zero by default based on idle time.

In [0]:
MODEL_SERVING_ENDPOINT_NAME

In [0]:
# Delete the endpoint if it already exists
client = get_deploy_client("databricks")

for each in client.list_endpoints():
    if each['name'] == MODEL_SERVING_ENDPOINT_NAME:
        client.delete_endpoint(MODEL_SERVING_ENDPOINT_NAME)


In [0]:
# Endpoint creation spins up a container that will run the model for inference. This can take 15+ minutes to complete.

# The endpoint will be availalbe via API or SDK outside this notebook context.

client = get_deploy_client("databricks")

try:
    endpoint = client.create_endpoint(
        name=MODEL_SERVING_ENDPOINT_NAME,
        config={
            "served_entities": [
                {
                    "name": "iot-maintenance-serving-endpoint",
                    "entity_name": f"{catalog}.{db}.{model_name}",
                    "entity_version": get_last_model_version(f"{catalog}.{db}.{model_name}"),
                    "workload_size": "Small",
                    "scale_to_zero_enabled": True
                }
            ]
        }
    )
except Exception as e:
    if "already exists" in str(e):
        print(f"Endpoint {catalog}.{db}.{MODEL_SERVING_ENDPOINT_NAME} already exists. Skipping creation.")
    else:
        raise e

while client.get_endpoint(MODEL_SERVING_ENDPOINT_NAME)['state']['config_update'] == 'IN_PROGRESS':
    time.sleep(10) 

if client.get_endpoint(MODEL_SERVING_ENDPOINT_NAME)['state']['ready'] != 'READY':
    print(f"Endpoint {catalog}.{db}.{MODEL_SERVING_ENDPOINT_NAME} creation failed.")
else:
    print(f"Endpoint {catalog}.{db}.{MODEL_SERVING_ENDPOINT_NAME} created successfully.")    

### Test our serving endpoint

Python API method

In [0]:
# Get the API endpoint and token for the current notebook context
API_ROOT = f"https://{dbutils.notebook.entry_point.getDbutils().notebook().getContext().browserHostName().value()}/"
API_TOKEN = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().getOrElse(None)

In [0]:
def create_tf_serving_json(data):
    return {'inputs': {name: data[name].tolist() for name in data.keys()} if isinstance(data, dict) else data.tolist()}

def score_model(dataset):

    url = f'{API_ROOT}/serving-endpoints/{MODEL_SERVING_ENDPOINT_NAME}/invocations'

    headers = {'Authorization': f'Bearer {API_TOKEN}', 
               'Content-Type': 'application/json'}


    ds_dict = {'dataframe_split': dataset.to_dict(orient='split')} if isinstance(dataset, pd.DataFrame) else create_tf_serving_json(dataset)

    data_json = json.dumps(ds_dict, allow_nan=True)

    response = requests.request(method='POST', headers=headers, url=url, data=data_json)
    
    if response.status_code != 200:
        raise Exception(f'Request failed with status {response.status_code}, {response.text}')
    return response.json()

In [0]:
columns = ['avg_energy', 'std_sensor_A', 'std_sensor_B', 'std_sensor_C', 'std_sensor_D', 'std_sensor_E', 'std_sensor_F']

# Get 5 rows to test with
dataset = spark.table(f'turbine_hourly_features').select(*columns).toPandas()[:5]

dataset

In [0]:
# Use our function to call the API of our model and get inferences live!
score_model(dataset)

Using ai_query

In [0]:
%sql
-- Query our endpoint name using ai_query
-- ai_query is a powerful way to apply any ML or AI endpoint to a large dataset
SELECT ai_query('e2eai_iot_turbine_prediction_endpoint',
STRUCT(CAST(avg_energy AS DOUBLE) AS avg_energy, 
      CAST(std_sensor_A AS DOUBLE) AS std_sensor_A, 
      CAST(std_sensor_B AS DOUBLE) AS std_sensor_B, 
      CAST(std_sensor_C AS DOUBLE) AS std_sensor_C, 
      CAST(std_sensor_D AS DOUBLE) AS std_sensor_D, 
      CAST(std_sensor_E AS DOUBLE) AS std_sensor_E,
      CAST(std_sensor_F AS DOUBLE) AS std_sensor_F), 
returnType => 'FLOAT') AS prediction
FROM turbine_hourly_features
LIMIT 3

In [0]:
%sql
-- Query our endpoint name using ai_query with values we provide (not from a table)
SELECT ai_query('e2eai_iot_turbine_prediction_endpoint',
  STRUCT(0.1889 AS avg_energy, 
        0.9644 AS std_sensor_A, 
        2.6558 AS std_sensor_B, 
        3.4528 AS std_sensor_C, 
        2.4851 AS std_sensor_D, 
        2.2884 AS std_sensor_E, 
        4.7021 AS std_sensor_F),
  returnType => 'FLOAT') AS prediction

Try viewing your endpoint in the UI by going to Serving, then clicking on your endpoint name. From this page, you can view usage statistics for your endpoint or click `Use` to see code examples of how to call your endpoint in various different ways.