&nbsp;
&nbsp;
![](../_resources/images/e2eai-4.jpg)


# MLOps on Databricks

Being successful with ML and AI is about more than just building models.  We need to consider lineage, repeatability, model consumption and ongoing performance.  And all of this needs to be managed using a repeatable and hopefully automated process.  This is the idea behind MLOps.

In this notebook, we will walk through model deployment to a live API Endpoint managed by Databricks.

In [0]:
%pip install --quiet databricks-sdk==0.40.0 mlflow==2.22.0
dbutils.library.restartPython()

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


In [0]:
%run ../_resources/00-setup $reset_all_data=false

## Configuration file

Please change your catalog and schema here to run the demo on a different catalog.

 
<!-- Collect usage data (view). Remove it to disable collection. View README for more details.  -->
<img width="1px" src="https://ppxrzfxige.execute-api.us-west-2.amazonaws.com/v1/analytics?category=lakehouse&org_id=4003492105941350&notebook=%2Fconfig&demo_name=lakehouse-iot-platform&event=VIEW&path=%2F_dbdemos%2Flakehouse%2Flakehouse-iot-platform%2Fconfig&version=1">


# Technical Setup notebook. Hide this cell results
Initialize dataset to the current user and cleanup data when reset_all_data is set to true

Do not edit

USE CATALOG `main`
using catalog.database `main`.`e2eai_iot_turbine`


data already existing. Run with reset_all_data=true to force a data cleanup for your local demo.


In [0]:

import numpy as np
import pandas as pd
import mlflow
from mlflow.models import infer_signature
from mlflow import MlflowClient
from mlflow.deployments import get_deploy_client
import os
import requests
import json


In [0]:
mlflow.set_registry_uri('databricks-uc')

### Create a local UDF for our saved model
By creating a local UDF with an ML model, you can seamlessly integrate machine learning predictions into your data processing workflows in Spark. This method makes the model available (via the UDF) in your local notebook scope and is useful for batch inference.

Databricks spins up a virtual environment for the model to ensure it runs in the same environment it was built.

In [0]:
#### Change the model_name here if you changed it in the prior notebook ####
model_name = "turbine_maintenance"

In [0]:
# Creating a User-Defined Function (UDF) with an ML model in Spark allows you to apply the model to data within a Spark DataFrame. This means you can use the model to make predictions directly in your Spark SQL queries or DataFrame operations.

# spark_udf loads the model in a virtual environment which can take 15+ minutes to build.

# This UDF is available in the context of this notebook.

predict_maintenance = mlflow.pyfunc.spark_udf(spark, 
                                              f"models:/{catalog}.{db}.{model_name}@prod", 
                                              "float", #output
                                              env_manager='virtualenv'
                                              )


# This registers the UDF with Spark SQL, allowing you to use it in SQL queries.
spark.udf.register("predict_maintenance", predict_maintenance)

2025/10/15 20:54:46 INFO mlflow.pyfunc: This UDF will use virtualenv to recreate the model's software environment for inference. This may take extra time during execution.
2025/10/15 20:54:46 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2025/10/15 20:54:46 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2025/10/15 20:54:46 INFO mlflow.utils.virtualenv: Installing python 3.12.3 if it does not exist
Downloading Python-3.12.3.tar.xz...
-> https://www.python.org/ftp/python/3.12.3/Python-3.12.3.tar.xz
Installing Python-3.12.3...
Installed Python-3.12.3 to /tmp/mlflow-8057bbe0c34de1bfd598b2ffa30bcbe6a84cb715-client.3.6-aarch64/pyenv_root/versions/3.12.3
2025/10/15 20:57:35 INFO mlflow.utils.virtualenv: Creating a new environment in /tmp/mlflow-8057bbe0c34de1bfd598b2ffa30bcbe6a84cb715-client.3.6-aarch64/virtualenv_envs/mlflow-8057bbe0c34de1bfd598b2ffa30bcbe6a84cb715 with /tmp/mlflow-8057bbe0c34de1bfd598b2ff

created virtual environment CPython3.12.3.final.0-64 in 244ms
  creator CPython3Posix(dest=/tmp/mlflow-8057bbe0c34de1bfd598b2ffa30bcbe6a84cb715-client.3.6-aarch64/virtualenv_envs/mlflow-8057bbe0c34de1bfd598b2ffa30bcbe6a84cb715, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, via=copy, app_data_dir=/home/spark-1796c14d-0886-4668-b72e-b6/.local/share/virtualenv)
    added seed packages: pip==25.0.1
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
Collecting setuptools==75.8.0 (from -r requirements.a648c3b451ff4a13a279761dfdbacf2b.txt (line 2))
  Downloading setuptools-75.8.0-py3-none-any.whl.metadata (6.7 kB)
Collecting wheel==0.45.1 (from -r requirements.a648c3b451ff4a13a279761dfdbacf2b.txt (line 3))
  Downloading wheel-0.45.1-py3-none-any.whl.metadata (2.3 kB)
Downloading setuptools-75.8.0-py3-none-any.whl (1.2 MB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


Collecting mlflow==2.22.0 (from -r /tmp/tmp159dejvg/requirements.txt (line 1))
  Downloading mlflow-2.22.0-py3-none-any.whl.metadata (30 kB)
Collecting numpy==1.26.4 (from -r /tmp/tmp159dejvg/requirements.txt (line 2))
  Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (62 kB)
Collecting pandas==1.5.3 (from -r /tmp/tmp159dejvg/requirements.txt (line 3))
  Downloading pandas-1.5.3.tar.gz (5.2 MB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/5.2 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.2/5.2 MB[0m [31m91.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: still running...
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): starte


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
2025/10/15 21:10:46 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /tmp/mlflow-8057bbe0c34de1bfd598b2ffa30bcbe6a84cb715-client.3.6-aarch64/virtualenv_envs/mlflow-8057bbe0c34de1bfd598b2ffa30bcbe6a84cb715/bin/activate && python -c ""']'


<function mlflow.pyfunc.spark_udf.<locals>.udf(iterator: Iterator[Tuple[Union[pandas.core.series.Series, pandas.core.frame.DataFrame], ...]]) -> Iterator[pandas.core.series.Series]>

In [0]:
# Retrieve the names of the input columns that the model expects.
columns = predict_maintenance.metadata.get_input_schema().input_names()

columns

['avg_energy',
 'std_sensor_A',
 'std_sensor_B',
 'std_sensor_C',
 'std_sensor_D',
 'std_sensor_E',
 'std_sensor_F']

In [0]:
# Check the signature / expected input schema
predict_maintenance.metadata.get_input_schema()

['avg_energy': double (required), 'std_sensor_A': double (required), 'std_sensor_B': double (required), 'std_sensor_C': double (required), 'std_sensor_D': double (required), 'std_sensor_E': double (required), 'std_sensor_F': double (required)]

In [0]:
# Apply the UDF to a Spark DataFrame, adding a new column with the model's predictions.

batch_pred_df = spark.table('turbine_hourly_features').withColumn("predict_turbine_maintenance", predict_maintenance(*columns))

display(batch_pred_df)

# create a table in the catalog
batch_pred_df.write.mode("overwrite").saveAsTable("turbine_hourly_predictions")

turbine_id,hourly_timestamp,avg_energy,std_sensor_A,std_sensor_B,std_sensor_C,std_sensor_D,std_sensor_E,std_sensor_F,location,model,state,abnormal_sensor,predict_turbine_maintenance
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T17:00:00.000Z,0.1889792040091697,0.9644652043128558,2.6558386572409103,3.4528106013576214,2.485158752607405,2.2884032468369284,4.702138990110717,Tupelo,EpicWind,America/Chicago,sensor_F,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T18:00:00.000Z,0.1921225762992177,1.0681855556261903,2.3848184303882847,3.303412042721332,2.172251292324001,2.342593019596896,4.870875418724548,Tupelo,EpicWind,America/Chicago,sensor_F,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T19:00:00.000Z,0.1735634457450677,1.1420887720146298,2.062708699095104,3.019329663712003,2.339552044868049,2.7306978700770164,4.237196637787606,Tupelo,EpicWind,America/Chicago,sensor_F,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T20:00:00.000Z,0.1034340926271473,1.0498727154061804,2.219216509159497,3.246726138931612,2.3204665834317817,2.662700177613455,4.289404582190178,Tupelo,EpicWind,America/Chicago,sensor_F,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T21:00:00.000Z,0.1548124352749333,1.0325552090494656,2.142101655549623,2.7298423212662217,2.3597486817214515,2.761466398058171,4.588788770497015,Tupelo,EpicWind,America/Chicago,sensor_F,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T22:00:00.000Z,0.0847723255024208,1.0021697211227565,2.0968943765292085,2.921547258775341,2.477840322666964,2.9466029618007314,4.357159925464822,Tupelo,EpicWind,America/Chicago,sensor_F,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,2024-01-16T23:00:00.000Z,0.074818609038791,1.058048335093487,2.4852932716249665,2.8927160852893152,2.1567050955955853,2.2120358529793696,5.614526027139428,Tupelo,EpicWind,America/Chicago,sensor_F,0.0
00f27248-1f4f-e174-432c-53bd2a9158df,2024-01-16T17:00:00.000Z,0.1283965372105728,1.065608883199752,1.9263319253102171,3.3330563526547747,2.230040196141461,2.354626086386649,1.8913049031607985,Crystal Lake,EpicWind,America/Chicago,ok,1.0
00f27248-1f4f-e174-432c-53bd2a9158df,2024-01-16T18:00:00.000Z,0.8542245491303897,1.080309777815946,1.9618452098136363,2.9717426105145472,2.306627597988137,2.5166973688595817,1.980452948870913,Crystal Lake,EpicWind,America/Chicago,ok,1.0
00f27248-1f4f-e174-432c-53bd2a9158df,2024-01-16T19:00:00.000Z,0.4915535666395597,1.0646332592567709,2.2186746553400307,3.3459438407963438,2.2847856939507167,2.5560343320959498,1.9519204325253467,Crystal Lake,EpicWind,America/Chicago,ok,1.0


Databricks visualization. Run in Databricks to view.

In [0]:
%sql
-- An example of using our UDF in a SQL query
SELECT turbine_id, 
    predict_maintenance(avg_energy, 
                        std_sensor_A, 
                        std_sensor_B, 
                        std_sensor_C, 
                        std_sensor_D, 
                        std_sensor_E, 
                        std_sensor_F) as prediction 
FROM turbine_hourly_features
LIMIT 10

turbine_id,prediction
004a641f-e9e5-9fff-d421-1bf88319420b,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,0.0
004a641f-e9e5-9fff-d421-1bf88319420b,0.0
00f27248-1f4f-e174-432c-53bd2a9158df,1.0
00f27248-1f4f-e174-432c-53bd2a9158df,1.0
00f27248-1f4f-e174-432c-53bd2a9158df,1.0


### Create a serving endpoint for our model
By creating a serving endpoint for an ML model, you can seamlessly integrate machine learning predictions into external applications or agents.  This method makes the model available (via the endpoint) outside your local notebook scope and is useful for near real-time or on-demand inference.

Once again, Databricks spins up a virtual environment for the model to ensure it runs in the same environment it was built.  This virtual environment remains up and running, waiting for inference requests.  To help save cost, Databricks sets the allocated compute to scale to zero by default based on idle time.

In [0]:
MODEL_SERVING_ENDPOINT_NAME

'e2eai_iot_turbine_prediction_endpoint'

In [0]:
# Delete the endpoint if it already exists
client = get_deploy_client("databricks")

for each in client.list_endpoints():
    if each['name'] == MODEL_SERVING_ENDPOINT_NAME:
        client.delete_endpoint(MODEL_SERVING_ENDPOINT_NAME)


In [0]:
# Endpoint creation spins up a container that will run the model for inference. This can take 15+ minutes to complete.

# The endpoint will be availalbe via API or SDK outside this notebook context.

client = get_deploy_client("databricks")

try:
    endpoint = client.create_endpoint(
        name=MODEL_SERVING_ENDPOINT_NAME,
        config={
            "served_entities": [
                {
                    "name": "iot-maintenance-serving-endpoint",
                    "entity_name": f"{catalog}.{db}.{model_name}",
                    "entity_version": get_last_model_version(f"{catalog}.{db}.{model_name}"),
                    "workload_size": "Small",
                    "scale_to_zero_enabled": True
                }
            ]
        }
    )
except Exception as e:
    if "already exists" in str(e):
        print(f"Endpoint {catalog}.{db}.{MODEL_SERVING_ENDPOINT_NAME} already exists. Skipping creation.")
    else:
        raise e

while client.get_endpoint(MODEL_SERVING_ENDPOINT_NAME)['state']['config_update'] == 'IN_PROGRESS':
    time.sleep(10) 

if client.get_endpoint(MODEL_SERVING_ENDPOINT_NAME)['state']['ready'] != 'READY':
    print(f"Endpoint {catalog}.{db}.{MODEL_SERVING_ENDPOINT_NAME} creation failed.")
else:
    print(f"Endpoint {catalog}.{db}.{MODEL_SERVING_ENDPOINT_NAME} created successfully.")    

Endpoint main.e2eai_iot_turbine.e2eai_iot_turbine_prediction_endpoint created successfully.


### Test our serving endpoint

Python API method

In [0]:
# Get the API endpoint and token for the current notebook context
API_ROOT = f"https://{dbutils.notebook.entry_point.getDbutils().notebook().getContext().browserHostName().value()}/"
API_TOKEN = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().getOrElse(None)

In [0]:
def create_tf_serving_json(data):
    """
    Prepares input data in the JSON format expected by TensorFlow Serving REST APIs. 
    It checks if the input data is a dictionary (such as a pandas DataFrame's .to_dict() output). If so, it converts each value (typically a numpy array or pandas Series) to a list and nests them under the "inputs" key. 
    If data is not a dictionary (e.g., a numpy array), it simply converts it to a list and nests it under "inputs".

    This ensures that the data structure matches what TensorFlow Serving expects for inference requests.
    """
    
    return {'inputs': {name: data[name].tolist() for name in data.keys()} if isinstance(data, dict) else data.tolist()}

def score_model(dataset):

    url = f'{API_ROOT}/serving-endpoints/{MODEL_SERVING_ENDPOINT_NAME}/invocations'

    headers = {'Authorization': f'Bearer {API_TOKEN}', 
               'Content-Type': 'application/json'}


    ds_dict = {'dataframe_split': dataset.to_dict(orient='split')} if isinstance(dataset, pd.DataFrame) else create_tf_serving_json(dataset)

    data_json = json.dumps(ds_dict, allow_nan=True)

    response = requests.request(method='POST', headers=headers, url=url, data=data_json)
    
    if response.status_code != 200:
        raise Exception(f'Request failed with status {response.status_code}, {response.text}')
    return response.json()

In [0]:
columns = ['avg_energy', 'std_sensor_A', 'std_sensor_B', 'std_sensor_C', 'std_sensor_D', 'std_sensor_E', 'std_sensor_F']

# Get 5 rows to test with
dataset = spark.table(f'turbine_hourly_features').select(*columns).toPandas()[:5]

dataset

Unnamed: 0,avg_energy,std_sensor_A,std_sensor_B,std_sensor_C,std_sensor_D,std_sensor_E,std_sensor_F
0,0.188979,0.964465,2.655839,3.452811,2.485159,2.288403,4.702139
1,0.192123,1.068186,2.384818,3.303412,2.172251,2.342593,4.870875
2,0.173563,1.142089,2.062709,3.01933,2.339552,2.730698,4.237197
3,0.103434,1.049873,2.219217,3.246726,2.320467,2.6627,4.289405
4,0.154812,1.032555,2.142102,2.729842,2.359749,2.761466,4.588789


In [0]:
# Use our function to call the API of our model and get inferences live!
score_model(dataset)

{'predictions': [0, 0, 0, 0, 0]}

Using ai_query

---
We can test the endpoint in SQL using Databricks' built_in function [ai_query()](https://docs.databricks.com/aws/en/sql/language-manual/functions/ai_query).
This state-of-art function allows to apply batch predictions. However, it is not yet available in the Free Edition. We are showing how to use it anyway.


In [0]:
%sql
-- Query our endpoint name using ai_query
-- ai_query is a powerful way to apply any ML or AI endpoint to a large dataset
SELECT ai_query('e2eai_iot_turbine_prediction_endpoint',
STRUCT(CAST(avg_energy AS DOUBLE) AS avg_energy, 
      CAST(std_sensor_A AS DOUBLE) AS std_sensor_A, 
      CAST(std_sensor_B AS DOUBLE) AS std_sensor_B, 
      CAST(std_sensor_C AS DOUBLE) AS std_sensor_C, 
      CAST(std_sensor_D AS DOUBLE) AS std_sensor_D, 
      CAST(std_sensor_E AS DOUBLE) AS std_sensor_E,
      CAST(std_sensor_F AS DOUBLE) AS std_sensor_F), 
returnType => 'FLOAT') AS prediction
FROM turbine_hourly_features
LIMIT 3

[0;31m---------------------------------------------------------------------------[0m
[0;31mSparkException[0m                            Traceback (most recent call last)
File [0;32m<command-4610739455949034>, line 1[0m
[0;32m----> 1[0m get_ipython()[38;5;241m.[39mrun_cell_magic([38;5;124m'[39m[38;5;124msql[39m[38;5;124m'[39m, [38;5;124m'[39m[38;5;124m'[39m, [38;5;124m"[39m[38;5;124m-- Query our endpoint name using ai_query[39m[38;5;130;01m\n[39;00m[38;5;124m-- ai_query is a powerful way to apply any ML or AI endpoint to a large dataset[39m[38;5;130;01m\n[39;00m[38;5;124mSELECT ai_query([39m[38;5;124m'[39m[38;5;124me2eai_iot_turbine_prediction_endpoint[39m[38;5;124m'[39m[38;5;124m,[39m[38;5;130;01m\n[39;00m[38;5;124mSTRUCT(CAST(avg_energy AS DOUBLE) AS avg_energy, [39m[38;5;130;01m\n[39;00m[38;5;124m      CAST(std_sensor_A AS DOUBLE) AS std_sensor_A, [39m[38;5;130;01m\n[39;00m[38;5;124m      CAST(std_sensor_B AS DOUBLE) AS std_sensor_B,

In [0]:
%sql
-- Query our endpoint name using ai_query with values we provide (not from a table)
SELECT ai_query('e2eai_iot_turbine_prediction_endpoint',
  STRUCT(0.1889 AS avg_energy, 
        0.9644 AS std_sensor_A, 
        2.6558 AS std_sensor_B, 
        3.4528 AS std_sensor_C, 
        2.4851 AS std_sensor_D, 
        2.2884 AS std_sensor_E, 
        4.7021 AS std_sensor_F),
  returnType => 'FLOAT') AS prediction

[0;31m---------------------------------------------------------------------------[0m
[0;31mSparkException[0m                            Traceback (most recent call last)
File [0;32m<command-4610739455949035>, line 1[0m
[0;32m----> 1[0m get_ipython()[38;5;241m.[39mrun_cell_magic([38;5;124m'[39m[38;5;124msql[39m[38;5;124m'[39m, [38;5;124m'[39m[38;5;124m'[39m, [38;5;124m"[39m[38;5;124m-- Query our endpoint name using ai_query with values we provide (not from a table)[39m[38;5;130;01m\n[39;00m[38;5;124mSELECT ai_query([39m[38;5;124m'[39m[38;5;124me2eai_iot_turbine_prediction_endpoint[39m[38;5;124m'[39m[38;5;124m,[39m[38;5;130;01m\n[39;00m[38;5;124m  STRUCT(0.1889 AS avg_energy, [39m[38;5;130;01m\n[39;00m[38;5;124m        0.9644 AS std_sensor_A, [39m[38;5;130;01m\n[39;00m[38;5;124m        2.6558 AS std_sensor_B, [39m[38;5;130;01m\n[39;00m[38;5;124m        3.4528 AS std_sensor_C, [39m[38;5;130;01m\n[39;00m[38;5;124m        2.4851 AS std_

Try viewing your endpoint in the UI by going to Serving, then clicking on your endpoint name. From this page, you can view usage statistics for your endpoint or click `Use` to see code examples of how to call your endpoint in various different ways.